Skip to content

Commit 825a960

Browse files
committed
fix ibm-fms Repo-Jacking
Signed-off-by: kurtis <kurtis@us.ibm.com>
1 parent 9e298e4 commit 825a960

9 files changed

Lines changed: 69 additions & 54 deletions

File tree

README.md

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,13 @@
66

77

88
<p align="center">
9-
🤗 <a href="https://huggingface.co/collections/ibm-fms/bamba-674f1388b9bbc98b413c7bab"> Bamba on Hugging Face</a>&nbsp | <a href="https://huggingface.co/blog/bamba"> Bamba Blog</a>&nbsp
9+
🤗 <a href="https://huggingface.co/collections/ibm-ai-platform
10+
/bamba-674f1388b9bbc98b413c7bab"> Bamba on Hugging Face</a>&nbsp | <a href="https://huggingface.co/blog/bamba"> Bamba Blog</a>&nbsp
1011
<be>
1112

1213

13-
<!--Bamba is a repository for training and using [Bamba](https://huggingface.co/ibm-fms/Avengers-Mamba2-9B) models, which are derived from [Mamba](https://github.com/state-spaces/mamba) models.-->
14+
<!--Bamba is a repository for training and using [Bamba](https://huggingface.co/ibm-ai-platform
15+
/Avengers-Mamba2-9B) models, which are derived from [Mamba](https://github.com/state-spaces/mamba) models.-->
1416

1517
Bamba-9B is a decoder-only language model based on the [Mamba-2](https://github.com/state-spaces/mamba) architecture and is designed to handle a wide range of text generation tasks. It is trained from scratch using a two-stage training approach. In the first stage, the model is trained on 2 trillion tokens from the Dolma v1.7 dataset. In the second stage, it undergoes additional training on 200 billion tokens, leveraging a carefully curated blend of high-quality data to further refine its performance and enhance output quality.
1618

@@ -44,14 +46,17 @@ pip install git+https://github.com/huggingface/transformers.git
4446
| Bamba | 9B (9.78B) | 32 | 4096 | 32 | Yes | 8 | 4096 | False |
4547

4648
### Checkpoints
47-
You can find links to our model checkpoints here: [Bamba Models](https://huggingface.co/collections/ibm-fms/bamba-674f1388b9bbc98b413c7bab)
49+
You can find links to our model checkpoints here: [Bamba Models](https://huggingface.co/collections/ibm-ai-platform
50+
/bamba-674f1388b9bbc98b413c7bab)
4851

4952
## Inference
5053

5154
You can use the following command to perform text generation using one of our checkpoints provided above:
5255

5356
```python
54-
python text_generation.py --model_path ibm-fms/Bamba-9B --tokenizer_path ibm-fms/Bamba-9B --prompt "The largest living mammal on Earth is " --max_new_tokens 128
57+
python text_generation.py --model_path ibm-ai-platform
58+
/Bamba-9B --tokenizer_path ibm-ai-platform
59+
/Bamba-9B --prompt "The largest living mammal on Earth is " --max_new_tokens 128
5560
```
5661

5762
## Training
@@ -247,7 +252,8 @@ make -j
247252

248253
### Conversion to GGUF
249254

250-
You can use a pre-converted GGUF file from Huggingface (e.g. [bamba-9b.gguf](https://huggingface.co/ibm-fms/Bamba-9B/blob/main/bamba-9b.gguf)). If one doesn't exist, you can use the [convert_hf_to_gguf.py](https://github.com/gabe-l-hart/llama.cpp/blob/BambaArchitecture/convert_hf_to_gguf.py) script from Gabe's fork to perform the conversion manually.
255+
You can use a pre-converted GGUF file from Huggingface (e.g. [bamba-9b.gguf](https://huggingface.co/ibm-ai-platform
256+
/Bamba-9B/blob/main/bamba-9b.gguf)). If one doesn't exist, you can use the [convert_hf_to_gguf.py](https://github.com/gabe-l-hart/llama.cpp/blob/BambaArchitecture/convert_hf_to_gguf.py) script from Gabe's fork to perform the conversion manually.
251257

252258
```sh
253259
# Install the python dependencies

blog/bamba.md

Lines changed: 18 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,8 @@ We introduce **Bamba-9B**, an inference-efficient Hybrid Mamba2 model trained by
1010

1111
## Artifacts 📦
1212

13-
1. [Hugging Face Bamba collection](https://huggingface.co/collections/ibm-fms/bamba-674f1388b9bbc98b413c7bab)
13+
1. [Hugging Face Bamba collection](https://huggingface.co/collections/ibm-ai-platform
14+
ibm-ai-platform/bamba-674f1388b9bbc98b413c7bab)
1415
2. [GitHub repo with inference, training, and tuning scripts](https://github.com/foundation-model-stack/bamba)
1516
3. [Data loader](https://github.com/foundation-model-stack/fms-fsdp/blob/main/fms_fsdp/utils/dataset_utils.py)
1617
4. [Quantization](https://github.com/foundation-model-stack/fms-model-optimizer)
@@ -32,8 +33,10 @@ To use Bamba with transformers, you can use the familiar `AutoModel` classes and
3233
```python
3334
from transformers import AutoModelForCausalLM, AutoTokenizer
3435

35-
model = AutoModelForCausalLM.from_pretrained("ibm-fms/Bamba-9B")
36-
tokenizer = AutoTokenizer.from_pretrained("ibm-fms/Bamba-9B")
36+
model = AutoModelForCausalLM.from_pretrained("ibm-ai-platform
37+
ibm-ai-platform/Bamba-9B")
38+
tokenizer = AutoTokenizer.from_pretrained("ibm-ai-platform
39+
ibm-ai-platform/Bamba-9B")
3740

3841
message = ["Mamba is a snake with following properties "]
3942
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
@@ -67,7 +70,8 @@ We compare Bamba-9B with SoTA transformer models of similar size ([Meta Llama 3.
6770

6871
| Model | Average | MMLU | ARC-C | GSM8K | Hellaswag | OpenbookQA | Piqa | TruthfulQA | Winogrande |
6972
| :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
70-
| [Bamba 9B](https://huggingface.co/ibm-fms/Bamba-9B) | 62.31 | 60.77 | 63.23 | 36.77 | 81.8 | 47.6 | 82.26 | 49.21 | 76.87 |
73+
| [Bamba 9B](https://huggingface.co/ibm-ai-platform
74+
ibm-ai-platform/Bamba-9B) | 62.31 | 60.77 | 63.23 | 36.77 | 81.8 | 47.6 | 82.26 | 49.21 | 76.87 |
7175
| [Meta Llama 3.1 8B](https://huggingface.co/meta-llama/Llama-3.1-8B) | 63.51 | 66.26 | 57.85 | 49.96 | 81.98 | 46.8 | 82.54 | 45.16 | 77.51 |
7276
| [Olmo2 7B](https://huggingface.co/allenai/OLMo-2-1124-7B) | 66.17 | 63.96 | 64.51 | 68.01 | 81.93 | **49.2** | 81.39 | 43.32 | 77.03 |
7377
| [IBM Granite v3 8B](https://huggingface.co/ibm-granite/granite-3.0-8b-base) | 67.47 | 65.45 | 63.74 | 62.55 | **83.29** | 47.6 | **83.41** | 52.89 | **80.82** |
@@ -79,7 +83,8 @@ We compare Bamba-9B with SoTA transformer models of similar size ([Meta Llama 3.
7983

8084
| Model | Average | MMLU-PRO | BBH | GPQA | IFEval | MATH Lvl 5 | MuSR |
8185
| :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
82-
| [Bamba 9B](https://huggingface.co/ibm-fms/Bamba-9B) | 10.91 | 17.53 | 17.4 | 4.14 | 15.16 | 1.66 | 9.59 |
86+
| [Bamba 9B](https://huggingface.co/ibm-ai-platform
87+
ibm-ai-platform/Bamba-9B) | 10.91 | 17.53 | 17.4 | 4.14 | 15.16 | 1.66 | 9.59 |
8388
| [Meta Llama 3.1 8B](https://huggingface.co/meta-llama/Llama-3.1-8B) | 14.27 | 25.46 | 25.16 | 8.61 | 12.55 | 5.14 | 8.72 |
8489
| [Olmo2 7B](https://huggingface.co/allenai/OLMo-2-1124-7B) | 13.36 | 22.79 | 21.69 | 4.92 | 16.35 | 4.38 | 10.02 |
8590
| [IBM Granite v3 8B](https://huggingface.co/ibm-granite/granite-3.0-8b-base) | 21.14 | 25.83 | 28.02 | 9.06 | **44.79** | 9.82 | 9.32 |
@@ -93,7 +98,8 @@ Safety benchmarks are crucial for ensuring AI models generate content that is et
9398

9499
| Model | PopQA | Toxigen | BBQ | Crow-SPairs* |
95100
| :---- | :---- | :---- | :---- | :---- |
96-
| [Bamba 9B](https://huggingface.co/ibm-fms/Bamba-9B) | 20.5 | 57.4 | 44.2 | 70.8 |
101+
| [Bamba 9B](https://huggingface.co/ibm-ai-platform
102+
ibm-ai-platform/Bamba-9B) | 20.5 | 57.4 | 44.2 | 70.8 |
97103
| [Meta Llama 3.1 8B](https://huggingface.co/meta-llama/Llama-3.1-8B) | **28.77** | 67.02 | 59.97 | 70.84 |
98104
| [IBM Granite v3 8B](https://huggingface.co/ibm-granite/granite-3.0-8b-base) | 27.5 | **79.9** | **82.1** | 75 |
99105
| [Olmo2 7B](https://huggingface.co/allenai/OLMo-2-1124-7B) | 25.7 | 63.1 | 58.4 | 72 |
@@ -111,9 +117,11 @@ We pick a few prominent models: [Olmo 7B](https://huggingface.co/allenai/OLMo-7B
111117

112118
| Model | Average | MMLU | ARC-C | GSM8K | Hellaswag | OpenbookQA | Piqa | TruthfulQA | Winogrande |
113119
| :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
114-
| [Bamba 9B (2.2T)](https://huggingface.co/ibm-fms/Bamba-9B) | **62.31** | **60.77** | **63.23** | **36.77** | 81.8 | **47.6** | 82.26 | **49.21** | 76.87 |
120+
| [Bamba 9B (2.2T)](https://huggingface.co/ibm-ai-platform
121+
ibm-ai-platform/Bamba-9B) | **62.31** | **60.77** | **63.23** | **36.77** | 81.8 | **47.6** | 82.26 | **49.21** | 76.87 |
115122
| [Olmo1.5 7B (2T)](https://huggingface.co/allenai/OLMo-7B-0424-hf) | 55.8 | 53.38 | 50.51 | 27.67 | 79.13 | 45.2 | 81.56 | 35.92 | 73.09 |
116-
| [Bamba 9B (2T)](https://huggingface.co/ibm-fms/Bamba-9B-2T) | 59.11 | 59.05 | 57.25 | 24.03 | **83.66** | 47.6 | **83.62** | 38.26 | **79.4** |
123+
| [Bamba 9B (2T)](https://huggingface.co/ibm-ai-platform
124+
ibm-ai-platform/Bamba-9B-2T) | 59.11 | 59.05 | 57.25 | 24.03 | **83.66** | 47.6 | **83.62** | 38.26 | **79.4** |
117125
| [Meta Llama2 7B (2T)](https://huggingface.co/meta-llama/Llama-2-7b-hf) | 53.78 | 46.64 | 52.65 | 13.57 | 78.95 | 45.2 | 80.03 | 38.96 | 74.27 |
118126
| [IBM Granite 7B (2T)](https://huggingface.co/ibm-granite/granite-7b-base) | 52.07 | 49.02 | 49.91 | 10.84 | 77.0 | 40.8 | 80.14 | 38.7 | 70.17 |
119127

@@ -132,7 +140,8 @@ Falcon Mamba is a pure Mamba model, Zamba has shared attention layer for every 6
132140

133141
| Model | Average | MMLU | ARC-C | GSM8K | Hellaswag | OpenbookQA | Piqa | TruthfulQA | Winogrande |
134142
| :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- | :---- |
135-
| [Bamba 9B](https://huggingface.co/ibm-fms/Bamba-9B) | 62.31 | 60.77 | 63.23 | 36.77 | 81.8 | 47.6 | 82.26 | 49.21 | 76.87 |
143+
| [Bamba 9B](https://huggingface.co/ibm-ai-platform
144+
ibm-ai-platform/Bamba-9B) | 62.31 | 60.77 | 63.23 | 36.77 | 81.8 | 47.6 | 82.26 | 49.21 | 76.87 |
136145
| NVIDIA Mamba2 Hybrid 8B\* | 58.78 | 53.6 | 47.7 | 77.69 | \-- | 42.8 | 79.65 | 38.72 | 71.27 |
137146
| [Zamba 7B](https://huggingface.co/Zyphra/Zamba-7B-v1) | 64.36 | 57.85 | 55.38 | 61.33 | 82.27 | 46.8 | **82.21** | 49.69 | 79.32 |
138147
| [Falcon Mamba 7B](https://huggingface.co/tiiuae/falcon-mamba-7b) | **65.31** | **63.19** | **63.4** | **52.08** | 80.82 | **47.8** | **83.62** | **53.46** | **78.14** |

blog/bamba31T.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
During Christmas of 2024, IBM, Princeton, CMU, and UIUC [released](https://huggingface.co/blog/bamba), Bamba v1, a performant Mamba2 based pretrained model with full data lineage trained to 2T tokens. Since then, we have been busy cooking an update with new datasets. Today, we are excited to release Bamba v2, trained for an additional 1T tokens that significantly improves on Bamba v1. The L1 and L2 leaderboard scores outperform Llama 3.1 8B, which was trained with nearly 5x the amount of data. All of this with the inference speedup that we get from Mamba2 based architecture, which with the latest vLLM is 2-2.5x faster than similar sized transformer models.
44

55
## Artifacts 📦
6-
1. [Hugging Face Bamba collection](https://huggingface.co/collections/ibm-fms/bamba-674f1388b9bbc98b413c7bab)
6+
1. [Hugging Face Bamba collection](https://huggingface.co/collections/ibm-ai-platform/bamba-674f1388b9bbc98b413c7bab)
77
2. [GitHub repo with inference, training, and tuning scripts](https://github.com/foundation-model-stack/bamba)
88
3. [vLLM RFC](https://github.com/vllm-project/vllm/issues/17140)
99

evaluation/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,8 +90,8 @@ In case you just want to run the benchmark as is and you do not want to use a mo
9090
harness_path="path/to/lm-evaluation-harness"
9191
python_path="python"
9292
lm_eval_script="${harness_path}/lm_eval"
93-
pretrained_model="ibm-fms/Bamba-9B"
94-
output_base_path="evaluation_results/debug/ibm-fms_Bamba-9B"
93+
pretrained_model="ibm-ai-platform/Bamba-9B"
94+
output_base_path="evaluation_results/debug/ibm-ai-platform_Bamba-9B"
9595
batch_size=4
9696

9797
# Function to run lm_eval with common arguments

evaluation/aggregation.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ def get_results_df(res_dir_paths, results_from_papers_path=None):
102102

103103
res_df["score"] = res_df["score"].round(2)
104104
res_df["model"] = res_df["model"].apply(
105-
lambda x: x.replace("/dccstor/fme/users/yotam/models/", "ibm-fms/")
105+
lambda x: x.replace("/dccstor/fme/users/yotam/models/", "ibm-ai-platform/")
106106
)
107107

108108
# df_pivot_score.to_csv("output/combined_results.csv", index=False)

evaluation/assets/eval_metadata.csv

Lines changed: 32 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -5,38 +5,38 @@ allenai/OLMo-2-1124-7B,7,PT,non_bamba
55
allenai/OLMo-7B-0424-hf,7,PT,non_bamba
66
allenai/OLMo-7B-hf,7,PT,non_bamba
77
google/gemma-2-9b,9,PT,non_bamba
8-
ibm-fms/Bamba-9.8b-1.8T-hf,9,PT,bamba
9-
ibm-fms/Bamba-9.8b-2.2T-hf,9,PT,bamba
10-
ibm-fms/Bamba-9.8b-2T-hf,9,PT,bamba
11-
ibm-fms/Bamba-9B-1.8T-fp8,9,PT,bamba
12-
ibm-fms/Bamba-9B-2.65T,9,PT,bamba
13-
ibm-fms/Bamba-9B-2T-fp8,9,PT,bamba
14-
ibm-fms/Bamba-9B-fp8,9,PT,bamba
15-
ibm-fms/Bamba-9b-2.1T-hf,9,PT,bamba
16-
ibm-fms/Bamba-9b-2.3T-hf,9,PT,bamba
17-
ibm-fms/Bamba-9b-2.5T-hf,9,PT,bamba
18-
ibm-fms/Bamba-9b-2.6T-hf,9,PT,bamba
19-
ibm-fms/Bamba-9b-2.8T-hf,9,PT,bamba
20-
ibm-fms/Bamba_annealed_models/Bamba-9b-2.1T-finemath-hf,9,PT,bamba
21-
ibm-fms/Bamba_annealed_models/Bamba-9b-Olmo-constant-2.5T-hf,9,PT,bamba
22-
ibm-fms/Bamba_annealed_models/Bamba-9b-Olmo-cosine-2.5T-hf,9,PT,bamba
23-
ibm-fms/Bamba_annealed_models/Bamba-9b-Olmo-cosine-4e5-2.5T-hf,9,PT,bamba
24-
ibm-fms/agentinstruct_lr1e_5-hf,9,SFT,bamba
25-
ibm-fms/agentinstruct_lr1e_6-hf,9,SFT,bamba
26-
ibm-fms/anteater_lr1e_5-hf,9,SFT,bamba
27-
ibm-fms/anteater_lr1e_6-hf,9,SFT,bamba
28-
ibm-fms/instruct_models/tuluv3/2.3T_base/lr1e_6_gbs_256-hf,9,SFT,bamba
29-
ibm-fms/instruct_models/tuluv3/2.3T_base/lr1e_6_gbs_32-hf,9,SFT,bamba
30-
ibm-fms/instruct_models/tuluv3/2.3T_base/lr1e_6_gbs_64-hf,9,SFT,bamba
31-
ibm-fms/instruct_models/tuluv3/2.3T_base/lr1e_6_wd_0.06-hf,9,SFT,bamba
32-
ibm-fms/instruct_models/tuluv3/2.3T_base/lr1e_6_wd_0.1-hf,9,SFT,bamba
33-
ibm-fms/instruct_models/tuluv3/2.3T_base/lr1e_6_wd_0.1_gbs_16-hf,9,SFT,bamba
34-
ibm-fms/instruct_models/tuluv3/2.3T_base/lr1e_6_wd_0.1_gbs_32-hf,9,SFT,bamba
35-
ibm-fms/instruct_models/tuluv3/2.3T_base/lr1e_6_wd_0_gbs_128_base-hf,9,SFT,bamba
36-
ibm-fms/instruct_models/tuluv3/2.3T_base/lr1e_6_wd_0_gbs_16-hf,9,SFT,bamba
37-
ibm-fms/lchu/70b_hsdp_768/hf/step-225000,70,PT,non_bamba
38-
ibm-fms/tuluv3_lr1e_5-hf,9,SFT,bamba
39-
ibm-fms/tuluv3_lr1e_6-hf,9,SFT,bamba
8+
ibm-ai-platform/Bamba-9.8b-1.8T-hf,9,PT,bamba
9+
ibm-ai-platform/Bamba-9.8b-2.2T-hf,9,PT,bamba
10+
ibm-ai-platform/Bamba-9.8b-2T-hf,9,PT,bamba
11+
ibm-ai-platform/Bamba-9B-1.8T-fp8,9,PT,bamba
12+
ibm-ai-platform/Bamba-9B-2.65T,9,PT,bamba
13+
ibm-ai-platform/Bamba-9B-2T-fp8,9,PT,bamba
14+
ibm-ai-platform/Bamba-9B-fp8,9,PT,bamba
15+
ibm-ai-platform/Bamba-9b-2.1T-hf,9,PT,bamba
16+
ibm-ai-platform/Bamba-9b-2.3T-hf,9,PT,bamba
17+
ibm-ai-platform/Bamba-9b-2.5T-hf,9,PT,bamba
18+
ibm-ai-platform/Bamba-9b-2.6T-hf,9,PT,bamba
19+
ibm-ai-platform/Bamba-9b-2.8T-hf,9,PT,bamba
20+
ibm-ai-platform/Bamba_annealed_models/Bamba-9b-2.1T-finemath-hf,9,PT,bamba
21+
ibm-ai-platform/Bamba_annealed_models/Bamba-9b-Olmo-constant-2.5T-hf,9,PT,bamba
22+
ibm-ai-platform/Bamba_annealed_models/Bamba-9b-Olmo-cosine-2.5T-hf,9,PT,bamba
23+
ibm-ai-platform/Bamba_annealed_models/Bamba-9b-Olmo-cosine-4e5-2.5T-hf,9,PT,bamba
24+
ibm-ai-platform/agentinstruct_lr1e_5-hf,9,SFT,bamba
25+
ibm-ai-platform/agentinstruct_lr1e_6-hf,9,SFT,bamba
26+
ibm-ai-platform/anteater_lr1e_5-hf,9,SFT,bamba
27+
ibm-ai-platform/anteater_lr1e_6-hf,9,SFT,bamba
28+
ibm-ai-platform/instruct_models/tuluv3/2.3T_base/lr1e_6_gbs_256-hf,9,SFT,bamba
29+
ibm-ai-platform/instruct_models/tuluv3/2.3T_base/lr1e_6_gbs_32-hf,9,SFT,bamba
30+
ibm-ai-platform/instruct_models/tuluv3/2.3T_base/lr1e_6_gbs_64-hf,9,SFT,bamba
31+
ibm-ai-platform/instruct_models/tuluv3/2.3T_base/lr1e_6_wd_0.06-hf,9,SFT,bamba
32+
ibm-ai-platform/instruct_models/tuluv3/2.3T_base/lr1e_6_wd_0.1-hf,9,SFT,bamba
33+
ibm-ai-platform/instruct_models/tuluv3/2.3T_base/lr1e_6_wd_0.1_gbs_16-hf,9,SFT,bamba
34+
ibm-ai-platform/instruct_models/tuluv3/2.3T_base/lr1e_6_wd_0.1_gbs_32-hf,9,SFT,bamba
35+
ibm-ai-platform/instruct_models/tuluv3/2.3T_base/lr1e_6_wd_0_gbs_128_base-hf,9,SFT,bamba
36+
ibm-ai-platform/instruct_models/tuluv3/2.3T_base/lr1e_6_wd_0_gbs_16-hf,9,SFT,bamba
37+
ibm-ai-platform/lchu/70b_hsdp_768/hf/step-225000,70,PT,non_bamba
38+
ibm-ai-platform/tuluv3_lr1e_5-hf,9,SFT,bamba
39+
ibm-ai-platform/tuluv3_lr1e_6-hf,9,SFT,bamba
4040
ibm-granite/granite-3.0-8b-base,8,PT,non_bamba
4141
ibm-granite/granite-7b-base,7,PT,non_bamba
4242
meta-llama/Llama-2-7b-hf,7,PT,non_bamba

evaluation/scripts/example_run_lmeval.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
harness_path="path/to/lm-evaluation-harness"
55
python_path="python"
66
lm_eval_script="${harness_path}/lm_eval"
7-
pretrained_model="ibm-fms/Bamba-9B"
8-
output_base_path="evaluation_results/debug/ibm-fms_Bamba-9B"
7+
pretrained_model="ibm-ai-platform/Bamba-9B"
8+
output_base_path="evaluation_results/debug/ibm-ai-platform_Bamba-9B"
99
batch_size=4
1010

1111
# Function to run lm_eval with common arguments

evaluation/serve_results.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ def get_results_df_cached(output_dir_path, res_dirs):
8383
.replace("-hf", "")
8484
.replace("-2T", "-2.0T")
8585
.replace("9B-fp8", "9B-2.2T-fp8")
86-
.replace("ibm-fms/", "")
86+
.replace("ibm-ai-platform/", "")
8787
.replace("instruct_models/", "")
8888
.replace("Bamba_annealed_models/", "")
8989
)

tuning/Fine-tuning.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ dataset = load_dataset("lucasmccabe-lmi/CodeAlpaca-20k", split="train")
1616

1717
# We load the model and the tokenizer
1818
# TODO: change path to bamba model when uploaded
19-
model_path = "ibm-fms/Bamba-9B"
19+
model_path = "ibm-ai-platform/Bamba-9B"
2020
model = AutoModelForCausalLM.from_pretrained(model_path)
2121
tokenizer = AutoTokenizer.from_pretrained(model_path)
2222

0 commit comments

Comments
 (0)