chtxxxxx commited on
Commit
ce0db78
·
verified ·
1 Parent(s): 4820c08

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -11
README.md CHANGED
@@ -15,15 +15,15 @@ language:
15
  - km
16
  - ta
17
  ---
18
- # SEA-LION-7B-Instruct-Research
19
 
20
  SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
21
  The size of the models range from 3 billion to 7 billion parameters.
22
  This is the card for the SEA-LION 7B Instruct (Non-Commercial) model.
23
 
24
- For more details on the base model, please refer to the [base model's model card](https://huggingface.co/aisingapore/sea-lion-7b).
25
 
26
- For the commercially permissive model, please refer to the [SEA-LION-7B-Instruct](https://huggingface.co/aisingapore/sea-lion-7b-instruct).
27
 
28
  SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
29
 
@@ -49,9 +49,9 @@ The model was then further instruction-tuned on <b>Indonesian data only</b>.
49
 
50
  ### Benchmark Performance
51
 
52
- SEA-LION-7B-Instruct-NC performs better than other models of comparable size when tested on tasks in the Indonesian language.
53
 
54
- We evaluated SEA-LION-7B-Instruct-NC on the [BHASA benchmark](https://arxiv.org/abs/2309.06085) and
55
  compared it against [Llama-2-7B](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf), [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
56
  and [Falcon-7B](https://huggingface.co/tiiuae/falcon-7b-instruct).
57
 
@@ -69,8 +69,8 @@ For Natural Language Reasoning (NLR) tasks, we tested the model on Natural Langu
69
 
70
  | Model                          | QA (F1) | Sentiment (F1) | Toxicity (F1) | Eng>Indo (ChrF++) | Indo>Eng (ChrF++) | Summary (ROUGE-L) | NLI (Acc) | Causal (Acc) |
71
  |--------------------------------|---------|----------------|---------------|-------------------|-------------------|-------------------|-----------|--------------|
72
- | SEA-LION-7B-Instruct-Research  | 24.86   | 76.13          | 24.45         | 52.50             | 46.82             | 15.44             | 33.20     | 23.80        |
73
- | SEA-LION-7B-Instruct           | **68.41**| **91.45**     | 17.98         | 57.48             | 58.04             | **17.54**         | 53.10     | 60.80        |
74
  | SeaLLM 7B v1                   | 30.96   | 56.29          | 22.60         | 62.23             | 41.55             | 14.03             | 26.50     | 56.60        |
75
  | SeaLLM 7B v2                   | 44.40   | 80.13          | **55.24**     | 64.01           | **63.28**         | 17.31             | 43.60     | 82.00   |
76
  | Sailor-7B (Base)               | 65.43   | 59.48          | 20.48         | **64.27**         | 60.68             | 8.69              | 15.10     | 38.40        |
@@ -83,9 +83,9 @@ For Natural Language Reasoning (NLR) tasks, we tested the model on Natural Langu
83
 
84
  ### Model Architecture and Objective
85
 
86
- SEA-LION is a decoder model using the MPT architecture.
87
 
88
- | Parameter | SEA-LION 7B |
89
  |-----------------|:-----------:|
90
  | Layers | 32 |
91
  | d_model | 4096 |
@@ -107,8 +107,8 @@ The tokenizer type is Byte-Pair Encoding (BPE).
107
 
108
  from transformers import AutoModelForCausalLM, AutoTokenizer
109
 
110
- tokenizer = AutoTokenizer.from_pretrained("aisingapore/sea-lion-7b-instruct-nc", trust_remote_code=True)
111
- model = AutoModelForCausalLM.from_pretrained("aisingapore/sea-lion-7b-instruct-nc", trust_remote_code=True)
112
 
113
  prompt_template = "### USER:\n{human_prompt}\n\n### RESPONSE:\n"
114
  prompt = """Apa sentimen dari kalimat berikut ini?
 
15
  - km
16
  - ta
17
  ---
18
+ # SEA-LION-7B-IT-Research
19
 
20
  SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
21
  The size of the models range from 3 billion to 7 billion parameters.
22
  This is the card for the SEA-LION 7B Instruct (Non-Commercial) model.
23
 
24
+ For more details on the base model, please refer to the [base model's model card](https://huggingface.co/aisingapore/SEA-LION-v1-7B).
25
 
26
+ For the commercially permissive model, please refer to the [SEA-LION-7B-IT](https://huggingface.co/aisingapore/SEA-LION-v1-7B-IT).
27
 
28
  SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
29
 
 
49
 
50
  ### Benchmark Performance
51
 
52
+ SEA-LION-7B-IT-Research performs better than other models of comparable size when tested on tasks in the Indonesian language.
53
 
54
+ We evaluated SEA-LION-7B-IT-Research on the [BHASA benchmark](https://arxiv.org/abs/2309.06085) and
55
  compared it against [Llama-2-7B](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf), [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
56
  and [Falcon-7B](https://huggingface.co/tiiuae/falcon-7b-instruct).
57
 
 
69
 
70
  | Model                          | QA (F1) | Sentiment (F1) | Toxicity (F1) | Eng>Indo (ChrF++) | Indo>Eng (ChrF++) | Summary (ROUGE-L) | NLI (Acc) | Causal (Acc) |
71
  |--------------------------------|---------|----------------|---------------|-------------------|-------------------|-------------------|-----------|--------------|
72
+ | SEA-LION-7B-IT-Research  | 24.86   | 76.13          | 24.45         | 52.50             | 46.82             | 15.44             | 33.20     | 23.80        |
73
+ | SEA-LION-7B-IT           | **68.41**| **91.45**     | 17.98         | 57.48             | 58.04             | **17.54**         | 53.10     | 60.80        |
74
  | SeaLLM 7B v1                   | 30.96   | 56.29          | 22.60         | 62.23             | 41.55             | 14.03             | 26.50     | 56.60        |
75
  | SeaLLM 7B v2                   | 44.40   | 80.13          | **55.24**     | 64.01           | **63.28**         | 17.31             | 43.60     | 82.00   |
76
  | Sailor-7B (Base)               | 65.43   | 59.48          | 20.48         | **64.27**         | 60.68             | 8.69              | 15.10     | 38.40        |
 
83
 
84
  ### Model Architecture and Objective
85
 
86
+ SEA-LION-7B-IT-Research is a decoder model using the MPT architecture.
87
 
88
+ | Parameter | SEA-LION-7B-IT-Research |
89
  |-----------------|:-----------:|
90
  | Layers | 32 |
91
  | d_model | 4096 |
 
107
 
108
  from transformers import AutoModelForCausalLM, AutoTokenizer
109
 
110
+ tokenizer = AutoTokenizer.from_pretrained("aisingapore/SEA-LION-v1-7B-IT-Research", trust_remote_code=True)
111
+ model = AutoModelForCausalLM.from_pretrained("aisingapore/SEA-LION-v1-7B-IT-Research", trust_remote_code=True)
112
 
113
  prompt_template = "### USER:\n{human_prompt}\n\n### RESPONSE:\n"
114
  prompt = """Apa sentimen dari kalimat berikut ini?