Usage indications
#1
by
maxoul
- opened
Hey,
Thanks for releasing the models. I'd like to test them, but they lacks documentation ! I have a few questions:
I'm concerned because the models seem to use LlamaForCausalLM but they should also have bidirectional attention right ? How is this handled ?
When loading the model I get:
lion = AutoModel.from_pretrained('hzeng/Lion-SP-1B-llama3-marco-mntp')
config.json: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 883/883 [00:00<00:00, 7.88MB/s]
adapter_config.json: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 816/816 [00:00<00:00, 7.85MB/s]
adapter_model.bin: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 45.2M/45.2M [00:05<00:00, 7.59MB/s]
Loading adapter weights from hzeng/Lion-SP-1B-llama3-marco-mntp led to unexpected keys not found in the model: model.layers.0.self_attn.q_proj.lora_A.default.weight, model.layers.0.self_attn.q_proj.lora_B.default.weight, model.layers.0.self_attn.k_proj.lora_A.default.weight, model.layers.0.self_attn.k_proj.lora_B.default.weight, model.layers.0.self_attn.v_proj.lora_A.default.weight, model.layers.0.self_attn.v_proj.lora_B.default.weight, model.layers.0.self_attn.o_proj.lora_A.default.weight, model.layers.0.self_attn.o_proj.lora_B.default.weight, model.layers.0.mlp.gate_proj.lora_A.default.weight, model.layers.0.mlp.gate_proj.lora_B.default.weight, model.layers.0.mlp.up_proj.lora_A.default.weight, model.layers.0.mlp.up_proj.lora_B.default.weight, model.layers.0.mlp.down_proj.lora_A.default.weight, model.layers.0.mlp.down_proj.lora_B.default.weight, model.layers.1.self_attn.q_proj.lora_A.default.weight, model.layers.1.self_attn.q_proj.lora_B.default.weight, model.layers.1.self_attn.k_proj.lora_A.default.weight, model.layers.1.self_attn.k_proj.lora_B.default.weight, model.layers.1.self_attn.v_proj.lora_A.default.weight, model.layers.1.self_attn.v_proj.lora_B.default.weight, model.layers.1.self_attn.o_proj.lora_A.default.weight, model.layers.1.self_attn.o_proj.lora_B.default.weight,
...
COuld you indicate a fix for that (it maybe the transformers version, not sure)
- Can you provide a quick snippet to obtain a doc/query embedding and compute scores ?
Thanks in advance,