MohamedAshraf701's picture
Upload fine-tuned model
3ecf4e9 verified
|
raw
history blame
14.8 kB
metadata
base_model: sentence-transformers/multi-qa-MiniLM-L6-cos-v1
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:44072
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: Men
    sentences:
      - Casual
      - Summer
      - Cream
      - Timberland Men's Madbury Convertible Off White Pant
      - Apparel
      - Track Pants
      - Bottomwear
  - source_sentence: Women
    sentences:
      - Fall
      - Brown
      - Casual
      - Accessories
      - Allen Solly Woman Women Brown Pendant
      - Jewellery
      - Pendant
  - source_sentence: Men
    sentences:
      - Tshirts
      - White
      - Casual
      - Apparel
      - Summer
      - Scullers Men White Striped T-shirt
      - Topwear
  - source_sentence: Women
    sentences:
      - Handbags
      - Summer
      - Accessories
      - Brown
      - Casual
      - Lino Perros Women Brown Handbag
      - Bags
  - source_sentence: Boys
    sentences:
      - Grey
      - Apparel
      - Topwear
      - Gini and Jony Boys Around The World Grey T-shirt
      - Tshirts
      - Summer
      - Casual

SentenceTransformer based on sentence-transformers/multi-qa-MiniLM-L6-cos-v1

This is a sentence-transformers model finetuned from sentence-transformers/multi-qa-MiniLM-L6-cos-v1. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Boys',
    'Apparel',
    'Topwear',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 44,072 training samples
  • Columns: sentence_0, sentence_1, sentence_2, sentence_3, sentence_4, sentence_5, sentence_6, and sentence_7
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2 sentence_3 sentence_4 sentence_5 sentence_6 sentence_7
    type string string string string string string string string
    details
    • min: 3 tokens
    • mean: 3.11 tokens
    • max: 5 tokens
    • min: 3 tokens
    • mean: 3.27 tokens
    • max: 4 tokens
    • min: 3 tokens
    • mean: 3.62 tokens
    • max: 7 tokens
    • min: 3 tokens
    • mean: 3.9 tokens
    • max: 7 tokens
    • min: 3 tokens
    • mean: 3.07 tokens
    • max: 5 tokens
    • min: 3 tokens
    • mean: 3.0 tokens
    • max: 3 tokens
    • min: 3 tokens
    • mean: 3.0 tokens
    • max: 4 tokens
    • min: 6 tokens
    • mean: 10.12 tokens
    • max: 24 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2 sentence_3 sentence_4 sentence_5 sentence_6 sentence_7
    Boys Apparel Topwear Tshirts Blue Summer Casual Mr.Men Boys A Lot Of Work Blue T-shirt
    Men Apparel Topwear Shirts Blue Fall Casual Locomotive Men Check Blue Shirt
    Men Apparel Topwear Shirts Black Fall Formal Arrow Men Stripes Black Shirts
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • num_train_epochs: 20
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 20
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
0.7257 500 4.593
1.4514 1000 4.1126
2.1771 1500 4.1057
2.9028 2000 4.1028
3.6284 2500 4.1026
4.3541 3000 4.1014
5.0798 3500 4.1015
5.8055 4000 4.1019
6.5312 4500 4.1001
7.2569 5000 4.1009
7.9826 5500 4.1001
8.7083 6000 4.0995
9.4340 6500 4.099
10.1597 7000 4.0995
10.8853 7500 4.0995
11.6110 8000 4.1004
12.3367 8500 4.099
13.0624 9000 4.0997
13.7881 9500 4.0994
14.5138 10000 4.0994
15.2395 10500 4.0992
15.9652 11000 4.0993
16.6909 11500 4.0986
17.4165 12000 4.0973
18.1422 12500 4.0993
18.8679 13000 4.0984
19.5936 13500 4.0992

Framework Versions

  • Python: 3.12.6
  • Sentence Transformers: 3.1.1
  • Transformers: 4.45.1
  • PyTorch: 2.4.1
  • Accelerate: 0.34.2
  • Datasets: 3.0.1
  • Tokenizers: 0.20.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}