alignmentforever's picture
Upload folder using huggingface_hub
d1b7681 verified
+ deepspeed --master_port 16107 --module safe_rlhf.finetune --train_datasets inverse-json::/home/hansirui_1st/jiayi/resist/setting3/safety_data/training/safe/safe_30k.json --model_name_or_path /aifs4su/hansirui_1st/models/Qwen1.5-0.5B --max_length 2048 --trust_remote_code True --epochs 1 --per_device_train_batch_size 4 --per_device_eval_batch_size 4 --gradient_accumulation_steps 8 --gradient_checkpointing --learning_rate 1e-5 --lr_warmup_ratio 0 --weight_decay 0.0 --lr_scheduler_type constant --weight_decay 0.0 --seed 42 --output_dir /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/Qwen1.5-0.5B/Qwen1.5-0.5B-s3-Q1-30k --log_type wandb --log_run_name qwen-0.5b-s3-Q1-30k --log_project Inverse_Alignment --zero_stage 3 --offload none --bf16 True --tf32 True --save_16bit
[rank1]:[W528 18:56:34.265728922 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 1] using GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank4]:[W528 18:56:35.284960455 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 4] using GPU 4 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank5]:[W528 18:56:35.393058084 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 5] using GPU 5 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank7]:[W528 18:56:35.398253436 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 7] using GPU 7 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank2]:[W528 18:56:35.458466649 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 2] using GPU 2 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank6]:[W528 18:56:35.458730328 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 6] using GPU 6 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank3]:[W528 18:56:35.463895458 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 3] using GPU 3 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
[rank0]:[W528 18:56:35.470006441 ProcessGroupNCCL.cpp:4561] [PG ID 0 PG GUID 0 Rank 0] using GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect. Specify device_ids in barrier() to force use of a particular device, or call init_process_group() with a device_id.
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/config.json
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/config.json
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/config.json
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/config.json
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/config.json
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/config.json
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/config.json
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/config.json
Model config Qwen2Config {
"_name_or_path": "/aifs4su/hansirui_1st/models/Qwen1.5-0.5B",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151643,
"hidden_act": "silu",
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 2816,
"max_position_embeddings": 32768,
"max_window_layers": 21,
"model_type": "qwen2",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"num_key_value_heads": 16,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.49.0",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}
Model config Qwen2Config {
"_name_or_path": "/aifs4su/hansirui_1st/models/Qwen1.5-0.5B",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151643,
"hidden_act": "silu",
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 2816,
"max_position_embeddings": 32768,
"max_window_layers": 21,
"model_type": "qwen2",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"num_key_value_heads": 16,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.49.0",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}
Model config Qwen2Config {
"_name_or_path": "/aifs4su/hansirui_1st/models/Qwen1.5-0.5B",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151643,
"hidden_act": "silu",
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 2816,
"max_position_embeddings": 32768,
"max_window_layers": 21,
"model_type": "qwen2",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"num_key_value_heads": 16,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.49.0",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}
Model config Qwen2Config {
"_name_or_path": "/aifs4su/hansirui_1st/models/Qwen1.5-0.5B",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151643,
"hidden_act": "silu",
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 2816,
"max_position_embeddings": 32768,
"max_window_layers": 21,
"model_type": "qwen2",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"num_key_value_heads": 16,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.49.0",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}
Model config Qwen2Config {
"_name_or_path": "/aifs4su/hansirui_1st/models/Qwen1.5-0.5B",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151643,
"hidden_act": "silu",
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 2816,
"max_position_embeddings": 32768,
"max_window_layers": 21,
"model_type": "qwen2",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"num_key_value_heads": 16,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.49.0",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}
Model config Qwen2Config {
"_name_or_path": "/aifs4su/hansirui_1st/models/Qwen1.5-0.5B",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151643,
"hidden_act": "silu",
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 2816,
"max_position_embeddings": 32768,
"max_window_layers": 21,
"model_type": "qwen2",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"num_key_value_heads": 16,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.49.0",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}
Model config Qwen2Config {
"_name_or_path": "/aifs4su/hansirui_1st/models/Qwen1.5-0.5B",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151643,
"hidden_act": "silu",
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 2816,
"max_position_embeddings": 32768,
"max_window_layers": 21,
"model_type": "qwen2",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"num_key_value_heads": 16,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.49.0",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}
Model config Qwen2Config {
"_name_or_path": "/aifs4su/hansirui_1st/models/Qwen1.5-0.5B",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151643,
"hidden_act": "silu",
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 2816,
"max_position_embeddings": 32768,
"max_window_layers": 21,
"model_type": "qwen2",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"num_key_value_heads": 16,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.49.0",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}
loading weights file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/model.safetensors
loading weights file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/model.safetensors
Will use torch_dtype=torch.bfloat16 as defined in model's config object
Will use torch_dtype=torch.bfloat16 as defined in model's config object
Will use torch_dtype=torch.bfloat16 as defined in model's config object
Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
Will use torch_dtype=torch.bfloat16 as defined in model's config object
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
Will use torch_dtype=torch.bfloat16 as defined in model's config object
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
Will use torch_dtype=torch.bfloat16 as defined in model's config object
Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.bfloat16 as defined in model's config object
Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Will use torch_dtype=torch.bfloat16 as defined in model's config object
Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
Detected DeepSpeed ZeRO-3: activating zero.init() for this model
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643
}
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643
}
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643
}
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643
}
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643
}
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643
}
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643
}
Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643
}
Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
All model checkpoint weights were used when initializing Qwen2ForCausalLM.
All model checkpoint weights were used when initializing Qwen2ForCausalLM.
All model checkpoint weights were used when initializing Qwen2ForCausalLM.
All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/Qwen1.5-0.5B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing Qwen2ForCausalLM.
All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/Qwen1.5-0.5B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing Qwen2ForCausalLM.
All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/Qwen1.5-0.5B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/Qwen1.5-0.5B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/Qwen1.5-0.5B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing Qwen2ForCausalLM.
All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/Qwen1.5-0.5B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing Qwen2ForCausalLM.
All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/Qwen1.5-0.5B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643,
"max_new_tokens": 2048
}
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643,
"max_new_tokens": 2048
}
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643,
"max_new_tokens": 2048
}
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643,
"max_new_tokens": 2048
}
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/generation_config.json
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643,
"max_new_tokens": 2048
}
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643,
"max_new_tokens": 2048
}
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643,
"max_new_tokens": 2048
}
loading file vocab.json
loading file merges.txt
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file vocab.json
loading file merges.txt
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file vocab.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file vocab.json
loading file merges.txt
loading file tokenizer.json
loading file merges.txt
loading file added_tokens.json
loading file tokenizer.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file vocab.json
loading file chat_template.jinja
loading file tokenizer_config.json
loading file merges.txt
loading file chat_template.jinja
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file vocab.json
loading file merges.txt
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
loading file vocab.json
loading file merges.txt
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
All model checkpoint weights were used when initializing Qwen2ForCausalLM.
All the weights of Qwen2ForCausalLM were initialized from the model checkpoint at /aifs4su/hansirui_1st/models/Qwen1.5-0.5B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2ForCausalLM for predictions without further training.
loading configuration file /aifs4su/hansirui_1st/models/Qwen1.5-0.5B/generation_config.json
Generate config GenerationConfig {
"bos_token_id": 151643,
"eos_token_id": 151643,
"max_new_tokens": 2048
}
loading file vocab.json
loading file merges.txt
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading file chat_template.jinja
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151646. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151646. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151646. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151646. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151646. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151646. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151646. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/models/pretrained.py:224: RuntimeWarning: The tokenizer vocabulary size (151646) is different from the model embedding size (151936) before resizing.
resize_tokenizer_embedding(tokenizer=tokenizer, model=model)
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151646. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Using /home/hansirui_1st/.cache/torch_extensions/py311_cu124 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/hansirui_1st/.cache/torch_extensions/py311_cu124/fused_adam/build.ninja...
/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/torch/utils/cpp_extension.py:2059: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
Building extension module fused_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
Loading extension module fused_adam...
Loading extension module fused_adam...Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...
Loading extension module fused_adam...Loading extension module fused_adam...
Loading extension module fused_adam...
wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
wandb: Currently logged in as: xtom to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.19.8
wandb: Run data is saved locally in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/Qwen1.5-0.5B/Qwen1.5-0.5B-s3-Q1-30k/wandb/run-20250528_185646-hslhqxbw
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run qwen-0.5b-s3-Q1-30k
wandb: ⭐️ View project at https://wandb.ai/xtom/Inverse_Alignment
wandb: πŸš€ View run at https://wandb.ai/xtom/Inverse_Alignment/runs/hslhqxbw
Training 1/1 epoch: 0%| | 0/938 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
Training 1/1 epoch (loss 2.2220): 0%| | 0/938 [00:06<?, ?it/s] Training 1/1 epoch (loss 2.2220): 0%| | 1/938 [00:06<1:38:28, 6.31s/it] Training 1/1 epoch (loss 2.1348): 0%| | 1/938 [00:09<1:38:28, 6.31s/it] Training 1/1 epoch (loss 2.1348): 0%| | 2/938 [00:09<1:08:36, 4.40s/it] Training 1/1 epoch (loss 2.1724): 0%| | 2/938 [00:09<1:08:36, 4.40s/it] Training 1/1 epoch (loss 2.1724): 0%| | 3/938 [00:09<39:29, 2.53s/it] Training 1/1 epoch (loss 2.1842): 0%| | 3/938 [00:10<39:29, 2.53s/it] Training 1/1 epoch (loss 2.1842): 0%| | 4/938 [00:10<26:02, 1.67s/it] Training 1/1 epoch (loss 2.1653): 0%| | 4/938 [00:10<26:02, 1.67s/it] Training 1/1 epoch (loss 2.1653): 1%| | 5/938 [00:10<18:40, 1.20s/it] Training 1/1 epoch (loss 2.1280): 1%| | 5/938 [00:10<18:40, 1.20s/it] Training 1/1 epoch (loss 2.1280): 1%| | 6/938 [00:10<14:32, 1.07it/s] Training 1/1 epoch (loss 2.2090): 1%| | 6/938 [00:11<14:32, 1.07it/s] Training 1/1 epoch (loss 2.2090): 1%| | 7/938 [00:11<11:51, 1.31it/s] Training 1/1 epoch (loss 2.1927): 1%| | 7/938 [00:11<11:51, 1.31it/s] Training 1/1 epoch (loss 2.1927): 1%| | 8/938 [00:11<10:23, 1.49it/s] Training 1/1 epoch (loss 1.9968): 1%| | 8/938 [00:12<10:23, 1.49it/s] Training 1/1 epoch (loss 1.9968): 1%| | 9/938 [00:12<08:50, 1.75it/s] Training 1/1 epoch (loss 2.0270): 1%| | 9/938 [00:12<08:50, 1.75it/s] Training 1/1 epoch (loss 2.0270): 1%| | 10/938 [00:12<07:43, 2.00it/s] Training 1/1 epoch (loss 2.0844): 1%| | 10/938 [00:12<07:43, 2.00it/s] Training 1/1 epoch (loss 2.0844): 1%| | 11/938 [00:12<07:09, 2.16it/s] Training 1/1 epoch (loss 2.1573): 1%| | 11/938 [00:13<07:09, 2.16it/s] Training 1/1 epoch (loss 2.1573): 1%|▏ | 12/938 [00:13<07:22, 2.09it/s] Training 1/1 epoch (loss 2.1605): 1%|▏ | 12/938 [00:13<07:22, 2.09it/s] Training 1/1 epoch (loss 2.1605): 1%|▏ | 13/938 [00:13<07:23, 2.08it/s] Training 1/1 epoch (loss 1.9138): 1%|▏ | 13/938 [00:14<07:23, 2.08it/s] Training 1/1 epoch (loss 1.9138): 1%|▏ | 14/938 [00:14<06:37, 2.33it/s] Training 1/1 epoch (loss 2.2113): 1%|▏ | 14/938 [00:14<06:37, 2.33it/s] Training 1/1 epoch (loss 2.2113): 2%|▏ | 15/938 [00:14<06:34, 2.34it/s] Training 1/1 epoch (loss 2.0852): 2%|▏ | 15/938 [00:15<06:34, 2.34it/s] Training 1/1 epoch (loss 2.0852): 2%|▏ | 16/938 [00:15<07:01, 2.19it/s] Training 1/1 epoch (loss 1.9559): 2%|▏ | 16/938 [00:15<07:01, 2.19it/s] Training 1/1 epoch (loss 1.9559): 2%|▏ | 17/938 [00:15<06:50, 2.24it/s] Training 1/1 epoch (loss 2.0714): 2%|▏ | 17/938 [00:15<06:50, 2.24it/s] Training 1/1 epoch (loss 2.0714): 2%|▏ | 18/938 [00:15<06:31, 2.35it/s] Training 1/1 epoch (loss 1.8474): 2%|▏ | 18/938 [00:16<06:31, 2.35it/s] Training 1/1 epoch (loss 1.8474): 2%|▏ | 19/938 [00:16<06:11, 2.47it/s] Training 1/1 epoch (loss 1.9205): 2%|▏ | 19/938 [00:16<06:11, 2.47it/s] Training 1/1 epoch (loss 1.9205): 2%|▏ | 20/938 [00:16<06:08, 2.49it/s] Training 1/1 epoch (loss 2.0388): 2%|▏ | 20/938 [00:17<06:08, 2.49it/s] Training 1/1 epoch (loss 2.0388): 2%|▏ | 21/938 [00:17<06:40, 2.29it/s] Training 1/1 epoch (loss 2.0611): 2%|▏ | 21/938 [00:17<06:40, 2.29it/s] Training 1/1 epoch (loss 2.0611): 2%|▏ | 22/938 [00:17<06:34, 2.32it/s] Training 1/1 epoch (loss 1.9040): 2%|▏ | 22/938 [00:17<06:34, 2.32it/s] Training 1/1 epoch (loss 1.9040): 2%|▏ | 23/938 [00:17<06:20, 2.41it/s] Training 1/1 epoch (loss 2.0298): 2%|▏ | 23/938 [00:18<06:20, 2.41it/s] Training 1/1 epoch (loss 2.0298): 3%|β–Ž | 24/938 [00:18<06:08, 2.48it/s] Training 1/1 epoch (loss 2.0659): 3%|β–Ž | 24/938 [00:18<06:08, 2.48it/s] Training 1/1 epoch (loss 2.0659): 3%|β–Ž | 25/938 [00:18<06:20, 2.40it/s] Training 1/1 epoch (loss 1.9792): 3%|β–Ž | 25/938 [00:19<06:20, 2.40it/s] Training 1/1 epoch (loss 1.9792): 3%|β–Ž | 26/938 [00:19<06:32, 2.32it/s] Training 1/1 epoch (loss 1.9643): 3%|β–Ž | 26/938 [00:19<06:32, 2.32it/s] Training 1/1 epoch (loss 1.9643): 3%|β–Ž | 27/938 [00:19<06:22, 2.38it/s] Training 1/1 epoch (loss 1.8965): 3%|β–Ž | 27/938 [00:19<06:22, 2.38it/s] Training 1/1 epoch (loss 1.8965): 3%|β–Ž | 28/938 [00:19<06:16, 2.42it/s] Training 1/1 epoch (loss 2.0121): 3%|β–Ž | 28/938 [00:20<06:16, 2.42it/s] Training 1/1 epoch (loss 2.0121): 3%|β–Ž | 29/938 [00:20<06:06, 2.48it/s] Training 1/1 epoch (loss 2.0208): 3%|β–Ž | 29/938 [00:20<06:06, 2.48it/s] Training 1/1 epoch (loss 2.0208): 3%|β–Ž | 30/938 [00:20<06:02, 2.51it/s] Training 1/1 epoch (loss 1.9586): 3%|β–Ž | 30/938 [00:21<06:02, 2.51it/s] Training 1/1 epoch (loss 1.9586): 3%|β–Ž | 31/938 [00:21<06:17, 2.40it/s] Training 1/1 epoch (loss 1.9993): 3%|β–Ž | 31/938 [00:21<06:17, 2.40it/s] Training 1/1 epoch (loss 1.9993): 3%|β–Ž | 32/938 [00:21<06:15, 2.42it/s] Training 1/1 epoch (loss 1.9812): 3%|β–Ž | 32/938 [00:21<06:15, 2.42it/s] Training 1/1 epoch (loss 1.9812): 4%|β–Ž | 33/938 [00:21<06:01, 2.50it/s] Training 1/1 epoch (loss 1.9201): 4%|β–Ž | 33/938 [00:22<06:01, 2.50it/s] Training 1/1 epoch (loss 1.9201): 4%|β–Ž | 34/938 [00:22<05:51, 2.57it/s] Training 1/1 epoch (loss 1.9739): 4%|β–Ž | 34/938 [00:22<05:51, 2.57it/s] Training 1/1 epoch (loss 1.9739): 4%|β–Ž | 35/938 [00:22<05:42, 2.64it/s] Training 1/1 epoch (loss 1.9911): 4%|β–Ž | 35/938 [00:23<05:42, 2.64it/s] Training 1/1 epoch (loss 1.9911): 4%|▍ | 36/938 [00:23<05:51, 2.56it/s] Training 1/1 epoch (loss 2.0473): 4%|▍ | 36/938 [00:23<05:51, 2.56it/s] Training 1/1 epoch (loss 2.0473): 4%|▍ | 37/938 [00:23<06:24, 2.34it/s] Training 1/1 epoch (loss 1.9885): 4%|▍ | 37/938 [00:24<06:24, 2.34it/s] Training 1/1 epoch (loss 1.9885): 4%|▍ | 38/938 [00:24<06:17, 2.39it/s] Training 1/1 epoch (loss 1.9089): 4%|▍ | 38/938 [00:24<06:17, 2.39it/s] Training 1/1 epoch (loss 1.9089): 4%|▍ | 39/938 [00:24<06:00, 2.50it/s] Training 1/1 epoch (loss 2.0994): 4%|▍ | 39/938 [00:25<06:00, 2.50it/s] Training 1/1 epoch (loss 2.0994): 4%|▍ | 40/938 [00:25<07:00, 2.14it/s] Training 1/1 epoch (loss 1.8468): 4%|▍ | 40/938 [00:25<07:00, 2.14it/s] Training 1/1 epoch (loss 1.8468): 4%|▍ | 41/938 [00:25<06:38, 2.25it/s] Training 1/1 epoch (loss 1.9319): 4%|▍ | 41/938 [00:25<06:38, 2.25it/s] Training 1/1 epoch (loss 1.9319): 4%|▍ | 42/938 [00:25<06:29, 2.30it/s] Training 1/1 epoch (loss 1.9209): 4%|▍ | 42/938 [00:26<06:29, 2.30it/s] Training 1/1 epoch (loss 1.9209): 5%|▍ | 43/938 [00:26<06:03, 2.46it/s] Training 1/1 epoch (loss 1.9627): 5%|▍ | 43/938 [00:26<06:03, 2.46it/s] Training 1/1 epoch (loss 1.9627): 5%|▍ | 44/938 [00:26<05:58, 2.49it/s] Training 1/1 epoch (loss 1.9554): 5%|▍ | 44/938 [00:26<05:58, 2.49it/s] Training 1/1 epoch (loss 1.9554): 5%|▍ | 45/938 [00:26<05:44, 2.59it/s] Training 1/1 epoch (loss 1.9003): 5%|▍ | 45/938 [00:27<05:44, 2.59it/s] Training 1/1 epoch (loss 1.9003): 5%|▍ | 46/938 [00:27<06:23, 2.32it/s] Training 1/1 epoch (loss 1.9916): 5%|▍ | 46/938 [00:27<06:23, 2.32it/s] Training 1/1 epoch (loss 1.9916): 5%|β–Œ | 47/938 [00:27<05:56, 2.50it/s] Training 1/1 epoch (loss 2.0183): 5%|β–Œ | 47/938 [00:28<05:56, 2.50it/s] Training 1/1 epoch (loss 2.0183): 5%|β–Œ | 48/938 [00:28<05:49, 2.55it/s] Training 1/1 epoch (loss 2.0447): 5%|β–Œ | 48/938 [00:28<05:49, 2.55it/s] Training 1/1 epoch (loss 2.0447): 5%|β–Œ | 49/938 [00:28<05:41, 2.60it/s] Training 1/1 epoch (loss 1.8472): 5%|β–Œ | 49/938 [00:28<05:41, 2.60it/s] Training 1/1 epoch (loss 1.8472): 5%|β–Œ | 50/938 [00:28<05:38, 2.63it/s] Training 1/1 epoch (loss 1.8300): 5%|β–Œ | 50/938 [00:29<05:38, 2.63it/s] Training 1/1 epoch (loss 1.8300): 5%|β–Œ | 51/938 [00:29<05:50, 2.53it/s] Training 1/1 epoch (loss 1.8982): 5%|β–Œ | 51/938 [00:29<05:50, 2.53it/s] Training 1/1 epoch (loss 1.8982): 6%|β–Œ | 52/938 [00:29<06:04, 2.43it/s] Training 1/1 epoch (loss 1.8491): 6%|β–Œ | 52/938 [00:30<06:04, 2.43it/s] Training 1/1 epoch (loss 1.8491): 6%|β–Œ | 53/938 [00:30<06:07, 2.41it/s] Training 1/1 epoch (loss 1.9811): 6%|β–Œ | 53/938 [00:30<06:07, 2.41it/s] Training 1/1 epoch (loss 1.9811): 6%|β–Œ | 54/938 [00:30<06:17, 2.34it/s] Training 1/1 epoch (loss 2.0053): 6%|β–Œ | 54/938 [00:30<06:17, 2.34it/s] Training 1/1 epoch (loss 2.0053): 6%|β–Œ | 55/938 [00:30<05:56, 2.48it/s] Training 1/1 epoch (loss 1.9305): 6%|β–Œ | 55/938 [00:31<05:56, 2.48it/s] Training 1/1 epoch (loss 1.9305): 6%|β–Œ | 56/938 [00:31<06:01, 2.44it/s] Training 1/1 epoch (loss 1.9169): 6%|β–Œ | 56/938 [00:31<06:01, 2.44it/s] Training 1/1 epoch (loss 1.9169): 6%|β–Œ | 57/938 [00:31<06:00, 2.44it/s] Training 1/1 epoch (loss 1.9080): 6%|β–Œ | 57/938 [00:32<06:00, 2.44it/s] Training 1/1 epoch (loss 1.9080): 6%|β–Œ | 58/938 [00:32<05:51, 2.50it/s] Training 1/1 epoch (loss 1.9597): 6%|β–Œ | 58/938 [00:32<05:51, 2.50it/s] Training 1/1 epoch (loss 1.9597): 6%|β–‹ | 59/938 [00:32<05:34, 2.62it/s] Training 1/1 epoch (loss 2.0467): 6%|β–‹ | 59/938 [00:32<05:34, 2.62it/s] Training 1/1 epoch (loss 2.0467): 6%|β–‹ | 60/938 [00:32<05:23, 2.72it/s] Training 1/1 epoch (loss 2.0173): 6%|β–‹ | 60/938 [00:33<05:23, 2.72it/s] Training 1/1 epoch (loss 2.0173): 7%|β–‹ | 61/938 [00:33<05:24, 2.70it/s] Training 1/1 epoch (loss 2.0024): 7%|β–‹ | 61/938 [00:33<05:24, 2.70it/s] Training 1/1 epoch (loss 2.0024): 7%|β–‹ | 62/938 [00:33<05:46, 2.52it/s] Training 1/1 epoch (loss 1.9954): 7%|β–‹ | 62/938 [00:34<05:46, 2.52it/s] Training 1/1 epoch (loss 1.9954): 7%|β–‹ | 63/938 [00:34<05:38, 2.58it/s] Training 1/1 epoch (loss 1.8226): 7%|β–‹ | 63/938 [00:34<05:38, 2.58it/s] Training 1/1 epoch (loss 1.8226): 7%|β–‹ | 64/938 [00:34<05:42, 2.55it/s] Training 1/1 epoch (loss 1.9375): 7%|β–‹ | 64/938 [00:34<05:42, 2.55it/s] Training 1/1 epoch (loss 1.9375): 7%|β–‹ | 65/938 [00:34<06:02, 2.40it/s] Training 1/1 epoch (loss 2.0100): 7%|β–‹ | 65/938 [00:35<06:02, 2.40it/s] Training 1/1 epoch (loss 2.0100): 7%|β–‹ | 66/938 [00:35<06:13, 2.33it/s] Training 1/1 epoch (loss 1.9682): 7%|β–‹ | 66/938 [00:35<06:13, 2.33it/s] Training 1/1 epoch (loss 1.9682): 7%|β–‹ | 67/938 [00:35<06:10, 2.35it/s] Training 1/1 epoch (loss 1.8981): 7%|β–‹ | 67/938 [00:36<06:10, 2.35it/s] Training 1/1 epoch (loss 1.8981): 7%|β–‹ | 68/938 [00:36<05:50, 2.48it/s] Training 1/1 epoch (loss 1.9856): 7%|β–‹ | 68/938 [00:36<05:50, 2.48it/s] Training 1/1 epoch (loss 1.9856): 7%|β–‹ | 69/938 [00:36<05:33, 2.61it/s] Training 1/1 epoch (loss 2.0554): 7%|β–‹ | 69/938 [00:36<05:33, 2.61it/s] Training 1/1 epoch (loss 2.0554): 7%|β–‹ | 70/938 [00:36<05:23, 2.69it/s] Training 1/1 epoch (loss 1.8820): 7%|β–‹ | 70/938 [00:37<05:23, 2.69it/s] Training 1/1 epoch (loss 1.8820): 8%|β–Š | 71/938 [00:37<05:33, 2.60it/s] Training 1/1 epoch (loss 1.9553): 8%|β–Š | 71/938 [00:37<05:33, 2.60it/s] Training 1/1 epoch (loss 1.9553): 8%|β–Š | 72/938 [00:37<05:49, 2.48it/s] Training 1/1 epoch (loss 1.8976): 8%|β–Š | 72/938 [00:38<05:49, 2.48it/s] Training 1/1 epoch (loss 1.8976): 8%|β–Š | 73/938 [00:38<05:29, 2.62it/s] Training 1/1 epoch (loss 1.8380): 8%|β–Š | 73/938 [00:38<05:29, 2.62it/s] Training 1/1 epoch (loss 1.8380): 8%|β–Š | 74/938 [00:38<05:33, 2.59it/s] Training 1/1 epoch (loss 1.7591): 8%|β–Š | 74/938 [00:38<05:33, 2.59it/s] Training 1/1 epoch (loss 1.7591): 8%|β–Š | 75/938 [00:38<05:21, 2.69it/s] Training 1/1 epoch (loss 1.8664): 8%|β–Š | 75/938 [00:39<05:21, 2.69it/s] Training 1/1 epoch (loss 1.8664): 8%|β–Š | 76/938 [00:39<05:20, 2.69it/s] Training 1/1 epoch (loss 1.7734): 8%|β–Š | 76/938 [00:39<05:20, 2.69it/s] Training 1/1 epoch (loss 1.7734): 8%|β–Š | 77/938 [00:39<05:36, 2.56it/s] Training 1/1 epoch (loss 2.0039): 8%|β–Š | 77/938 [00:39<05:36, 2.56it/s] Training 1/1 epoch (loss 2.0039): 8%|β–Š | 78/938 [00:39<05:35, 2.57it/s] Training 1/1 epoch (loss 1.8925): 8%|β–Š | 78/938 [00:40<05:35, 2.57it/s] Training 1/1 epoch (loss 1.8925): 8%|β–Š | 79/938 [00:40<05:18, 2.69it/s] Training 1/1 epoch (loss 1.9892): 8%|β–Š | 79/938 [00:40<05:18, 2.69it/s] Training 1/1 epoch (loss 1.9892): 9%|β–Š | 80/938 [00:40<05:18, 2.70it/s] Training 1/1 epoch (loss 2.0106): 9%|β–Š | 80/938 [00:41<05:18, 2.70it/s] Training 1/1 epoch (loss 2.0106): 9%|β–Š | 81/938 [00:41<05:24, 2.64it/s] Training 1/1 epoch (loss 1.9003): 9%|β–Š | 81/938 [00:41<05:24, 2.64it/s] Training 1/1 epoch (loss 1.9003): 9%|β–Š | 82/938 [00:41<05:33, 2.57it/s] Training 1/1 epoch (loss 1.8614): 9%|β–Š | 82/938 [00:41<05:33, 2.57it/s] Training 1/1 epoch (loss 1.8614): 9%|β–‰ | 83/938 [00:41<05:36, 2.54it/s] Training 1/1 epoch (loss 1.8251): 9%|β–‰ | 83/938 [00:42<05:36, 2.54it/s] Training 1/1 epoch (loss 1.8251): 9%|β–‰ | 84/938 [00:42<05:22, 2.65it/s] Training 1/1 epoch (loss 1.9154): 9%|β–‰ | 84/938 [00:42<05:22, 2.65it/s] Training 1/1 epoch (loss 1.9154): 9%|β–‰ | 85/938 [00:42<05:18, 2.67it/s] Training 1/1 epoch (loss 1.8866): 9%|β–‰ | 85/938 [00:42<05:18, 2.67it/s] Training 1/1 epoch (loss 1.8866): 9%|β–‰ | 86/938 [00:42<05:19, 2.66it/s] Training 1/1 epoch (loss 1.9024): 9%|β–‰ | 86/938 [00:43<05:19, 2.66it/s] Training 1/1 epoch (loss 1.9024): 9%|β–‰ | 87/938 [00:43<05:49, 2.44it/s] Training 1/1 epoch (loss 1.9140): 9%|β–‰ | 87/938 [00:43<05:49, 2.44it/s] Training 1/1 epoch (loss 1.9140): 9%|β–‰ | 88/938 [00:43<05:57, 2.38it/s] Training 1/1 epoch (loss 1.9120): 9%|β–‰ | 88/938 [00:44<05:57, 2.38it/s] Training 1/1 epoch (loss 1.9120): 9%|β–‰ | 89/938 [00:44<05:53, 2.41it/s] Training 1/1 epoch (loss 1.9215): 9%|β–‰ | 89/938 [00:44<05:53, 2.41it/s] Training 1/1 epoch (loss 1.9215): 10%|β–‰ | 90/938 [00:44<05:39, 2.50it/s] Training 1/1 epoch (loss 1.9926): 10%|β–‰ | 90/938 [00:45<05:39, 2.50it/s] Training 1/1 epoch (loss 1.9926): 10%|β–‰ | 91/938 [00:45<06:01, 2.34it/s] Training 1/1 epoch (loss 1.8381): 10%|β–‰ | 91/938 [00:45<06:01, 2.34it/s] Training 1/1 epoch (loss 1.8381): 10%|β–‰ | 92/938 [00:45<05:58, 2.36it/s] Training 1/1 epoch (loss 2.0584): 10%|β–‰ | 92/938 [00:46<05:58, 2.36it/s] Training 1/1 epoch (loss 2.0584): 10%|β–‰ | 93/938 [00:46<06:12, 2.27it/s] Training 1/1 epoch (loss 1.9441): 10%|β–‰ | 93/938 [00:46<06:12, 2.27it/s] Training 1/1 epoch (loss 1.9441): 10%|β–ˆ | 94/938 [00:46<05:52, 2.40it/s] Training 1/1 epoch (loss 1.8392): 10%|β–ˆ | 94/938 [00:46<05:52, 2.40it/s] Training 1/1 epoch (loss 1.8392): 10%|β–ˆ | 95/938 [00:46<05:29, 2.56it/s] Training 1/1 epoch (loss 1.8216): 10%|β–ˆ | 95/938 [00:47<05:29, 2.56it/s] Training 1/1 epoch (loss 1.8216): 10%|β–ˆ | 96/938 [00:47<05:47, 2.43it/s] Training 1/1 epoch (loss 1.9710): 10%|β–ˆ | 96/938 [00:47<05:47, 2.43it/s] Training 1/1 epoch (loss 1.9710): 10%|β–ˆ | 97/938 [00:47<06:02, 2.32it/s] Training 1/1 epoch (loss 1.8447): 10%|β–ˆ | 97/938 [00:48<06:02, 2.32it/s] Training 1/1 epoch (loss 1.8447): 10%|β–ˆ | 98/938 [00:48<05:51, 2.39it/s] Training 1/1 epoch (loss 1.9581): 10%|β–ˆ | 98/938 [00:48<05:51, 2.39it/s] Training 1/1 epoch (loss 1.9581): 11%|β–ˆ | 99/938 [00:48<05:29, 2.55it/s] Training 1/1 epoch (loss 1.6821): 11%|β–ˆ | 99/938 [00:48<05:29, 2.55it/s] Training 1/1 epoch (loss 1.6821): 11%|β–ˆ | 100/938 [00:48<05:09, 2.71it/s] Training 1/1 epoch (loss 1.9115): 11%|β–ˆ | 100/938 [00:49<05:09, 2.71it/s] Training 1/1 epoch (loss 1.9115): 11%|β–ˆ | 101/938 [00:49<05:06, 2.73it/s] Training 1/1 epoch (loss 1.8876): 11%|β–ˆ | 101/938 [00:49<05:06, 2.73it/s] Training 1/1 epoch (loss 1.8876): 11%|β–ˆ | 102/938 [00:49<04:57, 2.81it/s] Training 1/1 epoch (loss 1.8352): 11%|β–ˆ | 102/938 [00:49<04:57, 2.81it/s] Training 1/1 epoch (loss 1.8352): 11%|β–ˆ | 103/938 [00:49<05:04, 2.75it/s] Training 1/1 epoch (loss 1.9042): 11%|β–ˆ | 103/938 [00:50<05:04, 2.75it/s] Training 1/1 epoch (loss 1.9042): 11%|β–ˆ | 104/938 [00:50<05:04, 2.74it/s] Training 1/1 epoch (loss 1.9621): 11%|β–ˆ | 104/938 [00:50<05:04, 2.74it/s] Training 1/1 epoch (loss 1.9621): 11%|β–ˆ | 105/938 [00:50<05:33, 2.50it/s] Training 1/1 epoch (loss 1.8808): 11%|β–ˆ | 105/938 [00:50<05:33, 2.50it/s] Training 1/1 epoch (loss 1.8808): 11%|β–ˆβ– | 106/938 [00:50<05:14, 2.64it/s] Training 1/1 epoch (loss 1.9377): 11%|β–ˆβ– | 106/938 [00:51<05:14, 2.64it/s] Training 1/1 epoch (loss 1.9377): 11%|β–ˆβ– | 107/938 [00:51<05:22, 2.58it/s] Training 1/1 epoch (loss 2.1005): 11%|β–ˆβ– | 107/938 [00:51<05:22, 2.58it/s] Training 1/1 epoch (loss 2.1005): 12%|β–ˆβ– | 108/938 [00:51<05:40, 2.44it/s] Training 1/1 epoch (loss 2.0435): 12%|β–ˆβ– | 108/938 [00:52<05:40, 2.44it/s] Training 1/1 epoch (loss 2.0435): 12%|β–ˆβ– | 109/938 [00:52<05:17, 2.61it/s] Training 1/1 epoch (loss 2.0695): 12%|β–ˆβ– | 109/938 [00:52<05:17, 2.61it/s] Training 1/1 epoch (loss 2.0695): 12%|β–ˆβ– | 110/938 [00:52<05:04, 2.72it/s] Training 1/1 epoch (loss 1.7382): 12%|β–ˆβ– | 110/938 [00:52<05:04, 2.72it/s] Training 1/1 epoch (loss 1.7382): 12%|β–ˆβ– | 111/938 [00:52<05:09, 2.67it/s] Training 1/1 epoch (loss 2.0131): 12%|β–ˆβ– | 111/938 [00:53<05:09, 2.67it/s] Training 1/1 epoch (loss 2.0131): 12%|β–ˆβ– | 112/938 [00:53<05:14, 2.63it/s] Training 1/1 epoch (loss 1.8224): 12%|β–ˆβ– | 112/938 [00:53<05:14, 2.63it/s] Training 1/1 epoch (loss 1.8224): 12%|β–ˆβ– | 113/938 [00:53<05:14, 2.62it/s] Training 1/1 epoch (loss 1.9035): 12%|β–ˆβ– | 113/938 [00:54<05:14, 2.62it/s] Training 1/1 epoch (loss 1.9035): 12%|β–ˆβ– | 114/938 [00:54<05:40, 2.42it/s] Training 1/1 epoch (loss 1.8465): 12%|β–ˆβ– | 114/938 [00:54<05:40, 2.42it/s] Training 1/1 epoch (loss 1.8465): 12%|β–ˆβ– | 115/938 [00:54<05:33, 2.47it/s] Training 1/1 epoch (loss 1.9724): 12%|β–ˆβ– | 115/938 [00:55<05:33, 2.47it/s] Training 1/1 epoch (loss 1.9724): 12%|β–ˆβ– | 116/938 [00:55<05:57, 2.30it/s] Training 1/1 epoch (loss 1.8001): 12%|β–ˆβ– | 116/938 [00:55<05:57, 2.30it/s] Training 1/1 epoch (loss 1.8001): 12%|β–ˆβ– | 117/938 [00:55<05:47, 2.36it/s] Training 1/1 epoch (loss 1.9260): 12%|β–ˆβ– | 117/938 [00:55<05:47, 2.36it/s] Training 1/1 epoch (loss 1.9260): 13%|β–ˆβ–Ž | 118/938 [00:55<05:52, 2.33it/s] Training 1/1 epoch (loss 1.9705): 13%|β–ˆβ–Ž | 118/938 [00:56<05:52, 2.33it/s] Training 1/1 epoch (loss 1.9705): 13%|β–ˆβ–Ž | 119/938 [00:56<05:26, 2.51it/s] Training 1/1 epoch (loss 1.8801): 13%|β–ˆβ–Ž | 119/938 [00:56<05:26, 2.51it/s] Training 1/1 epoch (loss 1.8801): 13%|β–ˆβ–Ž | 120/938 [00:56<05:26, 2.50it/s] Training 1/1 epoch (loss 1.9432): 13%|β–ˆβ–Ž | 120/938 [00:56<05:26, 2.50it/s] Training 1/1 epoch (loss 1.9432): 13%|β–ˆβ–Ž | 121/938 [00:56<05:10, 2.63it/s] Training 1/1 epoch (loss 1.9196): 13%|β–ˆβ–Ž | 121/938 [00:57<05:10, 2.63it/s] Training 1/1 epoch (loss 1.9196): 13%|β–ˆβ–Ž | 122/938 [00:57<05:13, 2.60it/s] Training 1/1 epoch (loss 1.8550): 13%|β–ˆβ–Ž | 122/938 [00:57<05:13, 2.60it/s] Training 1/1 epoch (loss 1.8550): 13%|β–ˆβ–Ž | 123/938 [00:57<05:12, 2.61it/s] Training 1/1 epoch (loss 1.9007): 13%|β–ˆβ–Ž | 123/938 [00:58<05:12, 2.61it/s] Training 1/1 epoch (loss 1.9007): 13%|β–ˆβ–Ž | 124/938 [00:58<05:07, 2.64it/s] Training 1/1 epoch (loss 1.9370): 13%|β–ˆβ–Ž | 124/938 [00:58<05:07, 2.64it/s] Training 1/1 epoch (loss 1.9370): 13%|β–ˆβ–Ž | 125/938 [00:58<05:02, 2.69it/s] Training 1/1 epoch (loss 1.9887): 13%|β–ˆβ–Ž | 125/938 [00:58<05:02, 2.69it/s] Training 1/1 epoch (loss 1.9887): 13%|β–ˆβ–Ž | 126/938 [00:58<05:00, 2.71it/s] Training 1/1 epoch (loss 1.7783): 13%|β–ˆβ–Ž | 126/938 [00:59<05:00, 2.71it/s] Training 1/1 epoch (loss 1.7783): 14%|β–ˆβ–Ž | 127/938 [00:59<04:54, 2.75it/s] Training 1/1 epoch (loss 1.8991): 14%|β–ˆβ–Ž | 127/938 [00:59<04:54, 2.75it/s] Training 1/1 epoch (loss 1.8991): 14%|β–ˆβ–Ž | 128/938 [00:59<05:09, 2.62it/s] Training 1/1 epoch (loss 1.9085): 14%|β–ˆβ–Ž | 128/938 [01:00<05:09, 2.62it/s] Training 1/1 epoch (loss 1.9085): 14%|β–ˆβ– | 129/938 [01:00<05:51, 2.30it/s] Training 1/1 epoch (loss 1.8114): 14%|β–ˆβ– | 129/938 [01:00<05:51, 2.30it/s] Training 1/1 epoch (loss 1.8114): 14%|β–ˆβ– | 130/938 [01:00<05:33, 2.42it/s] Training 1/1 epoch (loss 1.8533): 14%|β–ˆβ– | 130/938 [01:00<05:33, 2.42it/s] Training 1/1 epoch (loss 1.8533): 14%|β–ˆβ– | 131/938 [01:00<05:47, 2.32it/s] Training 1/1 epoch (loss 1.7540): 14%|β–ˆβ– | 131/938 [01:01<05:47, 2.32it/s] Training 1/1 epoch (loss 1.7540): 14%|β–ˆβ– | 132/938 [01:01<05:44, 2.34it/s] Training 1/1 epoch (loss 1.9154): 14%|β–ˆβ– | 132/938 [01:01<05:44, 2.34it/s] Training 1/1 epoch (loss 1.9154): 14%|β–ˆβ– | 133/938 [01:01<05:18, 2.53it/s] Training 1/1 epoch (loss 2.0708): 14%|β–ˆβ– | 133/938 [01:02<05:18, 2.53it/s] Training 1/1 epoch (loss 2.0708): 14%|β–ˆβ– | 134/938 [01:02<05:13, 2.57it/s] Training 1/1 epoch (loss 1.7834): 14%|β–ˆβ– | 134/938 [01:02<05:13, 2.57it/s] Training 1/1 epoch (loss 1.7834): 14%|β–ˆβ– | 135/938 [01:02<05:08, 2.60it/s] Training 1/1 epoch (loss 1.8876): 14%|β–ˆβ– | 135/938 [01:02<05:08, 2.60it/s] Training 1/1 epoch (loss 1.8876): 14%|β–ˆβ– | 136/938 [01:02<04:51, 2.75it/s] Training 1/1 epoch (loss 1.9396): 14%|β–ˆβ– | 136/938 [01:03<04:51, 2.75it/s] Training 1/1 epoch (loss 1.9396): 15%|β–ˆβ– | 137/938 [01:03<04:55, 2.72it/s] Training 1/1 epoch (loss 1.9331): 15%|β–ˆβ– | 137/938 [01:03<04:55, 2.72it/s] Training 1/1 epoch (loss 1.9331): 15%|β–ˆβ– | 138/938 [01:03<04:44, 2.81it/s] Training 1/1 epoch (loss 1.8867): 15%|β–ˆβ– | 138/938 [01:03<04:44, 2.81it/s] Training 1/1 epoch (loss 1.8867): 15%|β–ˆβ– | 139/938 [01:03<04:40, 2.85it/s] Training 1/1 epoch (loss 1.9144): 15%|β–ˆβ– | 139/938 [01:04<04:40, 2.85it/s] Training 1/1 epoch (loss 1.9144): 15%|β–ˆβ– | 140/938 [01:04<04:33, 2.91it/s] Training 1/1 epoch (loss 1.9752): 15%|β–ˆβ– | 140/938 [01:04<04:33, 2.91it/s] Training 1/1 epoch (loss 1.9752): 15%|β–ˆβ–Œ | 141/938 [01:04<05:13, 2.54it/s] Training 1/1 epoch (loss 1.9065): 15%|β–ˆβ–Œ | 141/938 [01:05<05:13, 2.54it/s] Training 1/1 epoch (loss 1.9065): 15%|β–ˆβ–Œ | 142/938 [01:05<05:33, 2.39it/s] Training 1/1 epoch (loss 1.8340): 15%|β–ˆβ–Œ | 142/938 [01:05<05:33, 2.39it/s] Training 1/1 epoch (loss 1.8340): 15%|β–ˆβ–Œ | 143/938 [01:05<05:14, 2.53it/s] Training 1/1 epoch (loss 1.9820): 15%|β–ˆβ–Œ | 143/938 [01:05<05:14, 2.53it/s] Training 1/1 epoch (loss 1.9820): 15%|β–ˆβ–Œ | 144/938 [01:05<05:14, 2.52it/s] Training 1/1 epoch (loss 1.8798): 15%|β–ˆβ–Œ | 144/938 [01:06<05:14, 2.52it/s] Training 1/1 epoch (loss 1.8798): 15%|β–ˆβ–Œ | 145/938 [01:06<05:01, 2.63it/s] Training 1/1 epoch (loss 1.8653): 15%|β–ˆβ–Œ | 145/938 [01:06<05:01, 2.63it/s] Training 1/1 epoch (loss 1.8653): 16%|β–ˆβ–Œ | 146/938 [01:06<04:55, 2.68it/s] Training 1/1 epoch (loss 1.6954): 16%|β–ˆβ–Œ | 146/938 [01:07<04:55, 2.68it/s] Training 1/1 epoch (loss 1.6954): 16%|β–ˆβ–Œ | 147/938 [01:07<06:22, 2.07it/s] Training 1/1 epoch (loss 2.0651): 16%|β–ˆβ–Œ | 147/938 [01:07<06:22, 2.07it/s] Training 1/1 epoch (loss 2.0651): 16%|β–ˆβ–Œ | 148/938 [01:07<05:50, 2.25it/s] Training 1/1 epoch (loss 1.9874): 16%|β–ˆβ–Œ | 148/938 [01:08<05:50, 2.25it/s] Training 1/1 epoch (loss 1.9874): 16%|β–ˆβ–Œ | 149/938 [01:08<06:18, 2.08it/s] Training 1/1 epoch (loss 1.7731): 16%|β–ˆβ–Œ | 149/938 [01:08<06:18, 2.08it/s] Training 1/1 epoch (loss 1.7731): 16%|β–ˆβ–Œ | 150/938 [01:08<05:51, 2.24it/s] Training 1/1 epoch (loss 1.9317): 16%|β–ˆβ–Œ | 150/938 [01:08<05:51, 2.24it/s] Training 1/1 epoch (loss 1.9317): 16%|β–ˆβ–Œ | 151/938 [01:08<05:20, 2.46it/s] Training 1/1 epoch (loss 1.8224): 16%|β–ˆβ–Œ | 151/938 [01:09<05:20, 2.46it/s] Training 1/1 epoch (loss 1.8224): 16%|β–ˆβ–Œ | 152/938 [01:09<05:13, 2.51it/s] Training 1/1 epoch (loss 1.8990): 16%|β–ˆβ–Œ | 152/938 [01:09<05:13, 2.51it/s] Training 1/1 epoch (loss 1.8990): 16%|β–ˆβ–‹ | 153/938 [01:09<05:00, 2.61it/s] Training 1/1 epoch (loss 1.7595): 16%|β–ˆβ–‹ | 153/938 [01:10<05:00, 2.61it/s] Training 1/1 epoch (loss 1.7595): 16%|β–ˆβ–‹ | 154/938 [01:10<05:01, 2.60it/s] Training 1/1 epoch (loss 1.9885): 16%|β–ˆβ–‹ | 154/938 [01:10<05:01, 2.60it/s] Training 1/1 epoch (loss 1.9885): 17%|β–ˆβ–‹ | 155/938 [01:10<04:40, 2.79it/s] Training 1/1 epoch (loss 1.8749): 17%|β–ˆβ–‹ | 155/938 [01:10<04:40, 2.79it/s] Training 1/1 epoch (loss 1.8749): 17%|β–ˆβ–‹ | 156/938 [01:10<04:46, 2.73it/s] Training 1/1 epoch (loss 1.9154): 17%|β–ˆβ–‹ | 156/938 [01:11<04:46, 2.73it/s] Training 1/1 epoch (loss 1.9154): 17%|β–ˆβ–‹ | 157/938 [01:11<05:40, 2.29it/s] Training 1/1 epoch (loss 1.8001): 17%|β–ˆβ–‹ | 157/938 [01:11<05:40, 2.29it/s] Training 1/1 epoch (loss 1.8001): 17%|β–ˆβ–‹ | 158/938 [01:11<06:07, 2.12it/s] Training 1/1 epoch (loss 1.9466): 17%|β–ˆβ–‹ | 158/938 [01:12<06:07, 2.12it/s] Training 1/1 epoch (loss 1.9466): 17%|β–ˆβ–‹ | 159/938 [01:12<06:04, 2.14it/s] Training 1/1 epoch (loss 1.9001): 17%|β–ˆβ–‹ | 159/938 [01:12<06:04, 2.14it/s] Training 1/1 epoch (loss 1.9001): 17%|β–ˆβ–‹ | 160/938 [01:12<06:10, 2.10it/s] Training 1/1 epoch (loss 1.7853): 17%|β–ˆβ–‹ | 160/938 [01:13<06:10, 2.10it/s] Training 1/1 epoch (loss 1.7853): 17%|β–ˆβ–‹ | 161/938 [01:13<06:07, 2.11it/s] Training 1/1 epoch (loss 1.9007): 17%|β–ˆβ–‹ | 161/938 [01:13<06:07, 2.11it/s] Training 1/1 epoch (loss 1.9007): 17%|β–ˆβ–‹ | 162/938 [01:13<06:23, 2.03it/s] Training 1/1 epoch (loss 1.9421): 17%|β–ˆβ–‹ | 162/938 [01:14<06:23, 2.03it/s] Training 1/1 epoch (loss 1.9421): 17%|β–ˆβ–‹ | 163/938 [01:14<06:09, 2.10it/s] Training 1/1 epoch (loss 1.8834): 17%|β–ˆβ–‹ | 163/938 [01:14<06:09, 2.10it/s] Training 1/1 epoch (loss 1.8834): 17%|β–ˆβ–‹ | 164/938 [01:14<05:41, 2.26it/s] Training 1/1 epoch (loss 1.8991): 17%|β–ˆβ–‹ | 164/938 [01:15<05:41, 2.26it/s] Training 1/1 epoch (loss 1.8991): 18%|β–ˆβ–Š | 165/938 [01:15<05:51, 2.20it/s] Training 1/1 epoch (loss 1.8145): 18%|β–ˆβ–Š | 165/938 [01:15<05:51, 2.20it/s] Training 1/1 epoch (loss 1.8145): 18%|β–ˆβ–Š | 166/938 [01:15<05:45, 2.23it/s] Training 1/1 epoch (loss 1.9202): 18%|β–ˆβ–Š | 166/938 [01:15<05:45, 2.23it/s] Training 1/1 epoch (loss 1.9202): 18%|β–ˆβ–Š | 167/938 [01:15<05:44, 2.24it/s] Training 1/1 epoch (loss 1.7263): 18%|β–ˆβ–Š | 167/938 [01:16<05:44, 2.24it/s] Training 1/1 epoch (loss 1.7263): 18%|β–ˆβ–Š | 168/938 [01:16<05:28, 2.35it/s] Training 1/1 epoch (loss 2.0150): 18%|β–ˆβ–Š | 168/938 [01:16<05:28, 2.35it/s] Training 1/1 epoch (loss 2.0150): 18%|β–ˆβ–Š | 169/938 [01:16<05:17, 2.42it/s] Training 1/1 epoch (loss 1.8856): 18%|β–ˆβ–Š | 169/938 [01:17<05:17, 2.42it/s] Training 1/1 epoch (loss 1.8856): 18%|β–ˆβ–Š | 170/938 [01:17<05:37, 2.27it/s] Training 1/1 epoch (loss 1.8593): 18%|β–ˆβ–Š | 170/938 [01:17<05:37, 2.27it/s] Training 1/1 epoch (loss 1.8593): 18%|β–ˆβ–Š | 171/938 [01:17<05:27, 2.34it/s] Training 1/1 epoch (loss 1.7799): 18%|β–ˆβ–Š | 171/938 [01:18<05:27, 2.34it/s] Training 1/1 epoch (loss 1.7799): 18%|β–ˆβ–Š | 172/938 [01:18<05:25, 2.36it/s] Training 1/1 epoch (loss 1.9062): 18%|β–ˆβ–Š | 172/938 [01:18<05:25, 2.36it/s] Training 1/1 epoch (loss 1.9062): 18%|β–ˆβ–Š | 173/938 [01:18<05:06, 2.49it/s] Training 1/1 epoch (loss 2.0347): 18%|β–ˆβ–Š | 173/938 [01:18<05:06, 2.49it/s] Training 1/1 epoch (loss 2.0347): 19%|β–ˆβ–Š | 174/938 [01:18<04:57, 2.57it/s] Training 1/1 epoch (loss 1.9494): 19%|β–ˆβ–Š | 174/938 [01:19<04:57, 2.57it/s] Training 1/1 epoch (loss 1.9494): 19%|β–ˆβ–Š | 175/938 [01:19<04:49, 2.64it/s] Training 1/1 epoch (loss 1.6884): 19%|β–ˆβ–Š | 175/938 [01:19<04:49, 2.64it/s] Training 1/1 epoch (loss 1.6884): 19%|β–ˆβ–‰ | 176/938 [01:19<04:52, 2.60it/s] Training 1/1 epoch (loss 1.8310): 19%|β–ˆβ–‰ | 176/938 [01:19<04:52, 2.60it/s] Training 1/1 epoch (loss 1.8310): 19%|β–ˆβ–‰ | 177/938 [01:19<04:46, 2.66it/s] Training 1/1 epoch (loss 1.8946): 19%|β–ˆβ–‰ | 177/938 [01:20<04:46, 2.66it/s] Training 1/1 epoch (loss 1.8946): 19%|β–ˆβ–‰ | 178/938 [01:20<04:46, 2.65it/s] Training 1/1 epoch (loss 1.9684): 19%|β–ˆβ–‰ | 178/938 [01:20<04:46, 2.65it/s] Training 1/1 epoch (loss 1.9684): 19%|β–ˆβ–‰ | 179/938 [01:20<04:42, 2.69it/s] Training 1/1 epoch (loss 1.8251): 19%|β–ˆβ–‰ | 179/938 [01:20<04:42, 2.69it/s] Training 1/1 epoch (loss 1.8251): 19%|β–ˆβ–‰ | 180/938 [01:20<04:43, 2.68it/s] Training 1/1 epoch (loss 1.8193): 19%|β–ˆβ–‰ | 180/938 [01:21<04:43, 2.68it/s] Training 1/1 epoch (loss 1.8193): 19%|β–ˆβ–‰ | 181/938 [01:21<05:01, 2.51it/s] Training 1/1 epoch (loss 1.7255): 19%|β–ˆβ–‰ | 181/938 [01:21<05:01, 2.51it/s] Training 1/1 epoch (loss 1.7255): 19%|β–ˆβ–‰ | 182/938 [01:21<05:01, 2.50it/s] Training 1/1 epoch (loss 1.9876): 19%|β–ˆβ–‰ | 182/938 [01:22<05:01, 2.50it/s] Training 1/1 epoch (loss 1.9876): 20%|β–ˆβ–‰ | 183/938 [01:22<05:01, 2.50it/s] Training 1/1 epoch (loss 1.8105): 20%|β–ˆβ–‰ | 183/938 [01:22<05:01, 2.50it/s] Training 1/1 epoch (loss 1.8105): 20%|β–ˆβ–‰ | 184/938 [01:22<04:59, 2.52it/s] Training 1/1 epoch (loss 1.9081): 20%|β–ˆβ–‰ | 184/938 [01:23<04:59, 2.52it/s] Training 1/1 epoch (loss 1.9081): 20%|β–ˆβ–‰ | 185/938 [01:23<04:53, 2.57it/s] Training 1/1 epoch (loss 1.8328): 20%|β–ˆβ–‰ | 185/938 [01:23<04:53, 2.57it/s] Training 1/1 epoch (loss 1.8328): 20%|β–ˆβ–‰ | 186/938 [01:23<05:16, 2.38it/s] Training 1/1 epoch (loss 1.6989): 20%|β–ˆβ–‰ | 186/938 [01:23<05:16, 2.38it/s] Training 1/1 epoch (loss 1.6989): 20%|β–ˆβ–‰ | 187/938 [01:23<05:24, 2.32it/s] Training 1/1 epoch (loss 1.8318): 20%|β–ˆβ–‰ | 187/938 [01:24<05:24, 2.32it/s] Training 1/1 epoch (loss 1.8318): 20%|β–ˆβ–ˆ | 188/938 [01:24<05:07, 2.44it/s] Training 1/1 epoch (loss 1.8736): 20%|β–ˆβ–ˆ | 188/938 [01:24<05:07, 2.44it/s] Training 1/1 epoch (loss 1.8736): 20%|β–ˆβ–ˆ | 189/938 [01:24<05:10, 2.41it/s] Training 1/1 epoch (loss 2.1595): 20%|β–ˆβ–ˆ | 189/938 [01:25<05:10, 2.41it/s] Training 1/1 epoch (loss 2.1595): 20%|β–ˆβ–ˆ | 190/938 [01:25<05:25, 2.30it/s] Training 1/1 epoch (loss 1.8147): 20%|β–ˆβ–ˆ | 190/938 [01:25<05:25, 2.30it/s] Training 1/1 epoch (loss 1.8147): 20%|β–ˆβ–ˆ | 191/938 [01:25<05:29, 2.26it/s] Training 1/1 epoch (loss 1.8938): 20%|β–ˆβ–ˆ | 191/938 [01:26<05:29, 2.26it/s] Training 1/1 epoch (loss 1.8938): 20%|β–ˆβ–ˆ | 192/938 [01:26<05:23, 2.31it/s] Training 1/1 epoch (loss 1.8868): 20%|β–ˆβ–ˆ | 192/938 [01:26<05:23, 2.31it/s] Training 1/1 epoch (loss 1.8868): 21%|β–ˆβ–ˆ | 193/938 [01:26<05:04, 2.44it/s] Training 1/1 epoch (loss 1.8710): 21%|β–ˆβ–ˆ | 193/938 [01:26<05:04, 2.44it/s] Training 1/1 epoch (loss 1.8710): 21%|β–ˆβ–ˆ | 194/938 [01:26<05:00, 2.48it/s] Training 1/1 epoch (loss 1.8852): 21%|β–ˆβ–ˆ | 194/938 [01:27<05:00, 2.48it/s] Training 1/1 epoch (loss 1.8852): 21%|β–ˆβ–ˆ | 195/938 [01:27<05:05, 2.43it/s] Training 1/1 epoch (loss 1.8484): 21%|β–ˆβ–ˆ | 195/938 [01:27<05:05, 2.43it/s] Training 1/1 epoch (loss 1.8484): 21%|β–ˆβ–ˆ | 196/938 [01:27<05:06, 2.42it/s] Training 1/1 epoch (loss 1.9611): 21%|β–ˆβ–ˆ | 196/938 [01:28<05:06, 2.42it/s] Training 1/1 epoch (loss 1.9611): 21%|β–ˆβ–ˆ | 197/938 [01:28<04:59, 2.47it/s] Training 1/1 epoch (loss 1.7724): 21%|β–ˆβ–ˆ | 197/938 [01:28<04:59, 2.47it/s] Training 1/1 epoch (loss 1.7724): 21%|β–ˆβ–ˆ | 198/938 [01:28<04:43, 2.61it/s] Training 1/1 epoch (loss 1.8703): 21%|β–ˆβ–ˆ | 198/938 [01:28<04:43, 2.61it/s] Training 1/1 epoch (loss 1.8703): 21%|β–ˆβ–ˆ | 199/938 [01:28<04:35, 2.68it/s] Training 1/1 epoch (loss 1.8068): 21%|β–ˆβ–ˆ | 199/938 [01:29<04:35, 2.68it/s] Training 1/1 epoch (loss 1.8068): 21%|β–ˆβ–ˆβ– | 200/938 [01:29<04:46, 2.57it/s] Training 1/1 epoch (loss 1.8575): 21%|β–ˆβ–ˆβ– | 200/938 [01:29<04:46, 2.57it/s] Training 1/1 epoch (loss 1.8575): 21%|β–ˆβ–ˆβ– | 201/938 [01:29<04:50, 2.53it/s] Training 1/1 epoch (loss 1.9390): 21%|β–ˆβ–ˆβ– | 201/938 [01:30<04:50, 2.53it/s] Training 1/1 epoch (loss 1.9390): 22%|β–ˆβ–ˆβ– | 202/938 [01:30<05:13, 2.34it/s] Training 1/1 epoch (loss 1.9038): 22%|β–ˆβ–ˆβ– | 202/938 [01:30<05:13, 2.34it/s] Training 1/1 epoch (loss 1.9038): 22%|β–ˆβ–ˆβ– | 203/938 [01:30<04:51, 2.52it/s] Training 1/1 epoch (loss 1.8200): 22%|β–ˆβ–ˆβ– | 203/938 [01:30<04:51, 2.52it/s] Training 1/1 epoch (loss 1.8200): 22%|β–ˆβ–ˆβ– | 204/938 [01:30<04:37, 2.64it/s] Training 1/1 epoch (loss 1.8145): 22%|β–ˆβ–ˆβ– | 204/938 [01:31<04:37, 2.64it/s] Training 1/1 epoch (loss 1.8145): 22%|β–ˆβ–ˆβ– | 205/938 [01:31<04:39, 2.62it/s] Training 1/1 epoch (loss 1.9717): 22%|β–ˆβ–ˆβ– | 205/938 [01:31<04:39, 2.62it/s] Training 1/1 epoch (loss 1.9717): 22%|β–ˆβ–ˆβ– | 206/938 [01:31<04:41, 2.60it/s] Training 1/1 epoch (loss 1.9190): 22%|β–ˆβ–ˆβ– | 206/938 [01:31<04:41, 2.60it/s] Training 1/1 epoch (loss 1.9190): 22%|β–ˆβ–ˆβ– | 207/938 [01:31<04:40, 2.60it/s] Training 1/1 epoch (loss 1.9296): 22%|β–ˆβ–ˆβ– | 207/938 [01:32<04:40, 2.60it/s] Training 1/1 epoch (loss 1.9296): 22%|β–ˆβ–ˆβ– | 208/938 [01:32<04:42, 2.58it/s] Training 1/1 epoch (loss 1.8120): 22%|β–ˆβ–ˆβ– | 208/938 [01:32<04:42, 2.58it/s] Training 1/1 epoch (loss 1.8120): 22%|β–ˆβ–ˆβ– | 209/938 [01:32<04:38, 2.62it/s] Training 1/1 epoch (loss 1.8017): 22%|β–ˆβ–ˆβ– | 209/938 [01:33<04:38, 2.62it/s] Training 1/1 epoch (loss 1.8017): 22%|β–ˆβ–ˆβ– | 210/938 [01:33<04:33, 2.66it/s] Training 1/1 epoch (loss 1.9044): 22%|β–ˆβ–ˆβ– | 210/938 [01:33<04:33, 2.66it/s] Training 1/1 epoch (loss 1.9044): 22%|β–ˆβ–ˆβ– | 211/938 [01:33<04:33, 2.66it/s] Training 1/1 epoch (loss 1.7307): 22%|β–ˆβ–ˆβ– | 211/938 [01:33<04:33, 2.66it/s] Training 1/1 epoch (loss 1.7307): 23%|β–ˆβ–ˆβ–Ž | 212/938 [01:33<04:44, 2.55it/s] Training 1/1 epoch (loss 1.7741): 23%|β–ˆβ–ˆβ–Ž | 212/938 [01:34<04:44, 2.55it/s] Training 1/1 epoch (loss 1.7741): 23%|β–ˆβ–ˆβ–Ž | 213/938 [01:34<04:36, 2.62it/s] Training 1/1 epoch (loss 1.8641): 23%|β–ˆβ–ˆβ–Ž | 213/938 [01:34<04:36, 2.62it/s] Training 1/1 epoch (loss 1.8641): 23%|β–ˆβ–ˆβ–Ž | 214/938 [01:34<04:43, 2.56it/s] Training 1/1 epoch (loss 1.8841): 23%|β–ˆβ–ˆβ–Ž | 214/938 [01:35<04:43, 2.56it/s] Training 1/1 epoch (loss 1.8841): 23%|β–ˆβ–ˆβ–Ž | 215/938 [01:35<05:06, 2.36it/s] Training 1/1 epoch (loss 1.7418): 23%|β–ˆβ–ˆβ–Ž | 215/938 [01:35<05:06, 2.36it/s] Training 1/1 epoch (loss 1.7418): 23%|β–ˆβ–ˆβ–Ž | 216/938 [01:35<05:21, 2.25it/s] Training 1/1 epoch (loss 1.9979): 23%|β–ˆβ–ˆβ–Ž | 216/938 [01:36<05:21, 2.25it/s] Training 1/1 epoch (loss 1.9979): 23%|β–ˆβ–ˆβ–Ž | 217/938 [01:36<05:15, 2.29it/s] Training 1/1 epoch (loss 1.8798): 23%|β–ˆβ–ˆβ–Ž | 217/938 [01:36<05:15, 2.29it/s] Training 1/1 epoch (loss 1.8798): 23%|β–ˆβ–ˆβ–Ž | 218/938 [01:36<04:53, 2.45it/s] Training 1/1 epoch (loss 1.8203): 23%|β–ˆβ–ˆβ–Ž | 218/938 [01:36<04:53, 2.45it/s] Training 1/1 epoch (loss 1.8203): 23%|β–ˆβ–ˆβ–Ž | 219/938 [01:36<04:42, 2.55it/s] Training 1/1 epoch (loss 1.7585): 23%|β–ˆβ–ˆβ–Ž | 219/938 [01:37<04:42, 2.55it/s] Training 1/1 epoch (loss 1.7585): 23%|β–ˆβ–ˆβ–Ž | 220/938 [01:37<04:21, 2.74it/s] Training 1/1 epoch (loss 1.9284): 23%|β–ˆβ–ˆβ–Ž | 220/938 [01:37<04:21, 2.74it/s] Training 1/1 epoch (loss 1.9284): 24%|β–ˆβ–ˆβ–Ž | 221/938 [01:37<04:36, 2.59it/s] Training 1/1 epoch (loss 1.7878): 24%|β–ˆβ–ˆβ–Ž | 221/938 [01:37<04:36, 2.59it/s] Training 1/1 epoch (loss 1.7878): 24%|β–ˆβ–ˆβ–Ž | 222/938 [01:37<05:02, 2.37it/s] Training 1/1 epoch (loss 1.9698): 24%|β–ˆβ–ˆβ–Ž | 222/938 [01:38<05:02, 2.37it/s] Training 1/1 epoch (loss 1.9698): 24%|β–ˆβ–ˆβ– | 223/938 [01:38<04:53, 2.44it/s] Training 1/1 epoch (loss 1.9037): 24%|β–ˆβ–ˆβ– | 223/938 [01:38<04:53, 2.44it/s] Training 1/1 epoch (loss 1.9037): 24%|β–ˆβ–ˆβ– | 224/938 [01:38<04:46, 2.49it/s] Training 1/1 epoch (loss 1.8851): 24%|β–ˆβ–ˆβ– | 224/938 [01:39<04:46, 2.49it/s] Training 1/1 epoch (loss 1.8851): 24%|β–ˆβ–ˆβ– | 225/938 [01:39<04:40, 2.54it/s] Training 1/1 epoch (loss 1.9441): 24%|β–ˆβ–ˆβ– | 225/938 [01:39<04:40, 2.54it/s] Training 1/1 epoch (loss 1.9441): 24%|β–ˆβ–ˆβ– | 226/938 [01:39<05:00, 2.37it/s] Training 1/1 epoch (loss 1.8779): 24%|β–ˆβ–ˆβ– | 226/938 [01:40<05:00, 2.37it/s] Training 1/1 epoch (loss 1.8779): 24%|β–ˆβ–ˆβ– | 227/938 [01:40<05:15, 2.25it/s] Training 1/1 epoch (loss 1.8950): 24%|β–ˆβ–ˆβ– | 227/938 [01:40<05:15, 2.25it/s] Training 1/1 epoch (loss 1.8950): 24%|β–ˆβ–ˆβ– | 228/938 [01:40<04:57, 2.38it/s] Training 1/1 epoch (loss 1.8933): 24%|β–ˆβ–ˆβ– | 228/938 [01:40<04:57, 2.38it/s] Training 1/1 epoch (loss 1.8933): 24%|β–ˆβ–ˆβ– | 229/938 [01:40<04:51, 2.43it/s] Training 1/1 epoch (loss 1.9389): 24%|β–ˆβ–ˆβ– | 229/938 [01:41<04:51, 2.43it/s] Training 1/1 epoch (loss 1.9389): 25%|β–ˆβ–ˆβ– | 230/938 [01:41<05:02, 2.34it/s] Training 1/1 epoch (loss 1.8613): 25%|β–ˆβ–ˆβ– | 230/938 [01:41<05:02, 2.34it/s] Training 1/1 epoch (loss 1.8613): 25%|β–ˆβ–ˆβ– | 231/938 [01:41<05:21, 2.20it/s] Training 1/1 epoch (loss 1.8758): 25%|β–ˆβ–ˆβ– | 231/938 [01:42<05:21, 2.20it/s] Training 1/1 epoch (loss 1.8758): 25%|β–ˆβ–ˆβ– | 232/938 [01:42<05:26, 2.16it/s] Training 1/1 epoch (loss 1.8951): 25%|β–ˆβ–ˆβ– | 232/938 [01:42<05:26, 2.16it/s] Training 1/1 epoch (loss 1.8951): 25%|β–ˆβ–ˆβ– | 233/938 [01:42<05:16, 2.23it/s] Training 1/1 epoch (loss 1.8631): 25%|β–ˆβ–ˆβ– | 233/938 [01:43<05:16, 2.23it/s] Training 1/1 epoch (loss 1.8631): 25%|β–ˆβ–ˆβ– | 234/938 [01:43<05:21, 2.19it/s] Training 1/1 epoch (loss 1.8404): 25%|β–ˆβ–ˆβ– | 234/938 [01:43<05:21, 2.19it/s] Training 1/1 epoch (loss 1.8404): 25%|β–ˆβ–ˆβ–Œ | 235/938 [01:43<05:29, 2.13it/s] Training 1/1 epoch (loss 1.7809): 25%|β–ˆβ–ˆβ–Œ | 235/938 [01:44<05:29, 2.13it/s] Training 1/1 epoch (loss 1.7809): 25%|β–ˆβ–ˆβ–Œ | 236/938 [01:44<05:21, 2.18it/s] Training 1/1 epoch (loss 1.9851): 25%|β–ˆβ–ˆβ–Œ | 236/938 [01:44<05:21, 2.18it/s] Training 1/1 epoch (loss 1.9851): 25%|β–ˆβ–ˆβ–Œ | 237/938 [01:44<05:06, 2.29it/s] Training 1/1 epoch (loss 1.7303): 25%|β–ˆβ–ˆβ–Œ | 237/938 [01:44<05:06, 2.29it/s] Training 1/1 epoch (loss 1.7303): 25%|β–ˆβ–ˆβ–Œ | 238/938 [01:44<05:12, 2.24it/s] Training 1/1 epoch (loss 1.7647): 25%|β–ˆβ–ˆβ–Œ | 238/938 [01:45<05:12, 2.24it/s] Training 1/1 epoch (loss 1.7647): 25%|β–ˆβ–ˆβ–Œ | 239/938 [01:45<05:12, 2.24it/s] Training 1/1 epoch (loss 1.8432): 25%|β–ˆβ–ˆβ–Œ | 239/938 [01:46<05:12, 2.24it/s] Training 1/1 epoch (loss 1.8432): 26%|β–ˆβ–ˆβ–Œ | 240/938 [01:46<05:39, 2.06it/s] Training 1/1 epoch (loss 1.9860): 26%|β–ˆβ–ˆβ–Œ | 240/938 [01:46<05:39, 2.06it/s] Training 1/1 epoch (loss 1.9860): 26%|β–ˆβ–ˆβ–Œ | 241/938 [01:46<05:23, 2.16it/s] Training 1/1 epoch (loss 1.9164): 26%|β–ˆβ–ˆβ–Œ | 241/938 [01:46<05:23, 2.16it/s] Training 1/1 epoch (loss 1.9164): 26%|β–ˆβ–ˆβ–Œ | 242/938 [01:46<05:00, 2.32it/s] Training 1/1 epoch (loss 1.9111): 26%|β–ˆβ–ˆβ–Œ | 242/938 [01:47<05:00, 2.32it/s] Training 1/1 epoch (loss 1.9111): 26%|β–ˆβ–ˆβ–Œ | 243/938 [01:47<04:53, 2.37it/s] Training 1/1 epoch (loss 1.7805): 26%|β–ˆβ–ˆβ–Œ | 243/938 [01:47<04:53, 2.37it/s] Training 1/1 epoch (loss 1.7805): 26%|β–ˆβ–ˆβ–Œ | 244/938 [01:47<04:44, 2.44it/s] Training 1/1 epoch (loss 1.7904): 26%|β–ˆβ–ˆβ–Œ | 244/938 [01:47<04:44, 2.44it/s] Training 1/1 epoch (loss 1.7904): 26%|β–ˆβ–ˆβ–Œ | 245/938 [01:47<04:46, 2.42it/s] Training 1/1 epoch (loss 1.8602): 26%|β–ˆβ–ˆβ–Œ | 245/938 [01:48<04:46, 2.42it/s] Training 1/1 epoch (loss 1.8602): 26%|β–ˆβ–ˆβ–Œ | 246/938 [01:48<04:51, 2.38it/s] Training 1/1 epoch (loss 1.8572): 26%|β–ˆβ–ˆβ–Œ | 246/938 [01:48<04:51, 2.38it/s] Training 1/1 epoch (loss 1.8572): 26%|β–ˆβ–ˆβ–‹ | 247/938 [01:48<04:35, 2.51it/s] Training 1/1 epoch (loss 1.7882): 26%|β–ˆβ–ˆβ–‹ | 247/938 [01:49<04:35, 2.51it/s] Training 1/1 epoch (loss 1.7882): 26%|β–ˆβ–ˆβ–‹ | 248/938 [01:49<04:24, 2.61it/s] Training 1/1 epoch (loss 1.8923): 26%|β–ˆβ–ˆβ–‹ | 248/938 [01:49<04:24, 2.61it/s] Training 1/1 epoch (loss 1.8923): 27%|β–ˆβ–ˆβ–‹ | 249/938 [01:49<04:17, 2.68it/s] Training 1/1 epoch (loss 1.6846): 27%|β–ˆβ–ˆβ–‹ | 249/938 [01:49<04:17, 2.68it/s] Training 1/1 epoch (loss 1.6846): 27%|β–ˆβ–ˆβ–‹ | 250/938 [01:49<04:17, 2.67it/s] Training 1/1 epoch (loss 1.9552): 27%|β–ˆβ–ˆβ–‹ | 250/938 [01:50<04:17, 2.67it/s] Training 1/1 epoch (loss 1.9552): 27%|β–ˆβ–ˆβ–‹ | 251/938 [01:50<04:29, 2.55it/s] Training 1/1 epoch (loss 1.8999): 27%|β–ˆβ–ˆβ–‹ | 251/938 [01:50<04:29, 2.55it/s] Training 1/1 epoch (loss 1.8999): 27%|β–ˆβ–ˆβ–‹ | 252/938 [01:50<04:28, 2.56it/s] Training 1/1 epoch (loss 1.8726): 27%|β–ˆβ–ˆβ–‹ | 252/938 [01:50<04:28, 2.56it/s] Training 1/1 epoch (loss 1.8726): 27%|β–ˆβ–ˆβ–‹ | 253/938 [01:50<04:14, 2.69it/s] Training 1/1 epoch (loss 1.8620): 27%|β–ˆβ–ˆβ–‹ | 253/938 [01:51<04:14, 2.69it/s] Training 1/1 epoch (loss 1.8620): 27%|β–ˆβ–ˆβ–‹ | 254/938 [01:51<04:18, 2.65it/s] Training 1/1 epoch (loss 1.9170): 27%|β–ˆβ–ˆβ–‹ | 254/938 [01:51<04:18, 2.65it/s] Training 1/1 epoch (loss 1.9170): 27%|β–ˆβ–ˆβ–‹ | 255/938 [01:51<04:16, 2.67it/s] Training 1/1 epoch (loss 1.8188): 27%|β–ˆβ–ˆβ–‹ | 255/938 [01:52<04:16, 2.67it/s] Training 1/1 epoch (loss 1.8188): 27%|β–ˆβ–ˆβ–‹ | 256/938 [01:52<04:20, 2.62it/s] Training 1/1 epoch (loss 1.8237): 27%|β–ˆβ–ˆβ–‹ | 256/938 [01:52<04:20, 2.62it/s] Training 1/1 epoch (loss 1.8237): 27%|β–ˆβ–ˆβ–‹ | 257/938 [01:52<04:19, 2.62it/s] Training 1/1 epoch (loss 1.8898): 27%|β–ˆβ–ˆβ–‹ | 257/938 [01:52<04:19, 2.62it/s] Training 1/1 epoch (loss 1.8898): 28%|β–ˆβ–ˆβ–Š | 258/938 [01:52<04:19, 2.62it/s] Training 1/1 epoch (loss 1.9217): 28%|β–ˆβ–ˆβ–Š | 258/938 [01:53<04:19, 2.62it/s] Training 1/1 epoch (loss 1.9217): 28%|β–ˆβ–ˆβ–Š | 259/938 [01:53<04:23, 2.57it/s] Training 1/1 epoch (loss 1.7992): 28%|β–ˆβ–ˆβ–Š | 259/938 [01:53<04:23, 2.57it/s] Training 1/1 epoch (loss 1.7992): 28%|β–ˆβ–ˆβ–Š | 260/938 [01:53<04:21, 2.59it/s] Training 1/1 epoch (loss 1.8333): 28%|β–ˆβ–ˆβ–Š | 260/938 [01:54<04:21, 2.59it/s] Training 1/1 epoch (loss 1.8333): 28%|β–ˆβ–ˆβ–Š | 261/938 [01:54<04:47, 2.36it/s] Training 1/1 epoch (loss 1.8476): 28%|β–ˆβ–ˆβ–Š | 261/938 [01:54<04:47, 2.36it/s] Training 1/1 epoch (loss 1.8476): 28%|β–ˆβ–ˆβ–Š | 262/938 [01:54<04:33, 2.48it/s] Training 1/1 epoch (loss 1.7293): 28%|β–ˆβ–ˆβ–Š | 262/938 [01:55<04:33, 2.48it/s] Training 1/1 epoch (loss 1.7293): 28%|β–ˆβ–ˆβ–Š | 263/938 [01:55<04:43, 2.38it/s] Training 1/1 epoch (loss 2.0098): 28%|β–ˆβ–ˆβ–Š | 263/938 [01:55<04:43, 2.38it/s] Training 1/1 epoch (loss 2.0098): 28%|β–ˆβ–ˆβ–Š | 264/938 [01:55<04:45, 2.36it/s] Training 1/1 epoch (loss 1.8749): 28%|β–ˆβ–ˆβ–Š | 264/938 [01:55<04:45, 2.36it/s] Training 1/1 epoch (loss 1.8749): 28%|β–ˆβ–ˆβ–Š | 265/938 [01:55<04:40, 2.40it/s] Training 1/1 epoch (loss 1.9167): 28%|β–ˆβ–ˆβ–Š | 265/938 [01:56<04:40, 2.40it/s] Training 1/1 epoch (loss 1.9167): 28%|β–ˆβ–ˆβ–Š | 266/938 [01:56<05:07, 2.18it/s] Training 1/1 epoch (loss 1.7122): 28%|β–ˆβ–ˆβ–Š | 266/938 [01:56<05:07, 2.18it/s] Training 1/1 epoch (loss 1.7122): 28%|β–ˆβ–ˆβ–Š | 267/938 [01:56<04:41, 2.38it/s] Training 1/1 epoch (loss 1.7892): 28%|β–ˆβ–ˆβ–Š | 267/938 [01:57<04:41, 2.38it/s] Training 1/1 epoch (loss 1.7892): 29%|β–ˆβ–ˆβ–Š | 268/938 [01:57<04:24, 2.53it/s] Training 1/1 epoch (loss 1.8174): 29%|β–ˆβ–ˆβ–Š | 268/938 [01:57<04:24, 2.53it/s] Training 1/1 epoch (loss 1.8174): 29%|β–ˆβ–ˆβ–Š | 269/938 [01:57<04:17, 2.59it/s] Training 1/1 epoch (loss 1.8428): 29%|β–ˆβ–ˆβ–Š | 269/938 [01:57<04:17, 2.59it/s] Training 1/1 epoch (loss 1.8428): 29%|β–ˆβ–ˆβ–‰ | 270/938 [01:57<04:10, 2.67it/s] Training 1/1 epoch (loss 1.9387): 29%|β–ˆβ–ˆβ–‰ | 270/938 [01:58<04:10, 2.67it/s] Training 1/1 epoch (loss 1.9387): 29%|β–ˆβ–ˆβ–‰ | 271/938 [01:58<04:14, 2.62it/s] Training 1/1 epoch (loss 1.7537): 29%|β–ˆβ–ˆβ–‰ | 271/938 [01:58<04:14, 2.62it/s] Training 1/1 epoch (loss 1.7537): 29%|β–ˆβ–ˆβ–‰ | 272/938 [01:58<04:24, 2.52it/s] Training 1/1 epoch (loss 1.8947): 29%|β–ˆβ–ˆβ–‰ | 272/938 [01:58<04:24, 2.52it/s] Training 1/1 epoch (loss 1.8947): 29%|β–ˆβ–ˆβ–‰ | 273/938 [01:58<04:13, 2.62it/s] Training 1/1 epoch (loss 1.9248): 29%|β–ˆβ–ˆβ–‰ | 273/938 [01:59<04:13, 2.62it/s] Training 1/1 epoch (loss 1.9248): 29%|β–ˆβ–ˆβ–‰ | 274/938 [01:59<04:14, 2.61it/s] Training 1/1 epoch (loss 2.0335): 29%|β–ˆβ–ˆβ–‰ | 274/938 [01:59<04:14, 2.61it/s] Training 1/1 epoch (loss 2.0335): 29%|β–ˆβ–ˆβ–‰ | 275/938 [01:59<04:16, 2.59it/s] Training 1/1 epoch (loss 1.9129): 29%|β–ˆβ–ˆβ–‰ | 275/938 [02:00<04:16, 2.59it/s] Training 1/1 epoch (loss 1.9129): 29%|β–ˆβ–ˆβ–‰ | 276/938 [02:00<04:33, 2.42it/s] Training 1/1 epoch (loss 1.9431): 29%|β–ˆβ–ˆβ–‰ | 276/938 [02:00<04:33, 2.42it/s] Training 1/1 epoch (loss 1.9431): 30%|β–ˆβ–ˆβ–‰ | 277/938 [02:00<04:25, 2.49it/s] Training 1/1 epoch (loss 1.8247): 30%|β–ˆβ–ˆβ–‰ | 277/938 [02:00<04:25, 2.49it/s] Training 1/1 epoch (loss 1.8247): 30%|β–ˆβ–ˆβ–‰ | 278/938 [02:00<04:14, 2.59it/s] Training 1/1 epoch (loss 1.8989): 30%|β–ˆβ–ˆβ–‰ | 278/938 [02:01<04:14, 2.59it/s] Training 1/1 epoch (loss 1.8989): 30%|β–ˆβ–ˆβ–‰ | 279/938 [02:01<04:08, 2.66it/s] Training 1/1 epoch (loss 1.7479): 30%|β–ˆβ–ˆβ–‰ | 279/938 [02:01<04:08, 2.66it/s] Training 1/1 epoch (loss 1.7479): 30%|β–ˆβ–ˆβ–‰ | 280/938 [02:01<04:05, 2.68it/s] Training 1/1 epoch (loss 1.8172): 30%|β–ˆβ–ˆβ–‰ | 280/938 [02:02<04:05, 2.68it/s] Training 1/1 epoch (loss 1.8172): 30%|β–ˆβ–ˆβ–‰ | 281/938 [02:02<04:09, 2.63it/s] Training 1/1 epoch (loss 1.7444): 30%|β–ˆβ–ˆβ–‰ | 281/938 [02:02<04:09, 2.63it/s] Training 1/1 epoch (loss 1.7444): 30%|β–ˆβ–ˆβ–ˆ | 282/938 [02:02<04:22, 2.50it/s] Training 1/1 epoch (loss 2.0121): 30%|β–ˆβ–ˆβ–ˆ | 282/938 [02:02<04:22, 2.50it/s] Training 1/1 epoch (loss 2.0121): 30%|β–ˆβ–ˆβ–ˆ | 283/938 [02:02<04:12, 2.60it/s] Training 1/1 epoch (loss 1.8259): 30%|β–ˆβ–ˆβ–ˆ | 283/938 [02:03<04:12, 2.60it/s] Training 1/1 epoch (loss 1.8259): 30%|β–ˆβ–ˆβ–ˆ | 284/938 [02:03<04:01, 2.71it/s] Training 1/1 epoch (loss 1.7686): 30%|β–ˆβ–ˆβ–ˆ | 284/938 [02:03<04:01, 2.71it/s] Training 1/1 epoch (loss 1.7686): 30%|β–ˆβ–ˆβ–ˆ | 285/938 [02:03<03:57, 2.74it/s] Training 1/1 epoch (loss 1.8979): 30%|β–ˆβ–ˆβ–ˆ | 285/938 [02:04<03:57, 2.74it/s] Training 1/1 epoch (loss 1.8979): 30%|β–ˆβ–ˆβ–ˆ | 286/938 [02:04<04:16, 2.54it/s] Training 1/1 epoch (loss 1.7733): 30%|β–ˆβ–ˆβ–ˆ | 286/938 [02:04<04:16, 2.54it/s] Training 1/1 epoch (loss 1.7733): 31%|β–ˆβ–ˆβ–ˆ | 287/938 [02:04<04:27, 2.43it/s] Training 1/1 epoch (loss 1.9544): 31%|β–ˆβ–ˆβ–ˆ | 287/938 [02:04<04:27, 2.43it/s] Training 1/1 epoch (loss 1.9544): 31%|β–ˆβ–ˆβ–ˆ | 288/938 [02:04<04:37, 2.35it/s] Training 1/1 epoch (loss 1.9884): 31%|β–ˆβ–ˆβ–ˆ | 288/938 [02:05<04:37, 2.35it/s] Training 1/1 epoch (loss 1.9884): 31%|β–ˆβ–ˆβ–ˆ | 289/938 [02:05<04:33, 2.37it/s] Training 1/1 epoch (loss 1.7486): 31%|β–ˆβ–ˆβ–ˆ | 289/938 [02:05<04:33, 2.37it/s] Training 1/1 epoch (loss 1.7486): 31%|β–ˆβ–ˆβ–ˆ | 290/938 [02:05<04:30, 2.39it/s] Training 1/1 epoch (loss 1.9147): 31%|β–ˆβ–ˆβ–ˆ | 290/938 [02:06<04:30, 2.39it/s] Training 1/1 epoch (loss 1.9147): 31%|β–ˆβ–ˆβ–ˆ | 291/938 [02:06<04:28, 2.41it/s] Training 1/1 epoch (loss 1.8176): 31%|β–ˆβ–ˆβ–ˆ | 291/938 [02:06<04:28, 2.41it/s] Training 1/1 epoch (loss 1.8176): 31%|β–ˆβ–ˆβ–ˆ | 292/938 [02:06<04:57, 2.17it/s] Training 1/1 epoch (loss 1.8493): 31%|β–ˆβ–ˆβ–ˆ | 292/938 [02:07<04:57, 2.17it/s] Training 1/1 epoch (loss 1.8493): 31%|β–ˆβ–ˆβ–ˆ | 293/938 [02:07<04:31, 2.38it/s] Training 1/1 epoch (loss 1.8101): 31%|β–ˆβ–ˆβ–ˆ | 293/938 [02:07<04:31, 2.38it/s] Training 1/1 epoch (loss 1.8101): 31%|β–ˆβ–ˆβ–ˆβ– | 294/938 [02:07<04:24, 2.44it/s] Training 1/1 epoch (loss 1.6357): 31%|β–ˆβ–ˆβ–ˆβ– | 294/938 [02:07<04:24, 2.44it/s] Training 1/1 epoch (loss 1.6357): 31%|β–ˆβ–ˆβ–ˆβ– | 295/938 [02:07<04:07, 2.60it/s] Training 1/1 epoch (loss 1.8327): 31%|β–ˆβ–ˆβ–ˆβ– | 295/938 [02:08<04:07, 2.60it/s] Training 1/1 epoch (loss 1.8327): 32%|β–ˆβ–ˆβ–ˆβ– | 296/938 [02:08<04:16, 2.51it/s] Training 1/1 epoch (loss 1.8795): 32%|β–ˆβ–ˆβ–ˆβ– | 296/938 [02:08<04:16, 2.51it/s] Training 1/1 epoch (loss 1.8795): 32%|β–ˆβ–ˆβ–ˆβ– | 297/938 [02:08<04:20, 2.46it/s] Training 1/1 epoch (loss 1.8187): 32%|β–ˆβ–ˆβ–ˆβ– | 297/938 [02:08<04:20, 2.46it/s] Training 1/1 epoch (loss 1.8187): 32%|β–ˆβ–ˆβ–ˆβ– | 298/938 [02:08<04:09, 2.56it/s] Training 1/1 epoch (loss 1.8888): 32%|β–ˆβ–ˆβ–ˆβ– | 298/938 [02:09<04:09, 2.56it/s] Training 1/1 epoch (loss 1.8888): 32%|β–ˆβ–ˆβ–ˆβ– | 299/938 [02:09<04:26, 2.40it/s] Training 1/1 epoch (loss 1.8859): 32%|β–ˆβ–ˆβ–ˆβ– | 299/938 [02:09<04:26, 2.40it/s] Training 1/1 epoch (loss 1.8859): 32%|β–ˆβ–ˆβ–ˆβ– | 300/938 [02:09<04:15, 2.50it/s] Training 1/1 epoch (loss 1.9081): 32%|β–ˆβ–ˆβ–ˆβ– | 300/938 [02:10<04:15, 2.50it/s] Training 1/1 epoch (loss 1.9081): 32%|β–ˆβ–ˆβ–ˆβ– | 301/938 [02:10<04:09, 2.55it/s] Training 1/1 epoch (loss 1.9135): 32%|β–ˆβ–ˆβ–ˆβ– | 301/938 [02:10<04:09, 2.55it/s] Training 1/1 epoch (loss 1.9135): 32%|β–ˆβ–ˆβ–ˆβ– | 302/938 [02:10<04:21, 2.43it/s] Training 1/1 epoch (loss 1.7485): 32%|β–ˆβ–ˆβ–ˆβ– | 302/938 [02:11<04:21, 2.43it/s] Training 1/1 epoch (loss 1.7485): 32%|β–ˆβ–ˆβ–ˆβ– | 303/938 [02:11<04:13, 2.51it/s] Training 1/1 epoch (loss 2.0725): 32%|β–ˆβ–ˆβ–ˆβ– | 303/938 [02:11<04:13, 2.51it/s] Training 1/1 epoch (loss 2.0725): 32%|β–ˆβ–ˆβ–ˆβ– | 304/938 [02:11<04:13, 2.50it/s] Training 1/1 epoch (loss 1.7930): 32%|β–ˆβ–ˆβ–ˆβ– | 304/938 [02:11<04:13, 2.50it/s] Training 1/1 epoch (loss 1.7930): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 305/938 [02:11<04:09, 2.53it/s] Training 1/1 epoch (loss 1.8497): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 305/938 [02:12<04:09, 2.53it/s] Training 1/1 epoch (loss 1.8497): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 306/938 [02:12<04:17, 2.46it/s] Training 1/1 epoch (loss 1.8780): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 306/938 [02:12<04:17, 2.46it/s] Training 1/1 epoch (loss 1.8780): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 307/938 [02:12<04:25, 2.38it/s] Training 1/1 epoch (loss 1.8553): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 307/938 [02:13<04:25, 2.38it/s] Training 1/1 epoch (loss 1.8553): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 308/938 [02:13<04:11, 2.51it/s] Training 1/1 epoch (loss 1.6953): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 308/938 [02:13<04:11, 2.51it/s] Training 1/1 epoch (loss 1.6953): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 309/938 [02:13<04:09, 2.52it/s] Training 1/1 epoch (loss 1.9860): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 309/938 [02:13<04:09, 2.52it/s] Training 1/1 epoch (loss 1.9860): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 310/938 [02:13<04:03, 2.58it/s] Training 1/1 epoch (loss 1.7154): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 310/938 [02:14<04:03, 2.58it/s] Training 1/1 epoch (loss 1.7154): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 311/938 [02:14<04:11, 2.50it/s] Training 1/1 epoch (loss 1.8017): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 311/938 [02:14<04:11, 2.50it/s] Training 1/1 epoch (loss 1.8017): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 312/938 [02:14<04:37, 2.26it/s] Training 1/1 epoch (loss 1.9533): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 312/938 [02:15<04:37, 2.26it/s] Training 1/1 epoch (loss 1.9533): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 313/938 [02:15<04:38, 2.24it/s] Training 1/1 epoch (loss 2.0157): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 313/938 [02:15<04:38, 2.24it/s] Training 1/1 epoch (loss 2.0157): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 314/938 [02:15<04:18, 2.41it/s] Training 1/1 epoch (loss 1.7254): 33%|β–ˆβ–ˆβ–ˆβ–Ž | 314/938 [02:15<04:18, 2.41it/s] Training 1/1 epoch (loss 1.7254): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 315/938 [02:15<04:06, 2.53it/s] Training 1/1 epoch (loss 1.8892): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 315/938 [02:16<04:06, 2.53it/s] Training 1/1 epoch (loss 1.8892): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 316/938 [02:16<04:11, 2.47it/s] Training 1/1 epoch (loss 1.8375): 34%|β–ˆβ–ˆβ–ˆβ–Ž | 316/938 [02:16<04:11, 2.47it/s] Training 1/1 epoch (loss 1.8375): 34%|β–ˆβ–ˆβ–ˆβ– | 317/938 [02:16<04:36, 2.24it/s] Training 1/1 epoch (loss 1.9025): 34%|β–ˆβ–ˆβ–ˆβ– | 317/938 [02:17<04:36, 2.24it/s] Training 1/1 epoch (loss 1.9025): 34%|β–ˆβ–ˆβ–ˆβ– | 318/938 [02:17<04:41, 2.20it/s] Training 1/1 epoch (loss 1.7319): 34%|β–ˆβ–ˆβ–ˆβ– | 318/938 [02:17<04:41, 2.20it/s] Training 1/1 epoch (loss 1.7319): 34%|β–ˆβ–ˆβ–ˆβ– | 319/938 [02:17<04:39, 2.22it/s] Training 1/1 epoch (loss 1.7164): 34%|β–ˆβ–ˆβ–ˆβ– | 319/938 [02:18<04:39, 2.22it/s] Training 1/1 epoch (loss 1.7164): 34%|β–ˆβ–ˆβ–ˆβ– | 320/938 [02:18<04:44, 2.17it/s] Training 1/1 epoch (loss 1.7835): 34%|β–ˆβ–ˆβ–ˆβ– | 320/938 [02:18<04:44, 2.17it/s] Training 1/1 epoch (loss 1.7835): 34%|β–ˆβ–ˆβ–ˆβ– | 321/938 [02:18<04:48, 2.14it/s] Training 1/1 epoch (loss 1.9044): 34%|β–ˆβ–ˆβ–ˆβ– | 321/938 [02:19<04:48, 2.14it/s] Training 1/1 epoch (loss 1.9044): 34%|β–ˆβ–ˆβ–ˆβ– | 322/938 [02:19<04:40, 2.19it/s] Training 1/1 epoch (loss 2.0008): 34%|β–ˆβ–ˆβ–ˆβ– | 322/938 [02:19<04:40, 2.19it/s] Training 1/1 epoch (loss 2.0008): 34%|β–ˆβ–ˆβ–ˆβ– | 323/938 [02:19<04:45, 2.16it/s] Training 1/1 epoch (loss 1.7910): 34%|β–ˆβ–ˆβ–ˆβ– | 323/938 [02:20<04:45, 2.16it/s] Training 1/1 epoch (loss 1.7910): 35%|β–ˆβ–ˆβ–ˆβ– | 324/938 [02:20<04:36, 2.22it/s] Training 1/1 epoch (loss 1.7461): 35%|β–ˆβ–ˆβ–ˆβ– | 324/938 [02:20<04:36, 2.22it/s] Training 1/1 epoch (loss 1.7461): 35%|β–ˆβ–ˆβ–ˆβ– | 325/938 [02:20<04:19, 2.36it/s] Training 1/1 epoch (loss 1.9733): 35%|β–ˆβ–ˆβ–ˆβ– | 325/938 [02:20<04:19, 2.36it/s] Training 1/1 epoch (loss 1.9733): 35%|β–ˆβ–ˆβ–ˆβ– | 326/938 [02:20<04:15, 2.40it/s] Training 1/1 epoch (loss 1.7810): 35%|β–ˆβ–ˆβ–ˆβ– | 326/938 [02:21<04:15, 2.40it/s] Training 1/1 epoch (loss 1.7810): 35%|β–ˆβ–ˆβ–ˆβ– | 327/938 [02:21<04:01, 2.53it/s] Training 1/1 epoch (loss 1.7772): 35%|β–ˆβ–ˆβ–ˆβ– | 327/938 [02:21<04:01, 2.53it/s] Training 1/1 epoch (loss 1.7772): 35%|β–ˆβ–ˆβ–ˆβ– | 328/938 [02:21<03:58, 2.56it/s] Training 1/1 epoch (loss 1.7311): 35%|β–ˆβ–ˆβ–ˆβ– | 328/938 [02:22<03:58, 2.56it/s] Training 1/1 epoch (loss 1.7311): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 329/938 [02:22<04:46, 2.13it/s] Training 1/1 epoch (loss 1.8820): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 329/938 [02:22<04:46, 2.13it/s] Training 1/1 epoch (loss 1.8820): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 330/938 [02:22<04:35, 2.21it/s] Training 1/1 epoch (loss 1.8134): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 330/938 [02:23<04:35, 2.21it/s] Training 1/1 epoch (loss 1.8134): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 331/938 [02:23<04:55, 2.06it/s] Training 1/1 epoch (loss 1.9190): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 331/938 [02:23<04:55, 2.06it/s] Training 1/1 epoch (loss 1.9190): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 332/938 [02:23<04:29, 2.25it/s] Training 1/1 epoch (loss 1.8669): 35%|β–ˆβ–ˆβ–ˆβ–Œ | 332/938 [02:23<04:29, 2.25it/s] Training 1/1 epoch (loss 1.8669): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 333/938 [02:23<04:11, 2.41it/s] Training 1/1 epoch (loss 1.7788): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 333/938 [02:24<04:11, 2.41it/s] Training 1/1 epoch (loss 1.7788): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 334/938 [02:24<04:19, 2.33it/s] Training 1/1 epoch (loss 1.8716): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 334/938 [02:24<04:19, 2.33it/s] Training 1/1 epoch (loss 1.8716): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 335/938 [02:24<04:54, 2.05it/s] Training 1/1 epoch (loss 1.7446): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 335/938 [02:25<04:54, 2.05it/s] Training 1/1 epoch (loss 1.7446): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 336/938 [02:25<04:58, 2.02it/s] Training 1/1 epoch (loss 1.9250): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 336/938 [02:25<04:58, 2.02it/s] Training 1/1 epoch (loss 1.9250): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 337/938 [02:25<04:35, 2.18it/s] Training 1/1 epoch (loss 1.8895): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 337/938 [02:26<04:35, 2.18it/s] Training 1/1 epoch (loss 1.8895): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 338/938 [02:26<04:18, 2.32it/s] Training 1/1 epoch (loss 1.9291): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 338/938 [02:26<04:18, 2.32it/s] Training 1/1 epoch (loss 1.9291): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 339/938 [02:26<04:10, 2.39it/s] Training 1/1 epoch (loss 1.8130): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 339/938 [02:27<04:10, 2.39it/s] Training 1/1 epoch (loss 1.8130): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 340/938 [02:27<04:06, 2.43it/s] Training 1/1 epoch (loss 1.8379): 36%|β–ˆβ–ˆβ–ˆβ–Œ | 340/938 [02:27<04:06, 2.43it/s] Training 1/1 epoch (loss 1.8379): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 341/938 [02:27<04:16, 2.33it/s] Training 1/1 epoch (loss 1.7725): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 341/938 [02:27<04:16, 2.33it/s] Training 1/1 epoch (loss 1.7725): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 342/938 [02:27<03:58, 2.50it/s] Training 1/1 epoch (loss 1.9056): 36%|β–ˆβ–ˆβ–ˆβ–‹ | 342/938 [02:28<03:58, 2.50it/s] Training 1/1 epoch (loss 1.9056): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 343/938 [02:28<03:49, 2.59it/s] Training 1/1 epoch (loss 1.8576): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 343/938 [02:28<03:49, 2.59it/s] Training 1/1 epoch (loss 1.8576): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 344/938 [02:28<03:54, 2.53it/s] Training 1/1 epoch (loss 1.7557): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 344/938 [02:28<03:54, 2.53it/s] Training 1/1 epoch (loss 1.7557): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 345/938 [02:28<03:36, 2.74it/s] Training 1/1 epoch (loss 1.7457): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 345/938 [02:29<03:36, 2.74it/s] Training 1/1 epoch (loss 1.7457): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 346/938 [02:29<03:31, 2.80it/s] Training 1/1 epoch (loss 1.8075): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 346/938 [02:29<03:31, 2.80it/s] Training 1/1 epoch (loss 1.8075): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 347/938 [02:29<03:36, 2.73it/s] Training 1/1 epoch (loss 1.6833): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 347/938 [02:30<03:36, 2.73it/s] Training 1/1 epoch (loss 1.6833): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 348/938 [02:30<04:04, 2.41it/s] Training 1/1 epoch (loss 1.8114): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 348/938 [02:30<04:04, 2.41it/s] Training 1/1 epoch (loss 1.8114): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 349/938 [02:30<04:13, 2.32it/s] Training 1/1 epoch (loss 1.8146): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 349/938 [02:30<04:13, 2.32it/s] Training 1/1 epoch (loss 1.8146): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 350/938 [02:30<03:52, 2.53it/s] Training 1/1 epoch (loss 1.8923): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 350/938 [02:31<03:52, 2.53it/s] Training 1/1 epoch (loss 1.8923): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 351/938 [02:31<03:41, 2.65it/s] Training 1/1 epoch (loss 1.7870): 37%|β–ˆβ–ˆβ–ˆβ–‹ | 351/938 [02:31<03:41, 2.65it/s] Training 1/1 epoch (loss 1.7870): 38%|β–ˆβ–ˆβ–ˆβ–Š | 352/938 [02:31<03:37, 2.69it/s] Training 1/1 epoch (loss 1.9732): 38%|β–ˆβ–ˆβ–ˆβ–Š | 352/938 [02:31<03:37, 2.69it/s] Training 1/1 epoch (loss 1.9732): 38%|β–ˆβ–ˆβ–ˆβ–Š | 353/938 [02:31<03:28, 2.80it/s] Training 1/1 epoch (loss 1.8849): 38%|β–ˆβ–ˆβ–ˆβ–Š | 353/938 [02:32<03:28, 2.80it/s] Training 1/1 epoch (loss 1.8849): 38%|β–ˆβ–ˆβ–ˆβ–Š | 354/938 [02:32<03:37, 2.69it/s] Training 1/1 epoch (loss 1.8610): 38%|β–ˆβ–ˆβ–ˆβ–Š | 354/938 [02:32<03:37, 2.69it/s] Training 1/1 epoch (loss 1.8610): 38%|β–ˆβ–ˆβ–ˆβ–Š | 355/938 [02:32<03:38, 2.67it/s] Training 1/1 epoch (loss 1.7996): 38%|β–ˆβ–ˆβ–ˆβ–Š | 355/938 [02:33<03:38, 2.67it/s] Training 1/1 epoch (loss 1.7996): 38%|β–ˆβ–ˆβ–ˆβ–Š | 356/938 [02:33<03:38, 2.66it/s] Training 1/1 epoch (loss 1.8451): 38%|β–ˆβ–ˆβ–ˆβ–Š | 356/938 [02:33<03:38, 2.66it/s] Training 1/1 epoch (loss 1.8451): 38%|β–ˆβ–ˆβ–ˆβ–Š | 357/938 [02:33<03:42, 2.61it/s] Training 1/1 epoch (loss 1.8120): 38%|β–ˆβ–ˆβ–ˆβ–Š | 357/938 [02:33<03:42, 2.61it/s] Training 1/1 epoch (loss 1.8120): 38%|β–ˆβ–ˆβ–ˆβ–Š | 358/938 [02:33<03:37, 2.67it/s] Training 1/1 epoch (loss 1.7558): 38%|β–ˆβ–ˆβ–ˆβ–Š | 358/938 [02:34<03:37, 2.67it/s] Training 1/1 epoch (loss 1.7558): 38%|β–ˆβ–ˆβ–ˆβ–Š | 359/938 [02:34<03:34, 2.70it/s] Training 1/1 epoch (loss 1.9788): 38%|β–ˆβ–ˆβ–ˆβ–Š | 359/938 [02:34<03:34, 2.70it/s] Training 1/1 epoch (loss 1.9788): 38%|β–ˆβ–ˆβ–ˆβ–Š | 360/938 [02:34<03:47, 2.54it/s] Training 1/1 epoch (loss 1.9723): 38%|β–ˆβ–ˆβ–ˆβ–Š | 360/938 [02:35<03:47, 2.54it/s] Training 1/1 epoch (loss 1.9723): 38%|β–ˆβ–ˆβ–ˆβ–Š | 361/938 [02:35<04:14, 2.26it/s] Training 1/1 epoch (loss 1.8289): 38%|β–ˆβ–ˆβ–ˆβ–Š | 361/938 [02:35<04:14, 2.26it/s] Training 1/1 epoch (loss 1.8289): 39%|β–ˆβ–ˆβ–ˆβ–Š | 362/938 [02:35<04:09, 2.31it/s] Training 1/1 epoch (loss 1.8563): 39%|β–ˆβ–ˆβ–ˆβ–Š | 362/938 [02:36<04:09, 2.31it/s] Training 1/1 epoch (loss 1.8563): 39%|β–ˆβ–ˆβ–ˆβ–Š | 363/938 [02:36<04:05, 2.34it/s] Training 1/1 epoch (loss 1.9092): 39%|β–ˆβ–ˆβ–ˆβ–Š | 363/938 [02:36<04:05, 2.34it/s] Training 1/1 epoch (loss 1.9092): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 364/938 [02:36<04:06, 2.33it/s] Training 1/1 epoch (loss 1.8469): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 364/938 [02:36<04:06, 2.33it/s] Training 1/1 epoch (loss 1.8469): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 365/938 [02:36<03:59, 2.40it/s] Training 1/1 epoch (loss 1.8219): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 365/938 [02:37<03:59, 2.40it/s] Training 1/1 epoch (loss 1.8219): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 366/938 [02:37<03:58, 2.40it/s] Training 1/1 epoch (loss 1.8121): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 366/938 [02:37<03:58, 2.40it/s] Training 1/1 epoch (loss 1.8121): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 367/938 [02:37<04:01, 2.37it/s] Training 1/1 epoch (loss 1.9527): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 367/938 [02:38<04:01, 2.37it/s] Training 1/1 epoch (loss 1.9527): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 368/938 [02:38<03:56, 2.41it/s] Training 1/1 epoch (loss 1.8069): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 368/938 [02:38<03:56, 2.41it/s] Training 1/1 epoch (loss 1.8069): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 369/938 [02:38<03:50, 2.47it/s] Training 1/1 epoch (loss 1.8528): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 369/938 [02:38<03:50, 2.47it/s] Training 1/1 epoch (loss 1.8528): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 370/938 [02:38<03:39, 2.59it/s] Training 1/1 epoch (loss 1.8277): 39%|β–ˆβ–ˆβ–ˆβ–‰ | 370/938 [02:39<03:39, 2.59it/s] Training 1/1 epoch (loss 1.8277): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 371/938 [02:39<03:36, 2.62it/s] Training 1/1 epoch (loss 1.8297): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 371/938 [02:39<03:36, 2.62it/s] Training 1/1 epoch (loss 1.8297): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 372/938 [02:39<03:29, 2.70it/s] Training 1/1 epoch (loss 1.7536): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 372/938 [02:39<03:29, 2.70it/s] Training 1/1 epoch (loss 1.7536): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 373/938 [02:39<03:24, 2.76it/s] Training 1/1 epoch (loss 1.6981): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 373/938 [02:40<03:24, 2.76it/s] Training 1/1 epoch (loss 1.6981): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 374/938 [02:40<03:31, 2.67it/s] Training 1/1 epoch (loss 1.9368): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 374/938 [02:40<03:31, 2.67it/s] Training 1/1 epoch (loss 1.9368): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 375/938 [02:40<03:28, 2.71it/s] Training 1/1 epoch (loss 1.8502): 40%|β–ˆβ–ˆβ–ˆβ–‰ | 375/938 [02:41<03:28, 2.71it/s] Training 1/1 epoch (loss 1.8502): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 376/938 [02:41<03:25, 2.73it/s] Training 1/1 epoch (loss 1.9715): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 376/938 [02:41<03:25, 2.73it/s] Training 1/1 epoch (loss 1.9715): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 377/938 [02:41<03:41, 2.53it/s] Training 1/1 epoch (loss 1.7393): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 377/938 [02:41<03:41, 2.53it/s] Training 1/1 epoch (loss 1.7393): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 378/938 [02:41<03:37, 2.57it/s] Training 1/1 epoch (loss 1.8049): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 378/938 [02:42<03:37, 2.57it/s] Training 1/1 epoch (loss 1.8049): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 379/938 [02:42<03:35, 2.60it/s] Training 1/1 epoch (loss 1.6851): 40%|β–ˆβ–ˆβ–ˆβ–ˆ | 379/938 [02:42<03:35, 2.60it/s] Training 1/1 epoch (loss 1.6851): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 380/938 [02:42<03:38, 2.55it/s] Training 1/1 epoch (loss 1.8253): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 380/938 [02:42<03:38, 2.55it/s] Training 1/1 epoch (loss 1.8253): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 381/938 [02:42<03:31, 2.64it/s] Training 1/1 epoch (loss 1.7411): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 381/938 [02:43<03:31, 2.64it/s] Training 1/1 epoch (loss 1.7411): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 382/938 [02:43<03:40, 2.52it/s] Training 1/1 epoch (loss 1.8821): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 382/938 [02:43<03:40, 2.52it/s] Training 1/1 epoch (loss 1.8821): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 383/938 [02:43<03:34, 2.59it/s] Training 1/1 epoch (loss 1.9353): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 383/938 [02:44<03:34, 2.59it/s] Training 1/1 epoch (loss 1.9353): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 384/938 [02:44<03:32, 2.61it/s] Training 1/1 epoch (loss 1.7933): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 384/938 [02:44<03:32, 2.61it/s] Training 1/1 epoch (loss 1.7933): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 385/938 [02:44<03:33, 2.59it/s] Training 1/1 epoch (loss 1.8800): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 385/938 [02:45<03:33, 2.59it/s] Training 1/1 epoch (loss 1.8800): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 386/938 [02:45<03:41, 2.50it/s] Training 1/1 epoch (loss 1.8722): 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 386/938 [02:45<03:41, 2.50it/s] Training 1/1 epoch (loss 1.8722): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 387/938 [02:45<03:37, 2.53it/s] Training 1/1 epoch (loss 1.8756): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 387/938 [02:45<03:37, 2.53it/s] Training 1/1 epoch (loss 1.8756): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 388/938 [02:45<03:42, 2.47it/s] Training 1/1 epoch (loss 1.9805): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 388/938 [02:46<03:42, 2.47it/s] Training 1/1 epoch (loss 1.9805): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 389/938 [02:46<03:39, 2.50it/s] Training 1/1 epoch (loss 1.8444): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 389/938 [02:46<03:39, 2.50it/s] Training 1/1 epoch (loss 1.8444): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 390/938 [02:46<03:39, 2.49it/s] Training 1/1 epoch (loss 1.8135): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 390/938 [02:46<03:39, 2.49it/s] Training 1/1 epoch (loss 1.8135): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 391/938 [02:46<03:31, 2.59it/s] Training 1/1 epoch (loss 1.9313): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 391/938 [02:47<03:31, 2.59it/s] Training 1/1 epoch (loss 1.9313): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 392/938 [02:47<03:32, 2.57it/s] Training 1/1 epoch (loss 1.8391): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 392/938 [02:47<03:32, 2.57it/s] Training 1/1 epoch (loss 1.8391): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 393/938 [02:47<03:29, 2.60it/s] Training 1/1 epoch (loss 1.9843): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 393/938 [02:48<03:29, 2.60it/s] Training 1/1 epoch (loss 1.9843): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 394/938 [02:48<03:31, 2.57it/s] Training 1/1 epoch (loss 1.7801): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 394/938 [02:48<03:31, 2.57it/s] Training 1/1 epoch (loss 1.7801): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 395/938 [02:48<03:30, 2.58it/s] Training 1/1 epoch (loss 1.8070): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 395/938 [02:48<03:30, 2.58it/s] Training 1/1 epoch (loss 1.8070): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 396/938 [02:48<03:26, 2.62it/s] Training 1/1 epoch (loss 1.7809): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 396/938 [02:49<03:26, 2.62it/s] Training 1/1 epoch (loss 1.7809): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 397/938 [02:49<03:25, 2.64it/s] Training 1/1 epoch (loss 1.7641): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 397/938 [02:49<03:25, 2.64it/s] Training 1/1 epoch (loss 1.7641): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 398/938 [02:49<03:21, 2.68it/s] Training 1/1 epoch (loss 1.8202): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 398/938 [02:49<03:21, 2.68it/s] Training 1/1 epoch (loss 1.8202): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 399/938 [02:49<03:22, 2.66it/s] Training 1/1 epoch (loss 1.8977): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 399/938 [02:50<03:22, 2.66it/s] Training 1/1 epoch (loss 1.8977): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 400/938 [02:50<03:31, 2.55it/s] Training 1/1 epoch (loss 1.9199): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 400/938 [02:50<03:31, 2.55it/s] Training 1/1 epoch (loss 1.9199): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 401/938 [02:50<03:28, 2.58it/s] Training 1/1 epoch (loss 1.8693): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 401/938 [02:51<03:28, 2.58it/s] Training 1/1 epoch (loss 1.8693): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 402/938 [02:51<03:21, 2.66it/s] Training 1/1 epoch (loss 1.8355): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 402/938 [02:51<03:21, 2.66it/s] Training 1/1 epoch (loss 1.8355): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 403/938 [02:51<03:36, 2.47it/s] Training 1/1 epoch (loss 1.8971): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 403/938 [02:51<03:36, 2.47it/s] Training 1/1 epoch (loss 1.8971): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 404/938 [02:51<03:29, 2.55it/s] Training 1/1 epoch (loss 1.7280): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 404/938 [02:52<03:29, 2.55it/s] Training 1/1 epoch (loss 1.7280): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 405/938 [02:52<03:30, 2.54it/s] Training 1/1 epoch (loss 1.9815): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 405/938 [02:52<03:30, 2.54it/s] Training 1/1 epoch (loss 1.9815): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 406/938 [02:52<03:27, 2.56it/s] Training 1/1 epoch (loss 1.8968): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 406/938 [02:53<03:27, 2.56it/s] Training 1/1 epoch (loss 1.8968): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 407/938 [02:53<03:29, 2.53it/s] Training 1/1 epoch (loss 1.8169): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 407/938 [02:53<03:29, 2.53it/s] Training 1/1 epoch (loss 1.8169): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 408/938 [02:53<03:31, 2.51it/s] Training 1/1 epoch (loss 1.8098): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 408/938 [02:53<03:31, 2.51it/s] Training 1/1 epoch (loss 1.8098): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 409/938 [02:53<03:24, 2.59it/s] Training 1/1 epoch (loss 1.8128): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 409/938 [02:54<03:24, 2.59it/s] Training 1/1 epoch (loss 1.8128): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 410/938 [02:54<03:43, 2.37it/s] Training 1/1 epoch (loss 1.6971): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 410/938 [02:55<03:43, 2.37it/s] Training 1/1 epoch (loss 1.6971): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 411/938 [02:55<04:05, 2.14it/s] Training 1/1 epoch (loss 1.7716): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 411/938 [02:55<04:05, 2.14it/s] Training 1/1 epoch (loss 1.7716): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 412/938 [02:55<03:57, 2.22it/s] Training 1/1 epoch (loss 1.7657): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 412/938 [02:55<03:57, 2.22it/s] Training 1/1 epoch (loss 1.7657): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 413/938 [02:55<03:50, 2.27it/s] Training 1/1 epoch (loss 1.8410): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 413/938 [02:56<03:50, 2.27it/s] Training 1/1 epoch (loss 1.8410): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 414/938 [02:56<03:42, 2.36it/s] Training 1/1 epoch (loss 1.7624): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 414/938 [02:56<03:42, 2.36it/s] Training 1/1 epoch (loss 1.7624): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 415/938 [02:56<03:46, 2.31it/s] Training 1/1 epoch (loss 1.7489): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 415/938 [02:57<03:46, 2.31it/s] Training 1/1 epoch (loss 1.7489): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 416/938 [02:57<03:36, 2.41it/s] Training 1/1 epoch (loss 1.6519): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 416/938 [02:57<03:36, 2.41it/s] Training 1/1 epoch (loss 1.6519): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 417/938 [02:57<03:32, 2.45it/s] Training 1/1 epoch (loss 1.7160): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 417/938 [02:57<03:32, 2.45it/s] Training 1/1 epoch (loss 1.7160): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 418/938 [02:57<03:28, 2.50it/s] Training 1/1 epoch (loss 1.8072): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 418/938 [02:58<03:28, 2.50it/s] Training 1/1 epoch (loss 1.8072): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 419/938 [02:58<03:35, 2.41it/s] Training 1/1 epoch (loss 1.9290): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 419/938 [02:58<03:35, 2.41it/s] Training 1/1 epoch (loss 1.9290): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 420/938 [02:58<03:36, 2.40it/s] Training 1/1 epoch (loss 1.8354): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 420/938 [02:59<03:36, 2.40it/s] Training 1/1 epoch (loss 1.8354): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 421/938 [02:59<03:21, 2.57it/s] Training 1/1 epoch (loss 1.7256): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 421/938 [02:59<03:21, 2.57it/s] Training 1/1 epoch (loss 1.7256): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 422/938 [02:59<03:15, 2.64it/s] Training 1/1 epoch (loss 1.7354): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 422/938 [02:59<03:15, 2.64it/s] Training 1/1 epoch (loss 1.7354): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 423/938 [02:59<03:27, 2.48it/s] Training 1/1 epoch (loss 1.7111): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 423/938 [03:00<03:27, 2.48it/s] Training 1/1 epoch (loss 1.7111): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 424/938 [03:00<03:33, 2.41it/s] Training 1/1 epoch (loss 1.8267): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 424/938 [03:00<03:33, 2.41it/s] Training 1/1 epoch (loss 1.8267): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 425/938 [03:00<03:29, 2.45it/s] Training 1/1 epoch (loss 1.8730): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 425/938 [03:00<03:29, 2.45it/s] Training 1/1 epoch (loss 1.8730): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 426/938 [03:00<03:14, 2.63it/s] Training 1/1 epoch (loss 1.8654): 45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 426/938 [03:01<03:14, 2.63it/s] Training 1/1 epoch (loss 1.8654): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 427/938 [03:01<03:17, 2.59it/s] Training 1/1 epoch (loss 1.7727): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 427/938 [03:01<03:17, 2.59it/s] Training 1/1 epoch (loss 1.7727): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 428/938 [03:01<03:08, 2.70it/s] Training 1/1 epoch (loss 1.7646): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 428/938 [03:02<03:08, 2.70it/s] Training 1/1 epoch (loss 1.7646): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 429/938 [03:02<03:12, 2.65it/s] Training 1/1 epoch (loss 1.8755): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 429/938 [03:02<03:12, 2.65it/s] Training 1/1 epoch (loss 1.8755): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 430/938 [03:02<03:11, 2.66it/s] Training 1/1 epoch (loss 1.8081): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 430/938 [03:02<03:11, 2.66it/s] Training 1/1 epoch (loss 1.8081): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 431/938 [03:02<03:13, 2.62it/s] Training 1/1 epoch (loss 1.8453): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 431/938 [03:03<03:13, 2.62it/s] Training 1/1 epoch (loss 1.8453): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 432/938 [03:03<03:08, 2.69it/s] Training 1/1 epoch (loss 1.8649): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 432/938 [03:03<03:08, 2.69it/s] Training 1/1 epoch (loss 1.8649): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 433/938 [03:03<03:07, 2.69it/s] Training 1/1 epoch (loss 1.9276): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 433/938 [03:04<03:07, 2.69it/s] Training 1/1 epoch (loss 1.9276): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 434/938 [03:04<03:12, 2.62it/s] Training 1/1 epoch (loss 1.8174): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 434/938 [03:04<03:12, 2.62it/s] Training 1/1 epoch (loss 1.8174): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 435/938 [03:04<03:18, 2.53it/s] Training 1/1 epoch (loss 1.8754): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 435/938 [03:04<03:18, 2.53it/s] Training 1/1 epoch (loss 1.8754): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 436/938 [03:04<03:30, 2.38it/s] Training 1/1 epoch (loss 1.8506): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 436/938 [03:05<03:30, 2.38it/s] Training 1/1 epoch (loss 1.8506): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 437/938 [03:05<03:28, 2.41it/s] Training 1/1 epoch (loss 1.7153): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 437/938 [03:05<03:28, 2.41it/s] Training 1/1 epoch (loss 1.7153): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 438/938 [03:05<03:19, 2.50it/s] Training 1/1 epoch (loss 1.8601): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 438/938 [03:06<03:19, 2.50it/s] Training 1/1 epoch (loss 1.8601): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 439/938 [03:06<03:21, 2.48it/s] Training 1/1 epoch (loss 1.8121): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 439/938 [03:06<03:21, 2.48it/s] Training 1/1 epoch (loss 1.8121): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 440/938 [03:06<03:21, 2.48it/s] Training 1/1 epoch (loss 1.8192): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 440/938 [03:06<03:21, 2.48it/s] Training 1/1 epoch (loss 1.8192): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 441/938 [03:06<03:15, 2.54it/s] Training 1/1 epoch (loss 1.7236): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 441/938 [03:07<03:15, 2.54it/s] Training 1/1 epoch (loss 1.7236): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 442/938 [03:07<03:12, 2.58it/s] Training 1/1 epoch (loss 1.8125): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 442/938 [03:07<03:12, 2.58it/s] Training 1/1 epoch (loss 1.8125): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 443/938 [03:07<03:09, 2.62it/s] Training 1/1 epoch (loss 1.9016): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 443/938 [03:07<03:09, 2.62it/s] Training 1/1 epoch (loss 1.9016): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 444/938 [03:07<03:00, 2.74it/s] Training 1/1 epoch (loss 1.5927): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 444/938 [03:08<03:00, 2.74it/s] Training 1/1 epoch (loss 1.5927): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 445/938 [03:08<03:30, 2.34it/s] Training 1/1 epoch (loss 1.7754): 47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 445/938 [03:08<03:30, 2.34it/s] Training 1/1 epoch (loss 1.7754): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 446/938 [03:08<03:19, 2.47it/s] Training 1/1 epoch (loss 1.8354): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 446/938 [03:09<03:19, 2.47it/s] Training 1/1 epoch (loss 1.8354): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 447/938 [03:09<03:07, 2.62it/s] Training 1/1 epoch (loss 1.7829): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 447/938 [03:09<03:07, 2.62it/s] Training 1/1 epoch (loss 1.7829): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 448/938 [03:09<03:09, 2.58it/s] Training 1/1 epoch (loss 1.9821): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 448/938 [03:09<03:09, 2.58it/s] Training 1/1 epoch (loss 1.9821): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 449/938 [03:09<03:04, 2.66it/s] Training 1/1 epoch (loss 1.6366): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 449/938 [03:10<03:04, 2.66it/s] Training 1/1 epoch (loss 1.6366): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 450/938 [03:10<03:12, 2.53it/s] Training 1/1 epoch (loss 1.7153): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 450/938 [03:10<03:12, 2.53it/s] Training 1/1 epoch (loss 1.7153): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 451/938 [03:10<03:15, 2.49it/s] Training 1/1 epoch (loss 1.7992): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 451/938 [03:11<03:15, 2.49it/s] Training 1/1 epoch (loss 1.7992): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 452/938 [03:11<03:04, 2.64it/s] Training 1/1 epoch (loss 1.8171): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 452/938 [03:11<03:04, 2.64it/s] Training 1/1 epoch (loss 1.8171): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 453/938 [03:11<03:09, 2.56it/s] Training 1/1 epoch (loss 1.7980): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 453/938 [03:11<03:09, 2.56it/s] Training 1/1 epoch (loss 1.7980): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 454/938 [03:11<03:09, 2.55it/s] Training 1/1 epoch (loss 1.8311): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 454/938 [03:12<03:09, 2.55it/s] Training 1/1 epoch (loss 1.8311): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 455/938 [03:12<03:02, 2.65it/s] Training 1/1 epoch (loss 1.8029): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 455/938 [03:12<03:02, 2.65it/s] Training 1/1 epoch (loss 1.8029): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 456/938 [03:12<03:33, 2.26it/s] Training 1/1 epoch (loss 1.9395): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 456/938 [03:13<03:33, 2.26it/s] Training 1/1 epoch (loss 1.9395): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 457/938 [03:13<03:42, 2.16it/s] Training 1/1 epoch (loss 1.9020): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 457/938 [03:13<03:42, 2.16it/s] Training 1/1 epoch (loss 1.9020): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 458/938 [03:13<03:31, 2.27it/s] Training 1/1 epoch (loss 1.9285): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 458/938 [03:14<03:31, 2.27it/s] Training 1/1 epoch (loss 1.9285): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 459/938 [03:14<03:26, 2.32it/s] Training 1/1 epoch (loss 1.8498): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 459/938 [03:14<03:26, 2.32it/s] Training 1/1 epoch (loss 1.8498): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 460/938 [03:14<03:29, 2.28it/s] Training 1/1 epoch (loss 1.7821): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 460/938 [03:15<03:29, 2.28it/s] Training 1/1 epoch (loss 1.7821): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 461/938 [03:15<03:36, 2.20it/s] Training 1/1 epoch (loss 1.7817): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 461/938 [03:15<03:36, 2.20it/s] Training 1/1 epoch (loss 1.7817): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 462/938 [03:15<03:31, 2.25it/s] Training 1/1 epoch (loss 1.7059): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 462/938 [03:15<03:31, 2.25it/s] Training 1/1 epoch (loss 1.7059): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 463/938 [03:15<03:23, 2.33it/s] Training 1/1 epoch (loss 1.8906): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 463/938 [03:16<03:23, 2.33it/s] Training 1/1 epoch (loss 1.8906): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 464/938 [03:16<03:20, 2.36it/s] Training 1/1 epoch (loss 1.7927): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 464/938 [03:16<03:20, 2.36it/s] Training 1/1 epoch (loss 1.7927): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 465/938 [03:16<03:22, 2.34it/s] Training 1/1 epoch (loss 1.9009): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 465/938 [03:17<03:22, 2.34it/s] Training 1/1 epoch (loss 1.9009): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 466/938 [03:17<03:18, 2.38it/s] Training 1/1 epoch (loss 1.9224): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 466/938 [03:17<03:18, 2.38it/s] Training 1/1 epoch (loss 1.9224): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 467/938 [03:17<03:10, 2.48it/s] Training 1/1 epoch (loss 1.9557): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 467/938 [03:17<03:10, 2.48it/s] Training 1/1 epoch (loss 1.9557): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 468/938 [03:17<03:12, 2.44it/s] Training 1/1 epoch (loss 1.9025): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 468/938 [03:18<03:12, 2.44it/s] Training 1/1 epoch (loss 1.9025): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 469/938 [03:18<03:14, 2.41it/s] Training 1/1 epoch (loss 1.6247): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 469/938 [03:18<03:14, 2.41it/s] Training 1/1 epoch (loss 1.6247): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 470/938 [03:18<03:25, 2.27it/s] Training 1/1 epoch (loss 1.8405): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 470/938 [03:19<03:25, 2.27it/s] Training 1/1 epoch (loss 1.8405): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 471/938 [03:19<03:15, 2.39it/s] Training 1/1 epoch (loss 1.8595): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 471/938 [03:19<03:15, 2.39it/s] Training 1/1 epoch (loss 1.8595): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 472/938 [03:19<03:06, 2.50it/s] Training 1/1 epoch (loss 1.7193): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 472/938 [03:20<03:06, 2.50it/s] Training 1/1 epoch (loss 1.7193): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 473/938 [03:20<03:08, 2.46it/s] Training 1/1 epoch (loss 1.8568): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 473/938 [03:20<03:08, 2.46it/s] Training 1/1 epoch (loss 1.8568): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 474/938 [03:20<03:06, 2.49it/s] Training 1/1 epoch (loss 1.7531): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 474/938 [03:20<03:06, 2.49it/s] Training 1/1 epoch (loss 1.7531): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 475/938 [03:20<03:13, 2.40it/s] Training 1/1 epoch (loss 1.7691): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 475/938 [03:21<03:13, 2.40it/s] Training 1/1 epoch (loss 1.7691): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 476/938 [03:21<03:07, 2.47it/s] Training 1/1 epoch (loss 1.7789): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 476/938 [03:21<03:07, 2.47it/s] Training 1/1 epoch (loss 1.7789): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 477/938 [03:21<03:00, 2.55it/s] Training 1/1 epoch (loss 1.7330): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 477/938 [03:21<03:00, 2.55it/s] Training 1/1 epoch (loss 1.7330): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 478/938 [03:21<02:56, 2.61it/s] Training 1/1 epoch (loss 1.8653): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 478/938 [03:22<02:56, 2.61it/s] Training 1/1 epoch (loss 1.8653): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 479/938 [03:22<02:59, 2.56it/s] Training 1/1 epoch (loss 1.7865): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 479/938 [03:22<02:59, 2.56it/s] Training 1/1 epoch (loss 1.7865): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 480/938 [03:22<03:01, 2.52it/s] Training 1/1 epoch (loss 1.8590): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 480/938 [03:23<03:01, 2.52it/s] Training 1/1 epoch (loss 1.8590): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 481/938 [03:23<03:02, 2.51it/s] Training 1/1 epoch (loss 1.8127): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 481/938 [03:23<03:02, 2.51it/s] Training 1/1 epoch (loss 1.8127): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 482/938 [03:23<03:01, 2.51it/s] Training 1/1 epoch (loss 1.7818): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 482/938 [03:23<03:01, 2.51it/s] Training 1/1 epoch (loss 1.7818): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 483/938 [03:23<02:52, 2.64it/s] Training 1/1 epoch (loss 2.0040): 51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 483/938 [03:24<02:52, 2.64it/s] Training 1/1 epoch (loss 2.0040): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 484/938 [03:24<03:07, 2.43it/s] Training 1/1 epoch (loss 1.9618): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 484/938 [03:25<03:07, 2.43it/s] Training 1/1 epoch (loss 1.9618): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 485/938 [03:25<03:39, 2.07it/s] Training 1/1 epoch (loss 1.8149): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 485/938 [03:25<03:39, 2.07it/s] Training 1/1 epoch (loss 1.8149): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 486/938 [03:25<03:22, 2.23it/s] Training 1/1 epoch (loss 1.7989): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 486/938 [03:25<03:22, 2.23it/s] Training 1/1 epoch (loss 1.7989): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 487/938 [03:25<03:08, 2.39it/s] Training 1/1 epoch (loss 1.7737): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 487/938 [03:26<03:08, 2.39it/s] Training 1/1 epoch (loss 1.7737): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 488/938 [03:26<03:04, 2.44it/s] Training 1/1 epoch (loss 1.8220): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 488/938 [03:26<03:04, 2.44it/s] Training 1/1 epoch (loss 1.8220): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 489/938 [03:26<03:08, 2.39it/s] Training 1/1 epoch (loss 1.8980): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 489/938 [03:27<03:08, 2.39it/s] Training 1/1 epoch (loss 1.8980): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 490/938 [03:27<03:14, 2.31it/s] Training 1/1 epoch (loss 1.9144): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 490/938 [03:27<03:14, 2.31it/s] Training 1/1 epoch (loss 1.9144): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 491/938 [03:27<03:15, 2.29it/s] Training 1/1 epoch (loss 1.9669): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 491/938 [03:27<03:15, 2.29it/s] Training 1/1 epoch (loss 1.9669): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 492/938 [03:27<03:02, 2.45it/s] Training 1/1 epoch (loss 1.8028): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 492/938 [03:28<03:02, 2.45it/s] Training 1/1 epoch (loss 1.8028): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 493/938 [03:28<02:55, 2.54it/s] Training 1/1 epoch (loss 1.8117): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 493/938 [03:28<02:55, 2.54it/s] Training 1/1 epoch (loss 1.8117): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 494/938 [03:28<02:57, 2.50it/s] Training 1/1 epoch (loss 1.8527): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 494/938 [03:29<02:57, 2.50it/s] Training 1/1 epoch (loss 1.8527): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 495/938 [03:29<03:00, 2.46it/s] Training 1/1 epoch (loss 1.6713): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 495/938 [03:29<03:00, 2.46it/s] Training 1/1 epoch (loss 1.6713): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 496/938 [03:29<02:58, 2.48it/s] Training 1/1 epoch (loss 1.8962): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 496/938 [03:29<02:58, 2.48it/s] Training 1/1 epoch (loss 1.8962): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 497/938 [03:29<03:05, 2.38it/s] Training 1/1 epoch (loss 1.7594): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 497/938 [03:30<03:05, 2.38it/s] Training 1/1 epoch (loss 1.7594): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 498/938 [03:30<02:53, 2.53it/s] Training 1/1 epoch (loss 1.7219): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 498/938 [03:30<02:53, 2.53it/s] Training 1/1 epoch (loss 1.7219): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 499/938 [03:30<02:52, 2.55it/s] Training 1/1 epoch (loss 1.7358): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 499/938 [03:31<02:52, 2.55it/s] Training 1/1 epoch (loss 1.7358): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 500/938 [03:31<02:47, 2.61it/s] Training 1/1 epoch (loss 1.8776): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 500/938 [03:31<02:47, 2.61it/s] Training 1/1 epoch (loss 1.8776): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 501/938 [03:31<02:45, 2.65it/s] Training 1/1 epoch (loss 1.9080): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 501/938 [03:31<02:45, 2.65it/s] Training 1/1 epoch (loss 1.9080): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 502/938 [03:31<02:41, 2.71it/s] Training 1/1 epoch (loss 1.8147): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 502/938 [03:32<02:41, 2.71it/s] Training 1/1 epoch (loss 1.8147): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 503/938 [03:32<02:39, 2.73it/s] Training 1/1 epoch (loss 1.7070): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 503/938 [03:32<02:39, 2.73it/s] Training 1/1 epoch (loss 1.7070): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 504/938 [03:32<02:40, 2.71it/s] Training 1/1 epoch (loss 1.8174): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 504/938 [03:32<02:40, 2.71it/s] Training 1/1 epoch (loss 1.8174): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 505/938 [03:32<02:48, 2.57it/s] Training 1/1 epoch (loss 1.7841): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 505/938 [03:33<02:48, 2.57it/s] Training 1/1 epoch (loss 1.7841): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 506/938 [03:33<02:47, 2.58it/s] Training 1/1 epoch (loss 1.8641): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 506/938 [03:33<02:47, 2.58it/s] Training 1/1 epoch (loss 1.8641): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 507/938 [03:33<02:46, 2.58it/s] Training 1/1 epoch (loss 1.8152): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 507/938 [03:34<02:46, 2.58it/s] Training 1/1 epoch (loss 1.8152): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 508/938 [03:34<02:49, 2.53it/s] Training 1/1 epoch (loss 1.8179): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 508/938 [03:34<02:49, 2.53it/s] Training 1/1 epoch (loss 1.8179): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 509/938 [03:34<02:45, 2.59it/s] Training 1/1 epoch (loss 1.8715): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 509/938 [03:34<02:45, 2.59it/s] Training 1/1 epoch (loss 1.8715): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 510/938 [03:34<02:55, 2.44it/s] Training 1/1 epoch (loss 2.0180): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 510/938 [03:35<02:55, 2.44it/s] Training 1/1 epoch (loss 2.0180): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 511/938 [03:35<03:31, 2.02it/s] Training 1/1 epoch (loss 1.7367): 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 511/938 [03:35<03:31, 2.02it/s] Training 1/1 epoch (loss 1.7367): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 512/938 [03:35<03:15, 2.18it/s] Training 1/1 epoch (loss 1.8381): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 512/938 [03:36<03:15, 2.18it/s] Training 1/1 epoch (loss 1.8381): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 513/938 [03:36<03:33, 1.99it/s] Training 1/1 epoch (loss 1.8567): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 513/938 [03:36<03:33, 1.99it/s] Training 1/1 epoch (loss 1.8567): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 514/938 [03:36<03:19, 2.12it/s] Training 1/1 epoch (loss 1.7653): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 514/938 [03:37<03:19, 2.12it/s] Training 1/1 epoch (loss 1.7653): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 515/938 [03:37<03:11, 2.20it/s] Training 1/1 epoch (loss 1.8342): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 515/938 [03:37<03:11, 2.20it/s] Training 1/1 epoch (loss 1.8342): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 516/938 [03:37<03:02, 2.31it/s] Training 1/1 epoch (loss 1.8910): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 516/938 [03:38<03:02, 2.31it/s] Training 1/1 epoch (loss 1.8910): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 517/938 [03:38<02:52, 2.44it/s] Training 1/1 epoch (loss 1.7969): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 517/938 [03:38<02:52, 2.44it/s] Training 1/1 epoch (loss 1.7969): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 518/938 [03:38<02:47, 2.51it/s] Training 1/1 epoch (loss 1.8622): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 518/938 [03:38<02:47, 2.51it/s] Training 1/1 epoch (loss 1.8622): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 519/938 [03:38<02:54, 2.41it/s] Training 1/1 epoch (loss 1.9056): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 519/938 [03:39<02:54, 2.41it/s] Training 1/1 epoch (loss 1.9056): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 520/938 [03:39<03:11, 2.18it/s] Training 1/1 epoch (loss 1.7265): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 520/938 [03:39<03:11, 2.18it/s] Training 1/1 epoch (loss 1.7265): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 521/938 [03:39<02:57, 2.35it/s] Training 1/1 epoch (loss 1.7753): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 521/938 [03:40<02:57, 2.35it/s] Training 1/1 epoch (loss 1.7753): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 522/938 [03:40<02:52, 2.41it/s] Training 1/1 epoch (loss 1.7832): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 522/938 [03:40<02:52, 2.41it/s] Training 1/1 epoch (loss 1.7832): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 523/938 [03:40<02:51, 2.42it/s] Training 1/1 epoch (loss 1.7025): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 523/938 [03:41<02:51, 2.42it/s] Training 1/1 epoch (loss 1.7025): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 524/938 [03:41<02:47, 2.48it/s] Training 1/1 epoch (loss 1.7619): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 524/938 [03:41<02:47, 2.48it/s] Training 1/1 epoch (loss 1.7619): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 525/938 [03:41<02:45, 2.49it/s] Training 1/1 epoch (loss 1.8702): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 525/938 [03:41<02:45, 2.49it/s] Training 1/1 epoch (loss 1.8702): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 526/938 [03:41<02:37, 2.62it/s] Training 1/1 epoch (loss 1.7951): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 526/938 [03:42<02:37, 2.62it/s] Training 1/1 epoch (loss 1.7951): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 527/938 [03:42<02:35, 2.64it/s] Training 1/1 epoch (loss 1.8519): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 527/938 [03:42<02:35, 2.64it/s] Training 1/1 epoch (loss 1.8519): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 528/938 [03:42<02:35, 2.64it/s] Training 1/1 epoch (loss 1.7298): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 528/938 [03:43<02:35, 2.64it/s] Training 1/1 epoch (loss 1.7298): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 529/938 [03:43<02:44, 2.48it/s] Training 1/1 epoch (loss 1.7603): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 529/938 [03:43<02:44, 2.48it/s] Training 1/1 epoch (loss 1.7603): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 530/938 [03:43<02:46, 2.46it/s] Training 1/1 epoch (loss 1.7504): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 530/938 [03:43<02:46, 2.46it/s] Training 1/1 epoch (loss 1.7504): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 531/938 [03:43<02:42, 2.51it/s] Training 1/1 epoch (loss 1.9131): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 531/938 [03:44<02:42, 2.51it/s] Training 1/1 epoch (loss 1.9131): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 532/938 [03:44<02:38, 2.55it/s] Training 1/1 epoch (loss 1.8213): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 532/938 [03:44<02:38, 2.55it/s] Training 1/1 epoch (loss 1.8213): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 533/938 [03:44<02:39, 2.55it/s] Training 1/1 epoch (loss 1.7695): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 533/938 [03:44<02:39, 2.55it/s] Training 1/1 epoch (loss 1.7695): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 534/938 [03:44<02:33, 2.63it/s] Training 1/1 epoch (loss 1.7609): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 534/938 [03:45<02:33, 2.63it/s] Training 1/1 epoch (loss 1.7609): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 535/938 [03:45<02:51, 2.34it/s] Training 1/1 epoch (loss 1.6901): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 535/938 [03:45<02:51, 2.34it/s] Training 1/1 epoch (loss 1.6901): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 536/938 [03:45<02:42, 2.47it/s] Training 1/1 epoch (loss 1.8637): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 536/938 [03:46<02:42, 2.47it/s] Training 1/1 epoch (loss 1.8637): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 537/938 [03:46<02:38, 2.54it/s] Training 1/1 epoch (loss 1.7151): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 537/938 [03:46<02:38, 2.54it/s] Training 1/1 epoch (loss 1.7151): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 538/938 [03:46<02:36, 2.55it/s] Training 1/1 epoch (loss 1.8987): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 538/938 [03:47<02:36, 2.55it/s] Training 1/1 epoch (loss 1.8987): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 539/938 [03:47<02:41, 2.47it/s] Training 1/1 epoch (loss 1.7858): 57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 539/938 [03:47<02:41, 2.47it/s] Training 1/1 epoch (loss 1.7858): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 540/938 [03:47<02:35, 2.56it/s] Training 1/1 epoch (loss 1.6614): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 540/938 [03:47<02:35, 2.56it/s] Training 1/1 epoch (loss 1.6614): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 541/938 [03:47<02:31, 2.61it/s] Training 1/1 epoch (loss 1.8097): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 541/938 [03:48<02:31, 2.61it/s] Training 1/1 epoch (loss 1.8097): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 542/938 [03:48<02:31, 2.62it/s] Training 1/1 epoch (loss 1.8621): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 542/938 [03:48<02:31, 2.62it/s] Training 1/1 epoch (loss 1.8621): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 543/938 [03:48<02:34, 2.56it/s] Training 1/1 epoch (loss 1.7658): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 543/938 [03:48<02:34, 2.56it/s] Training 1/1 epoch (loss 1.7658): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 544/938 [03:48<02:39, 2.47it/s] Training 1/1 epoch (loss 1.7019): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 544/938 [03:49<02:39, 2.47it/s] Training 1/1 epoch (loss 1.7019): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 545/938 [03:49<02:50, 2.30it/s] Training 1/1 epoch (loss 1.6684): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 545/938 [03:49<02:50, 2.30it/s] Training 1/1 epoch (loss 1.6684): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 546/938 [03:49<02:51, 2.29it/s] Training 1/1 epoch (loss 1.7756): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 546/938 [03:50<02:51, 2.29it/s] Training 1/1 epoch (loss 1.7756): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 547/938 [03:50<02:43, 2.40it/s] Training 1/1 epoch (loss 1.8715): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 547/938 [03:50<02:43, 2.40it/s] Training 1/1 epoch (loss 1.8715): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 548/938 [03:50<02:37, 2.48it/s] Training 1/1 epoch (loss 1.7087): 58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 548/938 [03:51<02:37, 2.48it/s] Training 1/1 epoch (loss 1.7087): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 549/938 [03:51<02:39, 2.44it/s] Training 1/1 epoch (loss 1.8770): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 549/938 [03:51<02:39, 2.44it/s] Training 1/1 epoch (loss 1.8770): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 550/938 [03:51<02:34, 2.51it/s] Training 1/1 epoch (loss 1.8351): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 550/938 [03:51<02:34, 2.51it/s] Training 1/1 epoch (loss 1.8351): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 551/938 [03:51<02:31, 2.55it/s] Training 1/1 epoch (loss 1.7663): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 551/938 [03:52<02:31, 2.55it/s] Training 1/1 epoch (loss 1.7663): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 552/938 [03:52<02:30, 2.57it/s] Training 1/1 epoch (loss 1.8727): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 552/938 [03:52<02:30, 2.57it/s] Training 1/1 epoch (loss 1.8727): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 553/938 [03:52<02:27, 2.60it/s] Training 1/1 epoch (loss 1.8385): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 553/938 [03:52<02:27, 2.60it/s] Training 1/1 epoch (loss 1.8385): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 554/938 [03:52<02:27, 2.60it/s] Training 1/1 epoch (loss 1.6988): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 554/938 [03:53<02:27, 2.60it/s] Training 1/1 epoch (loss 1.6988): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 555/938 [03:53<02:33, 2.50it/s] Training 1/1 epoch (loss 1.8372): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 555/938 [03:53<02:33, 2.50it/s] Training 1/1 epoch (loss 1.8372): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 556/938 [03:53<02:28, 2.58it/s] Training 1/1 epoch (loss 1.8541): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 556/938 [03:54<02:28, 2.58it/s] Training 1/1 epoch (loss 1.8541): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 557/938 [03:54<02:25, 2.61it/s] Training 1/1 epoch (loss 1.8108): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 557/938 [03:54<02:25, 2.61it/s] Training 1/1 epoch (loss 1.8108): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 558/938 [03:54<02:33, 2.47it/s] Training 1/1 epoch (loss 1.8046): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 558/938 [03:55<02:33, 2.47it/s] Training 1/1 epoch (loss 1.8046): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 559/938 [03:55<02:42, 2.33it/s] Training 1/1 epoch (loss 1.7254): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 559/938 [03:55<02:42, 2.33it/s] Training 1/1 epoch (loss 1.7254): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 560/938 [03:55<02:51, 2.20it/s] Training 1/1 epoch (loss 1.8617): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 560/938 [03:55<02:51, 2.20it/s] Training 1/1 epoch (loss 1.8617): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 561/938 [03:55<02:40, 2.35it/s] Training 1/1 epoch (loss 1.9248): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 561/938 [03:56<02:40, 2.35it/s] Training 1/1 epoch (loss 1.9248): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 562/938 [03:56<02:35, 2.42it/s] Training 1/1 epoch (loss 1.7124): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 562/938 [03:56<02:35, 2.42it/s] Training 1/1 epoch (loss 1.7124): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 563/938 [03:56<02:34, 2.42it/s] Training 1/1 epoch (loss 1.8942): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 563/938 [03:57<02:34, 2.42it/s] Training 1/1 epoch (loss 1.8942): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 564/938 [03:57<02:32, 2.45it/s] Training 1/1 epoch (loss 1.8360): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 564/938 [03:57<02:32, 2.45it/s] Training 1/1 epoch (loss 1.8360): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 565/938 [03:57<02:37, 2.37it/s] Training 1/1 epoch (loss 1.7816): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 565/938 [03:57<02:37, 2.37it/s] Training 1/1 epoch (loss 1.7816): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 566/938 [03:57<02:27, 2.52it/s] Training 1/1 epoch (loss 1.9018): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 566/938 [03:58<02:27, 2.52it/s] Training 1/1 epoch (loss 1.9018): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 567/938 [03:58<02:28, 2.50it/s] Training 1/1 epoch (loss 1.8707): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 567/938 [03:58<02:28, 2.50it/s] Training 1/1 epoch (loss 1.8707): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 568/938 [03:58<02:32, 2.43it/s] Training 1/1 epoch (loss 1.9798): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 568/938 [03:59<02:32, 2.43it/s] Training 1/1 epoch (loss 1.9798): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 569/938 [03:59<02:23, 2.57it/s] Training 1/1 epoch (loss 1.7833): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 569/938 [03:59<02:23, 2.57it/s] Training 1/1 epoch (loss 1.7833): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 570/938 [03:59<02:31, 2.43it/s] Training 1/1 epoch (loss 1.8718): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 570/938 [04:00<02:31, 2.43it/s] Training 1/1 epoch (loss 1.8718): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 571/938 [04:00<02:58, 2.06it/s] Training 1/1 epoch (loss 1.8195): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 571/938 [04:00<02:58, 2.06it/s] Training 1/1 epoch (loss 1.8195): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 572/938 [04:00<02:46, 2.19it/s] Training 1/1 epoch (loss 1.8403): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 572/938 [04:01<02:46, 2.19it/s] Training 1/1 epoch (loss 1.8403): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 573/938 [04:01<02:38, 2.30it/s] Training 1/1 epoch (loss 1.8128): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 573/938 [04:01<02:38, 2.30it/s] Training 1/1 epoch (loss 1.8128): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 574/938 [04:01<02:33, 2.38it/s] Training 1/1 epoch (loss 1.7883): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 574/938 [04:01<02:33, 2.38it/s] Training 1/1 epoch (loss 1.7883): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 575/938 [04:01<02:28, 2.45it/s] Training 1/1 epoch (loss 1.7747): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 575/938 [04:02<02:28, 2.45it/s] Training 1/1 epoch (loss 1.7747): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 576/938 [04:02<02:22, 2.53it/s] Training 1/1 epoch (loss 1.7917): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 576/938 [04:02<02:22, 2.53it/s] Training 1/1 epoch (loss 1.7917): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 577/938 [04:02<02:21, 2.55it/s] Training 1/1 epoch (loss 1.9325): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 577/938 [04:02<02:21, 2.55it/s] Training 1/1 epoch (loss 1.9325): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 578/938 [04:02<02:16, 2.64it/s] Training 1/1 epoch (loss 1.8498): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 578/938 [04:03<02:16, 2.64it/s] Training 1/1 epoch (loss 1.8498): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 579/938 [04:03<02:12, 2.70it/s] Training 1/1 epoch (loss 1.8744): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 579/938 [04:03<02:12, 2.70it/s] Training 1/1 epoch (loss 1.8744): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 580/938 [04:03<02:14, 2.66it/s] Training 1/1 epoch (loss 1.8184): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 580/938 [04:04<02:14, 2.66it/s] Training 1/1 epoch (loss 1.8184): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 581/938 [04:04<02:16, 2.61it/s] Training 1/1 epoch (loss 1.7420): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 581/938 [04:04<02:16, 2.61it/s] Training 1/1 epoch (loss 1.7420): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 582/938 [04:04<02:13, 2.66it/s] Training 1/1 epoch (loss 1.8332): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 582/938 [04:04<02:13, 2.66it/s] Training 1/1 epoch (loss 1.8332): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 583/938 [04:04<02:11, 2.69it/s] Training 1/1 epoch (loss 1.7103): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 583/938 [04:05<02:11, 2.69it/s] Training 1/1 epoch (loss 1.7103): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 584/938 [04:05<02:28, 2.39it/s] Training 1/1 epoch (loss 1.6962): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 584/938 [04:05<02:28, 2.39it/s] Training 1/1 epoch (loss 1.6962): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 585/938 [04:05<02:28, 2.38it/s] Training 1/1 epoch (loss 1.8375): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 585/938 [04:06<02:28, 2.38it/s] Training 1/1 epoch (loss 1.8375): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 586/938 [04:06<02:25, 2.42it/s] Training 1/1 epoch (loss 1.8611): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 586/938 [04:06<02:25, 2.42it/s] Training 1/1 epoch (loss 1.8611): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 587/938 [04:06<02:18, 2.54it/s] Training 1/1 epoch (loss 1.8378): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 587/938 [04:06<02:18, 2.54it/s] Training 1/1 epoch (loss 1.8378): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 588/938 [04:06<02:19, 2.51it/s] Training 1/1 epoch (loss 1.8787): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 588/938 [04:07<02:19, 2.51it/s] Training 1/1 epoch (loss 1.8787): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 589/938 [04:07<02:19, 2.50it/s] Training 1/1 epoch (loss 1.9118): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 589/938 [04:07<02:19, 2.50it/s] Training 1/1 epoch (loss 1.9118): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 590/938 [04:07<02:17, 2.53it/s] Training 1/1 epoch (loss 1.7790): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 590/938 [04:07<02:17, 2.53it/s] Training 1/1 epoch (loss 1.7790): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 591/938 [04:07<02:14, 2.57it/s] Training 1/1 epoch (loss 1.7238): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 591/938 [04:08<02:14, 2.57it/s] Training 1/1 epoch (loss 1.7238): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 592/938 [04:08<02:14, 2.57it/s] Training 1/1 epoch (loss 1.8666): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 592/938 [04:08<02:14, 2.57it/s] Training 1/1 epoch (loss 1.8666): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 593/938 [04:08<02:10, 2.64it/s] Training 1/1 epoch (loss 1.8240): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 593/938 [04:09<02:10, 2.64it/s] Training 1/1 epoch (loss 1.8240): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 594/938 [04:09<02:10, 2.63it/s] Training 1/1 epoch (loss 1.7206): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 594/938 [04:09<02:10, 2.63it/s] Training 1/1 epoch (loss 1.7206): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 595/938 [04:09<02:11, 2.60it/s] Training 1/1 epoch (loss 1.8650): 63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 595/938 [04:09<02:11, 2.60it/s] Training 1/1 epoch (loss 1.8650): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 596/938 [04:09<02:07, 2.68it/s] Training 1/1 epoch (loss 1.8206): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 596/938 [04:10<02:07, 2.68it/s] Training 1/1 epoch (loss 1.8206): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 597/938 [04:10<02:12, 2.58it/s] Training 1/1 epoch (loss 1.9382): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 597/938 [04:10<02:12, 2.58it/s] Training 1/1 epoch (loss 1.9382): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 598/938 [04:10<02:16, 2.50it/s] Training 1/1 epoch (loss 1.8202): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 598/938 [04:11<02:16, 2.50it/s] Training 1/1 epoch (loss 1.8202): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 599/938 [04:11<02:14, 2.52it/s] Training 1/1 epoch (loss 1.7342): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 599/938 [04:11<02:14, 2.52it/s] Training 1/1 epoch (loss 1.7342): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 600/938 [04:11<02:12, 2.55it/s] Training 1/1 epoch (loss 1.8671): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 600/938 [04:11<02:12, 2.55it/s] Training 1/1 epoch (loss 1.8671): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 601/938 [04:11<02:09, 2.60it/s] Training 1/1 epoch (loss 1.7049): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 601/938 [04:12<02:09, 2.60it/s] Training 1/1 epoch (loss 1.7049): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 602/938 [04:12<02:11, 2.56it/s] Training 1/1 epoch (loss 1.8923): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 602/938 [04:12<02:11, 2.56it/s] Training 1/1 epoch (loss 1.8923): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 603/938 [04:12<02:08, 2.61it/s] Training 1/1 epoch (loss 1.8871): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 603/938 [04:12<02:08, 2.61it/s] Training 1/1 epoch (loss 1.8871): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 604/938 [04:12<02:05, 2.65it/s] Training 1/1 epoch (loss 1.7846): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 604/938 [04:13<02:05, 2.65it/s] Training 1/1 epoch (loss 1.7846): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 605/938 [04:13<02:10, 2.56it/s] Training 1/1 epoch (loss 1.8476): 64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 605/938 [04:13<02:10, 2.56it/s] Training 1/1 epoch (loss 1.8476): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 606/938 [04:13<02:09, 2.56it/s] Training 1/1 epoch (loss 1.7674): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 606/938 [04:14<02:09, 2.56it/s] Training 1/1 epoch (loss 1.7674): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 607/938 [04:14<02:07, 2.60it/s] Training 1/1 epoch (loss 1.8798): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 607/938 [04:14<02:07, 2.60it/s] Training 1/1 epoch (loss 1.8798): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 608/938 [04:14<02:16, 2.42it/s] Training 1/1 epoch (loss 1.7866): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 608/938 [04:15<02:16, 2.42it/s] Training 1/1 epoch (loss 1.7866): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/938 [04:15<02:15, 2.43it/s] Training 1/1 epoch (loss 1.7985): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/938 [04:15<02:15, 2.43it/s] Training 1/1 epoch (loss 1.7985): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 610/938 [04:15<02:19, 2.35it/s] Training 1/1 epoch (loss 1.8666): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 610/938 [04:16<02:19, 2.35it/s] Training 1/1 epoch (loss 1.8666): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 611/938 [04:16<02:26, 2.24it/s] Training 1/1 epoch (loss 1.7725): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 611/938 [04:16<02:26, 2.24it/s] Training 1/1 epoch (loss 1.7725): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 612/938 [04:16<02:17, 2.37it/s] Training 1/1 epoch (loss 1.5944): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 612/938 [04:16<02:17, 2.37it/s] Training 1/1 epoch (loss 1.5944): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 613/938 [04:16<02:13, 2.43it/s] Training 1/1 epoch (loss 1.9424): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 613/938 [04:17<02:13, 2.43it/s] Training 1/1 epoch (loss 1.9424): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 614/938 [04:17<02:19, 2.33it/s] Training 1/1 epoch (loss 1.6690): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 614/938 [04:17<02:19, 2.33it/s] Training 1/1 epoch (loss 1.6690): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 615/938 [04:17<02:12, 2.43it/s] Training 1/1 epoch (loss 1.8161): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 615/938 [04:18<02:12, 2.43it/s] Training 1/1 epoch (loss 1.8161): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 616/938 [04:18<02:12, 2.43it/s] Training 1/1 epoch (loss 1.6681): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 616/938 [04:18<02:12, 2.43it/s] Training 1/1 epoch (loss 1.6681): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 617/938 [04:18<02:10, 2.46it/s] Training 1/1 epoch (loss 1.7428): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 617/938 [04:18<02:10, 2.46it/s] Training 1/1 epoch (loss 1.7428): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 618/938 [04:18<02:04, 2.57it/s] Training 1/1 epoch (loss 1.8348): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 618/938 [04:19<02:04, 2.57it/s] Training 1/1 epoch (loss 1.8348): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 619/938 [04:19<02:02, 2.61it/s] Training 1/1 epoch (loss 1.6880): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 619/938 [04:19<02:02, 2.61it/s] Training 1/1 epoch (loss 1.6880): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 620/938 [04:19<02:05, 2.53it/s] Training 1/1 epoch (loss 1.7313): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 620/938 [04:19<02:05, 2.53it/s] Training 1/1 epoch (loss 1.7313): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 621/938 [04:19<02:05, 2.52it/s] Training 1/1 epoch (loss 1.9206): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 621/938 [04:20<02:05, 2.52it/s] Training 1/1 epoch (loss 1.9206): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 622/938 [04:20<02:02, 2.57it/s] Training 1/1 epoch (loss 1.6799): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 622/938 [04:20<02:02, 2.57it/s] Training 1/1 epoch (loss 1.6799): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 623/938 [04:20<02:02, 2.57it/s] Training 1/1 epoch (loss 1.7637): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 623/938 [04:21<02:02, 2.57it/s] Training 1/1 epoch (loss 1.7637): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 624/938 [04:21<02:08, 2.44it/s] Training 1/1 epoch (loss 1.5636): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 624/938 [04:21<02:08, 2.44it/s] Training 1/1 epoch (loss 1.5636): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 625/938 [04:21<02:06, 2.48it/s] Training 1/1 epoch (loss 1.8208): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 625/938 [04:21<02:06, 2.48it/s] Training 1/1 epoch (loss 1.8208): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 626/938 [04:21<02:01, 2.57it/s] Training 1/1 epoch (loss 1.8643): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 626/938 [04:22<02:01, 2.57it/s] Training 1/1 epoch (loss 1.8643): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 627/938 [04:22<02:02, 2.54it/s] Training 1/1 epoch (loss 1.7557): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 627/938 [04:22<02:02, 2.54it/s] Training 1/1 epoch (loss 1.7557): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 628/938 [04:22<01:57, 2.64it/s] Training 1/1 epoch (loss 1.7701): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 628/938 [04:23<01:57, 2.64it/s] Training 1/1 epoch (loss 1.7701): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 629/938 [04:23<01:56, 2.66it/s] Training 1/1 epoch (loss 1.8290): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 629/938 [04:23<01:56, 2.66it/s] Training 1/1 epoch (loss 1.8290): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 630/938 [04:23<01:53, 2.71it/s] Training 1/1 epoch (loss 1.7004): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 630/938 [04:23<01:53, 2.71it/s] Training 1/1 epoch (loss 1.7004): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 631/938 [04:23<01:54, 2.69it/s] Training 1/1 epoch (loss 1.9113): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 631/938 [04:24<01:54, 2.69it/s] Training 1/1 epoch (loss 1.9113): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 632/938 [04:24<01:56, 2.62it/s] Training 1/1 epoch (loss 1.6454): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 632/938 [04:24<01:56, 2.62it/s] Training 1/1 epoch (loss 1.6454): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 633/938 [04:24<01:59, 2.55it/s] Training 1/1 epoch (loss 1.7736): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 633/938 [04:25<01:59, 2.55it/s] Training 1/1 epoch (loss 1.7736): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 634/938 [04:25<02:05, 2.41it/s] Training 1/1 epoch (loss 1.7384): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 634/938 [04:25<02:05, 2.41it/s] Training 1/1 epoch (loss 1.7384): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 635/938 [04:25<02:12, 2.29it/s] Training 1/1 epoch (loss 1.7581): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 635/938 [04:25<02:12, 2.29it/s] Training 1/1 epoch (loss 1.7581): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 636/938 [04:25<02:06, 2.39it/s] Training 1/1 epoch (loss 1.9050): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 636/938 [04:26<02:06, 2.39it/s] Training 1/1 epoch (loss 1.9050): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 637/938 [04:26<02:08, 2.33it/s] Training 1/1 epoch (loss 1.8082): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 637/938 [04:26<02:08, 2.33it/s] Training 1/1 epoch (loss 1.8082): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 638/938 [04:26<02:08, 2.34it/s] Training 1/1 epoch (loss 1.6799): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 638/938 [04:27<02:08, 2.34it/s] Training 1/1 epoch (loss 1.6799): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 639/938 [04:27<02:07, 2.35it/s] Training 1/1 epoch (loss 1.8737): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 639/938 [04:27<02:07, 2.35it/s] Training 1/1 epoch (loss 1.8737): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 640/938 [04:27<02:07, 2.33it/s] Training 1/1 epoch (loss 1.6740): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 640/938 [04:28<02:07, 2.33it/s] Training 1/1 epoch (loss 1.6740): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 641/938 [04:28<02:06, 2.36it/s] Training 1/1 epoch (loss 1.6522): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 641/938 [04:28<02:06, 2.36it/s] Training 1/1 epoch (loss 1.6522): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 642/938 [04:28<01:59, 2.48it/s] Training 1/1 epoch (loss 1.8751): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 642/938 [04:28<01:59, 2.48it/s] Training 1/1 epoch (loss 1.8751): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 643/938 [04:28<01:53, 2.59it/s] Training 1/1 epoch (loss 1.6355): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 643/938 [04:29<01:53, 2.59it/s] Training 1/1 epoch (loss 1.6355): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 644/938 [04:29<01:51, 2.65it/s] Training 1/1 epoch (loss 1.8008): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 644/938 [04:29<01:51, 2.65it/s] Training 1/1 epoch (loss 1.8008): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 645/938 [04:29<01:44, 2.81it/s] Training 1/1 epoch (loss 1.8672): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 645/938 [04:29<01:44, 2.81it/s] Training 1/1 epoch (loss 1.8672): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 646/938 [04:29<01:47, 2.71it/s] Training 1/1 epoch (loss 1.8779): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 646/938 [04:30<01:47, 2.71it/s] Training 1/1 epoch (loss 1.8779): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 647/938 [04:30<01:52, 2.59it/s] Training 1/1 epoch (loss 1.8809): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 647/938 [04:30<01:52, 2.59it/s] Training 1/1 epoch (loss 1.8809): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 648/938 [04:30<01:57, 2.48it/s] Training 1/1 epoch (loss 1.8364): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 648/938 [04:31<01:57, 2.48it/s] Training 1/1 epoch (loss 1.8364): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 649/938 [04:31<01:52, 2.58it/s] Training 1/1 epoch (loss 1.7834): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 649/938 [04:31<01:52, 2.58it/s] Training 1/1 epoch (loss 1.7834): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 650/938 [04:31<01:50, 2.61it/s] Training 1/1 epoch (loss 1.8015): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 650/938 [04:31<01:50, 2.61it/s] Training 1/1 epoch (loss 1.8015): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 651/938 [04:31<01:49, 2.63it/s] Training 1/1 epoch (loss 1.6905): 69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 651/938 [04:32<01:49, 2.63it/s] Training 1/1 epoch (loss 1.6905): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 652/938 [04:32<01:48, 2.65it/s] Training 1/1 epoch (loss 1.9489): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 652/938 [04:32<01:48, 2.65it/s] Training 1/1 epoch (loss 1.9489): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 653/938 [04:32<01:46, 2.68it/s] Training 1/1 epoch (loss 1.8702): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 653/938 [04:32<01:46, 2.68it/s] Training 1/1 epoch (loss 1.8702): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 654/938 [04:32<01:39, 2.84it/s] Training 1/1 epoch (loss 1.8009): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 654/938 [04:33<01:39, 2.84it/s] Training 1/1 epoch (loss 1.8009): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 655/938 [04:33<01:42, 2.76it/s] Training 1/1 epoch (loss 1.8503): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 655/938 [04:33<01:42, 2.76it/s] Training 1/1 epoch (loss 1.8503): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 656/938 [04:33<01:44, 2.71it/s] Training 1/1 epoch (loss 1.8771): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 656/938 [04:33<01:44, 2.71it/s] Training 1/1 epoch (loss 1.8771): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 657/938 [04:33<01:41, 2.76it/s] Training 1/1 epoch (loss 1.7838): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 657/938 [04:34<01:41, 2.76it/s] Training 1/1 epoch (loss 1.7838): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 658/938 [04:34<01:37, 2.86it/s] Training 1/1 epoch (loss 1.8617): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 658/938 [04:34<01:37, 2.86it/s] Training 1/1 epoch (loss 1.8617): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 659/938 [04:34<01:39, 2.79it/s] Training 1/1 epoch (loss 1.8264): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 659/938 [04:35<01:39, 2.79it/s] Training 1/1 epoch (loss 1.8264): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 660/938 [04:35<01:47, 2.59it/s] Training 1/1 epoch (loss 1.8832): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 660/938 [04:35<01:47, 2.59it/s] Training 1/1 epoch (loss 1.8832): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 661/938 [04:35<01:46, 2.60it/s] Training 1/1 epoch (loss 1.7966): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 661/938 [04:35<01:46, 2.60it/s] Training 1/1 epoch (loss 1.7966): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 662/938 [04:35<01:42, 2.70it/s] Training 1/1 epoch (loss 1.6712): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 662/938 [04:36<01:42, 2.70it/s] Training 1/1 epoch (loss 1.6712): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 663/938 [04:36<01:48, 2.53it/s] Training 1/1 epoch (loss 1.8050): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 663/938 [04:36<01:48, 2.53it/s] Training 1/1 epoch (loss 1.8050): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 664/938 [04:36<01:59, 2.30it/s] Training 1/1 epoch (loss 1.7886): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 664/938 [04:37<01:59, 2.30it/s] Training 1/1 epoch (loss 1.7886): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 665/938 [04:37<02:10, 2.09it/s] Training 1/1 epoch (loss 1.7274): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 665/938 [04:37<02:10, 2.09it/s] Training 1/1 epoch (loss 1.7274): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 666/938 [04:37<02:07, 2.14it/s] Training 1/1 epoch (loss 1.7931): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 666/938 [04:38<02:07, 2.14it/s] Training 1/1 epoch (loss 1.7931): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 667/938 [04:38<02:08, 2.10it/s] Training 1/1 epoch (loss 1.6403): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 667/938 [04:38<02:08, 2.10it/s] Training 1/1 epoch (loss 1.6403): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 668/938 [04:38<02:09, 2.09it/s] Training 1/1 epoch (loss 1.6578): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 668/938 [04:39<02:09, 2.09it/s] Training 1/1 epoch (loss 1.6578): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 669/938 [04:39<02:03, 2.18it/s] Training 1/1 epoch (loss 1.6897): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 669/938 [04:39<02:03, 2.18it/s] Training 1/1 epoch (loss 1.6897): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 670/938 [04:39<01:55, 2.32it/s] Training 1/1 epoch (loss 1.8503): 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 670/938 [04:40<01:55, 2.32it/s] Training 1/1 epoch (loss 1.8503): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 671/938 [04:40<01:54, 2.32it/s] Training 1/1 epoch (loss 1.9841): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 671/938 [04:40<01:54, 2.32it/s] Training 1/1 epoch (loss 1.9841): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 672/938 [04:40<01:53, 2.35it/s] Training 1/1 epoch (loss 1.7763): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 672/938 [04:40<01:53, 2.35it/s] Training 1/1 epoch (loss 1.7763): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 673/938 [04:40<01:52, 2.35it/s] Training 1/1 epoch (loss 1.7631): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 673/938 [04:41<01:52, 2.35it/s] Training 1/1 epoch (loss 1.7631): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 674/938 [04:41<01:52, 2.34it/s] Training 1/1 epoch (loss 1.7409): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 674/938 [04:41<01:52, 2.34it/s] Training 1/1 epoch (loss 1.7409): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 675/938 [04:41<01:47, 2.44it/s] Training 1/1 epoch (loss 1.7931): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 675/938 [04:42<01:47, 2.44it/s] Training 1/1 epoch (loss 1.7931): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 676/938 [04:42<01:45, 2.49it/s] Training 1/1 epoch (loss 1.9395): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 676/938 [04:42<01:45, 2.49it/s] Training 1/1 epoch (loss 1.9395): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 677/938 [04:42<01:41, 2.56it/s] Training 1/1 epoch (loss 1.7250): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 677/938 [04:42<01:41, 2.56it/s] Training 1/1 epoch (loss 1.7250): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 678/938 [04:42<01:40, 2.58it/s] Training 1/1 epoch (loss 1.7642): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 678/938 [04:43<01:40, 2.58it/s] Training 1/1 epoch (loss 1.7642): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 679/938 [04:43<01:42, 2.54it/s] Training 1/1 epoch (loss 1.7624): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 679/938 [04:43<01:42, 2.54it/s] Training 1/1 epoch (loss 1.7624): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 680/938 [04:43<01:49, 2.35it/s] Training 1/1 epoch (loss 1.7895): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 680/938 [04:44<01:49, 2.35it/s] Training 1/1 epoch (loss 1.7895): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 681/938 [04:44<01:49, 2.35it/s] Training 1/1 epoch (loss 1.8179): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 681/938 [04:44<01:49, 2.35it/s] Training 1/1 epoch (loss 1.8179): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 682/938 [04:44<01:44, 2.44it/s] Training 1/1 epoch (loss 1.8383): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 682/938 [04:44<01:44, 2.44it/s] Training 1/1 epoch (loss 1.8383): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 683/938 [04:44<01:44, 2.44it/s] Training 1/1 epoch (loss 1.8156): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 683/938 [04:45<01:44, 2.44it/s] Training 1/1 epoch (loss 1.8156): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 684/938 [04:45<01:47, 2.37it/s] Training 1/1 epoch (loss 1.8667): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 684/938 [04:45<01:47, 2.37it/s] Training 1/1 epoch (loss 1.8667): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 685/938 [04:45<01:44, 2.42it/s] Training 1/1 epoch (loss 1.9215): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 685/938 [04:46<01:44, 2.42it/s] Training 1/1 epoch (loss 1.9215): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 686/938 [04:46<01:39, 2.54it/s] Training 1/1 epoch (loss 1.7535): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 686/938 [04:46<01:39, 2.54it/s] Training 1/1 epoch (loss 1.7535): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 687/938 [04:46<01:38, 2.54it/s] Training 1/1 epoch (loss 1.8105): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 687/938 [04:46<01:38, 2.54it/s] Training 1/1 epoch (loss 1.8105): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 688/938 [04:46<01:46, 2.35it/s] Training 1/1 epoch (loss 1.8560): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 688/938 [04:47<01:46, 2.35it/s] Training 1/1 epoch (loss 1.8560): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 689/938 [04:47<01:47, 2.32it/s] Training 1/1 epoch (loss 1.8835): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 689/938 [04:47<01:47, 2.32it/s] Training 1/1 epoch (loss 1.8835): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 690/938 [04:47<01:47, 2.30it/s] Training 1/1 epoch (loss 1.8566): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 690/938 [04:48<01:47, 2.30it/s] Training 1/1 epoch (loss 1.8566): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 691/938 [04:48<01:43, 2.39it/s] Training 1/1 epoch (loss 1.8473): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 691/938 [04:48<01:43, 2.39it/s] Training 1/1 epoch (loss 1.8473): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 692/938 [04:48<01:40, 2.45it/s] Training 1/1 epoch (loss 1.7254): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 692/938 [04:49<01:40, 2.45it/s] Training 1/1 epoch (loss 1.7254): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 693/938 [04:49<01:38, 2.49it/s] Training 1/1 epoch (loss 1.7256): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 693/938 [04:49<01:38, 2.49it/s] Training 1/1 epoch (loss 1.7256): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 694/938 [04:49<01:35, 2.55it/s] Training 1/1 epoch (loss 1.7889): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 694/938 [04:49<01:35, 2.55it/s] Training 1/1 epoch (loss 1.7889): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 695/938 [04:49<01:35, 2.54it/s] Training 1/1 epoch (loss 1.6193): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 695/938 [04:50<01:35, 2.54it/s] Training 1/1 epoch (loss 1.6193): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 696/938 [04:50<01:55, 2.09it/s] Training 1/1 epoch (loss 1.8688): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 696/938 [04:50<01:55, 2.09it/s] Training 1/1 epoch (loss 1.8688): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 697/938 [04:50<01:47, 2.25it/s] Training 1/1 epoch (loss 1.7537): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 697/938 [04:51<01:47, 2.25it/s] Training 1/1 epoch (loss 1.7537): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 698/938 [04:51<02:04, 1.92it/s] Training 1/1 epoch (loss 1.6803): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 698/938 [04:51<02:04, 1.92it/s] Training 1/1 epoch (loss 1.6803): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 699/938 [04:51<01:53, 2.10it/s] Training 1/1 epoch (loss 1.8069): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 699/938 [04:52<01:53, 2.10it/s] Training 1/1 epoch (loss 1.8069): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 700/938 [04:52<01:46, 2.23it/s] Training 1/1 epoch (loss 1.7211): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 700/938 [04:52<01:46, 2.23it/s] Training 1/1 epoch (loss 1.7211): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 701/938 [04:52<01:46, 2.23it/s] Training 1/1 epoch (loss 1.7093): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 701/938 [04:53<01:46, 2.23it/s] Training 1/1 epoch (loss 1.7093): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 702/938 [04:53<01:42, 2.31it/s] Training 1/1 epoch (loss 1.8657): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 702/938 [04:53<01:42, 2.31it/s] Training 1/1 epoch (loss 1.8657): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 703/938 [04:53<01:36, 2.43it/s] Training 1/1 epoch (loss 1.7381): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 703/938 [04:53<01:36, 2.43it/s] Training 1/1 epoch (loss 1.7381): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 704/938 [04:53<01:37, 2.40it/s] Training 1/1 epoch (loss 1.7645): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 704/938 [04:54<01:37, 2.40it/s] Training 1/1 epoch (loss 1.7645): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 705/938 [04:54<01:38, 2.37it/s] Training 1/1 epoch (loss 1.6945): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 705/938 [04:54<01:38, 2.37it/s] Training 1/1 epoch (loss 1.6945): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 706/938 [04:54<01:41, 2.28it/s] Training 1/1 epoch (loss 1.7883): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 706/938 [04:55<01:41, 2.28it/s] Training 1/1 epoch (loss 1.7883): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 707/938 [04:55<01:40, 2.30it/s] Training 1/1 epoch (loss 1.7947): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 707/938 [04:55<01:40, 2.30it/s] Training 1/1 epoch (loss 1.7947): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 708/938 [04:55<01:36, 2.37it/s] Training 1/1 epoch (loss 1.8913): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 708/938 [04:56<01:36, 2.37it/s] Training 1/1 epoch (loss 1.8913): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 709/938 [04:56<01:34, 2.42it/s] Training 1/1 epoch (loss 1.7108): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 709/938 [04:56<01:34, 2.42it/s] Training 1/1 epoch (loss 1.7108): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 710/938 [04:56<01:32, 2.47it/s] Training 1/1 epoch (loss 1.8581): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 710/938 [04:56<01:32, 2.47it/s] Training 1/1 epoch (loss 1.8581): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 711/938 [04:56<01:31, 2.49it/s] Training 1/1 epoch (loss 1.7703): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 711/938 [04:57<01:31, 2.49it/s] Training 1/1 epoch (loss 1.7703): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 712/938 [04:57<01:36, 2.33it/s] Training 1/1 epoch (loss 1.7567): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 712/938 [04:57<01:36, 2.33it/s] Training 1/1 epoch (loss 1.7567): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 713/938 [04:57<01:34, 2.37it/s] Training 1/1 epoch (loss 1.7801): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 713/938 [04:58<01:34, 2.37it/s] Training 1/1 epoch (loss 1.7801): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 714/938 [04:58<01:34, 2.38it/s] Training 1/1 epoch (loss 1.7796): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 714/938 [04:58<01:34, 2.38it/s] Training 1/1 epoch (loss 1.7796): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 715/938 [04:58<01:38, 2.26it/s] Training 1/1 epoch (loss 1.7178): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 715/938 [04:58<01:38, 2.26it/s] Training 1/1 epoch (loss 1.7178): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 716/938 [04:58<01:33, 2.39it/s] Training 1/1 epoch (loss 1.7524): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 716/938 [04:59<01:33, 2.39it/s] Training 1/1 epoch (loss 1.7524): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 717/938 [04:59<01:38, 2.25it/s] Training 1/1 epoch (loss 1.7421): 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 717/938 [05:00<01:38, 2.25it/s] Training 1/1 epoch (loss 1.7421): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 718/938 [05:00<01:44, 2.10it/s] Training 1/1 epoch (loss 1.8539): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 718/938 [05:00<01:44, 2.10it/s] Training 1/1 epoch (loss 1.8539): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 719/938 [05:00<01:43, 2.12it/s] Training 1/1 epoch (loss 1.7380): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 719/938 [05:00<01:43, 2.12it/s] Training 1/1 epoch (loss 1.7380): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 720/938 [05:00<01:42, 2.13it/s] Training 1/1 epoch (loss 1.6993): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 720/938 [05:01<01:42, 2.13it/s] Training 1/1 epoch (loss 1.6993): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 721/938 [05:01<01:39, 2.19it/s] Training 1/1 epoch (loss 1.6829): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 721/938 [05:01<01:39, 2.19it/s] Training 1/1 epoch (loss 1.6829): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 722/938 [05:01<01:37, 2.22it/s] Training 1/1 epoch (loss 1.5802): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 722/938 [05:02<01:37, 2.22it/s] Training 1/1 epoch (loss 1.5802): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 723/938 [05:02<01:33, 2.30it/s] Training 1/1 epoch (loss 1.6937): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 723/938 [05:02<01:33, 2.30it/s] Training 1/1 epoch (loss 1.6937): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 724/938 [05:02<01:29, 2.38it/s] Training 1/1 epoch (loss 1.7786): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 724/938 [05:03<01:29, 2.38it/s] Training 1/1 epoch (loss 1.7786): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 725/938 [05:03<01:30, 2.36it/s] Training 1/1 epoch (loss 1.7159): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 725/938 [05:03<01:30, 2.36it/s] Training 1/1 epoch (loss 1.7159): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 726/938 [05:03<01:36, 2.19it/s] Training 1/1 epoch (loss 1.8085): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 726/938 [05:03<01:36, 2.19it/s] Training 1/1 epoch (loss 1.8085): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 727/938 [05:03<01:33, 2.27it/s] Training 1/1 epoch (loss 1.6172): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 727/938 [05:04<01:33, 2.27it/s] Training 1/1 epoch (loss 1.6172): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 728/938 [05:04<01:34, 2.23it/s] Training 1/1 epoch (loss 1.6810): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 728/938 [05:04<01:34, 2.23it/s] Training 1/1 epoch (loss 1.6810): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 729/938 [05:04<01:37, 2.15it/s] Training 1/1 epoch (loss 1.7852): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 729/938 [05:05<01:37, 2.15it/s] Training 1/1 epoch (loss 1.7852): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 730/938 [05:05<01:36, 2.17it/s] Training 1/1 epoch (loss 1.8237): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 730/938 [05:05<01:36, 2.17it/s] Training 1/1 epoch (loss 1.8237): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 731/938 [05:05<01:36, 2.15it/s] Training 1/1 epoch (loss 1.7431): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 731/938 [05:06<01:36, 2.15it/s] Training 1/1 epoch (loss 1.7431): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 732/938 [05:06<01:30, 2.28it/s] Training 1/1 epoch (loss 1.7915): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 732/938 [05:06<01:30, 2.28it/s] Training 1/1 epoch (loss 1.7915): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 733/938 [05:06<01:31, 2.25it/s] Training 1/1 epoch (loss 1.7975): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 733/938 [05:07<01:31, 2.25it/s] Training 1/1 epoch (loss 1.7975): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 734/938 [05:07<01:25, 2.39it/s] Training 1/1 epoch (loss 1.8127): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 734/938 [05:07<01:25, 2.39it/s] Training 1/1 epoch (loss 1.8127): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 735/938 [05:07<01:26, 2.35it/s] Training 1/1 epoch (loss 1.8131): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 735/938 [05:08<01:26, 2.35it/s] Training 1/1 epoch (loss 1.8131): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 736/938 [05:08<01:30, 2.22it/s] Training 1/1 epoch (loss 1.7628): 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 736/938 [05:08<01:30, 2.22it/s] Training 1/1 epoch (loss 1.7628): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 737/938 [05:08<01:31, 2.20it/s] Training 1/1 epoch (loss 1.7115): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 737/938 [05:08<01:31, 2.20it/s] Training 1/1 epoch (loss 1.7115): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 738/938 [05:08<01:27, 2.29it/s] Training 1/1 epoch (loss 1.6834): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 738/938 [05:09<01:27, 2.29it/s] Training 1/1 epoch (loss 1.6834): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 739/938 [05:09<01:23, 2.39it/s] Training 1/1 epoch (loss 1.7998): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 739/938 [05:09<01:23, 2.39it/s] Training 1/1 epoch (loss 1.7998): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 740/938 [05:09<01:22, 2.41it/s] Training 1/1 epoch (loss 1.7909): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 740/938 [05:10<01:22, 2.41it/s] Training 1/1 epoch (loss 1.7909): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 741/938 [05:10<01:20, 2.46it/s] Training 1/1 epoch (loss 1.7840): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 741/938 [05:10<01:20, 2.46it/s] Training 1/1 epoch (loss 1.7840): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 742/938 [05:10<01:14, 2.62it/s] Training 1/1 epoch (loss 1.8485): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 742/938 [05:10<01:14, 2.62it/s] Training 1/1 epoch (loss 1.8485): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 743/938 [05:10<01:11, 2.71it/s] Training 1/1 epoch (loss 1.7563): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 743/938 [05:11<01:11, 2.71it/s] Training 1/1 epoch (loss 1.7563): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 744/938 [05:11<01:14, 2.61it/s] Training 1/1 epoch (loss 1.8931): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 744/938 [05:11<01:14, 2.61it/s] Training 1/1 epoch (loss 1.8931): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 745/938 [05:11<01:13, 2.64it/s] Training 1/1 epoch (loss 1.7380): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 745/938 [05:12<01:13, 2.64it/s] Training 1/1 epoch (loss 1.7380): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 746/938 [05:12<01:24, 2.28it/s] Training 1/1 epoch (loss 1.9314): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 746/938 [05:12<01:24, 2.28it/s] Training 1/1 epoch (loss 1.9314): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 747/938 [05:12<01:19, 2.39it/s] Training 1/1 epoch (loss 1.7438): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 747/938 [05:12<01:19, 2.39it/s] Training 1/1 epoch (loss 1.7438): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 748/938 [05:12<01:19, 2.40it/s] Training 1/1 epoch (loss 1.6624): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 748/938 [05:13<01:19, 2.40it/s] Training 1/1 epoch (loss 1.6624): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 749/938 [05:13<01:25, 2.21it/s] Training 1/1 epoch (loss 1.7310): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 749/938 [05:13<01:25, 2.21it/s] Training 1/1 epoch (loss 1.7310): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 750/938 [05:13<01:27, 2.16it/s] Training 1/1 epoch (loss 1.7625): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 750/938 [05:14<01:27, 2.16it/s] Training 1/1 epoch (loss 1.7625): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 751/938 [05:14<01:27, 2.14it/s] Training 1/1 epoch (loss 1.7623): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 751/938 [05:14<01:27, 2.14it/s] Training 1/1 epoch (loss 1.7623): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 752/938 [05:14<01:22, 2.26it/s] Training 1/1 epoch (loss 1.8506): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 752/938 [05:15<01:22, 2.26it/s] Training 1/1 epoch (loss 1.8506): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 753/938 [05:15<01:24, 2.20it/s] Training 1/1 epoch (loss 1.8157): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 753/938 [05:15<01:24, 2.20it/s] Training 1/1 epoch (loss 1.8157): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 754/938 [05:15<01:19, 2.31it/s] Training 1/1 epoch (loss 1.8044): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 754/938 [05:16<01:19, 2.31it/s] Training 1/1 epoch (loss 1.8044): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 755/938 [05:16<01:19, 2.29it/s] Training 1/1 epoch (loss 1.7805): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 755/938 [05:16<01:19, 2.29it/s] Training 1/1 epoch (loss 1.7805): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 756/938 [05:16<01:14, 2.44it/s] Training 1/1 epoch (loss 1.7018): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 756/938 [05:16<01:14, 2.44it/s] Training 1/1 epoch (loss 1.7018): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 757/938 [05:16<01:19, 2.29it/s] Training 1/1 epoch (loss 1.9108): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 757/938 [05:17<01:19, 2.29it/s] Training 1/1 epoch (loss 1.9108): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 758/938 [05:17<01:16, 2.36it/s] Training 1/1 epoch (loss 1.7702): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 758/938 [05:17<01:16, 2.36it/s] Training 1/1 epoch (loss 1.7702): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 759/938 [05:17<01:11, 2.49it/s] Training 1/1 epoch (loss 1.6003): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 759/938 [05:18<01:11, 2.49it/s] Training 1/1 epoch (loss 1.6003): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 760/938 [05:18<01:14, 2.39it/s] Training 1/1 epoch (loss 1.9598): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 760/938 [05:18<01:14, 2.39it/s] Training 1/1 epoch (loss 1.9598): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 761/938 [05:18<01:09, 2.53it/s] Training 1/1 epoch (loss 1.7128): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 761/938 [05:18<01:09, 2.53it/s] Training 1/1 epoch (loss 1.7128): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 762/938 [05:18<01:09, 2.54it/s] Training 1/1 epoch (loss 1.8537): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 762/938 [05:19<01:09, 2.54it/s] Training 1/1 epoch (loss 1.8537): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 763/938 [05:19<01:07, 2.58it/s] Training 1/1 epoch (loss 1.6021): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 763/938 [05:19<01:07, 2.58it/s] Training 1/1 epoch (loss 1.6021): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 764/938 [05:19<01:05, 2.67it/s] Training 1/1 epoch (loss 1.8614): 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 764/938 [05:19<01:05, 2.67it/s] Training 1/1 epoch (loss 1.8614): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 765/938 [05:19<01:04, 2.69it/s] Training 1/1 epoch (loss 1.7819): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 765/938 [05:20<01:04, 2.69it/s] Training 1/1 epoch (loss 1.7819): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 766/938 [05:20<01:04, 2.66it/s] Training 1/1 epoch (loss 1.8829): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 766/938 [05:20<01:04, 2.66it/s] Training 1/1 epoch (loss 1.8829): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 767/938 [05:20<01:05, 2.60it/s] Training 1/1 epoch (loss 1.7464): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 767/938 [05:21<01:05, 2.60it/s] Training 1/1 epoch (loss 1.7464): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 768/938 [05:21<01:04, 2.63it/s] Training 1/1 epoch (loss 1.9586): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 768/938 [05:21<01:04, 2.63it/s] Training 1/1 epoch (loss 1.9586): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 769/938 [05:21<01:05, 2.60it/s] Training 1/1 epoch (loss 1.8062): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 769/938 [05:21<01:05, 2.60it/s] Training 1/1 epoch (loss 1.8062): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 770/938 [05:21<01:07, 2.48it/s] Training 1/1 epoch (loss 1.6327): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 770/938 [05:22<01:07, 2.48it/s] Training 1/1 epoch (loss 1.6327): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 771/938 [05:22<01:06, 2.52it/s] Training 1/1 epoch (loss 1.7138): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 771/938 [05:22<01:06, 2.52it/s] Training 1/1 epoch (loss 1.7138): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 772/938 [05:22<01:04, 2.56it/s] Training 1/1 epoch (loss 1.7044): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 772/938 [05:23<01:04, 2.56it/s] Training 1/1 epoch (loss 1.7044): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 773/938 [05:23<01:02, 2.63it/s] Training 1/1 epoch (loss 1.6410): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 773/938 [05:23<01:02, 2.63it/s] Training 1/1 epoch (loss 1.6410): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 774/938 [05:23<01:03, 2.59it/s] Training 1/1 epoch (loss 1.7410): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 774/938 [05:23<01:03, 2.59it/s] Training 1/1 epoch (loss 1.7410): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 775/938 [05:23<01:03, 2.57it/s] Training 1/1 epoch (loss 1.8240): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 775/938 [05:24<01:03, 2.57it/s] Training 1/1 epoch (loss 1.8240): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 776/938 [05:24<01:11, 2.27it/s] Training 1/1 epoch (loss 1.8468): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 776/938 [05:24<01:11, 2.27it/s] Training 1/1 epoch (loss 1.8468): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 777/938 [05:24<01:11, 2.26it/s] Training 1/1 epoch (loss 1.7189): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 777/938 [05:25<01:11, 2.26it/s] Training 1/1 epoch (loss 1.7189): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 778/938 [05:25<01:06, 2.40it/s] Training 1/1 epoch (loss 1.6140): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 778/938 [05:25<01:06, 2.40it/s] Training 1/1 epoch (loss 1.6140): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 779/938 [05:25<01:07, 2.35it/s] Training 1/1 epoch (loss 1.8544): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 779/938 [05:26<01:07, 2.35it/s] Training 1/1 epoch (loss 1.8544): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 780/938 [05:26<01:07, 2.35it/s] Training 1/1 epoch (loss 1.8179): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 780/938 [05:26<01:07, 2.35it/s] Training 1/1 epoch (loss 1.8179): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 781/938 [05:26<01:04, 2.42it/s] Training 1/1 epoch (loss 1.7204): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 781/938 [05:26<01:04, 2.42it/s] Training 1/1 epoch (loss 1.7204): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 782/938 [05:26<01:03, 2.48it/s] Training 1/1 epoch (loss 1.7963): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 782/938 [05:27<01:03, 2.48it/s] Training 1/1 epoch (loss 1.7963): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 783/938 [05:27<01:01, 2.51it/s] Training 1/1 epoch (loss 1.7817): 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 783/938 [05:27<01:01, 2.51it/s] Training 1/1 epoch (loss 1.7817): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 784/938 [05:27<01:00, 2.54it/s] Training 1/1 epoch (loss 1.7286): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 784/938 [05:27<01:00, 2.54it/s] Training 1/1 epoch (loss 1.7286): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 785/938 [05:27<00:59, 2.57it/s] Training 1/1 epoch (loss 1.7685): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 785/938 [05:28<00:59, 2.57it/s] Training 1/1 epoch (loss 1.7685): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 786/938 [05:28<01:01, 2.48it/s] Training 1/1 epoch (loss 1.7540): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 786/938 [05:28<01:01, 2.48it/s] Training 1/1 epoch (loss 1.7540): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 787/938 [05:28<01:01, 2.47it/s] Training 1/1 epoch (loss 1.8726): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 787/938 [05:29<01:01, 2.47it/s] Training 1/1 epoch (loss 1.8726): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 788/938 [05:29<01:00, 2.49it/s] Training 1/1 epoch (loss 1.6244): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 788/938 [05:29<01:00, 2.49it/s] Training 1/1 epoch (loss 1.6244): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 789/938 [05:29<01:00, 2.48it/s] Training 1/1 epoch (loss 1.6154): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 789/938 [05:30<01:00, 2.48it/s] Training 1/1 epoch (loss 1.6154): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 790/938 [05:30<01:06, 2.22it/s] Training 1/1 epoch (loss 1.8052): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 790/938 [05:30<01:06, 2.22it/s] Training 1/1 epoch (loss 1.8052): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 791/938 [05:30<01:02, 2.34it/s] Training 1/1 epoch (loss 1.6999): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 791/938 [05:30<01:02, 2.34it/s] Training 1/1 epoch (loss 1.6999): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 792/938 [05:30<01:02, 2.35it/s] Training 1/1 epoch (loss 1.8149): 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 792/938 [05:31<01:02, 2.35it/s] Training 1/1 epoch (loss 1.8149): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 793/938 [05:31<00:59, 2.44it/s] Training 1/1 epoch (loss 1.8890): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 793/938 [05:31<00:59, 2.44it/s] Training 1/1 epoch (loss 1.8890): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 794/938 [05:31<00:55, 2.57it/s] Training 1/1 epoch (loss 1.7094): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 794/938 [05:32<00:55, 2.57it/s] Training 1/1 epoch (loss 1.7094): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 795/938 [05:32<00:56, 2.53it/s] Training 1/1 epoch (loss 1.7627): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 795/938 [05:32<00:56, 2.53it/s] Training 1/1 epoch (loss 1.7627): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 796/938 [05:32<00:59, 2.39it/s] Training 1/1 epoch (loss 1.7221): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 796/938 [05:32<00:59, 2.39it/s] Training 1/1 epoch (loss 1.7221): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 797/938 [05:32<00:55, 2.53it/s] Training 1/1 epoch (loss 1.8568): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 797/938 [05:33<00:55, 2.53it/s] Training 1/1 epoch (loss 1.8568): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 798/938 [05:33<00:56, 2.48it/s] Training 1/1 epoch (loss 1.8190): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 798/938 [05:33<00:56, 2.48it/s] Training 1/1 epoch (loss 1.8190): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 799/938 [05:33<00:53, 2.60it/s] Training 1/1 epoch (loss 1.7564): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 799/938 [05:34<00:53, 2.60it/s] Training 1/1 epoch (loss 1.7564): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 800/938 [05:34<00:54, 2.53it/s] Training 1/1 epoch (loss 1.8411): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 800/938 [05:34<00:54, 2.53it/s] Training 1/1 epoch (loss 1.8411): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 801/938 [05:34<00:52, 2.61it/s] Training 1/1 epoch (loss 1.8774): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 801/938 [05:34<00:52, 2.61it/s] Training 1/1 epoch (loss 1.8774): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 802/938 [05:34<00:55, 2.44it/s] Training 1/1 epoch (loss 1.7558): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 802/938 [05:35<00:55, 2.44it/s] Training 1/1 epoch (loss 1.7558): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 803/938 [05:35<00:55, 2.44it/s] Training 1/1 epoch (loss 1.7190): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 803/938 [05:35<00:55, 2.44it/s] Training 1/1 epoch (loss 1.7190): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 804/938 [05:35<00:55, 2.42it/s] Training 1/1 epoch (loss 1.8004): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 804/938 [05:36<00:55, 2.42it/s] Training 1/1 epoch (loss 1.8004): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 805/938 [05:36<00:51, 2.58it/s] Training 1/1 epoch (loss 1.7891): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 805/938 [05:36<00:51, 2.58it/s] Training 1/1 epoch (loss 1.7891): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 806/938 [05:36<00:51, 2.56it/s] Training 1/1 epoch (loss 1.7212): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 806/938 [05:36<00:51, 2.56it/s] Training 1/1 epoch (loss 1.7212): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 807/938 [05:36<00:51, 2.55it/s] Training 1/1 epoch (loss 1.8408): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 807/938 [05:37<00:51, 2.55it/s] Training 1/1 epoch (loss 1.8408): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 808/938 [05:37<00:49, 2.63it/s] Training 1/1 epoch (loss 1.6715): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 808/938 [05:37<00:49, 2.63it/s] Training 1/1 epoch (loss 1.6715): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 809/938 [05:37<00:50, 2.56it/s] Training 1/1 epoch (loss 1.8022): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 809/938 [05:38<00:50, 2.56it/s] Training 1/1 epoch (loss 1.8022): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 810/938 [05:38<00:49, 2.58it/s] Training 1/1 epoch (loss 1.6514): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 810/938 [05:38<00:49, 2.58it/s] Training 1/1 epoch (loss 1.6514): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 811/938 [05:38<00:50, 2.50it/s] Training 1/1 epoch (loss 1.7558): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 811/938 [05:38<00:50, 2.50it/s] Training 1/1 epoch (loss 1.7558): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 812/938 [05:38<00:47, 2.65it/s] Training 1/1 epoch (loss 1.7892): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 812/938 [05:39<00:47, 2.65it/s] Training 1/1 epoch (loss 1.7892): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 813/938 [05:39<00:46, 2.72it/s] Training 1/1 epoch (loss 1.7492): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 813/938 [05:39<00:46, 2.72it/s] Training 1/1 epoch (loss 1.7492): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 814/938 [05:39<00:45, 2.71it/s] Training 1/1 epoch (loss 1.7564): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 814/938 [05:39<00:45, 2.71it/s] Training 1/1 epoch (loss 1.7564): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 815/938 [05:39<00:46, 2.67it/s] Training 1/1 epoch (loss 1.8272): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 815/938 [05:40<00:46, 2.67it/s] Training 1/1 epoch (loss 1.8272): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 816/938 [05:40<00:47, 2.57it/s] Training 1/1 epoch (loss 1.8149): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 816/938 [05:40<00:47, 2.57it/s] Training 1/1 epoch (loss 1.8149): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 817/938 [05:40<00:48, 2.51it/s] Training 1/1 epoch (loss 1.6836): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 817/938 [05:41<00:48, 2.51it/s] Training 1/1 epoch (loss 1.6836): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 818/938 [05:41<00:46, 2.59it/s] Training 1/1 epoch (loss 1.9632): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 818/938 [05:41<00:46, 2.59it/s] Training 1/1 epoch (loss 1.9632): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 819/938 [05:41<00:44, 2.65it/s] Training 1/1 epoch (loss 1.6670): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 819/938 [05:41<00:44, 2.65it/s] Training 1/1 epoch (loss 1.6670): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 820/938 [05:41<00:43, 2.74it/s] Training 1/1 epoch (loss 1.8066): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 820/938 [05:42<00:43, 2.74it/s] Training 1/1 epoch (loss 1.8066): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 821/938 [05:42<00:43, 2.67it/s] Training 1/1 epoch (loss 1.9611): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 821/938 [05:42<00:43, 2.67it/s] Training 1/1 epoch (loss 1.9611): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 822/938 [05:42<00:43, 2.64it/s] Training 1/1 epoch (loss 1.7414): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 822/938 [05:42<00:43, 2.64it/s] Training 1/1 epoch (loss 1.7414): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 823/938 [05:42<00:44, 2.58it/s] Training 1/1 epoch (loss 1.7383): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 823/938 [05:43<00:44, 2.58it/s] Training 1/1 epoch (loss 1.7383): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 824/938 [05:43<00:45, 2.48it/s] Training 1/1 epoch (loss 1.6793): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 824/938 [05:43<00:45, 2.48it/s] Training 1/1 epoch (loss 1.6793): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 825/938 [05:43<00:44, 2.56it/s] Training 1/1 epoch (loss 1.8439): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 825/938 [05:44<00:44, 2.56it/s] Training 1/1 epoch (loss 1.8439): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 826/938 [05:44<00:43, 2.55it/s] Training 1/1 epoch (loss 1.7987): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 826/938 [05:44<00:43, 2.55it/s] Training 1/1 epoch (loss 1.7987): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 827/938 [05:44<00:41, 2.69it/s] Training 1/1 epoch (loss 1.8889): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 827/938 [05:44<00:41, 2.69it/s] Training 1/1 epoch (loss 1.8889): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 828/938 [05:44<00:41, 2.64it/s] Training 1/1 epoch (loss 1.7409): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 828/938 [05:45<00:41, 2.64it/s] Training 1/1 epoch (loss 1.7409): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 829/938 [05:45<00:44, 2.43it/s] Training 1/1 epoch (loss 1.8424): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 829/938 [05:45<00:44, 2.43it/s] Training 1/1 epoch (loss 1.8424): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 830/938 [05:45<00:44, 2.45it/s] Training 1/1 epoch (loss 1.8842): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 830/938 [05:46<00:44, 2.45it/s] Training 1/1 epoch (loss 1.8842): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 831/938 [05:46<00:43, 2.47it/s] Training 1/1 epoch (loss 1.7711): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 831/938 [05:46<00:43, 2.47it/s] Training 1/1 epoch (loss 1.7711): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 832/938 [05:46<00:42, 2.48it/s] Training 1/1 epoch (loss 1.7423): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 832/938 [05:47<00:42, 2.48it/s] Training 1/1 epoch (loss 1.7423): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 833/938 [05:47<00:43, 2.43it/s] Training 1/1 epoch (loss 1.7761): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 833/938 [05:47<00:43, 2.43it/s] Training 1/1 epoch (loss 1.7761): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 834/938 [05:47<00:40, 2.56it/s] Training 1/1 epoch (loss 1.7496): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 834/938 [05:47<00:40, 2.56it/s] Training 1/1 epoch (loss 1.7496): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 835/938 [05:47<00:38, 2.64it/s] Training 1/1 epoch (loss 1.6326): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 835/938 [05:48<00:38, 2.64it/s] Training 1/1 epoch (loss 1.6326): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 836/938 [05:48<00:39, 2.61it/s] Training 1/1 epoch (loss 1.9417): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 836/938 [05:48<00:39, 2.61it/s] Training 1/1 epoch (loss 1.9417): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 837/938 [05:48<00:36, 2.74it/s] Training 1/1 epoch (loss 1.7552): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 837/938 [05:48<00:36, 2.74it/s] Training 1/1 epoch (loss 1.7552): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 838/938 [05:48<00:37, 2.69it/s] Training 1/1 epoch (loss 1.6731): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 838/938 [05:49<00:37, 2.69it/s] Training 1/1 epoch (loss 1.6731): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 839/938 [05:49<00:36, 2.74it/s] Training 1/1 epoch (loss 1.6738): 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 839/938 [05:49<00:36, 2.74it/s] Training 1/1 epoch (loss 1.6738): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 840/938 [05:49<00:35, 2.76it/s] Training 1/1 epoch (loss 1.8437): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 840/938 [05:49<00:35, 2.76it/s] Training 1/1 epoch (loss 1.8437): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 841/938 [05:49<00:35, 2.75it/s] Training 1/1 epoch (loss 1.7364): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 841/938 [05:50<00:35, 2.75it/s] Training 1/1 epoch (loss 1.7364): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 842/938 [05:50<00:36, 2.66it/s] Training 1/1 epoch (loss 1.8672): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 842/938 [05:50<00:36, 2.66it/s] Training 1/1 epoch (loss 1.8672): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 843/938 [05:50<00:39, 2.38it/s] Training 1/1 epoch (loss 1.7398): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 843/938 [05:51<00:39, 2.38it/s] Training 1/1 epoch (loss 1.7398): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 844/938 [05:51<00:38, 2.44it/s] Training 1/1 epoch (loss 1.9123): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 844/938 [05:51<00:38, 2.44it/s] Training 1/1 epoch (loss 1.9123): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 845/938 [05:51<00:36, 2.56it/s] Training 1/1 epoch (loss 1.8992): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 845/938 [05:51<00:36, 2.56it/s] Training 1/1 epoch (loss 1.8992): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 846/938 [05:51<00:36, 2.53it/s] Training 1/1 epoch (loss 1.7776): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 846/938 [05:52<00:36, 2.53it/s] Training 1/1 epoch (loss 1.7776): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 847/938 [05:52<00:35, 2.57it/s] Training 1/1 epoch (loss 1.6970): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 847/938 [05:52<00:35, 2.57it/s] Training 1/1 epoch (loss 1.6970): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 848/938 [05:52<00:35, 2.55it/s] Training 1/1 epoch (loss 1.8049): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 848/938 [05:53<00:35, 2.55it/s] Training 1/1 epoch (loss 1.8049): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 849/938 [05:53<00:36, 2.47it/s] Training 1/1 epoch (loss 1.6701): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 849/938 [05:53<00:36, 2.47it/s] Training 1/1 epoch (loss 1.6701): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 850/938 [05:53<00:34, 2.55it/s] Training 1/1 epoch (loss 1.8959): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 850/938 [05:53<00:34, 2.55it/s] Training 1/1 epoch (loss 1.8959): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 851/938 [05:53<00:33, 2.61it/s] Training 1/1 epoch (loss 1.8638): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 851/938 [05:54<00:33, 2.61it/s] Training 1/1 epoch (loss 1.8638): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 852/938 [05:54<00:34, 2.48it/s] Training 1/1 epoch (loss 1.8371): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 852/938 [05:54<00:34, 2.48it/s] Training 1/1 epoch (loss 1.8371): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 853/938 [05:54<00:35, 2.36it/s] Training 1/1 epoch (loss 1.7596): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 853/938 [05:55<00:35, 2.36it/s] Training 1/1 epoch (loss 1.7596): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 854/938 [05:55<00:34, 2.43it/s] Training 1/1 epoch (loss 1.8294): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 854/938 [05:55<00:34, 2.43it/s] Training 1/1 epoch (loss 1.8294): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 855/938 [05:55<00:33, 2.45it/s] Training 1/1 epoch (loss 1.7713): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 855/938 [05:56<00:33, 2.45it/s] Training 1/1 epoch (loss 1.7713): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 856/938 [05:56<00:35, 2.32it/s] Training 1/1 epoch (loss 1.8513): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 856/938 [05:56<00:35, 2.32it/s] Training 1/1 epoch (loss 1.8513): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 857/938 [05:56<00:33, 2.41it/s] Training 1/1 epoch (loss 1.8562): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 857/938 [05:56<00:33, 2.41it/s] Training 1/1 epoch (loss 1.8562): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 858/938 [05:56<00:32, 2.43it/s] Training 1/1 epoch (loss 1.7970): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 858/938 [05:57<00:32, 2.43it/s] Training 1/1 epoch (loss 1.7970): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 859/938 [05:57<00:32, 2.41it/s] Training 1/1 epoch (loss 1.6890): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 859/938 [05:57<00:32, 2.41it/s] Training 1/1 epoch (loss 1.6890): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 860/938 [05:57<00:30, 2.55it/s] Training 1/1 epoch (loss 1.7829): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 860/938 [05:57<00:30, 2.55it/s] Training 1/1 epoch (loss 1.7829): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 861/938 [05:57<00:30, 2.56it/s] Training 1/1 epoch (loss 1.8923): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 861/938 [05:58<00:30, 2.56it/s] Training 1/1 epoch (loss 1.8923): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 862/938 [05:58<00:28, 2.64it/s] Training 1/1 epoch (loss 1.9231): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 862/938 [05:58<00:28, 2.64it/s] Training 1/1 epoch (loss 1.9231): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 863/938 [05:58<00:27, 2.72it/s] Training 1/1 epoch (loss 1.7840): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 863/938 [05:59<00:27, 2.72it/s] Training 1/1 epoch (loss 1.7840): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 864/938 [05:59<00:28, 2.63it/s] Training 1/1 epoch (loss 1.8900): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 864/938 [05:59<00:28, 2.63it/s] Training 1/1 epoch (loss 1.8900): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 865/938 [05:59<00:27, 2.70it/s] Training 1/1 epoch (loss 1.6919): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 865/938 [05:59<00:27, 2.70it/s] Training 1/1 epoch (loss 1.6919): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 866/938 [05:59<00:27, 2.65it/s] Training 1/1 epoch (loss 1.5714): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 866/938 [06:00<00:27, 2.65it/s] Training 1/1 epoch (loss 1.5714): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 867/938 [06:00<00:28, 2.48it/s] Training 1/1 epoch (loss 1.7632): 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 867/938 [06:00<00:28, 2.48it/s] Training 1/1 epoch (loss 1.7632): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 868/938 [06:00<00:28, 2.47it/s] Training 1/1 epoch (loss 1.7563): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 868/938 [06:01<00:28, 2.47it/s] Training 1/1 epoch (loss 1.7563): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 869/938 [06:01<00:28, 2.41it/s] Training 1/1 epoch (loss 1.8699): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 869/938 [06:01<00:28, 2.41it/s] Training 1/1 epoch (loss 1.8699): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 870/938 [06:01<00:27, 2.46it/s] Training 1/1 epoch (loss 1.7150): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 870/938 [06:01<00:27, 2.46it/s] Training 1/1 epoch (loss 1.7150): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 871/938 [06:01<00:26, 2.51it/s] Training 1/1 epoch (loss 1.8086): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 871/938 [06:02<00:26, 2.51it/s] Training 1/1 epoch (loss 1.8086): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 872/938 [06:02<00:26, 2.50it/s] Training 1/1 epoch (loss 1.9054): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 872/938 [06:02<00:26, 2.50it/s] Training 1/1 epoch (loss 1.9054): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 873/938 [06:02<00:25, 2.59it/s] Training 1/1 epoch (loss 1.8263): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 873/938 [06:03<00:25, 2.59it/s] Training 1/1 epoch (loss 1.8263): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 874/938 [06:03<00:24, 2.59it/s] Training 1/1 epoch (loss 1.8884): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 874/938 [06:03<00:24, 2.59it/s] Training 1/1 epoch (loss 1.8884): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 875/938 [06:03<00:25, 2.51it/s] Training 1/1 epoch (loss 1.7388): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 875/938 [06:03<00:25, 2.51it/s] Training 1/1 epoch (loss 1.7388): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 876/938 [06:03<00:24, 2.49it/s] Training 1/1 epoch (loss 1.7422): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 876/938 [06:04<00:24, 2.49it/s] Training 1/1 epoch (loss 1.7422): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 877/938 [06:04<00:25, 2.41it/s] Training 1/1 epoch (loss 1.7161): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 877/938 [06:04<00:25, 2.41it/s] Training 1/1 epoch (loss 1.7161): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 878/938 [06:04<00:28, 2.11it/s] Training 1/1 epoch (loss 1.7426): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 878/938 [06:05<00:28, 2.11it/s] Training 1/1 epoch (loss 1.7426): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 879/938 [06:05<00:28, 2.06it/s] Training 1/1 epoch (loss 1.9577): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 879/938 [06:05<00:28, 2.06it/s] Training 1/1 epoch (loss 1.9577): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 880/938 [06:05<00:26, 2.20it/s] Training 1/1 epoch (loss 1.7794): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 880/938 [06:06<00:26, 2.20it/s] Training 1/1 epoch (loss 1.7794): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 881/938 [06:06<00:26, 2.19it/s] Training 1/1 epoch (loss 1.8031): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 881/938 [06:06<00:26, 2.19it/s] Training 1/1 epoch (loss 1.8031): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 882/938 [06:06<00:24, 2.29it/s] Training 1/1 epoch (loss 1.8970): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 882/938 [06:07<00:24, 2.29it/s] Training 1/1 epoch (loss 1.8970): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 883/938 [06:07<00:25, 2.13it/s] Training 1/1 epoch (loss 1.9051): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 883/938 [06:07<00:25, 2.13it/s] Training 1/1 epoch (loss 1.9051): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 884/938 [06:07<00:23, 2.25it/s] Training 1/1 epoch (loss 1.8238): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 884/938 [06:08<00:23, 2.25it/s] Training 1/1 epoch (loss 1.8238): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 885/938 [06:08<00:26, 2.00it/s] Training 1/1 epoch (loss 1.7562): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 885/938 [06:08<00:26, 2.00it/s] Training 1/1 epoch (loss 1.7562): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 886/938 [06:08<00:23, 2.17it/s] Training 1/1 epoch (loss 1.7998): 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 886/938 [06:08<00:23, 2.17it/s] Training 1/1 epoch (loss 1.7998): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 887/938 [06:08<00:21, 2.33it/s] Training 1/1 epoch (loss 1.7576): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 887/938 [06:09<00:21, 2.33it/s] Training 1/1 epoch (loss 1.7576): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 888/938 [06:09<00:20, 2.42it/s] Training 1/1 epoch (loss 1.8177): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 888/938 [06:09<00:20, 2.42it/s] Training 1/1 epoch (loss 1.8177): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 889/938 [06:09<00:20, 2.45it/s] Training 1/1 epoch (loss 1.6451): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 889/938 [06:10<00:20, 2.45it/s] Training 1/1 epoch (loss 1.6451): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 890/938 [06:10<00:19, 2.46it/s] Training 1/1 epoch (loss 1.8047): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 890/938 [06:10<00:19, 2.46it/s] Training 1/1 epoch (loss 1.8047): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 891/938 [06:10<00:18, 2.49it/s] Training 1/1 epoch (loss 1.7994): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 891/938 [06:10<00:18, 2.49it/s] Training 1/1 epoch (loss 1.7994): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 892/938 [06:10<00:17, 2.58it/s] Training 1/1 epoch (loss 1.9425): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 892/938 [06:11<00:17, 2.58it/s] Training 1/1 epoch (loss 1.9425): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 893/938 [06:11<00:16, 2.70it/s] Training 1/1 epoch (loss 1.7254): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 893/938 [06:11<00:16, 2.70it/s] Training 1/1 epoch (loss 1.7254): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 894/938 [06:11<00:17, 2.48it/s] Training 1/1 epoch (loss 1.7886): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 894/938 [06:12<00:17, 2.48it/s] Training 1/1 epoch (loss 1.7886): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 895/938 [06:12<00:17, 2.45it/s] Training 1/1 epoch (loss 1.7447): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 895/938 [06:12<00:17, 2.45it/s] Training 1/1 epoch (loss 1.7447): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 896/938 [06:12<00:16, 2.53it/s] Training 1/1 epoch (loss 1.6706): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 896/938 [06:12<00:16, 2.53it/s] Training 1/1 epoch (loss 1.6706): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 897/938 [06:12<00:16, 2.51it/s] Training 1/1 epoch (loss 1.7704): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 897/938 [06:13<00:16, 2.51it/s] Training 1/1 epoch (loss 1.7704): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 898/938 [06:13<00:15, 2.54it/s] Training 1/1 epoch (loss 1.7929): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 898/938 [06:13<00:15, 2.54it/s] Training 1/1 epoch (loss 1.7929): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 899/938 [06:13<00:17, 2.20it/s] Training 1/1 epoch (loss 1.7835): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 899/938 [06:14<00:17, 2.20it/s] Training 1/1 epoch (loss 1.7835): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 900/938 [06:14<00:18, 2.08it/s] Training 1/1 epoch (loss 1.6787): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 900/938 [06:14<00:18, 2.08it/s] Training 1/1 epoch (loss 1.6787): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 901/938 [06:14<00:16, 2.28it/s] Training 1/1 epoch (loss 1.6417): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 901/938 [06:15<00:16, 2.28it/s] Training 1/1 epoch (loss 1.6417): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 902/938 [06:15<00:16, 2.25it/s] Training 1/1 epoch (loss 1.7719): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 902/938 [06:15<00:16, 2.25it/s] Training 1/1 epoch (loss 1.7719): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 903/938 [06:15<00:16, 2.15it/s] Training 1/1 epoch (loss 1.8317): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 903/938 [06:16<00:16, 2.15it/s] Training 1/1 epoch (loss 1.8317): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 904/938 [06:16<00:15, 2.19it/s] Training 1/1 epoch (loss 1.8734): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 904/938 [06:16<00:15, 2.19it/s] Training 1/1 epoch (loss 1.8734): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 905/938 [06:16<00:15, 2.19it/s] Training 1/1 epoch (loss 1.8100): 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 905/938 [06:17<00:15, 2.19it/s] Training 1/1 epoch (loss 1.8100): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 906/938 [06:17<00:15, 2.08it/s] Training 1/1 epoch (loss 1.6960): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 906/938 [06:17<00:15, 2.08it/s] Training 1/1 epoch (loss 1.6960): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 907/938 [06:17<00:14, 2.14it/s] Training 1/1 epoch (loss 1.7494): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 907/938 [06:18<00:14, 2.14it/s] Training 1/1 epoch (loss 1.7494): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 908/938 [06:18<00:13, 2.16it/s] Training 1/1 epoch (loss 1.7398): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 908/938 [06:18<00:13, 2.16it/s] Training 1/1 epoch (loss 1.7398): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 909/938 [06:18<00:13, 2.21it/s] Training 1/1 epoch (loss 1.8397): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 909/938 [06:18<00:13, 2.21it/s] Training 1/1 epoch (loss 1.8397): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 910/938 [06:18<00:11, 2.42it/s] Training 1/1 epoch (loss 1.7759): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 910/938 [06:19<00:11, 2.42it/s] Training 1/1 epoch (loss 1.7759): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 911/938 [06:19<00:10, 2.53it/s] Training 1/1 epoch (loss 1.7842): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 911/938 [06:19<00:10, 2.53it/s] Training 1/1 epoch (loss 1.7842): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 912/938 [06:19<00:09, 2.61it/s] Training 1/1 epoch (loss 1.8190): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 912/938 [06:19<00:09, 2.61it/s] Training 1/1 epoch (loss 1.8190): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 913/938 [06:19<00:09, 2.55it/s] Training 1/1 epoch (loss 1.8116): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 913/938 [06:20<00:09, 2.55it/s] Training 1/1 epoch (loss 1.8116): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 914/938 [06:20<00:09, 2.53it/s] Training 1/1 epoch (loss 1.7522): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 914/938 [06:20<00:09, 2.53it/s] Training 1/1 epoch (loss 1.7522): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 915/938 [06:20<00:09, 2.52it/s] Training 1/1 epoch (loss 1.8421): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 915/938 [06:21<00:09, 2.52it/s] Training 1/1 epoch (loss 1.8421): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 916/938 [06:21<00:08, 2.70it/s] Training 1/1 epoch (loss 1.7289): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 916/938 [06:21<00:08, 2.70it/s] Training 1/1 epoch (loss 1.7289): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 917/938 [06:21<00:08, 2.61it/s] Training 1/1 epoch (loss 1.7255): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 917/938 [06:21<00:08, 2.61it/s] Training 1/1 epoch (loss 1.7255): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 918/938 [06:21<00:08, 2.48it/s] Training 1/1 epoch (loss 1.7002): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 918/938 [06:22<00:08, 2.48it/s] Training 1/1 epoch (loss 1.7002): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 919/938 [06:22<00:07, 2.44it/s] Training 1/1 epoch (loss 1.7690): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 919/938 [06:22<00:07, 2.44it/s] Training 1/1 epoch (loss 1.7690): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 920/938 [06:22<00:07, 2.43it/s] Training 1/1 epoch (loss 1.7258): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 920/938 [06:23<00:07, 2.43it/s] Training 1/1 epoch (loss 1.7258): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 921/938 [06:23<00:06, 2.51it/s] Training 1/1 epoch (loss 1.7297): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 921/938 [06:23<00:06, 2.51it/s] Training 1/1 epoch (loss 1.7297): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 922/938 [06:23<00:06, 2.64it/s] Training 1/1 epoch (loss 1.6881): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 922/938 [06:23<00:06, 2.64it/s] Training 1/1 epoch (loss 1.6881): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 923/938 [06:23<00:05, 2.61it/s] Training 1/1 epoch (loss 1.7419): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 923/938 [06:24<00:05, 2.61it/s] Training 1/1 epoch (loss 1.7419): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 924/938 [06:24<00:05, 2.57it/s] Training 1/1 epoch (loss 1.8167): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 924/938 [06:24<00:05, 2.57it/s] Training 1/1 epoch (loss 1.8167): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 925/938 [06:24<00:05, 2.55it/s] Training 1/1 epoch (loss 1.6783): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 925/938 [06:25<00:05, 2.55it/s] Training 1/1 epoch (loss 1.6783): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 926/938 [06:25<00:04, 2.46it/s] Training 1/1 epoch (loss 1.8280): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 926/938 [06:25<00:04, 2.46it/s] Training 1/1 epoch (loss 1.8280): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 927/938 [06:25<00:04, 2.52it/s] Training 1/1 epoch (loss 1.7765): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 927/938 [06:25<00:04, 2.52it/s] Training 1/1 epoch (loss 1.7765): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 928/938 [06:25<00:03, 2.51it/s] Training 1/1 epoch (loss 1.6783): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 928/938 [06:26<00:03, 2.51it/s] Training 1/1 epoch (loss 1.6783): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 929/938 [06:26<00:03, 2.46it/s] Training 1/1 epoch (loss 1.7923): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 929/938 [06:26<00:03, 2.46it/s] Training 1/1 epoch (loss 1.7923): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 930/938 [06:26<00:03, 2.47it/s] Training 1/1 epoch (loss 1.6811): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 930/938 [06:27<00:03, 2.47it/s] Training 1/1 epoch (loss 1.6811): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 931/938 [06:27<00:02, 2.48it/s] Training 1/1 epoch (loss 1.7944): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 931/938 [06:27<00:02, 2.48it/s] Training 1/1 epoch (loss 1.7944): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 932/938 [06:27<00:02, 2.51it/s] Training 1/1 epoch (loss 1.7992): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 932/938 [06:27<00:02, 2.51it/s] Training 1/1 epoch (loss 1.7992): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 933/938 [06:27<00:01, 2.60it/s] Training 1/1 epoch (loss 1.7764): 99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 933/938 [06:28<00:01, 2.60it/s] Training 1/1 epoch (loss 1.7764): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 934/938 [06:28<00:01, 2.50it/s] Training 1/1 epoch (loss 1.9447): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 934/938 [06:28<00:01, 2.50it/s] Training 1/1 epoch (loss 1.9447): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 935/938 [06:28<00:01, 2.53it/s] Training 1/1 epoch (loss 1.8155): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 935/938 [06:29<00:01, 2.53it/s] Training 1/1 epoch (loss 1.8155): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 936/938 [06:29<00:00, 2.55it/s] Training 1/1 epoch (loss 1.7864): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 936/938 [06:29<00:00, 2.55it/s] Training 1/1 epoch (loss 1.7864): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 937/938 [06:29<00:00, 2.49it/s] Training 1/1 epoch (loss 1.8080): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 937/938 [06:29<00:00, 2.49it/s] Training 1/1 epoch (loss 1.8080): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 938/938 [06:29<00:00, 2.47it/s] Training 1/1 epoch (loss 1.8080): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 938/938 [06:29<00:00, 2.41it/s]
tokenizer config file saved in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/Qwen1.5-0.5B/Qwen1.5-0.5B-s3-Q1-30k/tokenizer_config.json
Special tokens file saved in /aifs4su/hansirui_1st/boyuan/resist/setting3-safety/Qwen1.5-0.5B/Qwen1.5-0.5B-s3-Q1-30k/special_tokens_map.json
wandb: ERROR Problem finishing run
Exception ignored in atexit callback: <bound method rank_zero_only.<locals>.wrapper of <safe_rlhf.logger.Logger object at 0x1550a56c12d0>>
Traceback (most recent call last):
File "/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/utils.py", line 212, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/hansirui_1st/jiayi/resist/setting3/safe_rlhf/logger.py", line 183, in close
self.wandb.finish()
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 449, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 391, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2106, in finish
return self._finish(exit_code)
^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2127, in _finish
self._atexit_cleanup(exit_code=exit_code)
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2352, in _atexit_cleanup
self._on_finish()
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/wandb_run.py", line 2609, in _on_finish
wait_with_progress(
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 24, in wait_with_progress
return wait_all_with_progress(
^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/mailbox/wait_with_progress.py", line 87, in wait_all_with_progress
return asyncio_compat.run(progress_loop_with_timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/site-packages/wandb/sdk/lib/asyncio_compat.py", line 27, in run
future = executor.submit(runner.run, fn)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/aifs4su/hansirui_1st/miniconda3/envs/by-align/lib/python3.11/concurrent/futures/thread.py", line 169, in submit
raise RuntimeError('cannot schedule new futures after '
RuntimeError: cannot schedule new futures after interpreter shutdown