Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

1.25 kB

	config = BertConfig(
	vocab_size_or_config_json_file=32000,
	hidden_size=768,
	num_hidden_layers=12,
	num_attention_heads=12,
	intermediate_size=3072,
	torchscript=True,
	)
	Instantiating the model
	model = BertModel(config)
	The model needs to be in evaluation mode
	model.eval()
	If you are instantiating the model with from_pretrained you can also easily set the TorchScript flag
	model = BertModel.from_pretrained("google-bert/bert-base-uncased", torchscript=True)
	Creating the trace
	traced_model = torch.jit.trace(model, [tokens_tensor, segments_tensors])
	torch.jit.save(traced_model, "traced_bert.pt")

	Loading a model
	Now you can load the previously saved BertModel, traced_bert.pt, from disk and use
	it on the previously initialised dummy_input:
	thon
	loaded_model = torch.jit.load("traced_bert.pt")
	loaded_model.eval()
	all_encoder_layers, pooled_output = loaded_model(*dummy_input)

	Using a traced model for inference
	Use the traced model for inference by using its __call__ dunder method:
	python
	traced_model(tokens_tensor, segments_tensors)
	Deploy Hugging Face TorchScript models to AWS with the Neuron SDK
	AWS introduced the Amazon EC2 Inf1
	instance family for low cost, high performance machine learning inference in the cloud.