Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
it has to be run before loading the model AutoModelForSeq2SeqLM.from_pretrained(model_name)
otherwise the model will first be loaded normally and only partitioned at forward time which is
less efficient and when there is little CPU RAM may fail
dschf = HfDeepSpeedConfig(ds_config) # keep this object alive
now a model can be loaded.