Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
D
DataParallel (DP)
Parallelism technique for training on multiple GPUs where the same setup is replicated multiple times, with each instance
receiving a distinct data slice.