Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
Smart partitioning and tiling algorithms allow each GPU to send and receive very small amounts of data during offloading such that a modern NVMe can fit an even larger total memory pool than is available to your training process.