Big Model Warehouse.ModelCenter implements pre-trained language models (PLMs) based on BMTrain backend. It supports Efficient, Low-Resource, Extendable model usage and distributed training.
Easy To Use
Compared to Deepspeed and Megatron, ModelCenter have better and more flexible code-packaging and easy to configure Python environments, and the training code is uniform with PyTorch style.
More Efficient Memory Utilization
Our implementation reduces the memory footprint by several times, allowing more efficient use of the GPU's computational power with a larger batch size.
Efficient Distributed Training With Low Resources
With the support of BMTrain, ModelCenter can easily extend ZeRO3's optimization to any PLMs, and we optimize communication and time scheduling for faster distributed training.
Powerful Performance
Thanks to BMTrain, ModelCenter performs amazingly compared to other popular frameworks.
Easy Usage
In line with the usage habits of Huggingface transformers, the threshold for getting started is lower, and the training speedup can be achieved with simple replacement.
Original Code Code after Replacement
Supported Models
bert-base-cased bert-base-uncased bert-large-cased bert-large-uncased bert-base-chinese bert-base-multilingual-cased
CPM-1(large) GPT-2(base) GPT-2(medium) GPT-2(large) GPT-2(XL) GPT-J(6B)
CPM-2(large) T5-small T5-base T5-large T5(3B) T5(11B)