@neuralmagic
Neural Magic
2 years
We pushed 🤗 BERT performance to new heights by also supporting quantization on top of sparsity. End result: 7x speedup over the dense model. 📈 You can easily benchmark sparse performance and apply it to your dataset by following this guide:
0
5
10