@neuralmagic
Neural Magic
1 year
Optimizing ML models for deployment is not an option. Particularly for enterprises that want to lower their computing costs while improving production performance. Our team suggests three model optimization techniques to consider before deploying your next model. A 🧵:
1
3
17

Replies

@neuralmagic
Neural Magic
1 year
1. AC/DC is a training-aware compression technique enabling the training of dense and sparse models simultaneously. It simplifies training workflows and outperforms existing sparse training methods in accuracy. @dtransposed explains AC/DC in this video:
Tweet media one
1
0
6
@neuralmagic
Neural Magic
1 year
2. OBC/OBQ is a model quantization and optimization technique that compresses a model in one shot, without retraining, in a single compression step, and with minimal computing cost. Minutes of work result in 4X speedup. Learn how at our next webinar:
1
1
4
@neuralmagic
Neural Magic
1 year
3. OBS pruning technique uses second-order derivatives to remove unimportant weights from a network while adjusting other weights to minimize error. It maintains accuracy and increases performance while decreasing models size 10x: See our NLP example:
1
1
3
@neuralmagic
Neural Magic
1 year
Our research team collaborates with @ISTAustria to create these cutting-edge model optimization techniques for everyone to use. 🙏 Start with #SparseML to apply these and other methods to your models to enable faster and more efficient AI use cases:
0
0
2