Last update: unknown. The model is for intention classification of the texts accepted by smart devices. It contains 60 categories of intention. Examples can be found on Huggingface The model is using Bert + NN + Knowledge Distillation + BiLSTM deployed on Heroku. It's just for my own exploration, so the models are not deliberately tuned. Before Knowledge Distillation, the model achieved an accuracy of around 0.84. After that, the model still remained an accuracy of around 0.79 😄. GitHub Link
I then converted model to ONNX (Open Neural Network Exchange) format and leveraged ONNX Runtime to accelerate inference speed by 6x, reducing time per call from 0.026 to 0.0043 seconds
Then I deployed the compressed model on Heroku server, using Gunicorn and Flask-RESTful for the app backend, with the model stored on Amazon S3.
• Used BERT as the encoder and a Neural Network as the decoder to classify text intentions.
• Self-studied Knowledge Distillation. Used the trained BERT-NN as the teacher model and BiLSTM as the student model to compress the model size from 439MB to 70MB while preserving comparable accuracy.
• Leveraged ONNX Runtime to accelerate inference speed by 6x, reducing time per call from 0.026 to 0.0043 seconds.
• Deployed the compressed model on Heroku server, using Gunicorn and Flask-RESTful for the app backend, with the model stored on Amazon S3.
UPDATE: Deprecated because Heroku stops providing free Dyno resources, but here is a video demonstration of the app
Top likely intents | Probability |
---|---|