<p align="center">
<img src="https://avatars.githubusercontent.com/u/12619994?s=200&v=4" width="150">
<br />
<a href="LICENSE"><img alt="Apache License" src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" /></a>
</p>
--------------------------------------------------------------------------------
# KD-NLP
This repository is a collection of Knowledge Distillation (KD) methods implemented by the Huawei Montreal NLP team.
<details><summary>Included Projects</summary><p>
* [**MATE-KD**](MATE-KD)
* KD for model compression and study of use of adversarial training to improve student accuracy using just the logits of the teacher as in standard KD.
* [MATE-KD: Masked Adversarial TExt, a Companion to Knowledge Distillation](https://arxiv.org/abs/2105.05912v1)
* [**Combined-KD**](Combined-KD)
* Proposition of Combined-KD (ComKD) that takes advantage of data-augmentation and progressive training.
* [How to Select One Among All? An Extensive Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding](https://arxiv.org/abs/2109.05696v1)
* [**Minimax-kNN**](Minimax-kNN)
* A sample-efficient semi-supervised kNN data augmentation technique.
* [Not Far Away, Not So Close: Sample Efficient Nearest Neighbour Data Augmentation via MiniMax](https://aclanthology.org/2021.findings-acl.309/)
* [**Glitter**](Glitter)
* A universal sample-efficient framework for incorporating augmented data into training.
* [When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation](https://aclanthology.org/2022.findings-acl.84/)
</p></details>
# License
This project's license is under the Apache 2.0 license.