Name of Project: Deberta v3 Large NLI
Proposal in one sentence: The idea is to train a v3 large over multiple nli datasets like mnli,snli,anli etc.
Description of the project and what problem is it solving: The v3 large is the latest upgrade in the deberta series of language models and NLI is a super popular and useful task. The hope with this project is to make a State-Of-The-Art model over nli which not only performs on test sets and validation sets but on real-world problems. There are around 2500+ models on text classification just on hugging face - Models - Hugging Face . Both the research and applications for this are growing rapidly and hopefully, this model of ours can be the staple model for people to use and to add-on to on other projects like text generation, image captioning, text to image models etc.
For anyone who is not aware of what NLI is -
NLI is natural language inference. Some examples of projects/products that use this are sentiment analysis, question answering (from a paragraph of text), etc. Some explanation - The Stanford Natural Language Processing Group
Also since the model that I am proposing is a zero-shot model it can be modified for your own purpose very easily. Like making a document similarity model or a feature extraction model etc.
- A model that at least reaches SOTA on multiple datasets