Buzz - Whisper based transcription and beyond
Proposal in One Sentence
Using the newly available Whisper model by OpenAI to transcribe content such as meetings and build on top of the obtained transcripts to perform tasks like topic detection, summarization,etc
Description of the project and what problem it is solving:
Meetings and online content offer a wealth of information. However, unlike textual information, the contents of most of these aren’t easily searchable or discoverable. These content, especially technical and educational content, could also be harder to comprehend without proper captions or transcripts. Transcripts increase the accessibility and allow for further textual processing, such as translation, to be carried on top of it.
OpenAI recently released Whisper, which is a general-purpose speech recognition model that can perform speech recognition as well as speech translation and language identification. This enables translation and transcription in languages other than English as well allowing transcription and translation of content on non-English Discord channels as well. The aim of the project is to utilize the Whisper model to transcribe content such as Algovera’s recorded meetings to make it easier for the community to search through the meeting contents and to provide this service as a Discord Bot. An additional outcome is to focus on creating a pipeline to combine this multilingual speech recognition with downstream natural language processing tasks such as topic modelling to find out what major topics were discussed, summarization to automatically generate Minutes of Meeting (from TL;DR to Too Long Didn’t Listen), emotion detection, etc. An example demo of one such application where emotion can be detected directly from voice/speech in many languages can be found here
Grant Deliverables:
- Transcribed demo files (such as recorded meetings/podcasts) based on the choice of the community
- ML pipeline containing Voice2X (where X is one of: topic detection/summary/emotion)
Spread the Love:
I am currently working solo on this and I am delighted to welcome anyone with similar interest to work on this together. Suggestions and contributions are always welcome
Squad:
Ram - Machine Learning Researcher with experience in Reinforcement Learning and a new kid on the Block.
Discord : shinjeki007#8768