Inspiration
University of Bath lecturers hold a special place in all of our hearts, and we have all had times where we long for more of their tantalising explanations, so we wanted to build an application that allows anyone to learn more about Computer Science and effectively revise the toughest parts.
We’re fascinated by artificial intelligence, and wanted to find a way to use a LLM in a way that could help with revision in a cool and interesting way.
What it does
Virtual lecturer allows your favourite lecturer to always be with you. Armed with the contextual knowledge of your course, they can answer questions which other LLM's may get wrong using their own voice. Ask Nicolai about integers, “The gift from almighty God”, let Ben explain what AI is, or even have Fabio reveal his favourite love islander.
We used Retrieval-Augmented Generation to inject context specific extracts of lecture notes into your question, ensuring responses reflect the specific nuance's of the course. The answer is then passed into a token infilling neural codec language model, fine-tuned on the lecturer’s voice.
This allows the user to have an experience as close as possible to the real thing, with realistic voice clones of the University of Bath lecturers, each equipped with expert knowledge.
How we built it
The software consists of three main components:
- The VoiceCraft voice model
- The RAG System and LLM prompter
- The React Native web app
Though familiar with React Native for developing a front-end application, both the voice model and Langchain for RAG and LLM prompting were completely new to us, and offered a lot to learn.
Challenges we ran into
Downloading the voice model to run it locally proved challenging, as there were a lot of strange dependency issues and we had to do a lot of tweaking to the initial parameters to get good results.
We also had to experiment with new technologies with steep learning curves to ensure all components of the app worked seamlessly together. Getting everything to a functional stage took a lot of research and planning, and we had many issues that we had to work through.
Accomplishments that we're proud of
It works! The voices not only sound human, but are also actually recognisable as the lecturers that they are based off. Additionally, all responses are technically accurate and relevant to the specific lecture notes that we used in our RAG system.
What we learned
Modern voice models are scarily powerful. With very little training data, we managed to create incredibly realistic voices, and it shows how easy it would be to use this technology for harm.
Also, designing a system so complex as a team of two people was not easy. There were a lot of times where we questioned if the project was even feasible, so we’re incredibly glad the finished project works as well as it does.
What's next for Virtual Lecturer
We could extend the system with more lecturers, and also potentially improve both the speed and performance of the voice models by fine-tuning them with larger amounts of training data.
Additionally, expanding the RAG database and cleaning up the UI to make it easier and more accessible for everyone would make this a genuinely good tool for people who want to revise by having these concepts explained to them by the lecturers that they’re familiar with.
Built With
- flask
- gpt
- javascript
- langchain
- python
- pytorch
- react-native
- retrieval-augmented-generation
Log in or sign up for Devpost to join the conversation.