Inspiration

Currently, there is a big communication gap between those who use ASL to communicate and the rest of the population. This leads to interactions where both parties seem helpless in getting their point across. Our application serves as a proof of concept of a real time ASL translator bridging the communication gap.

What it does

Our application allows users to record a video of themselves communicating a word or phrase in ASL. Then we parse the video and feed it to Gemini, which performs a translation from frames to text.

How we built it

We used a flask backend to interact with the Gemini API and a react frontend.

Challenges we ran into

Prompting Gemini and determining how many frames per second to send to the model were the biggest challenges.

Accomplishments that we're proud of

The model is able to detect very simple phrases.

What we learned

This task is difficult without the ability to finetune using images.

What's next for Gemini ASL Translator

Finetuning using images and translation into multiple languages.

Built With

Share this project:

Updates