Inspiration
Assist visually impaired users to navigate the internet
What it does
LLM-Based browser navigation through the use of speech to text recognition
How we built it
Built using GPT 4.0 for the LLM and Selenium to drive the browser navigation. Uses Python Flask on the backend and Javascript/HTML/CSS for the frontend.
Challenges we ran into
- Parsing the large DOM from webpages
- Getting the model to select the correct DOM elements
- Combining voice in and voice out to our system for hands free navigation for people with visual impairments
Log in or sign up for Devpost to join the conversation.