Iris: Natural Language Browser Navigation

Inspiration

Assist visually impaired users to navigate the internet

What it does

LLM-Based browser navigation through the use of speech to text recognition

How we built it

Built using GPT 4.0 for the LLM and Selenium to drive the browser navigation. Uses Python Flask on the backend and Javascript/HTML/CSS for the frontend.

Challenges we ran into

Parsing the large DOM from webpages
Getting the model to select the correct DOM elements
Combining voice in and voice out to our system for hands free navigation for people with visual impairments