Inspiration

A few months ago, I was playing the piano when I heard a person behind me ask what I was playing. I turned around and was surprised to see that he was completely blind. He said he was studying math and computer science and proceeded to show me how he navigates the world. It was astonishing to see him function without sight and just depending on a stick to see his surroundings. This encounter inspired me to make it my goal to develop a device that could help visually impaired people understand their surroundings simply by asking questions, providing an intuitive way to gather information about the world around them. We also thought it would be pretty cool to develop, as wearables are gaining popularity and have a growing capability to augment the interaction between humans and computers in a way never seen before.

What it does

Insight is a pair of smart glasses that captures your environment, analyzes what you ask about your surroundings, and talks to you about them. It also has the capability to remember details about sounds, objects, people, places, and more from previous conversations automatically, allowing you to reference an external memory.

Use Cases

Healthcare

Assisting surgeons by displaying patient vitals, scans, and relevant medical information hands-free during operations, allowing them to stay focused on the procedure.

Education

Helping students with learning disabilities or ADHD stay on task by providing reminders, breaking down assignments into steps, and minimizing distractions.

Retail

Enabling retail managers to quickly access real-time store performance data, employee locations, and customer traffic patterns while walking the sales floor.

Travel & Tourism

Providing real-time translations of signs, menus, and conversations to international travelers, narrating key info about destinations they encounter.

How we built it

We built Insight by connecting a Raspberry Pi to a speaker, microphone, and camera. We wrote code for the backend in Python, which performs all the complex input/output logic, and Next.js for the frontend that showcases all the collected memories. When the program is running, it analyzes the captured audio using Google's speech-to-text. This data is then fed into Gemini 1.5 Pro, which processes the information and provides a response via Google's text-to-speech to the user based on what they saw, heard, and asked.

Challenges we ran into

  • Working with Raspberry Pi was a challenge, as it was something we'd never worked with before.
  • Our primary challenge was dealing with a last-minute camera malfunction, which required us to drive to Office Max 30 minutes before the deadline to switch from an Arduino v2 camera to a USB webcam.
  • Creating a getStaticProps component in Next.js to display multiple images as a slideshow was also a challenge.
  • Getting the wake word to work was challenging, as it required the mic to be always on, listening, converting speech to text, and outputting things.
  • Many times throughout the process, we realized we were missing parts or had incorrect connections. We made over 5 trips to different stores at night to find the parts that we could scavenge to make this work.

Accomplishments that we're proud of

  • Successfully developing a fully functional personal assistant that utilizes a camera and speaker connected to the Raspberry Pi, complete with memory capabilities.
  • Designing an AI program that can detect the "Hey Google" trigger and subsequently capture the following image and audio.
  • Creating an attractive and user-friendly interface to showcase the images, questions, and responses.

What we learned

  • We discovered that we are capable of building things that initially seem beyond our abilities, as long as we are dedicated to learning and persevering through challenges.
  • None of our team members had worked with hardware before. This opportunity taught us a great deal about hardware and how it can be utilized to analyze and interpret the world around us, opening up numerous possibilities for developing products that combine both hardware and software.

What's next for Insight

  • Our future plans include developing a pair of wireless smart glasses, improving the speed of the responses, and enhancing the overall aesthetics of the product.
  • We also aim to integrate a smart assistant with internet access to enable web searching capabilities.

Built With

Share this project:

Updates