Human vs AI Speech

The main page before uploading a file.
The main page after uploading a file.
Where you record your own files and are given an AI generated sentence, before inputting a word..
Where you record your own files and are given an AI generated sentence, before inputting a word and with the AI-generated image.
Future web-pages that could be created, hovering over picture provides a description.

Inspiration

We wanted to create something relevant to potential issues in the rise of AI. One issue we highlighted was that sometimes you can't tell the difference between a human voice and an AI-generated voice. For example, this has caused people to think a celebrity said something they didn't say due to how easily accessible the AI-generated voices of celebrities are. Therefore, we wanted to increase the accessibility of large datasets of AI and human voice files to allow machine learning models to be able to train to tell the difference between them.

What it does

Our project has several pieces of functionality. Firstly, you can upload an audio file, and it will then output your speech as text. Then, it will upload an AI-generated voice file of the speech that you just said and play it to the user. Another piece of functionality is where you can record your voice file on the website, and it gives you an AI-generated random sentence if you can't think of something to say, and then does the same thing and generates an AI-generated voice file for that. Here it also displays an AI-generated image of the sentence you just said, so you can see the random sentences visually. Finally, we also included a page that includes where this could be taken further.

How we built it

We built this using Django. We used Python for the back-end and JavaScript, CSS, and HTML for the front-end. As a two-man team, we initially split our work into one person doing front-end and one person doing back-end. Then, once we felt we had a prototype ready, we merged the websites and collaborated more to improve the website. For the page with additional concepts, we used Figma to design four different webpages that weren't going to make it in the project within the 24 hours, and uploaded these to the website.

Challenges we ran into

Initially, we had trouble with the web development, as neither of us had done it before. We discovered issues with hosting the website and Firebase storage, so we eventually had to switch to Django midway through. We also ran into problems with the speed at which the model would generate the text from the file and produce its AI-generated voice file, so we then tried the OpenAI API, which was much quicker.

Accomplishments that we're proud of

We managed to get a functioning website despite never doing web development before.

What we learned

We became much more familiar with web development and utilising AI APIs.

What's next for Human vs AI Speech

We had several further ideas that would improve the functionality of the website, which we included in the website as concepts. This includes a page that allows you to upload several voice files and contains a progress bar that tells you how many minutes of files you need to upload until you have your own AI voice, which would be created through training a machine learning model to sound like the list of audio files. We also thought of another page that displays statistics about the number of users, how many minutes of audio files there are in the dataset, and how many audio files have been uploaded. It also included a button to download the dataset and, just for fun, the top 4 highest-voted AI images that the prompt created. Another page could have contained a game that tests if you can tell the difference between a real and an AI voice.

Built With

Updates

Josh Carter started this project — Apr 14, 2024 06:52 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.