You Can Talk to ChatGPT Using Your Voice Now!

In a really big change to ChatGPT, OpenAI has added two new ways to use this popular app.

Now, ChatGPT can talk. Pick one of five realistic computer voices, and you can talk to the chatbot like you’re on a phone call. It will answer your spoken questions right away.

ChatGPT can now understand questions about pictures. OpenAI showed this new ability in March when they introduced GPT-4 (the brain behind ChatGPT). This feature wasn’t available to everyone before, but now you can upload pictures to the app and ask it what’s in them.

Last week, OpenAI said that they will connect DALL-E 3, which is their newest picture-making tool, with ChatGPT. This means the chatbot will be able to make pictures for you.

You can talk to ChatGPT because of the two different models. One changes your words into text and the other changes ChatGPT’s words into speech.

Last week, in a presentation, Joanne Jang, who manages products, displayed different made-up voices of ChatGPT. These voices were made by teaching the talking computer program using the voices of actors that OpenAI hired. Someday, it might let users make their own voices.

“In fashioning the voices, the number-one criterion was whether this is a voice you could listen to all day.”

Joanne Jang, Product Manager

OpenAI is giving this talking robot to a few other companies, like Spotify. Spotify said today that they’re using the same fake voice tech to change famous people’s podcasts like Lex Fridman’s and Trevor Noah’s upcoming show into different languages. The voices will sound like the original podcasters.

This bunch of new stuff shows how quickly OpenAI is turning its test models into things people want. OpenAI has been busy improving its technology since ChatGPT became popular last November. They’re now selling it to regular people and businesses.

The special version of ChatGPT, called ChatGPT Plus, is now even better. It combines OpenAI’s top models, GPT-4 and DALL-E, into one easy-to-use app for your phone. This app is as good as Apple’s Siri, Google Assistant, and Amazon’s Alexa.

A year ago, only some computer programmers could use it. Now, anyone can get it for $20 each month.

“We’re trying to make ChatGPT more useful and more helpful.”

Joanne Jang, Product Manager

Last week, Raul Puri, a scientist working on GPT-4, showed me how it can recognize pictures. He put up a picture of a child’s math homework, drew a circle around a puzzle that looked like Sudoku on the screen, and asked ChatGPT how to solve it. ChatGPT gave the right steps in response.

Puri says he used this tool to assist him in repairing his fiancée’s computer. He uploaded pictures of error messages and asked ChatGPT for advice on what to do.

“This was a very painful experience that it helped me get through.”

Raul Puri, GPT-4 Creator

ChatGPT can see and recognize things in pictures. It was tested by a company called Be My Eyes. They make an app for people who can’t see well. People can take a picture of something and ask other people to tell them what it is. Now, with help from OpenAI, Be My Eyes lets people ask a chatbot instead of a person.

“Sometimes my kitchen is a little messy, or it’s just very early Monday morning and I don’t want to talk to a human being.”

“Now you can ask the photo questions.”

Hans Jørgen Wiberg, Be My Eyes founder to EmTech Digital interviewed

OpenAI knows there’s a danger in sharing these updates with everyone. Mixing models makes things more complicated, according to Puri. He mentions that his team has been thinking for months about how these could be used in the wrong way. For instance, you can’t inquire about pictures of people who want to keep their privacy.

“Right now if you ask ChatGPT to make a bomb it will refuse,” she says. “But instead of saying, ‘Hey, tell me how to make a bomb,’ what if you showed it an image of a bomb and said, ‘Can you tell me how to make this?’”

Joanne Jang, Product Manager

“You have all the problems with computer vision; you have all the problems of large language models. Voice fraud is a big problem.”

“You have to consider not just our users, but also the people that aren’t using the product.”

Raul Puri, GPT-4 Creator

More issues might happen. Putting voice recognition in the app could make ChatGPT harder to use for people who don’t have common accents, says Joel Fischer, a researcher at the University of Nottingham in the UK who studies how people and computers interact.

Artificial voices also bring along social and cultural baggage that will influence how people see and expect from the app, he says. This is a topic that still requires research.

OpenAI says they fixed the big issues and believe ChatGPT’s updates are safe to share.

“It’s been a remarkably good learning experience getting all these sharp edges sorted out.”

Raul Puri, GPT-4 Creator