ChatGPT Update Enables Chatbot to “See, Hear and Speak” with Users

The upgrade rolls out today, only for the subscription-based version of ChatGPT.

byDeeba Ahmed

September 26, 2023

3 minute read

OpenAI’s flagship chatbot, ChatGPT, is getting a major tweak that will transform how users interact with it.

Reportedly, the world-famous chatbot is getting a brand-new open-source text-to-speech model that can quickly generate a human-like voice from a brief speech sample.

It will now be able to see, hear, and speak, becoming more intuitive.

The voice feature is available in the English language for now.

History was created last November when OpenAI launched its flagship chatbot, ChatGPT. The product took the internet world by storm, making generating text and solving queries hassle-free.

The AI firm has been improving its features, and now it has announced adding a voice identification capability dubbed Whisper to the already feature-rich chatbot. For your information, Whisper is the chatbot’s speech recognition system that can perform various functions, such as settling a debate or creating a bedtime story. OpenAI has upgraded the app to use its reasoning skills for answering queries about images.

For example, it can offer insight into various image types, including documents containing pictures or text, photos, and screenshots. It can also troubleshoot when inquired about any issue, such as why the grill isn’t working or how to plan a meal. Moreover, the chatbot can analyze complex graphs.

“You can now use voice to engage in a back-and-forth conversation with your assistant,” OpenAI explained when announcing the new iOS and Android upgrade.

ChatGPT is tweaked with a new open-source text-to-speech model and will now generate human-like audio after listening to a few seconds-long sample speech or text. This means the world’s favorite chatbot will become chattier now.

Users can directly send audio queries; the app will respond with a synthesized voice. Users can choose between five different voices to interact with the chatbot. To create these voices, OpenAI collaborated with professional voice artists.

In addition, the new version will feature visual smarts, allowing users to upload/snap a photo from ChatGPT, and it will quickly generate a description of the image, just like it happens with Google Lens. Users can also interact with the app regarding selected images or parts of an image through the newly added drawing tool using voice commands. OpenAI has added audio/visual data feed into its chatbot’s machine-learning model to make it as smart as a human.

Use your voice to engage in a back-and-forth conversation with ChatGPT. Speak with it on the go, request a bedtime story, or settle a dinner table debate.

Sound on ? pic.twitter.com/3tuWzX0wtS
— OpenAI (@OpenAI) September 25, 2023

The addition of such a wide range of speaking and image-analyzing features indicates that OpenAI is working constantly on its AI models to make them more intuitive and interactive. ChatGPT can easily compete with Amazon’s Alexa or Apple’s Siri with the new features and may even become much more attractive to consumers than other AI chatbots with its rich data feed.

ChatGPT’s new voice generation tech will also allow other companies to license this technology as it has been created in-house by OpenAI. For instance, Spotify can use OpenAI’s speech synthesis algorithms to offer a podcast translation feature, delivering the content in different languages but in the original podcaster’s voice.

The new features will be available today only for the subscription-based version of ChatGPT, at $20 per month. It will be available in all the markets where ChatGPT is available, but currently, the voice feature will work in English.

The Latest

Tego AI Discloses Second Claude Flaw in a Week: Hidden Link Silently Sends Files to Attackers

Russian Hackers Used a Zimbra Zero-Day to Steal Emails Without Link Clicks

New Dolphin X Malware Uses AI Profiler to Rank High-Value Victims

What Is Cryptocurrency and How Does It Actually Work?

ChatGPT Update Enables Chatbot to “See, Hear and Speak” with Users

Tego AI Discloses Second Claude Flaw in a Week: Hidden Link Silently Sends Files to Attackers

Blockchain Life Returns to Dubai — Featuring the Debut of AI Future

AppViewX Arms Enterprise CLM Teams for Post-Quantum Migration and AI Adoption in Latest Product Release

AccuKnox Wins Best AI Startup Award for Enterprise Agentic AI Security at BSides Bangalore

Pulse Security Debuts Operational Management Platform Built for Security Leaders

ChatGPT Update Enables Chatbot to “See, Hear and Speak” with Users

RELATED ARTICLES

Related Posts