Google Pixel 4: Speech to Text

Google is publicizing the new Pixel 4 and Pixel 4Xl by emphasizing a “breakthrough new feature,” that is, a recorder app that can transcribe spoken audio in real-time with impressive accuracy. The tool is being touted for its applications for recording lectures, interviews, important events, etc. But imagine what a genuinely disruptive technology could do for Voice services.

Google Assistant has long claimed better  Voice dictation error rates (5%) than we could document. Looking across platforms, we have consistently reported dictation error rates of 10%, a figure that effectively limits Voice’s viability as a dictation service. Of course, many factors impact speech-to-text accuracy, including: microphone quality, vocabulary standards, speech standards, algorithm accuracy, Voice training, etc. Some of these factors are beyond the platform’s control, especially users use of slang or unusual pronunciations.

In the past we’ve complained about the limits of Voice to handle dictation-type activities, and we’ve commiserated with all the programmers/entrepreneurs whose applications have struggled with such error rates). (Check out our YouTube Channel). So we’re naturally cautious to change our recommendations.

However, we have explored the tool on our Pixels (2, 2L & 3) which is available for download. (According to Gizmodo, the app could be “… sideloaded via an APK on other Android phones, but the real-time voice transcription won’t work—this is one of those features (like Motion Sense) that Google is hoping will get you to buy one of its new flagship phones.”)  We were impressed by the app’s ability to take accurate dictation, particularly in light of the phones’ microphones, which are typically of lower quality that is found in a standalone Voice device.

The question is whether Google’s latest speech-to-text is an actually a disruptive innovation. If yes, then when will it be applied to Google Assistant? If it is already a part of Google Assistant, why haven’t we seen an improvement in error rates? And as with any “breakthrough” technology, what will the real world impact be?

Posted in Google, IT Solutions, Management Consulting, Voice Assistant