Google recently made audio playback available for some Knol articles. For example, if you check out the article on how to treat and prevent skin allergies, you’ll notice a link that says Listen. If you click the link, a small embedded media player appears and you begin to hear the article read out loud.
Google is certainly no stranger to bringing together analog and digital. A little over two years ago, Google released HP’s Tesseract OCR (optical character recognition) engine as open source. Earlier last year, Google announced GOOG-411, the free telephone directory service. From what I can tell, the voice you hear on GOOG-411 is the same voice you hear on Knol audio playback. The Knol team has undoubtedly leveraged some of the work of the GOOG-411 team in order to provide a realistic and natural voice in article playback.
More recently, Google demonstrated more of its speech-to-text technology with the announcement of an audio indexing product that recognizes speech within YouTube Videos. As of now, they’re showcasing it with speeches made by both candidates in this year’s presidential election. Google has even filed a patent for recognizing text within images and video, so I’m sure we’ll be seeing more from Google in the area of digital-to-analog and analog-to-digital in the near future. Given it’s Google’s mission is to organize the world’s information and make it universally accessible and useful, it is no surprise that Google has taken an interest in recognizing information embedded within multimedia.