Introduction, Dragon NaturallySpeaking and TalkTyper
For a long time, speech recognition was going to be the next big thing. People would dictate documents naturally and flawlessly in half the time that they could type them, freeing up time for other activities. Eventually, however, people became tired of waiting for speech recognition to reach an adequate standard for regular use, and the concept became derided as a novelty and unusable for day-to-day work.
Now, however, speech recognition has come of age and it's sneaked back into people's lives via a side entrance. Google, Apple and Microsoft all have speech recognition functionality built into their mobile operating systems, and you don't have to go out and buy a CD from which to install it as was the case during the first wave of interest.
Speech recognition software has followed the classic Hype Cycle: an initial burst of inflated expectations followed by a trough of disillusionment, a gradual enlightenment about the technology's actual usefulness, and an eventual levelling out of productivity.
There is logic to using speech recognition software on mobile devices, because often it can be quicker and easier than fiddling around with an on-screen keyboard. Mainstream adoption of speech recognition software on desktop PCs, however, has never happened. That may be partly due to people recognising some of its limitations, its perceived lack of suitability for an office environment, or simply because people moved on and just forgot about it.
Whatever the reason, speech recognition software is now at the level for which people had initially hoped. It is accurate, useful and gradually beginning to bleed into our lives.
There are a variety of different speech recognition tools and software packages, each offering specific functionality. Some are more fully loaded and advanced than others, but generally, they can be used for controlling a user's computer and dictating to a document. This article provides an overview of some of the most popular speech recognition tools and software packages available.
Nuance's Dragon NaturallySpeaking software is regarded as the market leader where speech recognition is concerned. NaturallySpeaking was launched in 1997 and is now up to its thirteenth iteration. There are a host of versions available depending on the user's requirements and the software offers a huge amount of functionality. According to Nuance, NaturallySpeaking is the world's best-selling speech recognition software for the PC.
NaturallySpeaking Professional Edition aims to provide business users with a means of controlling their computers and dictating documents, and Nuance claims it is three times faster than typing. As a result, it says, productivity can be improved and cost savings made. Amongst the functionalities provided are the capabilities to manage email, search the web and automate business processes.
NaturallySpeaking allows users to dictate into Microsoft Office applications and OpenOffice, create emails, tasks and meetings in Microsoft Outlook, search the web using any major browser, and post to social media services such as Facebook and Twitter. The software recognises a number of standard commands, such as creating files, scheduling calendar entries and searching a user's computer. It's also possible to set up custom commands.
Beyond its desktop functionality, Nuance can automatically transcribe user dictations into approved voice recorders, and mobile apps are available for iOS or Android. iOS users can record audio files whilst Android devices can be used as a wireless microphone.
NaturallySpeaking Professional Edition costs £549. Prices vary for other editions, depending on the functionality. For more information on the package, check out our full Dragon NaturallySpeaking Premium 13 review.
If you only need basic speech-to-text dictation functionality, then TalkTyper may well be adequate. TalkTyper is a simple, free-to-use website that captures user speech and renders it in plain text ready for being copied and then pasted elsewhere. It's not possible to sign up for an account, meaning that the website is designed simply for immediate and straightforward use.
TalkTyper was created with the aim of making voice dictation freely available to anyone who needed it. According to TalkTyper, it first became possible when Google added speech input functionality to its Chrome browser.
Once a user has loaded the TalkTyper website, they can click the microphone button and begin dictating. In addition to basic dictation, users can add basic punctuation by using commands like "period", "question mark" and "new paragraph".
If they are happy with the resulting text, they can add it to their saved text pad. Having finished their dictation, the user is able to add symbols, copy the text, print it, send it to Twitter, send it via an email or translate it into a different language.
Windows, Google, Compadre Interact, Sonic Extractor
Microsoft Windows Speech Recognition
Many people won't realise this, but Microsoft has been building native speech recognition software into Windows for some time. Although the functionality existed previously, it has been more fully integrated since Vista under the guise of Windows Speech Recognition.
Windows Speech Recognition can be used to both control a computer with voice commands and dictate text. A short setup process is required in order to calibrate the user's microphone, and the software can be trained to better understand a user's speech by creating a voice profile that it uses to recognise the individual.
Windows Speech Recognition can be used to give a number of common commands such as opening specific applications, scrolling in any direction and switching between open programs. The software can be used in conjunction with programs such as browsers. Its dictation functionality can be used with word processing applications or, for example, when filling out forms online.
The software is compatible with English, French, Spanish, German, Japanese, Simplified Chinese, and Traditional Chinese.
Google Voice Search
As well as being available on its Android mobile devices, Google's Voice Search functionality can now also be used direct from its Chrome browser and as a Windows 8 application. Users may find it quicker or just preferential to use voice commands when searching, or it may be necessary for accessibility.
Most Google Voice Search requests will simply take you to a list of results, but certain requests will have Google speak back a direct answer to you. Examples include, "What's the weather like?" and "What is $100 in pounds?"
In addition to Voice Search, it has also been reported that Google will be bringing voice input to Google Docs. There's no word on whether it will allow users to give commands or just dictate, but the additional functionality will mean that an increasing number of Google tools can be controlled with speech input.
SpeechGear Compadre Interact
SpeechGear was founded in 2001 and focuses on translating languages instantly for what people say, hear, read, write or type. Its Compadre product suite has a variety of uses, including translating documents. Compadre Interact, however, is used for providing instant translation of spoken conversations.
Interact allows users to say something out loud and have it spoken back in another language. Likewise, something spoken in a foreign language can be translated and spoken back in English. SpeechGear says that translation is instant, so there is no need to wait in order to hear something repeated back. The software also transcribes conversations automatically.
Digital Syphon Sonic Extractor
Digital Syphon's Sonic Extractor allows users to make automated transcriptions from any audio source. Users can simply input source content, such as a YouTube video link, and the software will analyse and transcribe the content. According to Digital Syphon, the software can be configured to work with most European and Asian languages, and up to 16 hours of content can be transcribed in an hour.