Voice recognition program in Russian. Speech recognition programs

Perhaps the most convenient text transcription program for Windows and Mac OS, which combines an audio player and a text editor. The principle of operation is very simple - load an audio file into the program, listen to it using hot keys on the keyboard (you can assign them yourself) and at the same time type text. Playback speed and audio volume are also adjusted using the keyboard. This way, your hands are always on the keyboard and there is no need to use the mouse or switch between different programs. It should be taken into account that the built-in text editor does not recognize errors and does not have many other familiar functions, for example, switching hyphens in dashes. However, you can use other text editors in parallel with Express Scribe by using hotkeys to control audio playback. The program is shareware, full cost: $17-50.


02. Transcriber-pro



A Russian-language program for Windows that allows you to listen not only to audio, but also to view video files. The built-in text editor has the ability to add timestamps and names of interlocutors. The resulting text can be imported into “interactive transcripts” and can also be adjusted as part of a group project. The application is available only with an annual subscription, the cost is 689 rubles per year.


03. RPlayer V1.4



A simple program for processing and transcribing audio files with hotkey support and the ability to type in Microsoft Word. Unlike previous similar programs, it can be downloaded for free, but it is unstable on new versions of Windows.

04. Voco

Professional Windows application for converting speech to text. Supports voice typing in any test browser, has a large collection of thematic dictionaries and does not require an Internet connection for speech recognition. The extended versions "Voco.Professional" and "Voco.Enterprise" can work with ready-made audio files. The only drawback is the high cost of the application.


05. Dragon Dictation



Free mobile application for recognizing dictated speech. The program can recognize about 40 languages ​​and their varieties, allows you to edit text and send it to email, social networks or copy to the clipboard. An Internet connection is required to operate.


06. RealSpeaker



A unique application that can not only recognize audio files, but also live speech spoken to the camera. Thanks to a special video extension, “RealSpeaker” reads lip movements, thereby improving the speech recognition process by up to 20-30% compared to other similar algorithms. Currently, the application supports 11 languages: Russian, English (American and British dialects), French, German, Chinese, Korean and Japanese, Turkish, Spanish, Italian and Ukrainian. The program is distributed free of charge, the cost depends on the duration of the subscription, the unlimited version costs about 2 thousand rubles.

In our modern, eventful world, the speed of working with information is one of the cornerstones of achieving success. Our work performance and productivity, and therefore our immediate material wealth, depend on how quickly we receive, create, and process information. Among the tools that can improve our working capabilities, programs for translating speech into text occupy an important place, allowing us to significantly increase the speed of typing the texts we need. In this material I will tell you what popular programs exist for translating audio voice into text, and what their features are.

Most of the currently existing programs for translating voice into text are paid, placing a number of requirements on the microphone (in the case when the program is intended for a computer). It is highly not recommended to work with a microphone built into a webcam, or located in the body of a standard laptop (the quality of speech recognition from such devices is quite low). In addition, it is quite important to have a quiet environment, without unnecessary noise that can directly affect your speech recognition level.

Moreover, most of these programs are capable of not only transforming speech into text on the computer screen, but also using voice commands to control your computer (launching and closing programs, receiving and sending email, opening and closing websites, and so on).

Speech to text program

Let's move on to a direct description of programs that can help translate speech into text.

Laitis program

The free Russian-language voice recognition program “Laitis” has a good quality of speech understanding, and, according to its creators, can almost completely replace the user’s usual keyboard. The program also works well with voice commands, allowing you to perform many actions to control your computer.

For its operation, the program requires high-speed Internet on the PC (the program uses network voice recognition services from Google and Yandex). The program’s capabilities also allow you to control your browser using voice commands, which requires installing a special extension from “Laitis” (Chrome, Mozilla, Opera) on your web navigator.

"Dragon Professional" - transcribing audio recordings into text

At the time of writing this material, a digital English-language product « Dragon Professional Individual" is one of the world leaders in the quality of recognized texts. The program understands seven languages ​​(only the Dragon Anywhere mobile application in and works with Russian so far), has high quality voice recognition, and can perform a number of voice commands. Moreover, this product is exclusively paid (the price for the main program is 300 US dollars, and for the “home” version of the Dragon Home product the buyer will have to pay 75 US dollars).

To operate, this product from Nuance Communications requires the creation of your own profile, which is designed to adapt the program’s capabilities to the specifics of your voice. In addition to directly dictating text, you can train the program to perform a number of commands, thereby making your interaction with the computer even more congruent and convenient.

"RealSpeaker" - ultra-accurate speech recognizer

The program for transforming voice into text “RealSpeaker”, in addition to the standard functions for programs of this kind, allows you to use the capabilities of your PC’s webcam. Now the program not only reads the audio component of the sound, but also records the movement of the corners of the speaker’s lips, thereby more correctly recognizing the words he pronounces.


"RealSpeaker" reads not only the audio, but also the visual component of the speech process

The application supports more than ten languages ​​(including Russian), allows speech recognition taking into account accents and dialects, allows you to transcribe audio and video, gives access to the cloud and much more. The program is shareware, but for the paid version you will have to pay real money.

“Voco” - the program will quickly translate your voice into a text document

Another voice-to-text converter is the paid digital product “Voco”, the price of the “home” version of which is now about 1,700 rubles. More advanced and expensive versions of this program - “Voco.Professional” and “Voco.Enterprise” have a number of additional features, one of which is speech recognition from the user’s audio recordings.

Among the features of Voco, I would like to note the ability to expand the program’s vocabulary (currently the program’s vocabulary includes more than 85 thousand words), as well as its autonomous operation from the network, allowing you not to depend on your Internet connection.


Among the advantages of Voco is the high learning curve of the program.

The application is turned on quite simply - just press the “Ctrl” key twice. The application is absolutely free, supports several dozen languages, including Russian.

Conclusion

Above, I listed programs for translating your audio voice recording into text, described their general functionality and characteristic features. Most of these products are usually paid, and the range and quality of Russian-language programs is qualitatively inferior to their English-language counterparts. When working with such applications, I recommend paying special attention to your microphone and its settings - this is important in the process of speech recognition, because a bad microphone can negate even the highest quality software of the type I reviewed.

There are two types of speech recognition programs:

1. tied to the speaker - these programs are constantly learning and over time they begin to understand the voice of “their owner” better and better. The more often the user works in the program, the better it understands him. Fortunately, the learning process happens quite quickly - in about 20 minutes the program will learn to understand you quite well.

2. independent of the speaker - you can start speaking immediately - the program will respond to voice commands. Unlike the first type, these programs do not need to learn to understand you. On the contrary, you need to learn to speak in such a way that the program understands you.

Why is speech recognition software used on a PC?

Don't think that if you install a speech recognition program, you will no longer need a keyboard and mouse, but working on your PC will be much easier.

1. Dictation - using speech recognition programs, many users dictate the text of documents. This opportunity is relevant, for example, for doctors conducting an examination (during which their hands are usually busy) and at the same time recording its results. For an ordinary user who finds it difficult to type text for some reason (or is simply too lazy), it may also be useful.

2. Entering commands - PC users can use a “recognizer” to enter commands, that is, the spoken word will be perceived by the system as a mouse click. The user commands: “Open file”, “Send mail” or “New window”, and the computer performs the corresponding actions. This is especially true for people with disabilities - instead of a mouse and keyboard, they will be able to control the computer using their voice.

What is required for speech recognition?

1. Speech recognition program - English-speaking Windows users can use, for example, Dragon Naturally Speaking or IBM Via Voice. The Russian language is understood by the programs “Gorynych” and “Dictograph”. Speech recognition software is already built into the Windows Vista operating system.

2. Microphone or headset (a hybrid of an earphone and a microphone) - to “get” words into the computer.

3. Sufficiently powerful computer – the computer does not need to be super fast for the speech recognition function to work. 1 GB of RAM is sufficient (for Windows Vista it is better to have 2 GB) and a processor clock speed of at least 1 GHz.

Which devices use speech recognition?

The speech recognition function can be used not only in PCs, but also in many other devices. This is especially true if the “gadget” has a compact keyboard with tiny keys (or none at all).

1. Mobile phones - models with voice control have been available for several years. But this has nothing to do with voice recognition - the device does not translate the voice into text, but compares the spoken phrase with a pre-recorded one (the latter is a “reference” and is usually called a “voice tag”). A voice tag can correspond to an entry in the address book (voice dialing) or a menu item (voice control). If the phone does not initially have the appropriate functions, it will be impossible to “train” it.

2. Mobile navigators - in new navigation devices, for example, Tom Tom Go 720T, the driver can enter the destination by voice. If you pronounce words clearly and, if possible, in silence, this function works very well. Although this operation takes the same amount of time as keyboard input, it is still safer and more convenient to use voice control while driving. True, you can’t do it completely without your hands - to launch a voice command you need to press the on-screen button.

3. Cars - Some new car brands, such as Mercedes, Audi, Toyota, Ford or BMW, can be controlled using voice (though the set of commands is limited). For example, in some BMW models, after pressing the button located on the steering wheel (see figure), voice control functions for the stereo system or navigation system are activated.

4. Multimedia discs for learning foreign languages ​​- some educational programs check the correct pronunciation. The program asks you to read a certain sentence and, after processing the result using the speech recognition function, tells you whether everything is okay with your pronunciation.

What problems arise when working with recognition programs?

Controlling devices or dictating texts works well enough, but unfortunately not perfectly. And this is caused by a number of reasons:

1. Words don't always sound the same - The biggest difficulty in speech recognition is that no one person will pronounce the same word the same way, even if they try hard.

2. Everyone speaks differently - so the speech recognition program will function more clearly if a new user “trains” it a little first. True, this is not always possible, and sometimes it is not even necessary, for example, when using programs that are not tied to the interlocutor. Many speech recognition programs can adjust to a new user automatically.

3. Background noise can significantly distort the sound of the spoken word. This significantly limits speech recognition functions, and in crowded or noisy places makes it completely impossible.

4. Fast speech - some users speak very quickly - the words almost merge. The interlocutor will easily understand such speech, but the program will be “too tough” for such a task.

5. Words with the same (or very similar) sound - it is especially difficult for speech recognition programs with so-called homophones - words that are pronounced almost the same, but are spelled differently (“lez” and “les”, “rot” and “rod” "). The program must determine the meaning of such words based on the context of the sentence.

What's the future for speech recognition?

In mobile phones, the role of the speech recognition function will increase significantly, because typing text on small keyboards of mobile phones is very tedious.

1. Dictation of SMS messages - soon you will not need to type text messages on your phone - you can simply dictate. Samsung promises to implement this function in some of its phone models (they should appear on the market in the near future).

2. Translation – by the time of the 2008 Olympic Games in Beijing, a mobile phone with a built-in translator is expected to appear. If you, while in the Middle Kingdom, want, for example, to dine at a restaurant, then you will only need to speak your order in Russian into your mobile phone - everything will be translated into Chinese, and an electronic voice from the speaker will transmit the order to the waiter.

It can be assumed that over time, more and more devices will understand the human voice. So don't be surprised if one morning your coffee machine not only asks you whether to make a cappuccino or an espresso, but also understands your answer.

Speech recognition in Windows Vista

Windows Vista includes speech recognition software. Unfortunately, this component only understands English, German, French, Spanish, Japanese and Chinese. When you launch the component for the first time (in the Control Panel you need to select the Ease of Access and Speech Recognition items), a training wizard window opens, which will take you for half an hour to introduce you to the principles of Windows voice control. After completing a few exercises, you will learn how to dictate and control Windows using voice commands. Since speech recognition software is speaker dependent, it will learn your voice at the same time. After successfully mastering the introductory part, Windows will respond to your call: “Listen!” and will begin to accept voice commands. Disadvantage: Voice input only works for Microsoft programs (for example, Windows itself, Word or Internet Explorer). When using other programs (for example, Open Office or Firefox), the computer will be “deaf.”

Updated: Monday, July 31, 2017

What does the semi-fantastic idea of ​​talking to a computer have to do with professional photography? Almost none, unless you are a fan of the idea of ​​endless development of the entire technical environment of man. Imagine for a moment that you are giving voice orders to your camera to change the focal length and make an exposure correction of half a stop plus. Remote control of the camera has already been implemented, but there you need to silently press the buttons, but here is a hearing camera!

It has become a tradition to cite some science fiction film as an example of voice communication between a person and a computer, for example “2001: A Space Odyssey” directed by Stanley Kubrick. There, the on-board computer not only conducts a meaningful dialogue with the astronauts, but can read lips like a deaf person. In other words, the machine has learned to recognize human speech without errors. Perhaps remote voice control of the camera will seem superfluous to some, but many would like this phrase "Take us down, baby" and the photo of the whole family against the background of a palm tree is ready.

Well, so I paid tribute to tradition and dreamed up a little. But, speaking from the heart, this article was difficult to write, and it all started with a gift in the form of a smartphone with Android 4 OS. This HUAWEI U8815 model has a small four-inch touch screen and an on-screen keyboard. It’s a little unusual to type on it, but it turns out it’s not particularly necessary. (image01)

1. Voice recognition in a smartphone running Android OS

While trying out a new toy, I noticed a graphic of a microphone in the search bar Google and on the keyboard in Notes. Previously, I was not interested in what this symbol meant. I had conversations in Skype, and typed letters on the keyboard. This is what most Internet users do. But as they later explained to me, in the search engine Google voice search in Russian was added and programs appeared that allow you to dictate short messages when using a browser "Chrome".

I said a phrase of three words, the program identified them and showed them in a cell with a blue background. There was something to be surprised about here, because all the words were written correctly. If you click on this cell, the phrase appears in the text field of the Android notepad. So I said a couple more phrases and sent a message to the assistant via SMS.


2. A brief history of voice recognition programs.

It was not a discovery for me that modern advances in the field of voice control make it possible to give commands to household appliances, cars, and robots. Command mode was introduced in previous versions of Windows, OS/2 and Mac OS. I've come across talking programs, but what's the use of them? Perhaps it’s my peculiarity that it’s easier for me to speak than to type on a keyboard, but on a cell phone I can’t type anything at all. You have to write down contacts on a laptop with a normal keyboard and transfer them via USB cable. But to simply speak into a microphone and have the computer type the text itself without errors was a dream for me. The atmosphere of hopelessness was maintained by discussions on the forums. There was such a sad thought everywhere in them:

“However, in reality, to date, programs for real speech recognition (and even in Russian) practically do not exist, and they will obviously not be created soon. Moreover, even the inverse problem of recognition - speech synthesis, which, it would seem, is much simpler than recognition, has not been fully solved." (ComputerPress No. 12, 2004)

“There are still no normal speech recognition programs (not just Russian) because the task is quite difficult for a computer. And the worst thing is that the mechanism of word recognition by humans has not yet been realized, so there is nothing to start from when creating recognition programs.” (Another discussion on the forum).

At the same time, reviews of English-language voice text entry programs indicated clear successes. For example, IBM ViaVoice 98 Executive Edition had a basic vocabulary of 64,000 words and the ability to add the same number of your own words. The percentage of word recognition without training the program was about 80% and with subsequent work with a specific user reached 95%.

Among the Russian language recognition programs, it is worth noting “Gorynych” - an addition to the English-language Dragon Dictate 2.5. I will tell you about the search and then the “battle with the five Gorynychs” in the second part of the review. The first I found was the "English Dragon".

3. Continuous speech recognition program “Dragon Naturally Speaking”

Modern version of the company's program "Nuance" ended up with an old friend of mine from the Minsk Institute of Foreign Languages. She brought it back from a trip abroad, and bought it thinking that it could be a “computer secretary.” But something didn’t work out, and the program remained on the laptop, almost forgotten. Due to the lack of any clear experience, I had to go to my friend myself. All this lengthy introduction is necessary for a correct understanding of the conclusions that I have drawn.

The full name of my first dragon was: . The program is in English and everything in it is clear even without a manual. The first step is to create a profile of a specific user to determine the sound characteristics of words in his performance. That’s what I did - the speaker’s age, country, and pronunciation features are important. My choice is as follows: age 22–54 years old, UK English, standard pronunciation. Next are several windows where you configure your microphone. (image04)

The next stage for serious speech recognition programs is training for the pronunciation features of a particular person. You are asked to choose the nature of the text: my choice is a short dictation instruction, but you can also “order” a humorous story.

The essence of this stage of working with the program is extremely simple - text is displayed in the window, with a yellow arrow above it. When pronounced correctly, the arrow moves through the phrases, and at the bottom there is a workout progress bar. I had pretty much forgotten my conversational English, so I made progress with difficulty. Time was also limited - the computer was not mine and I had to interrupt the training. But a friend said she took the test in less than half an hour. (image05)

Refusing to let the program adapt my pronunciation, I went to the main window and launched the built-in text editor. He spoke individual words from some texts that he found on the computer. The program printed those words that he said correctly, and replaced those that he said poorly with something “English.” Having pronounced the command “erase line” in English clearly, the program executed it. This means that I read the commands correctly, and the program recognizes them without prior training.

But it was important to me how this “dragon” writes in Russian. As you understood from the previous description, when training the program, you can only select English text; there is simply no Russian there. It is clear that it will not be possible to train Russian speech recognition. In the next photo you can see what phrase the program typed when pronouncing the Russian word “Hello”. (image06)

The outcome of the conversation with the first dragon turned out to be slightly comical. If you carefully read the text on the official website, you can see the English “specialization” of this software product. In addition, when loading, we read “English” in the program window. So why was all this necessary? It is clear that forums and rumors are to blame...

But there is also useful experience. A friend of mine asked to see the condition of her laptop. Somehow slowly he began to work. This is not surprising - the system partition had only 5% free space. While deleting unnecessary programs, I saw that the official version took up more than 2.3 GB. This figure will be useful to us later. (image.07)



Recognition of Russian speech, as it turned out, was a non-trivial task. In Minsk I managed to find “Gorynych” from a friend. He searched for the disc for a long time in his old rubble and, according to him, this is the official publication. The program installed instantly, and I found out that its dictionary contains 5,000 Russian words plus 100 commands and 600 English words plus 31 commands.

First you need to set up the microphone, which I did. Then I opened the dictionary and added the word "examination" because it was not in the program dictionary. I tried to speak clearly and monotonously. Finally, I opened the Gorynych Pro 3.0 program, turned on the dictation mode and received this list of “close-sounding words.” (image.09)

The result puzzled me, because it clearly differed for the worse from the work of an Android smartphone, and I decided to try other programs from “ Google Chrome online store". And I put off dealing with the “gorynych snakes” until later. I thought it was postponement action in the original Russian spirit

5. Google's voice capabilities

To work with voice on a regular Windows computer, you will need to install a browser Google Chrome. If you're using it online, you can click on the software store link at the bottom right. There, completely free, I found two programs and two extensions for voice text input. The programs are called "Voice notepad" And "Voicenot - voice to text". After installation, they can be found on the tab "Applications" your browser "Chromium". (image. 10)

The extensions are called "Google Voice Search Hotword (Beta) 0.1.0.5" And "Voice text input - Speechpad.ru 5.4". After installation, they can be turned off or deleted on the tab "Extensions".(image. 11)

VoiceNote. In the application tab in the Chrome browser, double-click the program icon. A dialog box will open as in the picture below. By clicking on the microphone icon, you speak short phrases into the microphone. The program transmits your words to the speech recognition server and types the text in the window. All words and phrases shown in the illustration were typed the first time. Obviously, this method only works when there is an active Internet connection. (image. 12)

Voice notepad. If you launch the program from the applications tab, a new Internet page tab will open Speechpad.ru. There are detailed instructions on how to use this service and a compact form. The latter is shown in the illustration below. (image. 13)

Voice input Text allows you to fill out text fields on Internet pages using your voice. For example, I went to my page "Google+". In the new message input field, right-click and select "SpeechPad". The pink input window indicates that you can dictate your text. (image. 14)

Google Voice Search allows you to search by voice. When you install and activate this extension, a microphone symbol appears in the search bar. When you press it, a symbol will appear in a large red circle. Just say your search phrase and it will appear in the search results. (image. 15)

Important note: For the microphone to work with Chrome extensions, you need to allow microphone access in your browser settings. It is disabled by default for security reasons. Go to Settings→Personal information→Content settings. (To access all settings at the end of the list, click Show advanced settings). A dialog box will open Page content settings. Select an item down the list Multimedia→microphone.

6. Results of working with Russian speech recognition programs

A little experience in using voice text input programs has shown excellent implementation of this feature on the servers of an Internet company Google. Without any preliminary training, words are recognized correctly. This indicates that the problem of Russian speech recognition has been solved.

Now we can say that the result of developments Google will be a new criterion for evaluating products from other manufacturers. I would like the recognition system to work offline without accessing the company’s servers - it’s more convenient and faster. But it is unknown when an independent program for working with a continuous stream of Russian speech will be released. It is worth assuming, however, that with the opportunity to train, this “creation” will become a real breakthrough.

Programs of Russian developers "Gorynych", "Dictographer" And "Combat" I will go into detail in the second part of this review. This article was written very slowly for the reason that the search for original disks is now difficult. At the moment, I already have all versions of Russian voice-to-text recognition engines except “Combat 2.52”. None of my friends or colleagues have this program, and I myself have only a few laudatory reviews on the forums. True, there was such a strange option - download “Combat” via SMS, but I don’t like it. (image16)


A short video clip will show you how speech recognition works in a smartphone with Android OS. The peculiarity of voice typing is the need to connect to Google servers. This is how your Internet should work

Price: $199.99
Developer ScanSoft
Website www.scansoft.com
Size No
Download page No
+
Widest functionality; work in all Windows applications; powerful dictionary databases
High price
! The best speech recognition software available

Definitely the best existing speech recognition module! Over its long history, Dragon has gone through the entire difficult path from soldier to marshal; no, perhaps, still not up to the marshal, but he certainly deserved the title of army general. The entire algorithm for working with the program is extremely simple - we connect headphones and a microphone to the corresponding outputs from the audio card and launch the utility itself. First, the user will be asked to calibrate the sound level from the microphone and dictate a number of ready-made texts to the computer to fine-tune Dragon Naturally Speaking to your timbre, intonation and pronunciation. And finally, an interactive tutorial where the user is taught basic voice commands.

It is worth noting that the PC is not a living interlocutor and he cannot complete the “swallowed” syllables or understand an illegibly spoken sentence. The speaker's own accent is no less important - the level of English that, for example, is heard at various international scientific conferences, is, in principle, unsuitable for work. On the other hand, there is always the possibility of self-learning: if Dragon does not want to recognize a certain word, take the time to look into Lingvo and pronounce it taking into account the correct transcription. I assure you, in a maximum of a week or two you will not only easily dictate kilobytes of texts, but also flaunt your true English pronunciation among your friends.

Still not satisfied with the recognition quality? Contact your local Accuracy Center to optimize your user profile and teach you how to add popular neologisms to your vocabulary. More exotic actions are also possible, such as recognizing the text content of a wav file (including from a Pocket PC or directly from the linear output of an audio card). In addition, Dragon Naturally Speaking can launch various programs, switch between them and even control a number of their functions (for example, start/pause music playback in a media player or directly work with the menu). Well, the Preferred and Pro-fessional versions additionally include our own speech engine, Real-Speech 2, one of the most advanced today.

But let's get back to recording the speech. What is especially pleasing is that you can dictate text not only in the native text processor DragonPad, but also in any other similar application - MS Word, Outlook Express, Internet Ex-plorer and Corel WordPerfect. The program works with the same success with ICQ, network chat (Network Assistant) and other instant messengers; however, then some commands become unavailable, but even to send a message Enter There is no need to press, just say: “New paragraph” - and ICQ will automatically do it. In more specialized applications, in particular in Word, additional commands are used: text formatting, spelling, editing - and all exclusively at the expense of spoken language. If the standard set of orders was not enough, you can always create your own, thereby further expanding the functionality of Dragon. With a little effort, it is quite possible to type a page of text without any edits. The main thing is the right combination of intonation and, of course, pronunciation. Don’t stretch out your phrases, but don’t scribble like a machine gun, otherwise the percentage of correctly understood material will surely tend to zero. Moreover, it is not at all necessary to constantly look in the dictionary - even if you did not pronounce some phrase quite correctly (for example, I’m very happy), known to the program, it will “guess” to automatically correct the text. Amazing? It's all about the huge vocabulary, which, along with advanced speech recognition technology, leaves no chance for competitors. How can we not recall the early versions of Dragon, with which the author of these lines suffered a lot in the past, but never achieved high-quality work from them...

Intelligent Voice Recognition System (IVOS) 2.0.2A
Shareware (30 days trial, registration - $50)
Developer ComunX
Website www.ivos.biz
Size 2.69 MB
Download page ftp://ftp.download.com/
pub/ppd/1007091810190380/
setup_ivos.exe
+
Microscopic distribution size; excellent functionality
Speech shorthand mode is not yet up to the level of Dragon
! One of the best utilities in this area

The most modest (in terms of distribution size) program in the review proved itself to be surprisingly worthy and largely justified its loud name. The reason for this is its versatility, designed to completely eradicate the means of “manual” information entry. So, IVOS allows you to: a) recognize speech and convert it into text in any Windows-compatible text processor; b) control your PC using a variety of voice commands, as well as create your own; c) voice over e-books using external voice engines. Plus, of course, such little things as extracting text from Wav files, a convenient program control panel that does not burden the screen, and an affordable (compared to the same Dragon) price. After registration, the user has access to VoiceTouch technology, which allows you to teach the PC your own verbal orders.

The efficiency of command execution is surprisingly high - perhaps even better than Realize Voice. But the level of recognition of “lectures” will be lower, which is not strange: it is one thing to understand a couple of words, and quite another to understand a whole sentence. It should be noted that IVOS, like many other speech recognition programs, except Dragon, uses the Speech API module from Microsoft for such purposes, and its effectiveness in this area directly depends on the creative success of this corporation. Nevertheless, you can achieve high-quality work from IVOS now by reading to the program all the training texts in its stock. Of course, in the end she will not reach the level of Dragon Naturally Speaking, but she is quite capable of typing not too complex documents. And if you regularly update the user dictionary, then there will be no special problems with scientific terms. True, a dilemma arises here - in the week that will have to be spent teaching the utility all the intricacies of working with speech, it is quite possible to quickly master the ten-finger touch typing method on the keyboard... On the other hand, the qualifications of a PC user will only increase if he owns several methods of entering information into a computer.

Realize Voice 4.0

Realize Voice 4.0
Shareware (15 days trial, registration - $49.00)
Developer Realize Software Corporation
Website www.realizesoftware.com
Size 55 MB
Download page
www.realizesoftware.com/
download/RzRV40download.exe (Web installer)
+
Unpretentious to the user's pronunciation; very wide set of commands
The quality of work could still be better; installs only on English version of Windows
! Control your PC with just your voice

Realize Voice, unlike the previously reviewed Dragon Naturally Speaking, is not very capable of shorthand (although it does have such a function in its arsenal), but it copes brilliantly with voice commands. What’s noteworthy is that you don’t need exceptionally deep knowledge of English - thanks to the smart heuristic analyzer module, the program will easily find a common language with almost any speaker. The range of Realize Voice functions is quite wide: from launching executable files and program shortcuts to working with correspondence and complex macros. As in other similar programs, the user only needs a connected microphone and a couple of minutes to get the hang of things. And before you start actually communicating with the utility, it’s worth defining the scope of work for it. By default, system menu shortcuts fall into this category, Desktop, folder contents Favorites and Quick Launch panels, as well as recently opened documents and programs. The entire process is fully automated and is completed literally instantly. True, some inconvenience is caused by the inability to use numbers in the names of commands - for example, you can launch DOOM 3 using a voice command only by renaming its shortcut to “DOOM Three”. The same, by the way, applies to the Cyrillic alphabet - not such a cheerful prospect, is it? However, in such a case, you can always resort to manually setting up the program by directly specifying the path to the file/document/graphic image you are interested in, etc. Here the name of the file and its coordinates do not matter at all - even if it is abvgd.exe, yes And Desktop you won't have to mutilate it. I was also very pleased with the set of built-in system commands for working with Windows - although it is not too large, it can move between open windows and emulate the actions of the most common keys ( Spacebar, Insert, Home etc.), turning off and blocking the system with its help is quite possible.

A little about macros. The utility allows you to combine a whole series of operations under one command - from entering characters from the keyboard and system commands to speech synthesis using the built-in voice engine. True, such an idyll as recording a CD using a single phrase is still far away, but time will tell... The main thing is that now you can (and not unsuccessfully!) “steer” your pet without any anachronisms like a mouse and keyboard. Try it - you won't regret it!

Voice Studio 1.4.6

Voice Studio 1.4.6
Shareware (7 days trial, registration - $20.97)
Developer Ultimate Interactive Desktop's
Website www.voicestudio.us
Size 57 MB
Download page
ftp://ftp.voicestudio.us/
pub/dl2/vssetup.exe
+
Excellent functionality; the presence of a “live” animated character; very low price
MS SAPI is used for speech recognition; quite high resource consumption
! A great addition to Dragon for controlling your PC with your voice

Perhaps this is one of the few, if not the only, such programs where our virtual interlocutor on the other side of the monitor has finally acquired a material form. And although the MS Agent technology, which is used for these purposes, can hardly be called a prototype of artificial intelligence, it has all the prerequisites for this. The animated assistant is not only endowed with a certain degree of independence, but also knows how to respond to a number of standard phrases (like “Hello!”, “How do you feel”, “Bad computer”, etc.). If desired, his vocabulary and phraseological stock can be easily replenished, and in addition, his actions can be set depending on his “mood.” Although such chatter with a PC will be limited by the scope of the program’s knowledge, no one bothers to expand it almost indefinitely. And there it’s just a stone’s throw away from the notorious AI... However, I digress a little.

Actually, everything is in order with the functionality of Voice Studio - shorthand (though Dragon is much better), a variety of voice commands (for greater convenience and faster memorization, they can be printed), as well as acceptable machine speech synthesis. More serious things include creating macros to launch a series of operations at once using one keyword, even recording and playing back mouse movements! Let me remind you that the last “feature” is widely used in many alternative browsers like GreenBrowser or MyIE2 to perform a number of actions (going to another page, opening a new window, etc.). Now you don’t need any unnecessary gestures - just say the appropriate command, and the computer will automatically recreate the previously recorded script. Who knows, maybe soon we will be able to play games using just a microphone? Time will show…

In the meantime, Voice Studio undoubtedly deserves the highest rating for its amazing friendliness and ease of use. It may not yet be able to record speech correctly, but controlling a PC with voice is simply incomparable. The best of these utilities and a worthy addition to Dragon!

Dictation 2004 v.4.5.2399

Dictation 2004 v.4.5.2399
Shareware (7 days trial, registration - $49.99)
Developer United Research Labs
Website www.research-lab.com
Size 41 MB
Download page
www.bandwidthsaver.com/
downloads/dict2002.zip
+
Basic set of functions for PC control and speech recording; great work with wav files
Not the best speech recognition performance; annoying text editing module
! Too little for this price

Despite the seemingly completely standard basic skills, Dictation 2004 can still boast of something. First of all, it is Point-and-Speak technology, which allows you to easily create commands for entering passwords, launching software, and dictating in almost all Windows applications. Integration with MS Word is announced, as well as intelligent technology for correctly identifying phrases. True, it is implemented in an extremely inconvenient way - in the form of a pop-up window that appears with every word spoken and only discourages any desire to work. It's good that you can turn it off. Dictation 2004 uses the same SAPI 5.1, so its quality is not fundamentally different from other software based on the same technology (Voxx, IVOS, Realize Voice, etc.). Among the additional functions, it is worth noting WAV Recorder for capturing information from audio cassettes, mobile devices, microphones and then recording it into wav files; then the text is extracted from them using a separate Dictation applet - Wave-to-Text. So far, of course, it is still far from ideal, but if the announcer has clear speech and good pronunciation, then there will be no problems.

+
Versatility in work; variety of possibilities
— “Training” the program will take a lot of time ! Interesting product, but could be better...

Another "jack of all trades" that allows you to chat to your heart's content with your PC. The list of program features is very similar to that of IVOS (stenography/voice commands/text reading), except that there is a useful bonus - scrupulous voice acting of your every action, be it typing or opening a file. The program uses the same Microsoft Speech API as IVOS, so its recognition quality is similar. There is a good set of voice commands for navigating the browser, basic operations in a text editor (cut/copy/paste, etc.), as well as working with windows, there are shortcuts for calling system applets, even opening/closing the optical drive tray - in general, everything for comfortable work. As for speech synthesis, it directly depends on the corresponding modules installed in the system. The free engines from Microsoft that come with the program are far from ideal, but, in principle, you can get used to them. A more convenient option, alas, not free of charge, is to try third-party developments, in particular Digit PC, which also has a very good Russian-language announcer. Considering all the pros and cons, Voxx would be a good candidate to buy. By the way, the trial version is limited only by the number of phrases/commands per session; To start a new session, just restart the program...

Conclusion

Despite the still numerous shortcomings, speech recognition programs have already moved from the rank of toys to a serious tool for a business person. If previously they were of little use, now they can really make life easier for the user and destroy the previously unshakable stereotype that a computer is just an iron box that crunches numbers. And of course, the most pleasant fact is the opportunity to experience the technological progress of the 21st century, which numerous science fiction writers have so often written about, right now. Join us!