Controlling a computer with speech

Slashdot it! Delicious Share on Facebook Tweet! Digg!

Vedics

The Vedics [12] program, also written in Python, integrates with the Gnome version 2 and version 3 desktop environment as well as Unity. Like Palaver, it comes without any user interface, but it understands only a selection of predefined English speech commands. To start Firewall, for example, you say "Run application," wait a few seconds, and then say "Firefox."

The program uses PocketSphinx as its engine, and the degree of recognition is not very good. Instead of "Move down," the program understood it as "Minimize window." It also took up a lot of processing on our test system, reacted to faint background noise, and crashed repeatedly.

To try Vedics, you can find it as a ready-made DEB package [13] on SourceForge. There, you can also download a PDF describing all the commands. You can download the tar archive, unzip it, and install the program using ./configure && make && sudo install . In any case, you should call up the program using vedics in the terminal. Only then can you determine what texts the program recognizes and whether it has crashed (Figure 7).

Figure 7: If you start Vedics from a terminal, you can tell immediately whether the program crashed. In such a case, use the kill command to delete any remnants in memory.

Conclusion

None of the candidates could compete with Siri or the commercial Windows programs. The language recognition turned out to be matter of luck, and the PocketSphinx engine [14] used by most programs lagged miles behind the commercial Dragon NaturallySpeaking.

Installation proved rocky because of all the required dependencies, and the operating concept was often cumbersome. Disabled users in particular would have a hard time getting the programs to work without assistance.

Palaver proved to have the best speech recognition capabilities, but it is inextricably linked with Google. The huge range of functions Simon provides can only be achieved with a massive amount of configuring – if you can get the program to work at all. Blather and FreeSpeech seem incomplete, and Vedics proved altogether useless in its current state. FreeSpeech at least allows input of English texts, as long as the extensive reworking isn't a bother.

Because the work on programs and engines is proceeding at its current slow rate, controlling the PC via speech may remain wishful thinking for Linux users for some time.

Buy this article as PDF

Express-Checkout as PDF

Pages: 5

Price $0.99
(incl. VAT)

Buy Ubuntu User

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content