|
Federal Computer
Week
Can You Hear Me Now ?
Federal Computer Week - 11th July 2005.
By Paul Ferrill.
The computer on the original "Star Trek" TV series
set the bar awfully high for voice recognition. It not only
understood human speech — even slang — but also replied clearly and
with personality. Although earthbound voice-recognition technology
is rapidly improving and is now useful for many office tasks, it has
not yet attained the standard the starship Enterprise set.
Indeed, the technology needs a combination of vastly
improved artificial intelligence technology and a more sophisticated
speech-recognition engine before matching the performance of the USS
Enterprise's system.
The good news is that the latest versions of Dragon
NaturallySpeaking and IBM's ViaVoice do a pretty good job of
figuring out what you're saying. As those products improve, they
have a broader range of uses.
Transcription for the medical and legal fields
continues to be one of the most frequent applications of
voice-recognition technology. Enabling accessibility for users with
disabilities runs a close second. Although most organizations that
use a computer-assisted transcription process won't totally replace
manual labor, they often use employees in quality assurance and
editing roles rather than as professional transcribers.
For this review we looked at Dragon NaturallySpeaking
Professional Version 8 and IBM ViaVoice Pro USB Edition Version 10.
Both products come with a high-quality noise-cancelling headset from
Andrea Electronics, although they use different models. Dragon
NaturallySpeaking comes with Model 91, while ViaVoice includes Model
61.
Both products offer similar speech-to-text
capabilities, although the target market is obviously different.
Dragon NaturallySpeaking comes with a number of features
specifically for enterprises, including a feature that lets you
store voice profiles on a central server and transcribe audio files
from digital recorders or any handheld device that supports the
Microsoft PocketPC operating system. ViaVoice focuses more on
individual users, providing most of the same functions as Dragon
NaturallySpeaking without the enterprise extras.
Getting started
During the setup process, both packages require you
to configure the software to match the hardware, such as headsets or
microphones, and specific users. During the first step, you speak
into the microphone to set the audio level.
Then must train the algorithms to match your speech.
To complete this learning process, you read large portions of text
to train the software to recognize your speech patterns. Lastly, the
program searches your computer for text files or e-mail messages,
which helps the software learn your writing style.
My first attempt to train Dragon NaturallySpeaking
was done in a room with a high level of ambient noise. The first
step of the calibration process adjusts the volume level while the
second step adjusts for the noise level. I was able to get past the
first but not the second step in that room. Moving to a quieter room
made all the difference, and the process proceeded without incident.
Dragon NaturallySpeaking also supports input from
external recording devices including PocketPCs. Training the
software to recognize the audio from one of those devices is not as
accurate as that from a good noise-cancelling headset. To alleviate
this problem, you can read a large passage of text for 15 minutes
from one of eight literary works during PocketPC's setup. Be careful
which passage you select because I had trouble focusing — and not
laughing — when reading "Dogbert's Top Secret Management Handbook."
IBM's ViaVoice product uses a similar setup process.
I had no problem completing the configuration steps in a quiet room.
I tried both products in the noisier room — noise created by a
window air conditioner and occasionally by a high-speed server fan —
after the training session and both performed acceptably.
In use
Both products instruct you to speak in your normal
tone of voice and at the pace you would typically use. They also
encourage you to pronounce your words clearly and distinctly to help
the recognition process.
You need to become accustomed to watching text appear
on the screen while speaking. Depending on your configuration and
how fast you talk, you could speak an entire sentence before
anything shows up on the screen.
To test the speech-recognition software, I used a
second-grade grammar textbook and read a paragraph with a number of
homonyms in it.
Both programs did a pretty good job of recognizing
the difference between words such as "see" and "sea" using the
context of the sentence. Dragon NaturallySpeaking couldn't seem to
understand the word "homonym" while ViaVoice picked it right up. For
other words, ViaVoice had problems while Dragon NaturallySpeaking
got them right.
If the software misunderstands a word or phrase, you
can correct the mistake so that it won't happen again. ViaVoice uses
a correction pop-up menu activated by selecting the wrong word and
speaking "Correct < text >." The menu then presents a list of
possible replacements. If you find the right word in the list, you
say, "Pick < n >" to select that word. You can also type in the
correct word if the program can't figure it out.
You should remember to add punctuation to your speech
when dictating. Both programs recognize keywords such as "period" to
mean end the sentence and insert a period. Other phrases such as
"new paragraph" instruct the software to end the sentence and start
a new paragraph.
To get the software to recognize a keyword as text,
you must speak the word as part of a sentence without pausing.
There's also a spell mode that lets you spell out license plate
numbers or proper names with multiple capital letters, for example.
Dragon NaturallySpeaking includes a set of tools
under the Accuracy Center to add words to your vocabulary or perform
additional training. You can add individual words or make the
program analyze a document and let you add words to the software's
vocabulary in bulk. Accuracy Center also lets you adjust your
microphone settings in case you change environments or hardware.
Both programs use a toolbar that loads at the top of
your screen by default. The Dragon NaturallySpeaking toolbar
displays a number of color-coded menu items along with the name of
the current user and the default input device. ViaVoice uses the
toolbar to display what it thinks you said and to communicate error
messages if it doesn't understand you.
ViaVoice includes a macro command feature to define
new commands to insert special text or automate a particular
function. One feature exclusive to ViaVoice is the ability to create
a macro template form that you can fill out later.
ViaVoice's documentation uses the example of a form
for a doctor's office that always includes patient information,
symptoms and diagnoses. Both programs allow you to import and export
those custom commands or macros for other users or computers.
Enterprise attention
Dragon NaturallySpeaking includes a number of
features intended for enterprise users. For example, it can store
user profile information on a server for access from more than one
computer.
The professional version of Dragon NaturallySpeaking
also includes software for personal digital assistants and digital
dictation devices. I tried the software on a Hewlett-Packard iPaq
hx2415 and found it to be more than adequate.
The product also supports multiple dictation sources
for specific users. But you still must train each dictation device.
Once you train the new device, you simply add it as another input
device for a specific user. Dragon NaturallySpeaking automatically
backs up user speech files after every fifth update, but you can
change the frequency.
A Manage Users dialog box lets you choose options for
setting backup and restore functions, importing/exporting custom
commands, and selecting multiple dictation devices. You set those
preferences at an individual computer used by multiple users or on a
central file server for roaming users.
Accessibility options
Both programs make it possible to operate a computer
virtually hands-free for individuals with physical challenges. The
user manuals for Dragon NaturallySpeaking and ViaVoice show how to
verbally execute basic Microsoft Windows functions, such as moving
the cursor on the screen and clicking the mouse. They also include
basic operating instructions for the most popular productivity
applications.
Options for the visually impaired are limited to
reading text from within a word-processing program or the scratch
pad application. Both programs provide a simple scratch pad
application that allows you to dictate text, copy it and then paste
it into another application.
Bottom line
Don't expect to see "Star Trek"-level speech
recognition anytime soon. Although some users have adopted voice
recognition to help with physical problems, such as carpal tunnel
syndrome or other physical limitations, you won't find a headset or
microphone on most desks.
Curiously, this lack of general acceptance seems to
have little to do with the technology's performance, as I found in
reviewing these two products. Rather, user perception and lack of
motivation rank as the two biggest challenges to widespread
adoption.
Many people get along fine with the way they use the
computer now and don't want or need another input device that has
some limitations and takes some customization.
Dragon NaturallySpeaking works well at what it does:
text transcription and dictation support. Although it costs more
than ViaVoice, it also offers more features and functions to justify
the price difference.
Ferrill, based in Lancaster, Calif., has been writing
about computers and software for more than 15 years.
Federal Computer
Week

|