|
Link to
Computerworld
Hands Off: Using Dragon
NaturallySpeaking.
Professional writer finds it twice as
fast as keyboarding ...
By Lamont
Wood.
(Computerworld
Magazine
May 18, 2007)
You decide what you are going to say. You say the words. They
appear on the screen. You're done.
That's what writing with speech recognition -- specifically, Dragon
NaturallySpeaking Version 9.0 -- amounts to. The lifetime it took to
achieve a smooth keyboarding rate of 60 words per minute no longer
matters. The skill you effortlessly mastered as a child -- talking
-- is all that's required to input text to your computer.
Of course, there's a little more to it than that, but the
requirements are trivial compared to, say, a semester of typing
class. First, you'll want a fast processor, but everything sold new
today will probably suffice. You will want to install all the RAM
you can in it -- a gigabyte is a good start. You'll want a fairly
quiet place to work, but any place you feel comfortable talking on
the phone will probably suffice. And you have to believe that
consistent pronunciation is a worthy goal, and not an artificial,
elitist imposition.
The software installs from two CDs in a conventional manner. The
package also comes with a headphone/microphone, and you'll want to
make sure it's connected correctly before beginning -- the correct
I/O ports are not always obvious.
After it's installed, the software will examine the vocabulary of
your "My Documents" folder, and ask you to read at least one short
canned passage while it analyzes the way you pronounce the words.
This reading is called the enrollment process and only takes a few
minutes -- unlike earlier versions of the software in the last
decade, where the process took nearly an hour.
Once it's running, Dragon NaturallySpeaking will install a command
bar along the top of screen, with a microphone icon and various menu
items. To begin dictation, you position the cursor on the screen,
just as if you were about to type something. But instead of typing,
you click the microphone icon and began speaking.
At this point, accuracy will probably be about 95%. That means
there will be about 10 mistakes per double-spaced page. That's well
short of perfection, but the mistakes will be correctly spelled
words that just happen to be the wrong words. To correct an
incorrect word, you select it and say the correct word. In most
cases, that suffices. Correcting the text, in other words, takes
hardly more time than it takes to proofread it. Inputting that text,
meanwhile, happens at conversational speed, which for most people is
between 120 and 150 words per minute. (The software claims to be
able to handle 160 words a minute.)
By comparison, the average typing speed for an office worker is
about 40 to 50 words a minute. My own is about 60 with an accuracy
of 93%, so on the whole, I've found that using speech recognition is
about twice as fast as typing. Those who type at hunt-and-peck
speeds will experience results that are even more dramatic.
Dragon NaturallySpeaking in action
After the user starts speaking, the yellow "results box" appears
near the text insertion point, showing what Dragon thinks it has
heard so far. The results will change as the phrase lengthens and
Dragon is able to perform further analysis. For instance, it decided
that "period" was punctuation in all three sentences.
In the third sentence, it initially placed "to" after "brought" but
moments later changed it to "two," successfully analyzing the
homonyms. Dragon was less successful with the homonyms in the second
sentence, failing to differentiate "which" and "witch."
Dragon types the resulting text after analyzing a phrase, so that
what appears on the screen may fall a sentence behind what the
speaker is saying (as was the case in this example.) This can be
disorienting and users are advised to not watch the screen during
dictation.
When finished, the user selected the second "which" and spoke the
word "witch" again. Dragon knew enough to respell it as the other
homonym.
But when used seriously, the software presents several minor
annoyances, the sum of which may drive users back to their familiar
keyboard.
First, there's the discouragement of seeing the computer generate
mistakes that you didn't make. To counteract that, the user needs to
learn how to employ the software's correction functions so that
accuracy will gradually improve, until the user's day-to-day
vocabulary is mastered. (After six weeks of daily use, my system's
accuracy has reached almost 99%.) Basically, when a word is
misrecognized, you select it and say "correct (that word)." The
software will learn as it goes along, and your user experience will
gradually improve.
Frankly, the software does an excellent job recognizing long words,
but will stumble over one-syllable words, using an/and, he/me/the or
in/on interchangeably. The correction process can tame that
tendency, but you also have to learn a different emphasis when
proofreading. With typing, by comparison, you have to check the long
words but can assume that the one-syllable ones are accurate.
Good at homonyms
On the other hand, Dragon is quite good at differentiating
homonyms. "I too took two shoes to the beach," usually came out
correctly, albeit after weeks of learning my vocabulary.
Dragon includes a text-to-speech facility to read the text back to
you. This will bring out mistakes that your eye would pass over,
especially the misused one-syllable words.
Second, you need to stop "thinking with your fingers" and learn to
dictate. You have to pretend that you are a BBC announcer and punch
out the words clearly and consistently. Slips of the tongue will
result in errors, so try not to slip -- or take it personally when
Dragon doesn't understand you. Also, you have to learn to pronounce
all of the punctuation marks. And while it sounds counterintuitive,
not watching the screen while you dictate will make the process go
smoother.
Beyond that, to truly master the software, you have to get a feel
for its rhythms. For instance, if you want to capitalize a name at
the end of the sentence, you need to say it, pause and then say the
words "cap that" before dictating the punctuation mark.
On the other hand, keyboards are not going to go away. Speech
recognition is a great way for getting your first draft on paper,
especially as you can grab the thought in flight and send it
directly to the screen, as it were, instead of having to arrest it
and then wrestle with the keyboard to get render your thought as
text. But editing is easier with a keyboard, especially when you
need to move text around. Meanwhile, text does not always read as
well as it sounds.
Also, the speed advantage is not automatic, since the speed at
which you can mentally compose publishable text is not much faster
than typing speed. But when composition is not an issue (as when
dictating notes or first drafts) the speed advantage can be
dramatic.
Using it in a home office, the chief disadvantage stems from the
fact that the cat cannot be convinced you're not talking to it. But
that's another story.
Link to
Computerworld

|