In a few years,
people will be able to order pizzas over the phone without
ever talking to a person, change a TV channel with a word or
two and file expense reports from their cars via cell phone.
Technology that allows computers to understand speech has
been in the works for more than 30 years, but getting it right
and widely accepted has proven more difficult than researchers
expected.
Experts say they've
finally worked through some major stumbling blocks such as
unreliable cell phone connections and background noise that
make it tough for computers to pick up a voice. The technology
will likely go mainstream within five years.
They expect cell phones and cars to be the areas where
voice-recognition could have the biggest growth. Concern over
distracted driving makes using speech to dial phones and
handle other tasks in cars even more urgent.
Some of the new technology, reminiscent of HAL, the talking
computer in the movie "2001: A Space Odyssey," is already
here:
- The Infiniti Q45 from Nissan allows passengers to change
the temperature and audio controls with voice commands.
OnStar, a Troy, Mich.-based subsidiary of General Motors, has
launched a system in which subscribers pay a fee so they can
make phone calls and get weather, stock quotes and e-mail read
to them through computers that understand voice commands.
- United Airlines has a system that allows callers to get
flight information by saying a flight number, or arrival and
destination cities for a flight. The computer asks questions
and the caller responds by voice.
- Sprint PCS recently launched a system that allows its
customers to buy plans that will let them get basic traffic
reports or horoscopes read to them by uttering commands into
their phone's speakers.
But speech recognition technology is expected to become
more sophisticated. Companies are spending millions to make
that happen. The 185 or so worldwide companies working on
speech recognition hope to cash in on what could be a vibrant,
new market.
Growing volume
Forrester Research estimates that voice-based commerce will
reach or exceed $450 billion, or three times the projected
amount for online retail sales, by 2003.
"It's going to be very big," said Ira Brodsky, president of
Datacomm Research Co. of Chesterfield, Mo. "Anybody who has a
phone will be using these kinds of services whether they know
it or not."
Brodsky said the technology will become popular because it
will focus on practical uses like getting directions or
traffic reports while driving.
The technology has improved, making speech recognition
possible.
The cost of hardware and software has gone down. Faster
microprocessors handle information in seconds versus the
several minutes it took a few years ago. Wireless networks are
more reliable.
"It all depends on having the infrastructure in place,"
said Judith Markowitz of Markowitz Consultants in Chicago.
Not quite conversational
Software that understands conversational speech has yet to
be fully developed.
Emmett Coin, CEO of ejTalk Research, a speech recognition
research and consulting company in Detroit, said speech
recognition systems will need to understand conversation to be
fully accepted by the average person. "The more it mimics our
human innate pattern of speech, the easier it will be and the
less we will know it is there," he said.
Most commercial systems require specific voice commands or
work by recognizing particular words that act as commands.
For instance JustTalk Inc., an Ann Arbor, Mich., speech
recognition company, has developed software that allows
businesspeople on the go to update their computer calendars,
contact lists and even their sales inventories with a phone
call. While the system can't understand conversation, it picks
up particular words from a caller's sentence.
Commands essentially limit the field of words a system has
to recognize helping to increase the rate at which computers
can accurately understand speech.
Much of today's software can reach up to 90 percent
accuracy, even higher with training, experts say. But strong
accents, high-pitched voices and speech impediments can throw
accuracy rates off.
Background noise can also confuse voice-recognition
systems. Unlike people, computers cannot focus on a particular
speaker and drown out the rest of the background noise,
explains Raymond Gunn, CEO of Clarity, a Troy company that
makes software that enables computers to separate a speaker's
voice from other noise.
Gunn said Clarity's software can boost a system's accuracy
rate from 90 percent up to as high as 97 percent.
Background noise is a particularly big issue in cars and
public places.
E-mail
this story to a friend
More articles on the Business News home page