|
Automatic
Speech Recognition (ASR) technology is making waves around
the world. Deb Mukherjee
says enterprises should be geared to embrace ASR in developing
applications that reduce expenditure and lead to better customer
satisfaction and retention
Most
of us 'love to talk' and get things done rather than punch
buttons and keys, or fill in forms. Looks like the time has
come for us to do precisely this and jettison our keyboards
in the time to come. Yes, Automatic Speech Recognition (ASR)
is making this happen and it's only a matter of time before
voice, the most natural of all interfaces, becomes the best
available alternative.
However, the benefits of ASR are not limited to merely speaking
to one's PC. It can redefine the way business is done across
the globe.
ASR systems are making round-the-clock access possible to
voice portals, providing information ranging from stock quotes
and business news to weather and traffic reports, over telephones.
These voice portals use speech recognition technology that
enable telephones to be used as a means to provide callers
with access to information repositories in much the same way
that Web portals do as information hubs for Internet users.
Although ASR systems are making headway, they have limited
buy-in use if they are not integrated with traditional interactive
voice response (IVR) systems. By integrating ASR with IVR,
organisations can transform telephony applications (in conjunction
with transaction processing systems) to automate applications
such as customer order entry, call centres, help desks, and
other customer support functions.
We are already seeing a move in this direction with the front-runners
being brokerage houses, financial institutions, banks, retail
and healthcare organisations. The highly competitive nature
of the retail, brokerage and financial services market is
making these sectors embrace speech recognition consumer interfaces
for a sustained market presence.
The situation today is that if brokerages and financial firms
do not integrate speech recognition capabilities with their
systems, they run the risk of loosing a significant number
of customers to those that do. Higher call volumes are also
making the addition of speech recognition more cost-effective
in call centres and other customer service organisations.
The primary benefit that motivates these organisations to
embrace speech solutions is the potential for dramatic reduction
in operational costs. Speech solutions can enhance the productivity
of customer service personnel through partial or complete
automation of customer calls. Increased automation frees the
customer service agent from many routine administrative tasks
and reduces costs related to customer service staffing, as
fewer agents are able to serve more customers.
Three types of ASR applications readily come to mind--dictation
and transcription, command and control, and telephony applications.
Dictation applications include simple dictation into windows
for example, for creating and editing documents. A few ASR
technology providers have succeeded in developing exclusive
dictation applications with customised vocabularies for specialised
markets such as medicine and law. These dictation products
that we see today are also extremely helpful to individuals
with physical disabilities.
Further to the benefits of dictation, ASRs can also help in
controlling our desktop using customised speech commands.
We may just be able to say 'shut down' or 'restart' and our
machine will obey. But beyond these simple activities, in
the near future, I envisage widespread adoption of ASR technology
in telephony applications that includes call centres, brokerage
houses, financial institutions and airline reservations.
The list of companies offering speech recognition technology
or products and solutions is fairly exhaustive. A majority
of blue chip computer software and hardware vendors and communication
companies are investing considerably in speech recognition
technology.
Undisputedly, financial justification emerges as a critical
factor in evaluating new technology solutions for any business.
Currently, implementing ASR solutions in an enterprise is
indeed expensive. In fact building a state-of-the-art speech
recognition and text-to-speech engine make up a majority of
the cost. However, with developments in natural language processing
(NLP) techniques and speech engines with high recognition
accuracy, we can expect the implementation cost to come down
significantly. The cost of implementation varies widely depending
on the amount of traffic that the set up is expected to handle.
A research report on speech enabling a telephony application
suggests that the ROI for a typical ASR solution is a lowly
6 months!
In the post-PC era, as computers tend to become more ubiquitous,
ASR would become highly significant, and existing interface
options such as keyboard and mice will lose out to ASR as
the primary user interface. ASR would make our lives much
simpler by helping us in a number of activities that we do
from browsing the Internet to sending our mails to placing
or retrieving stocks to carrying out banking transactions,
from anywhere at anytime. And with the current rate of developments
in ASR technology, in the near future, we can very well speak
with or control our washing machines and refrigerators using
voice commands.
ASR coupled with the combined strengths of telephone and the
computer would offer great benefits and rapid return on investment.
In the current era of fierce competition and where time is
money, enterprises should be geared to embrace speech solutions
in developing applications that reduce expenditure and lead
to better customer satisfaction and retention.
The author is CTO, Cognizant Technology Solutions. He can
be contacted at Debmukherjee@cognizant.com
|