Inverting Automated Voice Response

Khare (
Wed, 7 Jan 1998 23:46:39 +0530

One of the most intriguing techniques Jan van de Snepscheut taught us at
was program inversion: designing a compuational system to work forwards and
backwards. This has a crucial physical property: energy can be conserved in
such systems; otherwise, in 'regular' chips & devices, waste heat is
each time a bit is destroyed. For example, 2+2 -> 4 throws away
because 4 could have come from 1+3, 5+(-1), etc.

Anyway, this riff isn't about the physics of computation, it's about those
press-one-for-yes, two-for-no, telephone response systems; and how they
mirror CGI-BIN gateway programs and need XML markup.

I was stuck at my brokers', trying to close out my Boston accounts this
(97% of my money is in indexes...) The only way to access it from outside
was by 800 number. I was sitting there like a glorified modem, punching in
digits. What would it take to put a scripting language on the client, so my
PC can store the relevant parameters (card #, amt to transfer, etc) and map
voice I/O to symbolic I/O. In other words, simulate an ATM over this voice

Sure, the heavyweight solution is "home banking" -- just define a
protocol for BayBankBoston transactions and write a mission-specific
edition of
QuickIntuit. Uh-uh. Build a SINGLE, flexible system, and have it LEARN how
dela with new bank VR systems. How much coaching does it take? Recognizing
one of
those clearly enunciated Telephone Peoples' voices should be easy enough.

Computer-computer communication doesn't always have to sound like dolphins

It also doesn't have to look like punch cards: this voice scenario is the
same as
making a travel agent wizard that backs into reservation websites.

In planning the latest trip to Switzerland, I need to fuse data from
Expedia, and Swiss Rail to get a complete set of possible routings by
airfare and
air+train. All in all, I probably spent six hours worrying about this

The tree menu of a voice response is the same kind of schema automation
tools need
to extract as a WIDL meta-form. Unfortunately, in both cases, we are
stymied because
there is no stable binding from meat-data to meta-data. You can't tell from
bland syntho-voice, which revision of the bank menu you're hearing; you
can't tell
what format table the fare data will appear in from this form. There is
reason to
hope, though: we learn, we adapt, we rollback errors; soon, automated tools
will do
the same.

This isn't a holy grail of language modeling, like Zue's Galaxy group is
it's possible with VERY dumb systems. It just hasn't been seen as a killer
app, yet.
Maybe it isn't. But my task list for my someday-equerry is growing


PS. One of the latest is a standard daily sort-and-scan of environmental
contracts in the gov'ts commerce business daily site. can't even bookmark
the search,
much less automate it. WIDL? no, it's got to be MUCH easier to bring
scripting to the
masses. For that, info publishers will have to craft schemas, too; to
format their
results for later reuse.


Those of you who don;t know what DARPA is, should leave now.

whaddya think of the whitemale percentage?
I think the woman who gave you your badge is the last woman you'll see
until wed night.
So, what are the ethics of hitting on the chair's daughter?
hey, as long as your interests are pure...
purely professional?
oh, absolutly
But what about gin, then?
stay with baby blue bombay

not many grad students here. Maybe 5?
It's rare to see a meeting of old white geeks where Kiniry has the least

"can we get OMG to talk to Microsoft?" -- clearly an academic talking.

no xml -- old, buzzword-noncompliant slide?

I haven't seen this many Generic PhD technologists in a while.
heh heh, he said break functional requiremnts!
Is he insane? What kind of sw should HAVE 15 year lifecycles?
Oh, yeah, this is the guy who thinks intrinsic illities aren't
putting the feh in professional

To look up