Aging with Technology: Help with the details.

May 23rd, 2009

A crisis is on the horizon with the “Baby Boomers.” (Yawn) This, of course, is not news.

Soon there will be more of them that need assistance than society can pay for.

Soon there could be more of them than there are employed in the work force. This goes beyond money: There won’t be enough time for the following generations to attend to all the details of their aging parents and grandparents. And the list of details will grow larger as they grow older.

Because of the improvement of medical care in recent history, this group will remain healthier for a much longer period of time than previous generations. This will lead to a span of years (a gray area?) when most of these individuals will be alone yet would benefit greatly from (require?) some personal assistance.

Cassandra, a dedicated synthetic agent, could be a big help with appointments (doctor, classes, parties, etc.), lists ( shopping, to-dos, gifts, etc.), reminders (telephone calls, send birthday cards, trash day, etc.) and other such things. But she could be more than just a schedule manager, she could get a sense of the health and well being of the person she assists: Sleeping more? Skipping doctor appointments? Cognition lapses?

We have been told (and suspect it is true) that playing games forestalls dementia. Cassandra could be a more engaging opponent (or team mate) than a computer alone.

Cassandra is unlikely to convince anyone that she is “a person” but it is very likely that she would eventually be embraced as a “technology assist”. Think of all the non-human things we delegate details of our life to: Snooze alarms, thermostats, cruise control, ABS brakes, toasters, MS Outlook, “out-or-office” email responses, auto-pilots, blood glucose meters, spell checkers, scooter shopping carts, … The list of simple agents that we use (trust and rely on) is very long.

I trust my coffee maker to grind and brew my java at 6:15am and then keep it warm for an hour.

I am ready to delegate more. Who wouldn’t?

Voice in Medicine: Is it just too obvious?

April 30th, 2009

Medical costs are rising at extraordinary rates and much of that cost is a result of too many highly trained professionals doing mundane clerical tasks: Filling in forms, updating charts, writing prescriptions, creating reports … Now, couple  that with the “hands and eyes busy” nature of most medical encounters and it just seems clear that an intelligent voice-based synthetic agent/assistant would not only be less expensive than live medical personnel … but … it would lead to faster, more accurate, more legible, correctly filed information.

There is already a movement afoot to create an Electronic Medical Record (EMR) infrastructure across the industry.

This is exactly the kind of base technology that is needed to support synthetic agents. The data entry for an exam becomes almost as simple as the doctor or nurse narrating as they observe and measure: “Raised pink rash on the forearms,” “temperature is 98.9,” “Complaining about pain and stiffness in the left elbow,” etc. As a patient I would like to know what is going into my record and I would like to stress some points if I think they were missed by the doctor.

This is only the beginning of what could be done, and, it can be done without relying on any new magical technology. For fun, out of an immense pool of possible scenarios, let’s imagine one:

Mr. Jones [the patient]: “… like I was saying, I am having trouble when I straighten this arm. I have to do it slowly or else it hurts.”

Dr. [to patient]: “Let’s see that arm. [manipulating arm] does this hurt?

Mr. Jones: Yes, right there.

Dr.: Can you straighten it for me?

Mr. Jones [straightening arm and wincing]: Yes, but it’s harder to move here.

Dr.[ still looking at Mr. Jones’ arm]: “Cassandra, note that the patient is complaining of pain and stiffness in the left elbow.”

[Cassandra is the synthetic agent that the Dr. speaks and listens to through an unobtrusive “on the ear” headset. She knew that the Dr. was speaking to her because she listens all the time and knows when and what it means to “take a note.”]

Cassandra[speaking into the Dr’s ear]: “Note: complaining of pain and stiffness in the left elbow.”

Cassandra[interjecting]: “History: Patient mentioned a tightness of his left elbow last year about this time.”

Dr.[all while looking at Mr. Jones]: Is there some sport or activity you start in the spring?

Mr. Jones[thinking] Well, I started my golfing ….

Nothing in this scenario is far fetched.

We can create this synthetic agent today.

We should.

People Want To Learn Conversational Skills Too.

February 3rd, 2009

How many of us have ever tried to learn a foreign language?  You get the books, you get the tapes, you get the software that listens to you repeat phrases.  You think you have the vocabulary and the pronunciation down, and you can daydream some fairly interesting conversations with native speakers of that language.  Then, emboldened, you venture out to that country (or that part of town) and you try out your new skill.  At this point you find yourself in a position that most voiced based applications find themselves in.

If you’re like me, I expect your first forays into conversation in that new language broke down very rapidly.  You were lucky to get past two exchanges before you were in the uncharted conversational territory.  All manner of unexpected things happened: Odd pronunciations, slang, idiomatic phrases, unexpected domain shifts, the list goes on.  This, by the way, is why today’s voice applications avoid conversation with the human, and for the most part just lead the human along asking questions that must be answered in a restricted and controlled format.

But, back to the issue that your imagined conversations are never quite like real conversations.  What we’re missing in our imagined conversations is any hint of unexpected variation.  I’m not talking about the crazy non sequitur or random leaps from one domain to another, but just the normal subtle variations that always happen even when the topic is as mundane as picking up your dry cleaning, or depositing a check with the bank teller. Things as simple as when you mention the weather your conversational partner might respond with “yes it is cold,” or “boy it’s freezing,” or “yeah, I needed a hat.”

This might be an excellent job for a synthetic agent.  The domains are reasonably narrow and very well-defined, remember — we’re not inventing Hal 9000 here.  Next-generation dialog managers, like ejTalker, can automatically inject plausible variability in the phrasing and flow of the conversation.  These types of dialog managers also employ automatic behaviors such as conversational ellipsis and confusion metrics that improve the chances of repairing a conversation much like humans do.  So even if things go badly from the human’s perspective the exercise may still succeed, giving some measure of positive feedback.  In fact, I think it might be real fun to practice one’s conversational skills with a patient and forgiving tutor.  Of course, with present technology, this will not be as good as talking to a real human tutor, but it will be a definite step in the right direction.  One I would take.

I wonder why language teaching facilities, such as ESL (English as a Second Language) schools have not at least explored this kind of technology? It seems like a natural.

Why is talking to someone physically sitting next to you in the car NOT like talking on the phone?

January 31st, 2009

There have been enough studies to show that talking on the phone while driving impairs the driver.  In fact, driving while talking on the phone is just about as debilitating as driving drunk. And it doesn’t matter whether you’re holding the phone or it’s hands-free, the problem is something inherent in the phone call.

Now contrast that with talking to someone in the car. Most of us drivers do that all the time, and I think you’d agree that especially on longer trips, you actually feel safer having a conversational companion. You feel more attentive and alert.  Obviously, I’m ruling out the annoying passenger who is attempting to make you mad or tickle you, as well as trying to talk with your toddler in a rear facing safety seat strapped into the backseat of the car.  I’m talking about a socially appropriate, intelligent passenger who is in the front seat and (here is the key) is aware of the second-to-second situations that you, the driver, are dealing with.

Think about it, you’re driving along and something surprising happens.  Perhaps a fire truck is crossing an intersection against the light, or an animal runs onto the road ahead, suddenly all the brake lights on the cars ahead flash, your passenger reacts immediately even stopping in midsentence if they were speaking.  They immediately enlist their eyes and ears and provide additional information like “there’s another fire truck coming on the right,” “here comes the dog that was chasing that cat,” “I think it’s icy up there.”  Now contrast that to the phone call.  You, the driver, are responsible not only for your driving crisis, but you must instruct the person on the other side of the phone to “hold on a second.” Considering the phone call from the non-driver’s-side, I’m sure many of us have had the experience of talking to someone who was dealing with some sort of minor driving crisis but the driver decided just to “keep talking through it.”  You sensed that the conversation lost a little of its focus, and it’s likely that you used normal conversational techniques to try and regain the conversational focus.  You might have said something like “are you saying that…” or “that’s a little vague…” and the result is that you have tasked the driver with more cognitive load at the exact instant that they needed to apply their attention to the crisis.  It’s not your fault, you didn’t know.  Talking on the phone while driving doesn’t make you crash, but it puts you at risk of being robbed of your cognitive skills when you most need them.  Most of the time our driving experience is ordinary and uneventful, that’s probably why most drunk drivers make it to their destination without incurring catastrophe.

Sadly, most of the voice applications designed for the car today are just as bad, or even worse when it comes to cognitive load on the driver.  They have no awareness of the driver’s situation.  Even sadder, the designers of these applications opt for an interface of a few specific commands and rigid formalisms. Understandably they do this to overcome weaknesses in the speech recognition performance, but they also (and erroneously) believe that this small tight structure reduces the cognitive load on the driver.  I know how I feel using such systems that force me to concentrate on “getting it right.” I have witnessed people who are involved with the application development trying to demonstrate it.  They are sitting in a car that is parked at the auto show (on the carpet) and I watch their eyes glaze over when it doesn’t work and they stammer “I don’t remember the exact command.” I think about them careening down the highway in 2 tons of steel in that state of mind.

And here we come to Cassandra, your SA (Synthetic Agent) in the car.  First of all, with small improvements in the microphone input, she is vastly better at recognizing your speech.  And since she uses a sophisticated dialog management engine (her brain), she can engage in much more realistic conversations.  She can ask for clarification, or suspend an activity in order to do something of higher priority first — she has a memory. Now, with all the instrumentation already in new cars (with much more to come), Cassandra can know when you “hit the brake” a little more abruptly than usual, or you turned the steering wheel a little more erratically than expected, or that your automatic traction control system has detected that it might be slippery. Maybe Cassandra even uses her own “ears” to recognize emergency vehicle sirens?

In the next few years cars will have more proximity sensors and they will know what’s in front, behind and alongside them.  Most of us have had the experience of something surprising happening ahead on the road and our passenger said something like “the left lane is clear.”  Cassandra could say that.

I’d like to have Cassandra in my car.

Where might we first develop a relationship with a SA?

January 30th, 2009

ejTalk is all about long term relationship issues with conversational Synthetic Agents (SA). And clearly the kinds of encounters we have today with speech technology on the phone are not likely to be much more than quick inquiries (when is my flight?) or simple commands (move $200 to my checking account.)

Where do we:

  • spend large chunks of time, eyes and hands busy, and yearn for a way to make that time productive?
  • already have a nearly anthropomorphic relationship with an embodied technology?
  • forge a relationship lasting for years?
  • care about an object enough to bathe and groom it and take it to the doctor?

Why, of course, I’m talking about the car.

Many of my cars have had names. Many times I have spoken to my car, even though I knew it did not in any sense perceive my speech. In years past I have had cars that literally spoke to me. Definitely a one-sided interaction, but one that I grew to rely on so much that it led to me running out of gas with a subsequent car: The new one didn’t tell me it was hungry!

Now we have the technology to have the car speak and listen, such as systems like Sync where you press a button and say some “atomic” command (with the attendant cognitive load). But, with the addition of just little more (better microphone array, a brain … like ejTalker, an awareness of its environment) it is possible to converse and build a richer working relationship. You and the car will form a common ground. You and the car will automatically (pun?) grow to understand each other better over time:

  • Better raw speech recognition
  • Better anticipation of what you want to do
  • Better awareness of your preferences

This car/SA will become a worthy partner. It will remember stuff like:

  • Who you call
  • What phrases you tend to use
  • Things you ask it to remember
  • When you last got an oil change
  • Your supervisor’s birthday

With all this conversation won’t it be distracting like a phone call? A very good question. Short answer is no and perhaps the opposite! But that is for another entry coming soon….

Let the games begin

December 6th, 2008

Well here we go!

So this is the kickoff for the ejtalk blogging site.  I’m looking forward to having a lot of fun here and finding a lot of like-minded souls. It’s very likely that if you got to this blog you have a deep and abiding fascination with talking to synthetic life forms.  In my talks I usually refer to them as synthetic agents (SA).

In the fictional accounts of books and movies these synthetic agents appeared with a wide range of capabilities.  Some are embodied such as Robbie the robot, CP30 and Rosie the robot maid on the Jetsons.  Some are unfettered intellects such as HAL or the network spanning sentience imagined by Heinlein in The Moon is a Harsh Mistress.  Some of these imagined synthetics were a little more practical.  For the most part the computer in the Star Trek series was devoid of much personality.  It was generally conversationally efficient and cooperative, although not likely to be much fun at a party.

In the real world we have all experienced the nearly mindless form filling voice-based applications.  Rarely do we say more than one or two utterances to accomplish some simple action.  Even all the cool voice apps you read about in the press are no more than one or two utterances long, and they lead you to some atomic goal such as a phone number.  We are beginning to see some more ambitious applications, for instance technical troubleshooting.  Quite possibly if you’ve called your cable company recently with a technical problem you have gone through a series of diagnostic questions from a synthetic agent with the goal of discovering simple problems or at the very least making a more focused statement of your actual problem.

Granted, these conversations are still rudimentary, but they do address some of the issues that come up in real conversation: pauses, filler words, conversational ellipsis, restatements, etc. One thing that all of these current applications have in common is that the creator has to micromanage all of the potential twists and turns.

The next phase of conversational synthetic agents requires a fundamental change in developmental methodology. What to do and how to do it is going to be interesting.
There is a lot to talk about!