Monday, July 25, 2016

"John hit Mary" and other AI problems

With articles being written about AI constantly now, I feel it is time to think about the basics. How do you get a computer to understand a simple sentence such as John hit Mary?

When I first started working on computational linguistics (as it was called then), the linguists made clear that they thought this was easy to do on a computer. You just used a syntactic parser and identified the noun phrases and the verb phrases.

I thought this was absurd, just as absurd as the idea that AI is coming tomorrow to eat us all.

To explain why, let’s discuss this sentence. What happens when you hear it? You react in some way. You might think that this was spousal abuse and that the police should be called. But then, I could tell you that John is 5 and Mary is his mother. Then you might wondered if and how Mary punished him. Or, I could tell you that John is 50 and Mary is his mother. Now you are wondering about the police again and also about what is wrong with John. Or, I could tell you that Mary is 5 and John is her father and you would be wondering about his parenting skills.

Absent of any of this information you make assumptions. I could ask you what Mary was wearing or what color her hair was and you might very well have an answer, or at least a guess. We comprehend through visualization and imagination. No sentence makes full sense out of context and we rarely have the full context, so we imagine it. People are constantly figuring out the parts they don’t know for sure. We make mistakes all the time. That is how human comprehension works. Did I mention that John and Mary were each driving their own cars at the time? Did I mention that John is a blackjack dealer and Mary was playing blackjack? Perhaps John is a baseball pitcher and Mary was the batter. Maybe they are both boxers.

Sentences don’t mean much out of context but two things are true:

1. we never have the full context so understanders make inferences, draw pictures in their minds and attempt to do the best they can

2. computers, in order to do this effectively, would have to have what people have: a model of the world. 

When you hear about all the AI programs being worked on today it is safe to assume that they are not even thinking about building complex world models.

For fun, I typed “John hit Mary” into Google. The first thing that comes up is an article on empathy that discusses some of what I have been saying here. The second thing that comes up is an excerpt from a book of mine (Explanation Patterns) which discusses the belief structures underlying the comprehension of such a sentence.  

Since the media and many companies would have se believe that have solved the natural language problem, let’s consider a real example of “John hit Mary” . This is from the New York Post (July 25, 2016);

Cops smashed their way into Hollywood star Lindsay Lohan’s posh London flat after a furious bust up with her Russian lover.
Police were called in as Lohan, 30, suffered a meltdown on the balcony of her Knightsbridge apartment with boyfriend Egor Tarabasov, 22 – claiming she had been attacked.
Waking up neighbors, the A-lister shouted, “He just strangled me. He almost killed me.”
In footage taken by a neighbor she could be heard begging for help outside of her $4.2 million home at 5 a.m. on Saturday morning.
Shouting her name and address across the street, she screamed, “Please please please. He just strangled me. He almost killed me. Everybody will know. Get out of my house.”
She added, “Do it. I dare you again. You’re f—ing crazy. You sick f—. You need help. It’s my house, get out of my house.”
She was also heard shouting to Egor — who was also near the balcony, “I’m done. I don’t love you anymore. You tried to kill me. You’re a f—ing psycho,” before adding. “We are finished.”
She added, “No Egor, you’ve been strangling me constantly. You can’t strangle a woman constantly and beat the s— out of her and think it’s ok. Everybody saw you touch me. It’s filmed. Get out! Get out.”
Ten minutes later police arrived after receiving reports of a “woman in distress” – forcing their way into the property – only to find it empty after going inside.
They said no crime was committed and no arrests were made.

How would a computer understand this story? It would need a better model of the world than I have. I don't know much about Lindsay Lohan except that she was a popular child actor who now seems to be in trouble quiet often.

Reading this story, I wonder what is wrong with her. Why does she make such bad choices? How is a 22 year old Russian the right man for her? 

Then, also, I wonder why no arrests were made. Was she making it all up? And how about “you have been strangling me constantly?” Really? Why would someone put up with that? Sometimes, I know, poor women put up with abuse because they have nowhere to go. But isn’t she the rich one?

So, when I read this, I wonder many things about what is wrong with this woman, why her life went so wrong, why someone isn’t helping her, and what parts of this story have been left out.

A computer would need a deep model about why people do what they do, better than the one I have, because I am even having trouble with wondering why this is news. (Of course, it is the New York Post.)

When computers can tell me what the real issues are here and be able to enlighten me in some way about the  questions I asked, then I would be impressed. Not afraid this AI, simply happy to know that a computer could explain stuff to me that I don't have a good world model for, and hence don’t understand very well. It is the building of complex world models about why people do what they do that underlies all understanding. The AI being worked on today doesn't even attempt to solve that problem.

Jack Elson said...

Right on. Just the word "hit" has many possible interpretations which requires knowing that John and Mary are humans as opposed to gorillas or boats. Context is everything. We as humans make so many assumptions, e.g. John and Mary are humans, and we immediately know how it feels to be hit, and we tend to make a value judgment before we seek more info. How do we build the world model into AI? Do we start with an infant computer and let it learn or can we start with a more comprehensive model? So we need think of the computer as a human or as just a machine? Still a long way to go.