Tuesday, May 3, 2016

AI is nowhere near working; let's think about what people can do that AI can't

When I started working in AI in the 1960’s, it wasn’t really one field, just a set of people trying to get computers to do some interesting things that we knew people were capable of doing. These days, unfortunately, AI seems to mean “deep learning” whatever that is, and stuff IBM talks about that uses the word “cognitive.”  

I have recently been thinking about some of the aspects of AI that I did not work on. (I worked on natural language processing, memory, and learning.)  I think there are things worth discussing about the other areas of AI that might shed light on what is really going in today’s so-called AI.

Let’s start with Face Recognition. It is clear that face recognition technology is pretty good. Facebook can tell when your picture has been posted by someone and can add your name to it. I am sure there are all kinds of surveillance technologies that make use of face recognition as well.

But, there is an aspect of face recognition that people naturally do, but computers cannot come close to doing today. I don’t mean to be political here, but my best example is recognizing Ted Cruz’s face. I can recognize him, but every time I see a man who is angry, mean, and just a little Satanic looking. Commentators say these things all the time, and I am not trying to comment on that; rather I am trying ask the question: what is it that we see when we look at someone and immediately distrust them and are slightly afraid of them?

To put this another way, when you are walking down the street and someone scary walks by, what is it that you see in his face that makes him seem scary?   I  have been running some experiments about how people react to talking head videos that we’ve captured of experts telling stories about their expertise. Every time someone looks at one of our videos they have an instant reaction to the human qualities of the person as well as to the story the person is telling. They like or dislike people in about ten seconds. What is it that they are seeing?

This is an interesting topic, but my point is about AI. AI is no where ready, no matter how well it does at face recognition, to tell us,  “I find this guy scary” or “distrustful" or “he seems to be lying”, even though people can do this all the time without conscious thought.

What does this tell us? It tells us that AI has a long way to go before it can do stuff that nearly any human can easily do.   Actually, any dog can do this. They too have instant reactions to a person. What are they seeing? This is the AI question. My guess is that Facebook even with its 100’s of AI people is not working on this problem and moreover, it doesn’t care. But it is a very important aspect of cognition. (Sorry, IBM, you don’t actually own that word.) Facebook is only working on the “you can count the pixels and pattern match” part of face recognition. When we feel attracted to someone, or we want to avoid them, we are using our innate ability to do a more subtle kind of face recognition.

I have this same problem with speech synthesis and speech recognition. I was riding in my wife’s car the other day and the navigation system  she was using told her to turn on “puggah” boulevard. We were in an area we know and both laughed out loud. The street is called PGA Blvd, which is the acronym for the Professional Golf Association. The program never heard about acronyms I guess. Later it told us to get on the ramp for W Palm Beach. Now a reader would think I am abbreviating west with the W, but I am not. The navigation system actually said “W.” My reaction was that this device is really stupid. How hard would it be to make an intelligent navigator with respect to speech synthesis? Well, apparently too hard for the company that made this one. (It would also be too hard for it to tell me about a new restaurant that I was passing and might want to check out, which is the kind of AI that I am interested in.) That is AI too, but it is not “deep learning,” so no one is funding it.

Which leads me to what I really wanted to talk about here, speech recognition. Someone said to me the other day that AI has made real strides in speech recognition. I laughed. Now, I realize people talk to Siri and other devices. And sometimes Siri “knows” what you are saying in the sense that it can find a response. As a way of pointing the real AI problem out to my friend, the next thing I said was:   “szeretlek nudunuca”. To which he responded “huh?” I said it again. He said he didn’t understand. I asked if he could tell me one word that I said. He said “no.” I said “can you even report a part of what I said?’ He said “no.” It all sounded like gibberish to him. Of course it did. I was talking Hungarian. When someone speaks an unfamiliar language you cannot hear where the word breaks are, and you cannot even decipher the sounds. This is because human speech recognition involves having heard everything before and understanding the context in which the spoken words said belong.  

It is difficult for people to understand a sentence that is out of context.  What is a normal response to a completely unexpected sentence? People generally have to be listening for something in order to understand it. Understanding involves guessing about what someone is likely to say. Those guesses are made on the basis our knowledge of each other and of the possible things we are or might talking about. To do that right in AI, we need to determine intentions and motivations and we need to have a model of the person we are talking to, including what they know and what their interests are.

The other day someone who I play softball with (who often asks me questions) asked me “what is Zion?” I had to ask him to explain what he was actually asking about. I heard the words, but had no idea what he was trying to find out.  After a bunch of sentences from him I got what the question was about. I didn't have any trouble with the words, but I had absolutely no idea what he wanted to learn from me. Siri and the others would not be able to have that conversation with him because there is no AI there. Apple, Google and the rest don’t care about that. It is the pretense of AI that seems to interest them.


We are very far from a computer that can do the things I’ve just discussed. AI will be hyped as much as IBM’s marketers and others choose to do it in an effort to make money. As for me, I would prefer that they actually worked on AI instead of trying to convince everyone that AI is already here.

No comments:

Post a Comment