Monday, June 27, 2016

attempting to understand Bob Dylan (just like all those big firm's Natural Language Processing programs claim they can do)







Suddenly, natural language processing (NLP) is back in the news. (Oddly this is a term I made up around 1970 because I didn’t like the previous term: computational linguistics.) I should be very happy that a field in which I spent a lot of time, having a resurgence, but I am not. People say they are working on NLP but they seem to universally misunderstand the problem. To explain the problem I will discuss the meaning of some Bob Dylan lyrics. (I chose these because IBM Watson ads chose Bob Dylan to be in their commercials and Watson summarized his work as “love fades.”)

I have selected a verse from a few of what I consider to be some of his most popular songs:


Blowin' In The Wind (1963)

Yes, and how many times must a man look up
Before he can see the sky?
Yes, and how many ears must one man have
Before he can hear people cry?
Yes, and how many deaths will it take 'til he knows
That too many people have died?


What do those lyrics mean? To me, this is a song about people’s insensitivity to the plight of others. It was written when the Viet Nam War was just beginning, and Civil Rights protestors were getting killed.

What would modern day natural language programs be able to get out of this verse? That he says “yes” a lot? That some people need more ears?

Let’s look at another verse from another song:


A Hard Rain's A-Gonna Fall (1963)

Oh, what did you meet my blue-eyed son ?
Who did you meet, my darling young one?
I met a young child beside a dead pony
I met a white man who walked a black dog
I met a young woman whose body was burning
I met a young girl, she gave me a rainbow
I met one man who was wounded in love
I met another man who was wounded in hatred
And it's a hard, it's a hard, it's a hard, it's a hard
And it's a hard rain's a-gonna fall.

What is this about? To me it seems to be about the hard knocks of life and is making the prediction that things will be getting even worse. Current NLP programs would see this as being about people, I assume, and maybe rain. Would any modern NLP program be able to understand the metaphor about hard rain or giving a gift of a rainbow? I doubt it. Yet, understanding metaphor, is critical to NLP since metaphor is everywhere. (This food tastes like crap.)

Stanford offers an NLP course (via Coursera.) This is what they say about it:

This course covers a broad range of topics in natural language processing, including word and sentence tokenization, text classification and sentiment analysis, spelling correction, information extraction, parsing, meaning extraction, and question answering, We will also introduce the underlying theory from probability, statistics, and machine learning that are crucial for the field, and cover fundamental algorithms like n-gram language modeling, naive bayes and maxent classifiers, sequence models like Hidden Markov Models, probabilistic dependency and constituent parsing, and vector-space models of meaning.

So, using a lot math you can figure out that a gift of a rainbow is about helping someone appreciate the beauty around them? I guess a Hidden Markov Model would do that for you.

Here are more lyrics from another song:

The Times They Are A-Changin’ (1964)

Come writers and critics
Who prophesize with your pen
And keep your eyes wide
The chance won't come again
And don't speak too soon
For the wheel's still in spin
And there's no tellin' who
That it's namin'
For the loser now
Will be later to win
For the times they are a-changin’.

Was Dylan speaking out against the Viet Nam War here? It seems to me he was asking the media to stop reporting on the war as a wonderful glory for the U.S. and start speaking up about its horrors. How did I figure that out? I read it, thought about it, and recalled its context. Nothing miraculous.(But, imagine any of these NLP program doing that!)  To understand you need to be thinking about what something means. Would your typical modern day NLP program think this was about prophesy, or losing?


Maggie's Farm (1965)

I ain't gonna work for Maggie's pa no more
No, I ain't gonna work for Maggie's pa no more
Well, he puts his cigar
Out in your face just for kicks
His bedroom window
It is made out of bricks
The National Guard stands around his door
Ah, I ain't gonna work for Maggie's pa no more.

This is a hard one to understand, even for a person. I saw it as a song about dropping out of the system. Here is what Wikipedia says about it:

The song, essentially a protest song against protest folk, represents Dylan's transition from a folk singer who sought authenticity in traditional song-forms and activist politics to an innovative stylist whose self-exploration made him a cultural muse for a generation.

On the other hand, this biographical context provides only one of many lenses through which to interpret the text. While some may see "Maggie's Farm" as a repudiation of the protest-song tradition associated with folk music, it can also (ironically) be seen as itself a deeply political protest song. We are told, for example, that the "National Guard" stands around the farm door, and that Maggie's mother talks of "Man and God and Law." The "farm" that Dylan sings of can in this case easily represent racism, state oppression and capitalist exploitation.

How would Microsoft’s NLP group get their programs to understand this? Here is what they say about themselves:

The Redmond-based Natural Language Processing group is focused on developing efficient algorithms to process texts and to make their information accessible to computer applications. Since text can contain information at many different granularities, from simple word or token-based representations, to rich hierarchical syntactic representations, to high-level logical representations across document collections, the group seeks to work at the right level of analysis for the application concerned.

In other words, since this isn’t a document, it is unlikely that Microsoft could do anything with “Maggie’s Farm” at all. Or, maybe my own ability to process language is off and they would get that the “farm” referred to the state’s exploitation of its own people.

Let’s try another:

Rainy Day Woman # 12 & 35 (1966)

Well, they'll stone ya when you're trying to be so good
They'll stone ya just a-like they said they would
They'll stone ya when you're tryin' to go home
Then they'll stone ya when you're there all alone
But I would not feel so all alone
Everybody must get stoned.


I have always liked this song because it says two different things at the same time. To me, it says that if you try do anything at all, someone will always be trying to stop you. It also says drugs are a good solution to dealing with all this.

Maybe Google knows how to deal with this kind of thing. Here is what Google says about their NLP work:

Natural Language Processing (NLP) research at Google focuses on algorithms that apply at scale, across languages, and across domains. Our systems are used in numerous ways across Google, impacting user experience in search, mobile, apps, ads, translate and more.
Our work spans the range of traditional NLP tasks, with general-purpose syntax and semantic algorithms underpinning more specialized systems. We are particularly interested in algorithms that scale well and can be run efficiently in a highly distributed environment.

Our syntactic systems predict part-of-speech tags for each word in a given sentence, as well as morphological features such as gender and number. They also label relationships between words, such as subject, object, modification, and others. We focus on efficient algorithms that leverage large amounts of unlabeled data, and recently have incorporated neural net technology.

On the semantic side, we identify entities in free text, label them with types (such as person, location, or organization), cluster mentions of those entities within and across documents (coreference resolution), and resolve the entities to the Knowledge Graph.

Recent work has focused on incorporating multiple sources of knowledge and information to aid with analysis of text, as well as applying frame semantics at the noun phrase, sentence, and document level.

So, they would probably get the second stoned reference, but the idea that people will try to prevent anything you might do for no good reason would be lost on Google.

Finally, one more song to contemplate:

The Boxer (1970)

  I'm just a poor boy
Though my story's seldom told
I have squandered my resistance
For a pocketful of numbles
Such are promises, all lies and jest
Still a man hears what he wants to hear
And disregards the rest.


I have always liked this song a great deal. But, I cannot tell you what it is about from looking at these lyrics. Here is the rest of it:

When I left my home and family
I was no more than a boy
In the company of strangers
In the quiet of the railway station
Running scared, laying low
Seeking out the poorer quarters
Where the ragged people go
Looking for the places only they would know.

Asking only workman's wages
I come looking for a job
But I get no offers
Just a come-on from the whores on Seventh Avenue
I do declare
There were times when I was so lonesome
I took some comfort there.

Then I'm laying out my winter clothes
And wishing I was gone, going home
Where the New York City winters aren't bleeding me
Leading me
Going home.

In the clearing stands a boxer
And a fighter by his trade
And he carries the reminders
Of every glove that laid him down
And cut him till he cried out
In his anger and his shame
"I am leaving, I am leaving"
But the fighter still remains.

Seeing the entire song makes it seem to me like a song about hope. But when you Google it you find out Dylan was very interested in boxing and that Paul Simon wrote this song  as a “dig against Dylan”.

Well, who knows? I don’t really care what these songs mean. But, oddly I can’t listen to them without taking meaning from them. A song resonates because you get something out of it that stays with you. It may not teach you anything. You may not learn anything from it. But you understand it as best you can nevertheless. To understand means to figure out what words mean in a context and what ideas they are trying to convey. Notice that “ideas” are never mentioned in the write ups I have quoted above. Google is not trying to figure out what ideas are being expressed but they do expect humans and computers to “merge” sometime soon (which mean people were suddenly a lot dumber.)

The hype about NLP these days is about Siri or other imitators that haven’t a clue what you just said but can respond with some words that may or may not be relevant to you.

It would be nice if all these research firms with piles of money to spend would work on the real NLP problem, which is figuring out how humans understand what is said to them and then automatically alter their memories accordingly. When we listen to someone talk, we attempt to discern what ideas they are trying to convey and then we grow in some small way from having participated in the conversation. To put this another way, NLP is really about learning and memory, as I said 35 years ago. Too bad that nowadays we only care about selling better ads to people or answering questions about where they can find a restaurant.

The times they are a changing.

Monday, June 20, 2016

I don't care about Odysseus Mr Kelly and neither did Jimmy Cagney

I like old movies. The other day I was watching a Jimmy Cagney movie, when my mind went to one my fixations, education. What is the connection between Cagney and education? Something personal.

I attended Stuyvesant High School, which was, (and is,) a school for smart science-oriented kids for which one needs to pass a test in order to get in. I should have liked Stuyvesant I suppose, but I am sorry to say I didn’t. I was reminded of one of the reasons I didn’t by watching Jimmy Cagney. Jimmy Cagney and I had the same English teacher. (Oh come Roger, you are not that old.)

His name was Mr Kelly and he had taught at Stuyvesant High School all his life. Jimmy Cagney was born in 1899, so let’s assume he went to high school in 1917. I started Stuyvesant in 1962. So Mr Kelly had to have been there for 45 years, I suppose, and indeed he was. Was Jimmy a science superstar? No. Stuyvesant was a local school for the lower east side in New York back in those days.  Mr. Kelly used to brag about what he told Jimmy or what Jimmy had said. He was his most famous student (this, of course, included many of New York’s best and brightest for many years.)

I remember this about Mr Kelly, in part, because he tended to say it a lot. What else do I remember about Mr. Kelly’s English class? I remember he used to sit in the back of the room and in a booming voice say  “Why did Odysseus…” followed by whatever the action was. When I typed Why did Odysseus into Google, these questions came up:

Why did Odysseus leave Ithaca?
Why did Odysseus go to fight in Troy?

Now, as an adult I have spent a great deal of time in Greece. I have been to Ithaca and Troy. And, I can tell you that I simply have no idea why Odysseus did anything or why Mr Kelly, or more accurately the New York City school system, wanted me to know. And, moreover I don’t care.

Now, I realize that intellectuals like to claim that knowledge about the Ancient Greeks is important to know. I am, at least in theory, an intellectual, and I still don’t care,

Now imagine how many of our students care.

Why do we insist on teaching things that kids don’t care about and have no reason to care about?

Is this a very clever way to behave? How do students who don’t care manage to get by? Is their future made more difficult by not caring about such stuff? 

I argue that it is. I got by despite not caring about this kind of thing. Most of the school population does not get by with this attitude and so, although there is no reason to know anything about Odysseus, most kids are punished severely for not knowing because they can’t pass tests and get good grades and get into college. It is time to re-think what we do in high school. Some kids can survive it. many cannot.

I am sure that someone somewhere now wants to lecture me on what I missed out on and why I should care about Odysseus. But I care about other thing, like computers and Artificial Intelligence and how the mind works, none of which were taught at Stuyvesant High School at the time, and managed to get by just fine.


Can we please let kids choose to learn what it interests them to learn? 

Monday, June 6, 2016

A little IBM Watson irony

Last year, IBM asked me if they could produce an “art visual” with a quote of mine on it. In light of IBM’s complete disregard of the implications of the quote that they selected with respect to their claims for Watson, I thought it would be fun to show the visual here:



since the print is kind of small, here is what it says:


"number crunching can only get you so far. Intelligence, artificial or otherwise, requires knowing why things happen, what emotions they stir up, and being able to predict possible consequences of actions"