Discussion about this post

User's avatar
Peter O'Connor's avatar

The Bob-AI takes are generally insightful but I submit that they would be more insightful with a some minimal knowledge of concepts like vectors.

A vector is just a list of numbers, like [1.5, -2, 0.5]. So Hinton is saying a "thought" is just represented by a list of numbers in the model. You can add or subtract two vectors of the same length like [1.5, -2, 0.5]-[1, 0.5, 0]=[0.5, -2.5, 0.5].

The reason that matters is that one of the first indicators that something freaky was going on in those language models way back when they started training them, which Michal Kosinsky alluded to in an earlier podcast, was:

You train these models to predict the next word, and in the process, they learn an internal vector representation for every word (they turn each word into a list of 1000 numbers, and this mapping from word to vector evolves as they learn). Then, after learning, researchers looked at these vectors and asked "hey what happens if you take [vector for king]-[vector for man]+[vector for woman]"?. Guess what - the answer is really close to [vector for queen]. Same goes for London- England+France=Paris. So these things have learned analogies, even though all they were trained to do was predict the next word. Once you realize that these models are learning to structure knowledge like this even though they're not explicitly trained to, you start thinking "ok maybe these are not just stochastic parrots after all"

Expand full comment
Mark Jensen's avatar

Good piece. François Chollet's book Deep Learning with Python, now in its 2nd ed. (Manning, 2021) has some accessible passages that help clarify the 'vector' notion you mention.

Expand full comment
50 more comments...

No posts