What’s behind the interface?
The most recent generation of AI is based on LLMs. Interestingly, ChatGPT combines an LLM with an interaction layer that uses reinforcement learning.
An LLM is a neural network model that uses unsupervised learning to predict outcomes. Among the many AI models developed, LLMs are uniquely unexplainable. Language models (as distinct from large language models) have existed for a while and can predict the next word or phrase in a sentence. They use different techniques than LLMs and have different applications — auto-correct is a common use.
So why has this particular application become so popular, so quickly? It’s partly because people from non-technical backgrounds can use it for a range of tasks. Many professionals, creators and writers have already tried it: they are the universe of users, customers and citizens that would need to accept AI for it to likely have a real impact.
But there’s another factor. ChatGPT also has an analytics layer, comprising reinforcement learning built using feedback from humans (known as ‘labelers’ in AI-speak).
To create this, the ‘labelers’ gave OpenAI examples of what a “good” answer would look like. Then they ranked ChatGPT output for a particular prompt, from worst to best, with the results used to train a separate ‘reward’ model. Finally, this was used in a supervised exercise to create a policy which formed the logic that makes ChatGPT’s user experience (UX) so good.