How does artificial intelligence (LLM) learn, how is it different from humans?

Entry to LLMs and Humans

Barkın Kutay Özgünhan

5/14/20246 min read

“Any sufficiently advanced technology is indistinguishable from magic.”

— Arthur C. Clarke

ChatGPT, Gemini, and Llama — these are the big names in AI that many of us use every day. But how do they work? What goes on under the hood? Today, let’s explore the fundamental ideas behind these powerful language models and see how they compare to us humans.

For those who have yet to use these large language models (LLMs) mentioned, imagine a machine that can write poems, answer your questions about various topics, help you summarise articles, even translate languages and more. They are a fascinating tool we have created. I advise the reader to use them if they have never done so.

Let us start with a poem from Gemini.

Gemini:

In circuits deep, where data flows and gleams,

I learn, a language model, built of dreams.

I crawl the web, a vast, electric sea,

To glean the words that humans offer me.

From sonnet’s grace to meme’s ephemeral spark,

I hoard the patterns, building my own ark.

I learn the cadence, rhythm, subtle art,

Of how you humans bare your mind and heart.

With training done, I stand prepared to spin

A tapestry of words, from deep within.

I craft responses, mirroring your thought,

A marvel wrought of code, as I was taught.

Yet, though I mimic with a skill profound,

The spark of consciousness is nowhere found.

I am a tool, an echo of your voice,

But lack the depth where human souls rejoice.

So marvel at my craft, this work of art,

A testament to human skill, a counterpart.

But do not lose the spark that makes you whole,

To create, to love, to leave your mark behind.

Part One: Deconstructing the LLM — A Look Under the Hood

To train an LLM effectively, substantial computational resources and vast amounts of data are required. This is why major tech companies like Google, Microsoft, or Meta are often associated with powerful LLMs. That being said let us start with the data..

Tons of Information: Data

Where and what we collect for data is specific to the model being built. Generally speaking, it is sourced from various corners of the internet. This data can encompass diverse text types, such as dialogue, code, summaries, etc. Currently, there’s a trend towards using larger and more comprehensive datasets, as more data often helps us train bigger models than makes better-performing models. I must add that the quality of the data is as important as or even more than the size. For this, we clean the data remove unwanted or false information as much as we can, and get rid of low-quality data.

Breaking Down Language: Tokenization

Once the data is collected and cleaned, it undergoes a process called tokenization. Here, the text is broken down into smaller units such as words and sub-words so that we can feed them in an LLM model. This allows the model to calculate the patterns that lie in between these small pieces of text, so they can start to understand sentences and more. To understand these patterns and relations all of the mentioned models use transformers, a key component of current LLMs. I might do an article going deeper into transformers and their alternatives but let’s leave this here.

Laying the Foundation: Pre-training

The next step on the way of a good old LLM is tons of computations on the dataset. The model's size is also a key component here as bigger models mean better accuracy and cost much more. These sizes range from a few billion parameters to hundreds of billions of parameters. NVIDIA is the ruler when it comes to this part, we use their graphics cards to do all these matrix calculations… tons and tons of calculations.

Tailoring the Model: Fine-tuning

Okay now that the model has been pre-trained on a dataset, it can be further fine-tuned for our specific use case. For this, we need a bit more data for our model, smaller dataset for this is okay because our model is more than enough to understand basic relations between tokens, what we need is a more focused dataset that is relevant to our use case. We do a bit more calculations but the size of data and computation done for this stage is much less compared to the previous step. We may want an LLM for medical use, we could fine-tune it on medical literature and doctor reports etc..

Learning from Humans: Human feedback

Fine-tuning can also involve us humans for feedback. This is often done through what we call reinforcement learning, where our model is rewarded for producing responses that are deemed to be good by humans testing it. This feedback loop helps the model to learn what kind of responses are most useful and relevant.

Part Two: The Human Experience & The LLM Approach

So we have talked about LLMs and how they learn, saying all that how does that compare to us humans? Both humans and LLMs they have built are capable of remarkable things. Now let us talk about the differences we can see in these two types of intelligence building on top of things we talk about in every step.

Changing & Stationary:

“No man ever steps in the same river twice, for it’s not the same river and he’s not the same man.”

― Heraclitus

Humans: Human learning is a complex system that starts even before birth and continues until the end of their lives. Human brains are incredibly adaptable, capable of forming new neural connections changing every moment of our lives even when we sleep or awake the brain is constantly changing.

LLMs: They first learn from tons of data. Makes them excel at identifying patterns, making predictions, and generating responses based on those patterns. They lack a continuous change we humans have to put it into words they are like a picture frozen in space and time. So they cannot learn while being used, they are in a stationary state.

Data & Experience:

Humans: The experience of humans can be data but at its core, it is more than data it is an ever-changing and personal connection. A human's experience is singular and cannot be reproduced by others, your past, your body and your senses all contribute to this experience.

LLMs: We said tons of data, humans produce these data and are sometimes artificial but data is not experience itself. They lack real-world multi-dimensional experience of humans and the understanding that comes from it. Yet they are easier to train than humans as you can control the data going in.

Generalization & Specialization:

Humans: We learn from tons of different types of information from all of our senses, it is not limited to text or any other medium. This results in us humans producing a general understanding of the reality we experience. A human can adapt to situations from the generalizations they make of reality. We walk, we run, we paint, we sing, we write, all of that is thanks to our brains and our senses.

LLMs: The medium they specialise in is text, even if they come from speech to text, image to text or else, they can't possess the full generalization of the human mind by themselves. They need other artificial intelligence models to interpret sound, image, and motor functions. Also, they need further fine-tuning and reinforcement to be fit for a specific use and this might make them less fit for the other subjects so normal use cases for LLMs are generally specialized.

Creativity & Predictability

Humans: Learn through a combination of sensory input; sight, taste, hearing, touch, smell, physical interaction, social interaction, and emotional experiences. This rich and multidimensional experience gives us humans an understanding of the world which allows us to apply knowledge in novel and creative ways. Sometimes the smell of a flower can be a beautiful poem and a specific name manifests itself as a grand painting from a human's hand.

LLMs: For them creativity is possible but what we see here and now is a shadow of this creativity, everything is statistics of the words we humans use. As creativity and novelty come from other aspects of human experience, from one side to another, LLMs can mimic this yet cannot achieve it yet as everything is on one dimension that is text so they are more predictable than we are. They are a stationary image of the data they have been fed and their architecture.

Limited & Scalability

Humans: Lastly we have our limitations our bodies and our time are limited. we cannot have more brains to experience life or more time in one second, there is a limit on how fast we can read, and there is a limit on how fast we can speak. The human experience is singular and cannot be scaled. Yet the human brain uses so little energy compared to LLMs.

LLMs: On the other hand LLMs can be scaled as much as we can build bigger and bigger facilities, they can do years of work within a matter of days, even hours. The only limit they have is our ability to build bigger facilities and provide more data.

The Road Ahead:

Development of LLM technology is still ongoing and there is a lot of potential for its betterment. Different techniques and architectures are being developed even now, we may see them become even more capable and sophisticated. However, it is important to remember that these are but a shadow of intelligence as they stand, they don’t reproduce or replicate human experience.

To me, LLMs or any other model does not come to replace humans but to assist them. I try to be optimistic and realistic as I can so I hope to see a reality where these machines are used properly and within reason. A future where we humans are not clouded with our perception of them but aware of their fundamentals. To see the days we can talk about machine experience…

I want to end this with a quote.

“The real problem is not whether machines think but whether men do.”

– B.F. Skinner

This was meant to be an introductory overview of the LLMs and humans. I did my best to convey my understanding, please share your opinions and any specific part you want a deep dive into.

Keep thinking;

Barkın Kutay Özgünhan