Self-taught AI, not as complex as you think: The anatomy of ChatGPT

I remember my first ever prompt was something very basic like ‘explain what causes food to be spicy as if you were Morpheus from the Matrix’ – I was really into spicy food at the time and Morpheus is pretty damn cool – and I was totally blown away by the accuracy and depth of the response.

After using Amazon’s Alexa for several years, and finding its responses somewhat lacking and very boilerplate, I began to see the concept of a Jarvis-like AI from Iron Man as a tangible reality; the sad nerd that I am.

A person in a black coat holding a red pepper  Description automatically generated

Since then, my explorations have spanned a range of the new AI tools available – evaluating their utility in marketing contexts and personal scenarios (I even used AI to draft my wife’s birthing plan, earning some serious #supportivehusband points) – my curiosity didn’t stop at surface-level applications. I’ve delved into understanding how AI works, keen to uncover what lies under the hood of its sophisticated exterior – how it operates, what it can do for you, how to harness its capabilities, and the impact it could have on our lives, for better or worse.

Unsurprisingly, there’s a vast expanse to cover when it comes to AI, far more than a single blog post can encapsulate. Thus, this is the inaugural post of a series I’m embarking on. Consider this a go-to guide to AI. This isn’t about pitting man against machine. Instead, it represents a quest to sift through the buzz and the sensationalism surrounding AI, striving for a grounded perspective. All from the viewpoint of someone who couldn’t tell coding from crochet. 

For now, let’s start right at the beginning. How does AI actually work? In this blog, I’ll guide you through:

Why I’m learning more about AI

My fascination with AI comes from its omnipresence in our lives, from the subtle algorithms curating our social feeds, to voice assistants like Siri and Alexa, to the most sophisticated systems predicting global trends. My search for knowledge here isn’t just to grasp the technicalities but to understand the broader picture: how AI will redefine our jobs, enhance our daily experiences, and challenge our ethical boundaries.

AI is inevitably going to bring about a seismic shift in the job market, arguably eclipsing the transformations bought about by the PC and the internet. Many people have predictions for where AI could take us; Mo Gawdat shares his concerns on the dangers of AI with Steven Bartlett and Rob Toews from Forbes talks about where AI will be in 2030, but I don’t think anyone really knows what the world will look like five years from now. Even six months from now is a stretch given the rapid development. 

I do have some small predictions for the end of the year if you’re interested:

Beyond some assumptions on practical applications and outcomes, we can’t predict where AI will be in terms of its power and capability, but we can do things to keep up to speed with development. My ethos is that it’s better to lean into it and to remain agile as we navigate and adapt to these unprecedented changes. There were a lot of naysayers when the internet came out (watch this interview from 1995 where David Letterman and the audience mock Bill Gates and his view on the internet) and look how that turned out. 

Spotlight on GPT

The release of GPT-3 was, in my opinion, the watershed moment for businesses where they really started to see the practical use cases for Gen AI across their workforce. There’s a reason there’s been a surge of Gen AI tools being released by the big players – Google with Gemini, Microsoft (who back OpenAI) with Copilot,  Meta with Llama, and X with Grok – and that’s because they know the potential and they want to get their naughty little fingers in the pie of AI’s rapidly expanding market value. That’s not to say they weren’t developing these tools beforehand, but the spotlight on GPT-3 certainly sped up their timelines. What OpenAI did for the Gen AI market isn’t too dissimilar to what Tesla did to the Electric Vehicle market.

For the purpose of this blog and my exploration of AI, Generative Pre-trained Transformer (GPT) emerges as my primary use case, as this was the first significant mover in the Gen AI space and the tool I’ve engaged with most extensively. 

The coding behind AI

At its core, the magic of AI lies in its coding. Programming languages like Python serve as the foundation, allowing developers to create complex algorithms that guide AI’s learning process. Among these, algorithms developed to mimic Recurrent Neural Networks (RNNs) emulate a crucial aspect of human cognition — the ability to remember and learn from sequential information, similar to the brain’s process of storing and recalling past experiences to make sense of sequences. These algorithms dictate how AI interprets data, learns from it, and applies its acquired knowledge to make informed decisions or generate nuanced responses

Training AI: A simplified analogy

GPT’s learning journey combines supervised and self-supervised methods, where you train the AI by praising good responses and redirecting bad responses. Supervised is when a human will review outputs and guide the model to do better. Self-supervised is the next generation where you feed the model with so much data that it is able to generate its own predictions.

Bit of a stretch, but it’s not too dissimilar to the way one might train a puppy, with rewards for good behaviour and corrections for errors. 

Through extensive training on diverse datasets and this blend of learning techniques, GPT learns to recognise patterns and make decisions, fine-tuning its ability to generate precise responses to natural language prompts. 

Creating your own AI

If we boil it down to basics, the steps to crafting an AI might look something like this:

Boom! You, my friend, just created AI.

And here’s the kicker: regardless of the AI application—be it text, image, video, music, or anything else—they all come to life following these foundational steps.

GPT-3: The one that really got people talking

Believe it or not, the original GPT model was introduced in 2018, but most of us, myself included, were blissfully unaware of this disruptor lurking in the shadows. I’ll skip over the earlier models and move straight to GPT-3, the one that really got people talking early last year. 

This model’s dataset, which includes a wide swath of the web via Common Crawl, internet text from WebText2, and a vast collection of digital books from Books2, underscores the scale of GPT-3’s operations. Most sources estimate that it was trained with around 45 terabytes of text data.

I did some rough maths on this* and worked out that it would take the average person 71,298 years of non-stop reading to get through this amount of information.

GPT-3 is then guided by 175 billion parameters** to write its responses. 

When you send it a prompt, it takes the prompt and generates what it believes is the best possible resolution to the sequence, based on that 45 terabytes of data and its 175 billion parameters. It’s pretty insane!

*45 terabytes is 45,000,000,000,000 bytes. One byte represents one letter, so 1kb is 1,000 letters and if we say the average word is made up of 5 letters, that’s 167 words per kilobyte. That’s around 7.5 trillion words of structured information, knowledge, and storytelling that the model has analysed. If we take that another step further; at an average reading speed of 200 words, that would take someone 71,298 years of non-stop reading.

**Parameters in AI can be likened to adjusting the settings on a DJ deck, where each knob and slider fine-tunes how the AI “listens” and “speaks” in human language. Just as a DJ manipulates these controls to perfect the sound for their audience, tweaking AI parameters adjusts its ability to process and generate language.

GPT-4: It’s still only just getting started

Building on the foundation laid by its predecessors, GPT-4 further refines these capabilities. Although specific details about GPT-4’s training data remain under wraps, it’s plausible to assume it processed an even larger lake of text data than GPT-3, with even more parameters built into the model.

Even then though, it’s still only trained on a minuscule portion of all the information available just on the internet alone. It’s estimated that there’s 175 zettabytes of data on the internet – let’s take a vast portion of this out as the ‘unsavoury’ side of the internet. For argument’s sake, let’s say there’s 50 zettabytes of useful information. Compared to the 45 terabytes of information GPT-3 was built with, this is only 0.000009%. Even if GPT-4 is 1,000 times more powerful, that’s still a minuscule fraction.

We’re not even close to the real-time information application, the reality is we’re still in the baby steps phase of what AI could become. 

AI’s exponential growth and technological limitations

In my view, there’s a significant journey ahead for AI. The limitations we face aren’t solely from data restrictions due to copyright and privacy concerns but also stem from the computational horsepower needed to fuel these models. Picture a future where AI can sift through the entirety of the internet, engaging in both supervised and self-supervised learning continuously, all the while digesting real-time information influx from the web.

Currently, our technological infrastructure for AI, primarily powered by GPUs designed for gaming, as well as the global shortage of semiconductors poses limitations to AI’s growth. However, the advent of technology specifically designed for AI, such as Learning Processing Units (LPUs), promises a future where AI’s capabilities could expand even more. 

Imagine what will happen when we can get an AI to program an AI, creating an AI that’s 1,000 times more powerful than its predecessor, then that AI creating another AI that’s 10,000 times more powerful than that. 

At some point, AI will be able to perform tasks autonomously. It will find issues to fix, problems to solve – things we might not even have thought of ourselves.

You thought it was rapid growth so far, just you wait. It’s still early days and it’s operating with a metaphorical arm tied behind its back.

Conclusion

Right, that’s all from me this time. Hopefully there’s something in here that you’re walking away with. Next time, I’ll delve deeper into the practical applications of AI and how to write a good prompt, focusing primarily on marketing and sales. However, I’ll also highlight some compelling use cases from various other sectors to provide a broader perspective.

Related content

Access full article

B2B strategies. B2B skills.
B2B growth.

Propolis helps B2B marketers confidently build the right strategies and skills to drive growth and prove their impact.