truck8.ai

How AI Actually Works: A Plain-English Guide

You're already using AI — but do you know what's actually happening under the hood? An honest, practical breakdown for business owners who don't have a technical background.

7 min readTom Mekenkamp

How AI works — and why you should care

Most business owners I talk to are already using AI in some form. ChatGPT for a proposal, Copilot for a summary, or a tool that automatically sorts emails. They can see that it works, but they don't really know how. And that's fine — until you want to build something yourself, or critically evaluate a vendor's pitch.

I've noticed that a lack of understanding makes business owners unnecessarily dependent. They accept explanations they don't really follow, and they don't dare ask why an AI solution is so expensive, or why it keeps giving half-answers. Yet the fundamentals behind how AI works are genuinely not that complicated — you don't need a maths degree.

In this article I'll walk you through what AI is, how a chat model like ChatGPT or Claude actually thinks, and how you can put that 'intelligence' to work inside your own tool. No hype, no buzzwords — just the essentials.

AI isn't one thing — it's a map of nested circles

When someone says 'we're doing something with AI,' that's about as informative as saying 'we're doing something with technology.' Not meaningless, but not concrete either. AI is an umbrella term that covers several sub-fields nested inside one another.

The outermost circle is Artificial Intelligence: anything that makes machines do what we'd call 'intelligent.' Inside that sits Machine Learning — the idea that a system learns patterns from examples rather than a programmer writing every rule by hand. One layer deeper is Deep Learning, which works with large neural networks: many layers of simple computing units that tune themselves against enormous amounts of data.

And then there are two overlapping areas at the core: Large Language Models (LLMs), which generate text, and perception models, which 'read' the world — for example, a transcription model that converts speech to text. ChatGPT, Claude, and similar tools are LLMs. Siri understanding what you dictate is a perception model.

The practical takeaway: when someone sells you an 'AI solution,' always ask which layer they mean. Almost everything that matters today — the chatbots, the summaries, the smart search — lives in those innermost, overlapping circles.

Machine learning: learned, not programmed

The core difference between traditional software and machine learning is simple but fundamental. Traditional software follows rules a human wrote. Think of an accounting package that says: if an invoice exceeds €10,000, send it to the manager. Every exception has to be added by a programmer.

Machine learning flips this. Instead of writing rules, you provide examples: you show the system thousands of invoices together with the right decisions, and the model figures out the patterns itself. You don't tell it how to decide — you show it what the correct decision is, over and over again.

The result is powerful and treacherous at the same time: the quality of the model is entirely dependent on the quality of the data. Garbage in, garbage out — AI amplifies what's already in there, good or bad. If your training data is biased or incomplete, the model learns biased or incomplete behaviour. That's not a bug; it's the architecture.

How a chat model really works: 'spicy autocomplete'

A Large Language Model — the type of AI behind ChatGPT, Claude, and similar tools — is at its core a next-word predictor. More precisely: a next-token predictor, where a token is a chunk of text slightly smaller than a word. During training the model processed billions of sentences and texts, and learned patterns about which words typically follow which other words.

Every time you type something, the model looks at everything already on the page — your question, the context, earlier messages — and calculates which token is most likely to come next. It does this token by token until the answer is complete. Hence the popular description: 'spicy autocomplete.' It's not far from the truth.

This has one important consequence that surprises a lot of people: an LLM doesn't look anything up. It has no database of facts it consults. It generates the most likely continuation of the text. That means it can sound convincing and still be wrong — what's called a hallucination. Not intentional, not lazy: structural, because that's how it works.

Two terms you need to know

  • Context window: the model's short-term memory. Everything it 'sees' — your prompt, the conversation, any documents — must fit inside this window. Whatever falls outside it simply doesn't exist for the model. No context, no memory.
  • Temperature: the creativity dial. Low means predictable and conservative; high means more varied and creative. Same model, different mood — literally a setting you pass in with every call.

The system prompt: how a bare model becomes an assistant

When you open ChatGPT and ask a question, you're not talking to a blank model that just answers. There's a layer on top: a system prompt. That's a hidden instruction sent before your question, telling the model who it is, what it can and can't do, and how it should behave.

Imagine hiring an assistant who speaks excellent English and knows a little about everything. Without any instructions, that assistant does all sorts of things. But if you say: 'You are our company's customer-service rep, you only answer questions about our products, and you are always friendly but brief' — suddenly you have a very specific employee. That's essentially what a system prompt does.

This is also why the same underlying technology can produce such different products. The model is the engine. The system prompt is the steering wheel. You can put a sports-car engine in a truck and still end up with a completely different vehicle — it all depends on how you configure it.

For business owners, this is a practical insight: when you buy an AI tool, always ask how it's configured. What instructions does the model receive? What constraints are built in? That determines the behaviour far more than which underlying model is being used.

Intelligence inside a tool: the difference between looking smart and actually thinking

This is where it gets genuinely interesting for business owners who want to build something themselves. There's a fundamental difference between a tool that looks clever and a tool that actually reasons.

A 'hardcoded' demo looks beautiful: you fill in a form, an analysis appears, everything is neatly formatted. But that analysis is baked into the code. Change the input and the output doesn't change — nothing is calculated, nothing is reasoned. It's a Potemkin village.

A tool with real intelligence works differently. The moment a user enters something, the application sends a live request — via an API — to a language model. It passes the user's input along with the system prompt that tells the model how to respond. The model thinks, generates an answer, and that answer comes back to the tool. Every time, for every user, for every situation.

The flow is simple: user input → API call with system prompt → model generates answer → tool displays result. That's it. A smart tool isn't a smarter form — it's a form that calls a model when you use it.

And you rent that model per use. You pay per amount of text going in and out — tokens, as we mentioned earlier. A small experiment costs you literally a few cents. A mature application called hundreds of times a day costs more, but still a fraction of a human employee doing the same work.

Chaining models together: AI's real power

One of the most underrated insights: most useful AI applications aren't a single model — they're a chain of models, each doing one thing well.

A workflow I see regularly with small-business clients: record a meeting, generate a summary, send out action points. That sounds like one step, but it's actually two. First, a transcription model — a perception model — converts the audio to text. That's its only job, and it does it better than the built-in speech recognition on your phone, which is small and outdated. Then an LLM takes that raw text and writes a summary with action points, guided by a system prompt that specifies what the summary should look like.

Two models, two API calls, one working workflow. The transcription step and the summarisation step are independently replaceable — if a better transcription model comes out, you swap it in without touching the rest.

This is why I always start with the same question when I work with a client: which step in your work takes the most time and is the most repetitive? Nine times out of ten, there's a combination of two or three models that can take over almost that entire step.

Key takeaways

  • AI is an umbrella term: Machine Learning contains Deep Learning, which covers both generative models (LLMs) and perception models (transcription, image recognition).
  • An LLM predicts the most likely next token — it doesn't look anything up. That's why it can sound convincing and still be wrong.
  • The system prompt is the steering wheel: the same engine (the model) behaves completely differently depending on the instructions you send with it.
  • Real intelligence in a tool lives in the live API call, not in a smart-looking form with fixed text.
  • Most practical AI applications are chains of models — each does one thing well, connected via APIs.
TM

Written by

Tom Mekenkamp

AI consultant & founder of truck8.ai

15+ years leading transformations at AB-InBev, Royal BAM and beyond — now building AI products and helping SMEs implement AI.

From understanding to building

Now you know how AI works. The next step is putting it into practice yourself — with your own business data, your own use cases, and direct feedback. That's exactly what we do in the AI cohort: a small group of business owners who go from theory to working applications in eight weeks.

Explore the AI cohort