Journey with AI 2026

Ah, AI. I've long held that man is lazy, and AI is a kind of laziness enabling machine that the world has never seen before. Not even slavery got us this close to living out our dreams of lazing around.

But enough with my random ramblings.

I recently ran an event with the GeeksHacking community on Vibe coding, and of course the attendees had opinions on AI and how this would pan out in the world and all of our careers. I myself have been experimenting with AI since early 2023 both for work and for my own use, and have my own opinions. Since this is my blog, and you're here, I assume you want to know my opinions. Here we go.

My read on the current AI technology landscape

Large Language Models

In the last few years, we've moved from Chat interface + LLM API, to... Many different kinds of Chat interface + LLM API. I'm not trying to be reductive here, but this is really what it is. LLMs are by definition token machines, and the basic token is a fragment of text/data.

What has been really interesting is seeing how the LLM reacts to different combinations of these fragments of text, in conjunction with some actually interesting engineering going on in every stage of the pipeline between the user's keyboard, to the power-hungry data centres, back to the user's screen.

Diffusion Models

Diffusion models seem to be moving slower than LLMs from a consumer's perspective. I can't really tell whether this is because the technology is just harder to harness and control, or if it's because there's just less interest (and therefore less investment and community) in it.

I think this is really interesting however, and although fewer players are in the game (and mostly big ones like Google, OpenAI, Bytedance), the advancements they have been making are really amazing and getting rather concerning. This is where we get deepfake videos on a scale that even pornographic sites couldn't have dreamed of two years ago.

Agents

If you thought AI was a confusing term, wait till you hear about Agents. There have been many definitions, and I like some of them. Simon Willison's definition is a good technical one:

'An LLM agent runs tools in a loop to achieve a goal.'

I quite like that, but I don't think it helps the general layman to understand the concept, so let me offer another possible - less technical - definition:

'An LLM agent is a software program that can do something on your behalf with or without your involvement'

This sounds scary, but this can describe anything from a search agent (like Gemini) or an answer agent (like a Hubspot AI agent), all the way up to a cybersecurity hacker agent (like Claude Mythos) or an engineering team agent (like Cursor) and everything in between.

This is also what you might use to describe a new-ish fad; OpenClaw.

I know OpenClaw claims to be the AI that actually does things, but try to look past the marketing spiel. They all do things. It's just a question of which things.

My Work Stack

I'm probably a little too deep to say that I have no bias. I do, and I'll try to explain why, but take my words with a pinch of salt.

I really liked Sean Geodecke's articles on AI; he has quite a balanced view of the ideal versus the reality. In this post he made a point that I agree with.

'Software engineers are paid to ship, not to learn.'

It's a hard lesson to learn, but it's very true. The business model isn't about the engineers improving, but about the engineers providing more value - sometimes future value - than their salary.

With this in mind, my company has been experimenting with AI in as practical a manner as we can. Let me talk through our experience.

We started pretty early; after ChatGPT came out, we were already using it to try and understand the technology. This was fun and sometimes funny, but not too useful just yet.

This opinion changed when GPT 3.5 came out, and we started considering the use of some open source solutions like Open-WebUI (then called Ollama WebUI) to access some of these models through the OpenAI API, and also to see how the open source solutions would run. We ran these in a local sandbox at first, and then later opened it up to the rest of the R&D team, and later to the company, to use.

We also started testing out some IDE plugins like Cline, Roo Code, and Continue, and attempted various different projects with these to get a sense of the cost and effectiveness of the models given some code editing harness.

And then Anthropic released Claude Code, and we hopped on this pretty early on. This was very interesting and for some of us, game changing. We found that certain teams got a lot out of this, and other teams really struggled to find effectiveness. Frontend and Backend and R&D teams all got a big boost, while QA and mobile teams found this less useful.

More recently, Anthropic has also introduced Claude Cowork, which is now making it possible for non-dev teams to use Claude with their own workflows, and finding some real value from this. We're now in the middle of a company-wide rollout of Claude to the whole company.

My Personal Stack

The work stack and my personal stack developed in parallel, but with very different constraints. At work, it's a team decision driven by company policy — there are tools I use personally that the company would never touch. At home, it's a different balance: security consciousness (no free tiers for me) and cost consciousness (I want efficiency or predictability for my token spend). That's what led me to Google AI Pro and DeepSeek.

As you might have seen from my previous posts, I've been using various AI models to do AI-assisted coding for my own side-projects. My current stack is rather Google focused, but only because I decided that Google AI Pro (which includes Jules, Antigravity, and Gemini) was really worth the $30 a month, and paid for a year upfront.

Gemini handles a bunch of search → data wrangling work, Jules does small and scheduled tasks, and Antigravity helps me to iron out most of the bumps. The models get better too, so Gemini 3.1 Pro is better than Gemini 3, and better than Gemini 2.5, and so on.

When Gemini fails, I use Claude Code or OpenCode together with DeepSeek API, and this has been a very cost-effective way of getting personal AI-assisted coding.

I also gave OpenClaw a try, but it wasn't transparent enough for my liking. So I've pivoted instead to Nanobot. This runs on a Raspberry Pi 4, and I use it mainly to help me write this blog. It manages my content calendar, runs scheduled checks on my drafts, integrates with my git workflow, and helps me draft ideas from my quick thoughts so I can come back around and edit the writing to my liking.

What I Dropped

ChatGPT

Like everyone, I started here, but eventually moved off. For work use, I don't prefer it - I like Claude better, and the Dev team landed on Claude anyway - and for personal use it's just too expensive. I keep up with it just to be aware of the technology.

Open-WebUI

Initially I used Open-WebUI to give me access to both local and cloud models, but as these models get bigger, and I get more impatient, the local models became much more of a nice to have. Eventally I started just using it as a router for the frontier models, and then pretty much don't use it now that I have Gemini.

Continue

This was an early player, and I'm sure they've changed since I last tried them. It was pretty interesting to use it, although initially very frustrating because I had to approve every single action it took. Of course, that's the better and more secure way, but it's also not very useful for saving time. I did manage to build Cash Register with this, but it was very time consuming when I did it.

LangFlow/Flowise

I spent quite a bit of time playing with this to see if I could form a LLM 'machine' to do something more complex than a simple model. As it turns out, the bitter lesson is indeed real. Successive model updates made this kind of LLM orchestration a little redundant, and more complex than necessary.

Qoder

This was a little interesting, and the model was actually pretty good. But it quickly faded away because the efficiency wasn't very high and I didn't want to maintain yet another pool of tokens for yet another model provider. Lock-in is quite real.

Cost Considerations

My monthly AI spending for personal use breaks down roughly as:

Google AI Pro (Jules, Antigravity, Gemini): $30
DeepSeek API: $3-5
nanobot: $0 (runs on existing hardware)
Total: ~$35/month

Getting Started

If you're new to AI assistants in 2026, here's my advice for you.

Start with one tool—don't try to master everything at once
Identify your primary use case—coding, writing, research, or automation?
Set a budget—AI can get expensive quickly if you subscribe to everything
Experiment freely—most platforms offer free trials
Be willing to switch—the landscape changes monthly

There are some pretty good standard stacks for certain work-focused use cases, but for personal use there's probably a lot more customizability that you can find, if you look hard enough. And why not look for it?

Wrapping Up

So that's my journey with AI so far in 2026. I started with ChatGPT like everyone else, went through a phase of trying everything, and have settled into a stack that works for me—Google AI Pro for the daily stuff, DeepSeek for cost-effective coding, and a Raspberry Pi running nanobot that helps me keep this blog going.

The landscape keeps shifting, and I'm sure my stack will look different a year from now. But that's the fun of it. If you're experimenting too, I'd love to hear what's working for you.