Why I'm not a fan of AI everywhere

But it does have use cases

These days it seems that AI is everywhere. It’s being pushed in your face, every application has an AI component. Even your OS is going to be agentic

In many cases I’m not a fan. But I can see some benefits under some circumstances.

Because of how many areas AI is getting into, this has become something of a long post (I’ve been putting it off for a while ‘cos I expected this!). And I’ve only covered a fraction of the potential use cases.

Background

When I’m talking about “AI” in this post then I’m mostly talking about the modern “Large Language Model” style of systems. These are, inherently, prediction systems. Based on input and training models and constraints on the output the system will generate responses that are statistically likely to be relevant.

John Scalzi, a pretty successful SciFi author, has written how xAI’s grokipedia makes shit up using himself as the subject of a query. He’s done this with other AIs and got similarly wrong results.

This, for example, is why AI’s have trouble with basic maths or why they “hallucinate”.

Despite these failures, somehow these AIs can pass bar association exams or do well in maths tests. Yet they fail to be accurate in the real world, famously inventing cases. I suspect this is because exam questions and answers are more generally known, and so show up in training data.

With this background I’m going to look at a few use cases and give my opinion on the suitability (or not) of AI.

I’m not going to discuss whether the current generation of AI is true AI (AGI), or whether the current path is going to lead to it. I’m not going to discuss how marketing has caused a terminology shift. I’m not going to discuss the potential AI economic bubble, nor the energy and water used. Nor the data and intellectual property stolen to create the current models.

What I’m going to talk about is the use of this AI.

Agentic OS

I’ll start here since I mentioned it earlier. And I really really don’t want this.

The problem, in my mind, is one of determinism. We have spent decades making computers do exactly what we tell them. We’ve invented specialized languages to try and be as precise as possible in making these statements.

Natural language is inherently imprecise. We see this, all the time, in online conversations. Emoticons (and later emojis) appeared in online text communities (eg Usenet) to try and avoid the problems caused by language ambiguities and through the lack of secondary information channels (tone of voice, facial expressions, body language); I might call my friend an arsehole or wanker as a mild humourous jab, but then use the same words to denigrate a stranger. An outsider, reading just the text, can’t tell.

So if two humans can’t understand each other perfectly (even worse if one is English and the other is American), can we expect some AI to understand my intent? I really don’t want my computer to do unexpected things because it interpreted my natural language commands differently to what I intended.

A common counter to this would be a “Human In The Loop” (HITL) process, but this really won’t work. Imagine this scenario…

  • Computer, open the last document I was working on
  • Do you want to open pornographic_document.jpg?
  • Dammit, no! That’s not what I meant! Go away, I’ll do it using keyboard and mouse

It would get even worse as your PC starts to control more of your life; for example if you have a smart house you won’t want to be prompted every time you ask it to turn a light on

  • Computer, turn on the light
  • Do you want me to turn on the light?
  • Oh STFU, and just do it without asking me in the future.

As an Alexa user since 2016 I would have smashed the thing with a hammer if it had kept asking me for verification.

We will tune the algorithm (some things don’t need HITL, others will) but there will always be edge cases.

So, no; I don’t want my computer guessing my intent. I don’t want an agentic OS. I want my interactions with my OS to be deterministic, not probabilistic.

Speech recognition

But since I mentioned Alexa, something AI should be good at is speech recognition. Because it’s predictive it should be able to better recognise speech and perform a more accurate speech-to-text conversion. Words it doesn’t clearly recognise can be statistically determined based on what it did recognise. And, similarly, to help detect false positive activations.

This can definitely assist in dictation scenarios, or even when talking to “smart devices”. It should also help with IVR systems.

I distinguish this from the “agentic”, above, because we’re not trying to determine the intent of the speech, just recognise the speech itself.

Vibe coding

Agh, OMG, no. I hate this. This is the worst of everything I don’t want about an agentic OS, made 10x worse.

Programming languages exist as a means of being explicit to the computer about what we want. Yes, it’s a skill; not just one of learning the language but also of one around how to think in a logical manner, of breaking problems down. The coding aspect of programming is, really, easy when compared to the brain power of understanding the problem and decomposing it.

It’s this latter aspect that vibe coding is replacing, and why I feel it’s ultimately a bad idea.

With vibe coding you describe the end state and, magically, the AI produces code for you. Except it doesn’t quite work so you refine the description and it changes the code. And you iterate on this dozens of times and eventually you get something you think does what you intended. Success, right? You haven’t had to learn a programming language, you haven’t had to break the problem down. You just used natural language to do stuff.

Well, it depends.

If this is for a home personal project and you’re not concerned about security or efficiency or even 100% accuracy then this is fine. Linus Torvalds has famously vibe coded and people point to this and say “Hey, if the inventor of Linux does it then it must be OK”. But this was for a personal project.

Linus has also said “I’m not a programmer anymore”. Indeed, if you read that article he “vibe codes” in real life; writes an email stating intent (pseudo code). There’s a HITL process; the person at the other end of the email, and the maintainers who control the merge process.

My small attempt at vibe coding resulted in working code… but not good code. There were architectural issues, even in the small program I wanted to write.

And there’s a big difference between “personal use” and “enterprise use”. Code you write today will be modified tomorrow (for some value of “tomorrow”). Your vibed code in your repo isn’t going to have any state. Someone, or something, is going to have to understand what was written, why it does what it does, how it works. That person might be you (I joke “I try to write good code and document it because the next person to maintain it could just easily be me”).

There’s a big difference between “working code” and “maintainable code”.

I’ve seen some people just say “Eh, the AI of tomorrow will be good enough to understand the code and build the context”, but that’s just magical thinking.

The other thing I’ve seen is “this is what code reviews are for”; this is the HITL part of the process. But this is, essentially, just moving the coding cost from the original coder to the reviewer. If a review now takes twice as long because the supplied code is harder to understand then we’re not actually winning. The original developer may code faster (although this may be a misperception) but the whole workflow could be slower.

In the past we’ve complained about “Stack Overflow programmers” who just blindly cut’n’paste code. I suspect we’ll also start seeing the same about vibe programmers.

Especially since the AI was trained on Stack Overflow (and has probably killed it, based on the traffic the site is now getting) and so has all of the bugs and security issues that will come as a result.

So, sure, vibe code for personal projects. But don’t take it anywhere near the enterprise. It will cost you in the long term.

Coding assistance

Now here I’m distinguishing coding assistance from actual coding itself. Indeed, almost the opposite; instead of telling the developer what to do we tell them what not to do.

We could have an agent sitting in the IDE that’s predicting mistakes and highlighting them. There are already systems that deal with this, but I could see an LLM based AI being better at detecting bad coding patterns.

Today an LLM takes too many resources to be efficient locally (umm, unless we give every developer a beefy GPU in their laptop?) so this may be better as part of a SaaS scan, but smaller models may be more efficient in the future.

Having a model that can predict errors and warn the developer while they’re coding can cut down the detect/fix cycle time, improve developer efficiency and reduce overall development time. The HITL process is immediate.

Meeting summaries

I’ve worked with people who said “wow, this is great; it’s saved me so much time”. But whenever I’ve tried to use it I’ve found it… lacking. The problem, for me, is that it loses nuance and can easily overlook an essential statement. When you try to summarise an hour long contentious meeting getting a summary of “we discussed foo; bar was mention; baz was suggested as a solution” is just not good enough (although AI will expand that out to 5 or 6 paragraphs ‘cos it loves to be verbose). I’ve not seen it be wrong but I’ve seen it miss important stuff.

Definitely a scenario with HITL is essential. Don’t rely on AI, here.

Document/website/whatever generation

I hate this. I mentioned, in the intro, how AI has caused lawyers issues because the produced document referenced non-existent cases.

I’ve seen marketing websites and “apps” (whatever was meant by that) that are AI generated, and came away wondering “just WTH did any of that mean”.

I’ve seen proposals that are so generic and meaningless that I’ve laughed at them and told the vendor to go away (to be fair, I’ve also told vendors with lazy sales people to do the same thing).

But the thing that gets me about a lot of AI generated content is the banality of a lot of it. Now maybe I’m just being exposed to the worst of the worst (eg on LinkedIn) but there seems to be a lack of tone to the document. It’s just bland and lacking anything to keep me reading.

I’ve been told by people I know personally that when they read this blog they hear my voice in their head; I write this as I would speak it. AI generated content seems to lack this. And that leads to a form of uncanny valley. I get a gut aversion to the content.

Can I tell AI content from human content all the time? Heck, no. I probably get it wrong more often than not. But that just means other content is just as bad :-)

Using an AI to create bad content is not excused because humans also create bad content.

(And, hush; no calling this blog “bad content”!).

Marketing emails

Just FOAD.

I hate marketing spam at the best of times; sending AI generated slop is a quick way of getting me to hate you and your company and to tell everyone to avoid you.

In my career I’ve had people cold-email me; I’ve had these cold-emails lie to me (“I’ve tried to contact you…”; no you haven’t, I’d have blocked your company if you had!). I really don’t want AI spam in my inbox.

Fortunately I’ve retired, but this still triggers me :-)

As a joke (which I formulated over a decade ago), I’ve wondered if spam is going to be the path to true AI. See, the spammers use better and better systems to generate email that looks like it’s from a human. Which means we need better and better solutions. And this is what is ultimately going to lead to a true AI… who might just destroy the world because of all the content it learned from!

Image generation

Y’know; I’m of two minds on this one. For an idiot like me I can see the use of image generation to help get messages across. You might notice a distinct lack of pictures on this blog; I’m not a visual person and my attempts at drawing might be considered “scribbles” on a good day. So a “vibe coded” image might be beneficial? Maybe?

This is kinda like vibe coding in general; for personal use I could see a benefit, but be very aware of using it in an enterprise or commercial situation. And doubly so if using it for marketing; you don’t want a potential customer to laugh at you because your images are so obviously bad.

And here’s another minefield. I said I wasn’t going to get into stuff like IP theft and the like. But this is an area that is rife for abuse. I can’t stay silent with things like Grok can create Child Sexual Abuse Imagery(CSAM).

Now it’s not just Grok that can do this, but its the one with the worst guard rails.

Guard rails

And this kinda takes me all the way back to the beginning; when we’re parsing the intent of human language how can we prevent abuse. Humans are already bad at this. Con men are adept at bypassing guard rails. Business Email Compromise is based on the precept that people can be convinced to bypass guard rails.

We call it “prompt injection” but it’s really the same thing; we convince the AI to do things it’s not meant to do. Whether it is creating porn images of celebrities, or CSAM, or revenge porn, or…

We’re already seeing use of this in scam calls.

In the future we’re going to see a LOT of fake images during election season. Previously Trump used them to generate fake images promoting himself; eg hanging out with people of colour. In the future we’re going to see fake videos of political opponents doing/saying things they never did.

How can we build the necessary guard rails, especially since this stuff is open source and can be run locally? Fact checkers won’t help because we already have people believing “alternate facts”. Consensus reality appears to be a thing of the past, and everyone lives in their own private reality.

(This is depressing. Sorry! And it’s not really an AI problem, but it’s something AI will make worse).

I think we’ve entered a future where you can’t believe anything you don’t see or here, in person, with your own eyes and ears. And, as noted, even that’s not infallible!

Summary

I’m not, in general, a fan of AI. I don’t want it on my machine, I don’t want it writing my code. I definitely don’t want it writing code for my employer (if I still had one). I’ve modified my web browser configs so that my search excludes AI results, because they’re frequently wrong.

But I can see areas where, as a consumer of services it can make my life better (eg IVR systems).

Wherever this form of AI is used, it’s essential that HITL is present. And you need to evaluate the cost of that; it might turn out that you’ve shifted cost from stage 1 over to stage 3 of a process (in programming terms you’ve shifted right!) and potentially increased overall costs as a result.

I suspect this whole post will be controversial simply because of the areas I’m covering. Everyone has their own opinions. But that’s what this blog is here for; for me to opinion on things!

If you have an AI area you want to hear me ramble on about, let me know :-)

And this post was written without the help of any AI; indeed I wrote it using vi in an ssh window.