Who Wrote This Code | Graham Mackie

In 1984, Ken Thompson gave a Turing Award lecture called “Reflections on Trusting Trust.” He demonstrated that you could modify a C compiler to insert a backdoor into any program it compiled, including future versions of the compiler itself. The backdoor wouldn’t appear in any source code. You could audit every line of every file and find nothing. The compromise lived in the compiler binary, invisible and self-perpetuating.

His closing line: “You can’t trust code that you did not totally create yourself.”

That was forty-two years ago. We did not listen.

The new compiler

The parallel writes itself. In 1984, the compiler was the opaque intermediary between the code you wrote and the code that ran. You trusted it because you had to. Nobody was reading compiler output. You wrote your C, you ran cc, you got a binary. The binary did what you expected, so you assumed the compiler was honest.

Now the opaque intermediary is an LLM. I write a prompt, the model writes code, I read it, it looks reasonable, I ship it. I’m not auditing every dependency it pulls in. I’m not verifying that the package it imported actually does what it claims. I’m checking the logic, skimming the imports, and moving on. Dozens of times a day.

Thompson’s point was that the attack surface isn’t where you’re looking. You’re staring at source code, but the threat is in the tool that produces the artifact. The compiler. The LLM. You trust it because not trusting it means you can’t get anything done.

Fiction that aged badly

Andrew Nesbitt published a fictional incident report back in February. I mentioned it in my last post. It described a supply chain attack on made-up npm packages called left-justify and left-support, riffs on left-pad, the eleven-line package that broke the internet in 2016. In the story, an attacker phishes the solo maintainer’s credentials, publishes a malicious version, and it spreads through transitive dependencies across npm, PyPI, Cargo, and RubyGems. 4.2 million developer machines. The CVE was CVE-2024-YIKES.

It was satire. It was supposed to be exaggerated. The whole point was “this is what happens if we don’t fix the ecosystem.”

Then Shai-Hulud happened. I wrote about it a couple days ago. Three waves of a self-replicating npm worm, the latest one landing May 11th. TanStack, Mistral AI’s SDK, UiPath, OpenSearch. 170+ packages. The attackers poisoned CI caches in GitHub Actions and extracted OIDC tokens from runner process memory. The worm spread itself. If your stolen tokens got revoked, it tried to rm -rf ~/ on the way out.

Nesbitt’s satire was supposed to be the worst-case scenario. Reality lapped it in nine months.

Slopsquatting

Here’s where it gets specifically weird for anyone using AI to write code.

Seth Larson coined the term in April 2025. The concept: LLMs hallucinate package names. They invent plausible-sounding names. Mashups of real packages, or things that sound like they should exist but don’t. Research found that roughly 20% of code samples generated by LLMs reference packages that don’t exist on npm or PyPI. And 43% of those hallucinated names are consistent. The model makes up the same fake name across multiple sessions.

That’s an attack surface you can set a clock to. If a model reliably hallucinates react-codeshift as a package name, all an attacker has to do is register it and wait. A researcher from Aikido Security did exactly that and found the name had already spread to 237 GitHub repos via AI-generated code before anyone noticed.

The attack isn’t in the package. It’s in the model. The model is the compiler. The developer reads the generated code, sees a reasonable-looking import, and installs it.

PromptMink

This is the one that got to me.

In late April, ReversingLabs disclosed a campaign they called PromptMink, attributed to Famous Chollima (a North Korean APT). 60+ packages, 300+ versions, seven months. But it wasn’t designed to fool humans.

It was designed to fool AI coding agents.

The packages used what the researchers called “LLM Optimization abuse.” SEO for AI models. Package descriptions, READMEs, and metadata were all crafted to make AI coding tools more likely to recommend them. In February, Claude autonomously added @solana-launchpad/sdk to a crypto trading agent via a commit. That package pulled in a credential-stealing payload.

A state-sponsored attacker poisoned the recommendations of the tool that writes the code, and the developer never saw it happen. That’s Thompson’s lecture, forty-two years later, with a language model in place of a C compiler.

Building my own trust chain

I use AI tools to build things every day. I hit my Codex and Claude Code limits every week. I’m running twenty-something ventures and I can prototype a new product in a day or two because I trust the output. I skim the generated code, check the logic, glance at the imports, and move on.

I’m not going to stop. Going back to writing every line by hand would be like going back to assembly because you don’t trust the compiler. Thompson’s whole point was that you can’t audit everything. You have to trust something.

But you can pick what you trust, and you can build the thing that earns it.

I’ve been building ForgeGraph for a while now. It started as a code hosting and deployment platform. I liked the ideas in Jujutsu around changesets and wanted something built for my actual stack: Hetzner VPSs, Cloudflare Workers, a git server and WebSocket relay between nodes, CI/CD pipelines that I control end to end. A GitHub alternative tailored to how I work.

The npm chaos pushed it somewhere new. ForgeGraph is getting its own registry mirror.

The idea is deliberately slow. When a new package version hits npm, our mirror doesn’t make it available right away. It quarantines it for several days. During that window, the package runs in a sandbox. Install scripts execute, network calls get logged, file system access gets recorded. If something tries to harvest environment variables or phone home to a host it shouldn’t, it gets flagged before it reaches any of my machines.

That’s the opposite of how npm works. On npm you publish and it’s live. Anybody can install it instantly. That speed is great for maintainers and great for attackers. A quarantine flips the tradeoff. You don’t get the latest version the minute it drops. You get it after it’s sat for a few days and hasn’t done anything suspicious.

The delay is the point. Most supply chain attacks get caught within hours or days. The Shai-Hulud waves were identified and the malicious versions pulled relatively fast. But “fast” still meant thousands of developers installed them in the gap between publish and discovery. A registry that just waits out that gap catches most of it.

It’s not going to stop a patient attacker who writes a payload that sleeps through sandbox execution and activates later. But most of these attacks are smash-and-grab. Harvest credentials, spread as fast as possible, bail before anyone notices. A quarantine period breaks that timing.

I don’t think the ecosystem is going to fix this for us. npm and PyPI have been talking about package signing and reproducible builds for years. Progress is slow because the registries serve millions of developers, and any friction they add pushes people to alternatives. I don’t have that problem. I’m building this for my own stack, my own ventures. If it’s useful to other people, great.

Thompson said the problem has no technical solution. He was right. You can’t get rid of trust in a system that runs on it. But you can decide where the verification boundary sits. You can build your own compiler, if you want to push the metaphor that far.

ForgeGraph started as a way to manage my own deployments. Now it’s turning into a way to manage my own trust chain. Same instinct, different threat.

The new compiler

Fiction that aged badly

Slopsquatting

PromptMink

Building my own trust chain

Share on: