Skip to main content

Vibecoding Your Way to Remote Code Execution

Jeppe Lillevang Salling 5 minutes read

Earlier this week a colleague pointed me at bolt.new — one of those AI tools that generates a fullstack web app from a prompt, right in the browser. You describe the thing you want, it scaffolds the project, and a minute later you have something running. It’s genuinely impressive on first contact.

Later that same day I scrolled past a Reddit post claiming developers would soon be out of a job because of tools like Bolt. I will admit my reaction was less measured argument and more raised eyebrow. So I decided to give it a small, slightly mean little test: ask Bolt to build the kind of thing a non-developer would plausibly ask for, and then read what it actually shipped.

The brief I gave it was unremarkable: a simple file upload site, so non-technical users could drop files onto a server without dealing with FTP or SCP. Bolt obliged. The UI was clean, the project was wired up in seconds, and the demo worked exactly as advertised. Then I opened the source.

A stored XSS, served on a plate

The first version it built was Node.js. It accepted any file the browser would let me pick, wrote it to disk under its original name, and served the upload directory back out as static content. No file-type filtering, no extension allowlist, no Content-Disposition: attachment, no sanitization of what was inside the file. Whatever you uploaded was, in effect, hosted.

So I uploaded the world’s laziest payload:

<script>alert('XSS??')</script>

The file landed in the public uploads folder, I opened the URL, and the script ran. The alert is harmless on its own — but the same primitive will happily steal session cookies, redirect users to a phishing page, or quietly exfiltrate anything visible in the DOM. On a site where authenticated users routinely click “view file,” that is more than enough to ruin somebody’s afternoon.

What struck me was that this isn’t an exotic mistake. “Don’t serve user uploads from your own origin” is one of the oldest pieces of web-security folklore there is. It just wasn’t part of the prompt, so it wasn’t part of the output.

A reverse shell, served on a slightly bigger plate

Out of curiosity I asked Bolt to redo the same thing in PHP — partly because PHP is famous for this exact category of footgun, and partly because I wanted to see whether the model would push back. It didn’t. It cheerfully produced a PHP version with all the classic ingredients:

  • No MIME type checks
  • The original filename preserved verbatim
  • A public /uploads/ directory served by the web server
  • A move_uploaded_file() call and the kind of optimistic vibes that get you into trouble

If you’ve ever read a write-up of a PHP file-upload exploit, you already know how this ends. I uploaded a small shell.php, opened it in the browser, and the server obediently executed it and connected back to a netcat listener on my laptop. From there it’s the usual kit: read and write any file the web user can touch, run arbitrary commands, pivot, or just trash the box for fun.

A reverse shell is a different category of bad than stored XSS. XSS lets you mess with users of the site. A shell lets you become the site.

What I actually take from this

I want to be careful not to turn this into an “AI bad” post. Bolt did exactly what I asked: it built a working file uploader. It didn’t build a secure file uploader, because I never said so, and the model has no particular reason to volunteer threat modelling I didn’t request. The same prompt handed to a junior developer under deadline pressure would very plausibly produce the same code.

The thing that makes me a little uneasy is the speed. When it takes thirty seconds to scaffold something that looks polished and runs on the first try, the code-review step that would normally catch this stuff gets quietly skipped. The output looks finished, so it gets treated as finished. That’s the failure mode I think we should be paying attention to — not the model, but the workflow that grows up around it.

These tools are going to keep getting better, and the surface-level quality of what they produce is going to keep climbing faster than most people’s instinct to read the diff. If you’re using them — and I do — it’s worth keeping a small mental checklist for the boring security basics: where does untrusted input land, who serves it back, what gets executed, and on whose behalf. None of that is new. It’s just easier to forget when the code wrote itself.

The repo

If you want to poke at the (redacted) version of what Bolt produced, it’s here: github.com/Lillevang/bolt-pwn.