LLMs Alone Won't Design Rockets
Every week there’s a new product release or demo that makes people lean forward. One model writes tighter code. Another crushes a benchmark. Another explains quantum mechanics to a six-year-old in the voice of a pirate. It feels like acceleration, and the subtext is absorbed almost unconsciously:
The breakthrough we’re waiting for is already inside the model. We just need better access.
Scale it more. Tune it more. Prompt it better. Wrap it in an agent loop. And somewhere in that fog of parameters, the next propulsion system, the next material, the next cure will fall out.
It’s an appealing notion. It’s also flawed in some notable ways.
People talk about the “latent space” as if it were a mysterious reservoir of undiscovered truths. It isn’t. It’s a map — vast, dense, flexible — constructed from the material the model consumed during training. The model can interpolate between points on that map. It can blend distant regions. It can produce combinations no human would think to try.
But the envelope of that map doesn’t move on its own.
A language model is essentially static — a frozen crystal of implied knowledge.
Inside the boundary, you get impressive recombination.
Outside it, there’s no ground to stand on.
So when people suggest that a model will stumble into novel physics by rummaging around its own interior, they’re really claiming that rearranging old information will somehow generate new information. It might produce something novel within the envelope of established understanding. But discovery only occurs when the boundary itself shifts — and that shift requires something the model cannot do.
Progress — in science, engineering, craft, or anything that touches the world — comes from running a hypothesis and seeing what the world does back.
You build the prototype, fire the engine, check the telemetry, see what failed, figure out why, adjust. That loop is where the unexpected appears. It’s where constraints enter the picture. It’s how the map gets extended.
A model can describe that loop.
It can comment on it, speculate about it.
But it doesn’t perform it.
And that’s a subtle but crucial distinction: the model isn’t the part that moves.
The world is the part that moves. The model just helps you talk about the movement.
When a model gives you a correct answer and when it gives you a hallucination, the same mechanism produced both. A hallucination isn’t a malfunction; it’s simply output with no grounding. There’s no internal meter that says, “This aligns with reality.” No internal sense that anything is off. No built-in warning when it fabricates.
To the model, a polished truth and a polished falsehood are functionally identical.
It’s essentially a radio that can’t tell whether it’s tuned to a signal or to static.
Sound comes out either way, and we decide which is which.
Breakthroughs don’t live inside the weights.
This isn’t a question of intelligence.
It’s a matter of verification.
You can feed a model real results — measurements, failures, telemetry — and it may reason over them beautifully. Sometimes better than the humans who produced the data. But notice what’s happening:
You touched the world.
You ran the experiment.
You curated the information that mattered and shaped the context.
The model didn’t generate the new constraint that updated your understanding.
It processed the inputs; the world did the rest.
That’s its role: powerful inside the loop, but not the loop itself.
The same limit appears in the everyday domains people project these systems into. They imagine the model assisting their life — scheduling, finances, goal-aware meal planning, the whole personal cockpit.
But steering is attempting a destination, noticing what actually happened and adjusting.
It requires catching the drift, the mismatch, the change in conditions.
It requires real contact.
A language model can support that.
It can’t replace it.
The radio helps you navigate.
It doesn’t know where you are.
Discovery, engineering, and agency all depend on interaction with reality.
Models don’t replace that. At their best, they amplify it.
Which is why — whether you’re aiming for the moon or just trying to run your day...
LLMs alone won’t design rockets.