The Reverse Junior: Debunking the Main Myth of Vibe-Coding

AI doesn't struggle with complex code — it struggles with simple things like loops, arithmetic, and array indices. That inversion makes it far more dangerous than people think.

Yesterday Habr held an "Author's Fireside" event. It was very interesting, and one claim by the speaker stuck with me. He argued that AI can help write simple pieces of code but doesn't work well with complex things — in other words, large language models resemble a junior programmer.

I decided to write an article about this in the morning, drawing on my knowledge and experience in computational mathematics. I believe this is the main myth of vibe-coding. The reality is exactly the opposite — AI writes fairly complex things quite well and retrieves hard-to-find information with ease. But it trips up on elementary matters. This is a junior in reverse.

The problem is that this creates a dangerous illusion. I will explain clearly why, and what the consequences can be. Get your coffee ready and prepare for a debunking that may someday save your millions, your career, or even human lives.

The Core Argument

LLMs like ChatGPT or Claude do a decent job of imitating "complexity" — architecture, patterns, dataclasses, even reasonable module decomposition. But on "simple" things — basic math, loops, numerical methods, physics — they reliably produce exactly the kinds of errors that render all the UML diagrams worthless.

To be concrete, I'll walk through a specific, real example: a simulation of supersonic ballistic rocket flight that an AI generated for me. The full Jupyter notebook with the code is available via a link at the end. The entire kilometer of code won't fit in this article, so I'll present only the characteristic fragments.

The speed at which AI creates such a large codebase is impressive. But we're going to look at why it may sometimes be simpler to write everything yourself than to clean up afterward.

One important point: a fresh paid version of ChatGPT, when asked to review this code, does find some of the problems — regarding the Euler method, integration step size, and other "grown-up" issues. But it remains quite satisfied with the spots where far more mundane yet fatal bugs reside. Those are what we'll examine.

The "LLC Horns and Code" Effect

To illustrate the concept, let's introduce a term: the "LLC Horns and Code" effect.

Imagine a hypothetical outsourcing firm with that name. They excel at one thing: writing code that looks extremely convincingly like the real thing.

You open the repository — beautiful. There are core, domain, infrastructure layers. The code has UserRepository, PaymentService, Factory, Strategy, sometimes even AggregateRoot. A three-hundred-line file has not a single goto and a solitary TODO. The manager is happy, the client is calm, the architect neatly moves squares around in Miro.

Then the system goes to production. Under load, quadratic complexity surfaces on the hot path. Money is lost somewhere in decimal places because "I used double — what's the problem?" Retry is implemented as an infinite loop with sleep(100) in hopes of magic.

The facade is senior-level. The internals are a typical student who was given a book on patterns but never took a course on algorithms and system analysis. That's exactly what an LLM is today — "LLC Horns and Code on steroids." Very convincing form, completely random quality of substance wherever precise numbers — not words — are required.

A Personal Anecdote

I've done tutoring for a long time and have had all kinds of clients beyond school students. A few years ago, I was approached by employees of a typical firm engaged in Russian import-substitution software. They were creating a CAD system for pipe-joint calculations, and specifically the people who came to me were building an import-substituted computational geometry library. They gave me problems from school-level plane geometry — 3,000 rubles per problem — which I easily solved and fully documented in 15 minutes.

It turned out these CAD developers:

  • Had no command whatsoever of matrices, derivatives, or even the basics of 7th–8th grade geometry
  • Had never even heard of computational stability of algorithms
  • Were unaware that everything they had been doing for the past six months was available as open-source, properly written, on the internet — and when I showed them, they said it wouldn't work because their management would realize they had downloaded the code and not written it themselves
  • Had spent six months writing a great deal of code that would fail on obvious tests — yet their tests (written by equally unqualified specialists) all passed

There are many such firms on the market, which I call "LLC Horns and Code."

The Supersonic Rocket: What the AI Generated

I used LLMs (Gemini and Claude) to model the flight of a rocket through the atmosphere on the active supersonic segment. After discussing the problem formulation with the AI, I settled on the following setup:

An atmosphere with altitude, temperature, pressure, density, and speed of sound. Tabulated aerodynamic drag and lift coefficients as functions of Mach number. Engine thrust depending on fuel consumption and pressure differential. Rocket mass decreasing linearly with time. Numerical integration of an ODE system for velocity, trajectory angle, coordinates, angular velocity, and pitch angle. Comparison of several integration methods. Output: trajectory, velocity, angle, atmosphere parameter plots.

I didn't generate it all at once — I acted as an experienced vibe-coder. I broke the task into small steps, verified and tested each step with the LLM, asked clarifying questions, checked the sources of the tables, and gradually built something working.

The result looks exactly like a respectable engineering project — proper atmosphere functions, hand-typed tables, analytical formulas, clean labels. If you're a manager or customer without experience in numerical methods, you'd probably think it's solid work. And here is where the real analysis begins.

Bug 1: The int() and Step Count Problem

Look at this line:

n = int(0.01 + (t_end - t0) / h)

At first glance, this is a known "trick": add a small constant to compensate for floating-point representation errors, then take int() as a floor substitute. Many wrote this in homework before proper libraries existed.

Let's look at real numbers used in the notebook:

t0 = 0.0
t_end = 4.23
h = 0.1

The machine representation of 4.23 / 0.1 is not a clean 42.3 but something like 42.300000000000004. After adding 0.01 we get 42.31000000000001. Cast to int and we unexpectedly get 42.

Then the time grid is built:

t_values = np.linspace(t0, t_end, n + 1)

That gives np.linspace(0.0, 4.23, 43) — an array from 0.0 to 4.23 with step approximately 0.100714... But the integration loop runs with step exactly h = 0.1. The state y_values[i] corresponds to time t = t0 + i * h, while the plot draws against t_values, which covers an entirely different grid.

The last point by t_values is 4.23 seconds, but the physical state it corresponds to is 4.2 seconds (42 steps of 0.1). A small but systematic discrepancy. In a real task where you need to analyze values at engine cutoff, this is already a serious desynchronization. The paid ChatGPT missed this entirely — it talked readily about method stability, but not about float and int().

Bug 2: Pseudo-Abstraction — the Parameter f That Does Nothing

def euler_method(f, y0, t0, t_end, h, params):
    ...
    for i in range(n):
        y_values[i + 1] = y_values[i] + h * np.array(
            [f(t_values[i], y_values[i], params) for f in [f1, f2, f3, f4, f5, f6]]
        )

The signature implies you can pass any ODE system via the argument f. Inside, that f is immediately shadowed in the list comprehension for f in [f1, f2, f3, f4, f5, f6]. The real system is hardwired to global f1...f6, and the parameter f is window dressing that is never used.

The LLM has seen countless examples where a numerical method accepts a right-hand-side function and copies that pattern — but cannot verify that the argument actually participates in the computation. The result is a pseudo-abstraction: looks clean and flexible, internally nailed to a specific global implementation.

Bug 3: Silent Type Conversion and Fake Rounding

In the fragment that prepares tabular data:

P_table_01 = np.array([0] * L_01)
...
P_table_01[i] = int((P_table_01 // 1)[i] + ((P_table_01[i] % 1) // 0.5))

When the array is initialized with [0], NumPy creates an integer-typed array. Subsequent float results from calculate_P() are silently truncated. The fractional part vanishes without a trace.

The rounding formula itself is theatre: if P_table_01 is integer, P_table_01[i] % 1 is always zero, and the entire construction reduces to int(P_table_01[i]) — no rounding happens at all. The LLM generated a rounding-shaped template without verifying it actually rounds anything.

Bug 4: Aerodynamics Disabled by One Line

def alpha(y):
    return 0   # ideal control case

Since alpha(y) always returns zero, the lift force vanishes, the moment disappears, and all angular motion in the model becomes purely decorative. Meanwhile, beautiful graphs of angle of attack, pitch angle, and angular velocity are plotted — suggesting these quantities are being computed. One trivial line switches off an entire subsystem of equations.

Bug 5: The Wrong "Velocity Vector Angle"

def get_speed_vector_angle(y):
    return radians_to_degrees(np.arctan2(y[:, 3], y[:, 2]))  # arctan(y_c / x_c)

The comment says this is the velocity vector angle. In reality, it takes the x and y coordinates and computes the angle of the position vector from the origin — unrelated to velocity direction. The correct velocity angle should be computed from the time derivatives of the coordinates, (dx/dt, dy/dt). Elsewhere in the code the author has those variables, but the final plot uses the wrong arguments. This is not a complex physics mistake — it's the wrong arguments to arctangent.

What AI Caught vs. What It Missed

To be fair: I fed this notebook to the latest paid ChatGPT and asked it to find problems. At the high level it did well — it flagged that the explicit Euler method for a stiff nonlinear system where aerodynamic drag grows as V² is risky, that the step h = 0.1 is suspiciously large, and that an adaptive-step solve_ivp would be more robust. These are not trivial observations.

But the same model did not notice:

  • The int() step count error causing temporal desynchronization
  • The mismatched time grids between integration and plotting
  • The shadowed f parameter in euler_method
  • The integer array swallowing fractional values
  • The zeroed-out angle of attack disabling all aerodynamics
  • The position angle masquerading as a velocity angle

Each of those bugs lives in one or two lines. Each looks "simple." And each, individually, can entirely invalidate all the effort.

Where Else AI Reliably Fails on Simple Things

The rocket story is not unique. The same patterns appear in other tasks.

Ask an LLM to implement binary search by hand and it will very often mix up strict and non-strict comparisons, update low and high incorrectly, leave one element never checked, or hang in an infinite loop on a boundary case. By signature and structure it will be exactly the binary search from a textbook. But the boundary condition where < must become <= will be in the wrong place.

With floating point it gets more interesting. The construct if a == b: for computed results appears in generated code with remarkable frequency. SQL queries are assembled via f-strings, opening the door to injection attacks — the model has seen a thousand examples of string interpolation and does "what people do."

Array bounds, empty containers, mixed-up radians and degrees, off-by-one errors in loops — all of these surface constantly in generated code, including in places any experienced programmer would catch immediately because they've seen the same bug in production a hundred times.

Four Practical Conclusions

First, AI cannot be considered the author of code. It is a draft generator. Architectural scaffolding, library boilerplate, dead-boring glue code — yes. Everything involving mathematics, physics, or discrete logic must be verified exactly as if it were written by a person who is completely illiterate in those subjects.

Second, do not delegate "simple" things to AI. On the contrary, simple things — steps, indices, types, invariants — must be reviewed with particular care. It will draw the architecture for you. But it will compute dt through int() division and drift 0.03 seconds in the wrong direction.

Third, junior developers still need to learn the fundamentals — not just "how to phrase a prompt." If someone doesn't understand the difference between the explicit Euler method and Runge–Kutta, or what numerical stability means, they won't see the problem in the integration scheme, the mismatched time grids, or the disabled aerodynamics subsection. And the fundamentals must be taught especially well, starting from the very basics — including errors caused by machine arithmetic.

Fourth, the fact that AI can sometimes find its own errors does not make it safe. It is a useful tool for a second pass. But the responsibility for ensuring that the rocket in your model doesn't become a numerical fireworks display remains yours.

The AI Trust Rule: the more primitive the operation (addition, loop, conditional), the higher the probability that the LLM got it wrong. Think of it as code written by a philosophy professor who skipped arithmetic in elementary school.