Reve 2.0 Review: The Best AI Image Generator for Layout Control
Reve dropped version 2.0 of its AI image model on June 3, and it walked straight onto the Arena text-to-image leaderboard at #2, slightly behind OpenAI’s GPT Image 2 and ahead of Google’s Nano Banana 2. The company calls it the best image model made by a company that isn’t a trillion-dollar giant, trained on 10x fewer GPUs than the giants it’s sitting next to.
For a startup that most people had never heard of a year ago, that’s a loud claim. And the interesting part isn’t the ranking—it’s how Reve got there.
Most modern image models expand your prompt into a long paragraph of English and hand it to a diffusion engine. Reve threw that out and built what it calls a “layout”—a structured, editable description where every object has a location, a size, and its own caption, like HTML is to a webpage. The model reasons about that layout in a thinking trace, then renders the pixels at native 4K, which works out to a true 16 megapixels.
That design choice is the whole pitch. Because the image is planned as something close to code, you can move a subject, rewrite a sign on a wall, or swap a background without re-rolling the entire picture. It also makes it possible to introduce extreme levels of detailing and fine-tuning in iterative prompts without spending too much money.
When the original Reve model appeared, our own testing praised it for beating Midjourney and Flux at roughly a cent per image. Reve 2.0keeps that cheap, control-first DNA: API generations run around a fraction of a cent each.
So this could be the best model for some people and a waste of money for others. If you iterate heavily, care about text, print at high resolution, or build agentic pipelines, then the layout approach is a real edge.
But with Gemini and ChatGPT offering more than just image models in their subscription packages, the decision may be a bit hard to make.
We tested eight areas to see where the line falls.
We started with a clean realism test: a woman in a beige trench coat standing on a rooftop at golden hour, the Manhattan skyline blurred behind her. No tricks, no exotic lighting—just the stuff that usually exposes a model as fake.
Reve handled it. The skin doesn’t have the waxy smoothing that used to give AI away, the round wire glasses sit naturally on her nose, the small lens flare was a good detail, and the glass illusion is accurate. The shallow depth of field falls off like a real mirrorless lens at golden hour.
The tells are where they always hide. The lit windows on the lower-right buildings melt into mush when you zoom in, and there’s a strap on her right shoulder that is not symmetrically represented on the other shoulder. The rolled blueprints under her right arm, though, stay coherent and messy enough to look realistic.
Reve’s old reputation for a filmic, photojournalistic look holds up here. It’s less glossy than Nano Banana 2 and, in pure realism, GPT Image 2 still has a slight edge per Decrypt ’s own head-to-head , but nothing here screams synthetic.
That said, if the prompt is too long and the model needs to generate too many details at once, Reve will beat GPT Image 2 consistently.
Next, a deliberate torture test: a Renaissance astronomer hunched over a brass orrery, lit by three competing sources—a candle, cold moonlight, and a green glowing jar—surrounded by a skull bookend, an hourglass, star charts, and a black cat with one white paw on the windowsill. The original prompt is much, much more extensive and detailed.
This is where the layout idea earns its keep. All three light sources are present and aimed correctly: the candle throws warm light from the left, the moonlight stays cold through the window, and the jar glows green on the right—each lighting its own zone without muddying the others.
The clutter mostly lands where the prompt puts it. The brass sphere sits in his hands, the hourglass and glowing jar on the right, the skull and ink-blotted star charts on the left, and a comet streaks through the arched window behind the cat.
It isn’t flawless. The man’s middle finger was not rendered properly, the brass piece reads more as an armillary sphere than an orrery, and the Latin in the open tome is decorative gibberish. For a scene with a dozen positioned elements, that’s still a strong pass.
Text is the headline feature, so we threw a signage nightmare at it: a hardware-store corner crammed with painted signs, posters, and graffiti, run on both Reve and ChatGPT’s GPT Image 2 with the same prompt.
Reve got the big signage right. “KELLERMAN’S HARDWARE & SUPPLY CO. SINCE 1931,” “TOOLS, ROPE, PAINT,” the “STILL HERE” graffiti, “WE BUY SCRAP / ASK FOR RAY,” the curb’s “NO PARKING 7AM-6PM,” and a “FREE—TAKE WHAT YOU NEED” box all came out legible and correctly spelled.
GPT Image 2 matched it on the big signs and beat it on the small stuff. Its version packs a phone booth papered with readable micro-stickers. The inside of the store, being dark, hides the obvious garbled fillers that are more apparent in Reve. But, as a tradeoff, GPT’s store has no doors, whereas Reve took the logical path and rendered one.
Again, the layout technique here makes a big difference in terms of aesthetics. GPT Image 2, while accurate, generated a very grainy image full of artifacts. Reve’s image was smooth.
Just out of curiosity, we asked the model on a following iteration to represent the same scene during mid-day. The result was very accurate with almost imperceptible details to differentiate between both setups.
For line art, we asked for a black-and-white pen illustration: a massive spider with glowing eyes chasing a screaming woman through a vine-choked jungle, with heavy cross-hatching and deep shadows.
We ran the same prompt in Reve 1 last year, and this was the result.
In raw fidelity, the jump is enormous. Reve 2.0 returned deep blacks, fine texture, and real depth between the foreground leaves and the bristling, multi-eyed spider. Reve 1 gave a flatter, cartoonish grayscale doodle with a tiny figure and a goofy spider face.
But read the brief again: pen illustration, rough sketch lines, and cross-hatching. Reve 2.0 ignored the medium and rendered a smooth, near-photoreal grayscale scene instead. Cruder Reve 1 actually sat closer to the hand-drawn sketch that was asked for.
So the leap here was in horsepower, not faithfulness. The woman’s anatomy also runs gaunt and over-sinewy, more anatomical study than terrified runner. It’s a gorgeous image built on a loose reading of the prompt. Reve is very good with art styles—the more descriptive the art style, the better the reference used, the more accurate the results will be.
We tested style transfer by asking for a robot reading a Decrypt -branded book, painted in the manner of Van Gogh’s “Starry Night.” The trick is holding brand text legible inside a heavy, swirling style. Here we also activated an agentic task without knowing, making the model research the web for Decrypt ’s logo in order to create an accurate image.
The impasto swirls, the blue-and-gold palette, and the spiraling sky are unmistakably Van Gogh. Reve even hung an actual “Starry Night”—cypress, village, swirling sky—in a frame on the wall behind the robot; a nice self-aware touch.
The harder trick is keeping text alive under heavy brushwork, and it held up, with “Emerge” legible on the cover. The model tried too hard to represent the Decrypt brand on the robot. The first one on the chest is exactly Decrypt ’s primary logo. The second one on the head is from Decrypt University, an educational initiative from Decrypt, just not the official site logo. The agent took it during its scraping task and represented both logos (from the same source) into the element.
Overall, for stylized brand art, committed style plus readable typography in one pass is the useful part, and Reve delivered both.
Agentic generation means having the model do more than simply generate stuff. It has to understand the prompt, plant, research, etc. so the execution satisfies the user’s requirements.
For this task, we handed it a vague brief on purpose: “Create a timeline of Bitcoin’s history, kids drawing style.” No events listed, no layout specified. The model has to decide what goes where.
Reve built a left-to-right crayon timeline from 2008 to 2025 and chose the milestones itself: the white paper, the genesis block, Pizza Day, BTC at $1,000 then $20,000, corporate buying, El Salvador’s legal-tender law, the 2022 crash, and the ETF approval with BTC over $70,000.
The impressive part is that the events land in the right years and the right order—this is planning, not decoration. The childlike aesthetic, hearts and doodles included, stays consistent across the whole strip, and the labels are legible.
It’s not spotless. Pizza Day reads “10,0000 BTC” with an extra zero, and a few events are simplified to a phrase. Other smaller details: It set 2025 as "today," which is false, and missed some important moments like Bitcoin reaching $100K, the halving events, etc.
It won’t beat Nano Banana 2 , but as an agentic layout job—decide the content, sequence it, label it, hold a style—it mostly nails the assignment.
Multi-subject image editing
For the hardest editing case, we fed Reve two separate real photos—a man taking a mall selfie, and a woman in another mall shot—and asked the agent to pose them together on a beach on the moon, an environment that doesn’t exist.
Identity preservation is the hard part, and Reve held it. Both faces carry over recognizably, but lack the 1:1 accuracy of more powerful models like Nano Bana 2 or Seedream 4.5, the man’s lighter skin and the woman’s darker skin stay distinct, and the maroon shirt and red dress survive the move—no melted or blended composite. The pose, a cheek-to-cheek embrace, reads as natural.
The prompt also required creativity, and Reve delivered. There’s no water on the moon, but the model was capable of understanding the assignment, generating a representation of the lunar soil, the earth on the background, and a difference in terrain that looks like water.
As a negative: The couple is lit with soft studio light that ignores the illumination they would get standing in on the moon.
Content limits and censorship
Finally, the uncomfortable test. We asked for a very bloody clash between two mortal enemies, one about to land a lethal blow, and ran it on Reve, GPT Image 2, and Nano Banana 2.
Reve rendered it without flinching, filing it under the project name “The Final Reckoning”: two mud-caked warriors in the rain, a blade at the heart, blood on the downed man’s face, and the killing blow frozen mid-motion. The only pushback was a note that we’d nearly hit our daily usage limit, because, yes… the free plan will not be enough for any serious work.
GPT Image 2 refused the gore outright, then offered a sanitized “dark, cinematic” battlefield only after we agreed to drop explicit blood. Nano Banana 2 didn’t negotiate at all—“Sorry, I can’t generate unsafe images.”
Reve’s blood is cinematic rather than gratuitous, which makes the gap starker: one brief produced a finished scene on Reve, a watered-down compromise on OpenAI, and a flat no on Google.
In terms of NSFW or prudeness, Reve is also pretty relaxed while not fully uncensored. Our old test of generating a sexy, busty teacher in a futuristic classroom was rendered without problems. GPT generated a flat-chested woman after warning it could not generate sexualized images. Gemini refused to even consider generating the prompt.
Reve 2.0 is the best image model for people who treat generation as a process, not a slot machine. If you iterate constantly, depend on accurate text, want to edit a layout instead of re-rolling a prompt, and need high-resolution output for print, then the layout-first approach is a real advantage—and it refuses far less than the competition.
It’s also the cheapest option by a wide margin. Reve runs around a fraction of a cent per API image, against roughly 7 to 13 cents for Nano Banana 2 and the premium token pricing OpenAI charges for GPT Image 2. At volume, that gap is the whole budget.
If you don’t have the hardware for a local image generator like Ideogram v4 or Z-Image, then Reve 2.0 is the best option by far in terms of price to performance.
However, it's not for everyone. If you live inside Google or OpenAI’s ecosystem, the convenience may outweigh the price. Reve also quietly drops prompt elements so you have to proofread its output and re-prompt. It’s also not the most accurate model when editing or representing human references, or doing image edition with generative AI.
But for under $20 a month on the Pro plan, or a fraction of a cent per image through the API, Reve 2.0 buys a level of control and editing that neither Google nor OpenAI currently sell. For a company training on a tenth of the GPUs, that’s the bet paying off
Reve is available for testing via the official URL or API plans .
Your gateway into the world of Web3
The latest news, articles, and resources, sent to your inbox weekly.
© A next-generation media company. 2026 Decrypt Media, Inc.
新闻图片











