OpenBot
← All products
OpenBot Synth · Data synthesis

Generate the training data you can't collect.

The failures Bench finds and the long-tail your real captures miss — synthesized into training data, then re-verified in Bench so every batch comes with a measured gain. Isaac Sim, MuJoCo, diffusion, and real2sim under the hood; a closed loop on top.

6
simulators
6
output modalities
4
generation backends
synth_close_gap.py
from openbot import Synth

# Rebuild the 142 failed kitchen-handover rollouts
synth = Synth.from_bench_failures(
    bench_run="run_8c91a4",
    sim="isaac_sim",
)

# Sweep the axis most likely to close the gap
synth = synth.randomize(
    lighting=["dim", "warm", "cold", "backlit"],
    friction=("uniform", 0.6, 1.4),
    object_mass=("uniform", 0.8, 1.5),
    samples_per_scene=8,
)

# Validate the gain by re-running Bench on the augmented set
gain = synth.validate(
    policy="openvla-7b",
    task="kitchen_handover"
)
print(gain.task_success_delta)   # +0.18
print(gain.handover_recall)      # 0.78  (was 0.60)
Capabilities

The long tail your real data will never cover.

Synthetic data only matters if it actually moves task success. Synth is wired into Bench so every batch comes with a measured gain.

  1. 01

    Text- and image-driven scenes

    Describe the kitchen or point at a reference photo. Get a scene that matches your robot's real workspace — with controlled variation you can dial up or down.

  2. 02

    Domain randomization that matters

    Lighting, materials, friction, mass, sensor noise, latency. Sweep one axis at a time and see exactly which one your policy is fragile on — no more guessing.

  3. 03

    Real2sim from failed episodes

    Bench flags 142 failed rollouts. Synth rebuilds those scenes in simulation, expands them with targeted randomization, and feeds them back into training.

  4. 04

    Closed-loop validation

    Every synth batch is paired with an A/B Bench run: baseline vs. retrained-with-synth. You get a measured gain report, not a vibe check.

  5. 05

    Multi-modal output

    RGB, depth, segmentation masks, normals, contact forces, language captions — whatever your VLA or diffusion policy was trained to expect, generated in one pass.

  6. 06

    Parameter sweeps as experiments

    Tracked, comparable, reproducible. Compare 4 randomization strategies on the same task in the same dashboard. Know which axis actually moves task success.

Measured gain

Every batch comes with a number.

Not a feeling. Not a hunch. A real A/B Bench run comparing baseline vs. retrained-with-synth — so you know exactly what moved the needle.

Baseline
real capture only
73%task success
open_drawer98%
pick_mug91%
pour82%
handover60%
+ Synth
+18 pp
91%task success
open_drawer99%
pick_mug96%
pour91%
handover78%
Task success
91%+18
was 73%
Handover recall
78%+18
was 60%
Sim→Real gap
−14pp+15
was −29pp
Scenes generated
1,136
was
Compatibility

Every sim. Every modality.

Simulators
Isaac SimIsaac LabMuJoCoRoboCasaLIBEROGenesis
Output modalities
RGBDepthSegmentationNormalsContact forcesLanguage captions
Generation backends
Diffusion (SDXL+)3D Gaussian SplattingProceduralReal2sim

Stop guessing what randomization to add.

Synth picks the axes Bench says you're fragile on, generates the data, and reports the measured gain. Closed loop.