Generate the training data you can't collect.
The failures Bench finds and the long-tail your real captures miss — synthesized into training data, then re-verified in Bench so every batch comes with a measured gain. Isaac Sim, MuJoCo, diffusion, and real2sim under the hood; a closed loop on top.
- 6
- simulators
- 6
- output modalities
- 4
- generation backends
from openbot import Synth
# Rebuild the 142 failed kitchen-handover rollouts
synth = Synth.from_bench_failures(
bench_run="run_8c91a4",
sim="isaac_sim",
)
# Sweep the axis most likely to close the gap
synth = synth.randomize(
lighting=["dim", "warm", "cold", "backlit"],
friction=("uniform", 0.6, 1.4),
object_mass=("uniform", 0.8, 1.5),
samples_per_scene=8,
)
# Validate the gain by re-running Bench on the augmented set
gain = synth.validate(
policy="openvla-7b",
task="kitchen_handover"
)
print(gain.task_success_delta) # +0.18
print(gain.handover_recall) # 0.78 (was 0.60)The long tail your real data will never cover.
Synthetic data only matters if it actually moves task success. Synth is wired into Bench so every batch comes with a measured gain.
- 01
Text- and image-driven scenes
Describe the kitchen or point at a reference photo. Get a scene that matches your robot's real workspace — with controlled variation you can dial up or down.
- 02
Domain randomization that matters
Lighting, materials, friction, mass, sensor noise, latency. Sweep one axis at a time and see exactly which one your policy is fragile on — no more guessing.
- 03
Real2sim from failed episodes
Bench flags 142 failed rollouts. Synth rebuilds those scenes in simulation, expands them with targeted randomization, and feeds them back into training.
- 04
Closed-loop validation
Every synth batch is paired with an A/B Bench run: baseline vs. retrained-with-synth. You get a measured gain report, not a vibe check.
- 05
Multi-modal output
RGB, depth, segmentation masks, normals, contact forces, language captions — whatever your VLA or diffusion policy was trained to expect, generated in one pass.
- 06
Parameter sweeps as experiments
Tracked, comparable, reproducible. Compare 4 randomization strategies on the same task in the same dashboard. Know which axis actually moves task success.
Every batch comes with a number.
Not a feeling. Not a hunch. A real A/B Bench run comparing baseline vs. retrained-with-synth — so you know exactly what moved the needle.
Every sim. Every modality.
Stop guessing what randomization to add.
Synth picks the axes Bench says you're fragile on, generates the data, and reports the measured gain. Closed loop.
