Data Collection

The VLAI L1 is built for large-scale teleoperation data collection. Its one-click recording pipeline, dual-arm synchronization, and built-in VR teleop make it the fastest path from robot delivery to training-ready dataset.

Recording Workflow

Bimanual VR Teleoperation Recording

1

Connect and verify all systems

rc connect --device l1 --host 192.168.1.45
rc status   # check: arms, base, cameras, battery all green
2

Move L1 to recording position

Drive the L1 to the task workspace using WASD in the browser panel. Set the lift height for the task (e.g., 130 cm for table-top manipulation). Park it and lock the wheels.

rc teleop --device l1   # open browser panel
# Drive to position, then lock:
python -c "from roboticscenter import L1; r=L1('192.168.1.45'); r.connect(); r.base.lock_wheels(); r.disconnect()"
3

Set up the task scene and cameras

Place task objects in consistent starting positions. Verify camera views in the browser panel — both wrist cameras (Developer Max) and any external cameras should cover the task workspace.

4

Start recording session via CLI

rc record \
  --device l1 \
  --task "Pick up the bottle and pour into the glass" \
  --num_episodes 50 \
  --output ~/datasets/l1-pour-v1 \
  --teleop_mode vr   # or: browser, leader_arms
# Press ENTER in VR to start each episode, ENTER again to end
5

Review episodes

rc replay \
  --dataset ~/datasets/l1-pour-v1 \
  --episode 0

The viewer shows all camera streams + joint state time-series synchronized. Delete poor episodes before pushing.

6

Push to HuggingFace Hub

huggingface-cli login
rc push_dataset \
  --dataset ~/datasets/l1-pour-v1 \
  --repo_id your-username/l1-pour-v1
Dataset Format

L1 Dataset Schema

The L1 recording pipeline produces a multi-modal dataset with both arms, the mobile base, all cameras, and optional language annotations.

Fields in each episode Parquet file
observation.left_arm_state float32[8] Left arm joint positions in radians (8 DOF)
observation.right_arm_state float32[8] Right arm joint positions in radians (8 DOF)
observation.base_state float32[3] Mobile base x, y, heading in meters and radians
observation.lift_height float32 Torso lift height in meters
observation.images.* video path Wrist cameras (left, right), head camera, external workspace camera
action.left_arm float32[8] Target left arm joint positions from VR teleop
action.right_arm float32[8] Target right arm joint positions from VR teleop
language_instruction string Natural language task description for VLA conditioning
timestamp float64 Unix timestamp in seconds
next.done bool True on the last frame of each episode
Quality Assurance

Quality Checklist

The L1's VR teleop can introduce unique data quality issues around latency and bimanual coordination. Run through this before pushing to the Hub.

  • 1
    VR latency was below 50ms during recording Check the latency monitor in the browser panel during recording. Above 50ms, the operator's hand movements lag the robot's actions, creating a causal mismatch in the dataset. Re-record on a lower-latency WiFi channel if needed.
  • 2
    Both arms moved as intended (no single-arm episodes) For bimanual tasks, verify both arms show significant motion in observation.left_arm_state and observation.right_arm_state. Single-arm-dominant episodes may indicate the operator favored one hand.
  • 3
    Mobile base was stationary during arm manipulation Unless you are recording mobile manipulation tasks, observation.base_state should be nearly constant within each episode. Base movement during manipulation causes the workspace to shift relative to the cameras.
  • 4
    All camera streams present for the full episode The L1's WiFi bandwidth may drop frames under load. Run rc validate_dataset --dataset ~/datasets/l1-pour-v1 to check for missing frames across all camera streams.
  • 5
    Language instruction matches what was demonstrated The language instruction is set before recording starts. If the operator improvised a different approach (e.g., used one arm instead of two), update the instruction or delete the episode.
Next Step

Training a VLA from Your Dataset

Once your dataset is on HuggingFace Hub, fine-tune a VLA with the L1 action space.

Fine-tune OpenVLA on L1 data

pip install roboticscenter[vla]

python -m roboticscenter.scripts.finetune_vla \
  --model openvla/openvla-7b \
  --dataset your-username/l1-pour-v1 \
  --action_space l1_bimanual \   # registers the 16-DOF bimanual action head
  --epochs 50 \
  --output_dir outputs/openvla-l1-pour

Deploy fine-tuned VLA on-device (Developer Pro/Max)

rc deploy vla \
  --model outputs/openvla-l1-pour \
  --quantize int4 \
  --device l1 \
  --host 192.168.1.45

# Run the policy:
rc run policy \
  --task "Pick up the bottle and pour into the glass" \
  --max_steps 100

Dataset Ready? Start Training.

Push to HuggingFace Hub and fine-tune a VLA model on your bimanual manipulation data.