Building Neural Digital Twins for Real-World Objects
Real-world capture in the lab: RGB-D sensing and object-centric processing toward keypoint-conditioned WorldString
training data.
WorldString Interactive visualization
Each row shows three panels, from left to right: input keypoints,
learned token-colored shape, and error map. Drag to rotate and scroll to zoom; panning is disabled. The camera stays centered on the
cloud or keypoints. Use Sync other panels to this view in the keypoint panel to copy
its camera, frame, and play/pause state to the other two panels in the same row.
Error map: green — TP · red — FP · blue — FN (red/blue are
softened toward greener tones in the panel for readability). Keypoints are colored by joint index (HSV) and drawn as
shaded spheres; they use the same frame
index as the adjacent shape panels.
Robot hand (Articulated objects)
Keypoints (input)
Loading…
Learned token assignment
Loading…
Error map vs. ground truth
Loading…
SMPL football motion (Skinning objects)
Keypoints (input)
Loading…
Learned token assignment
Loading…
Error map vs. ground truth
Loading…
Earphone (Soft objects)
Keypoints (input)
Loading…
Learned token assignment
Loading…
Error map vs. ground truth
Loading…
Double stretch (soft / deformable)
Keypoints (input)
Loading…
Learned token assignment
Loading…
Error map vs. ground truth
Loading…
Unitree Go2
Keypoints (input)
Loading…
Learned token assignment
Loading…
Error map vs. ground truth
Loading…
Training Process Visualization
Training process visualization for the Go2 and H1 models. Test the saved checkpoints on same keypoint states.