Video Results



Scene: MPH11


Input Video
CRISP(ours)

Scene: outdoor stairs up


Input Video
CRISP(ours)

Scene: outdoor walk

Input Video
CRISP(ours)

Scene: Basement Sitting Booth

Input Video
CRISP(ours)

Scene: indoor walk

Input Video
CRISP(ours)

Scene: Handstand

Input Video
CRISP(ours)

Scene: N3_OpenLibrary

Input Video
CRISP(ours)

Scene: N3OpenArea

Input Video
CRISP(ours)

Baseline Comparisons

Scene: indoor walk off mvs


Input Video
CRISP(Ours)
VideoMimic

Scene: MPH8


Input Video
CRISP(Ours)
VideoMimic

Scene: outdoor long walk


Input Video
CRISP(Ours)
VideoMimic

Scene: outdoor stairs up down


Input Video
CRISP(Ours)
VideoMimic

Contact/Non-Contact Comparisons

Scene: pkr_c1


Input
Contact
Non-Contact

Scene: pkr_c


Input
Contact
Non-Contact

Acknowledgements

We extend our deepest gratitude to Guanya Shi for invaluable feedback and insightful discussions on CRISP. We thank Luna Shi and Weiyu Li for helpful discussions, Qitao Zhao for writing suggestions, and Zhengyi Luo for early stage discussions. The research project is funded by Bosch Research Center for Artificial Intelligence.