Understanding Physical Dynamics with Counterfactual World Modeling

التفاصيل البيبلوغرافية
العنوان: Understanding Physical Dynamics with Counterfactual World Modeling
المؤلفون: Venkatesh, Rahul, Chen, Honglin, Feigelis, Kevin, Bear, Daniel M., Jedoui, Khaled, Kotar, Klemen, Binder, Felix, Lee, Wanhee, Liu, Sherry, Smith, Kevin A., Fan, Judith E., Yamins, Daniel L. K.
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition
الوصف: The ability to understand physical dynamics is critical for agents to act in the world. Here, we use Counterfactual World Modeling (CWM) to extract vision structures for dynamics understanding. CWM uses a temporally-factored masking policy for masked prediction of video data without annotations. This policy enables highly effective "counterfactual prompting" of the predictor, allowing a spectrum of visual structures to be extracted from a single pre-trained predictor without finetuning on annotated datasets. We demonstrate that these structures are useful for physical dynamics understanding, allowing CWM to achieve the state-of-the-art performance on the Physion benchmark.
Comment: ECCV 2024. Project page at: https://neuroailab.github.io/cwm-physics/
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2312.06721
رقم الأكسشن: edsarx.2312.06721
قاعدة البيانات: arXiv