What is Embodied Question Answering?



Embodied Question Answering is a new AI task where an agent is spawned at a random location in a 3D environment and asked a question ("What color is the car?"). In order to answer, the agent must first intelligently navigate to explore the environment, gather information through first-person (egocentric) vision, and then answer the question ("orange").

Apr 2019 — Two papers accepted at CVPR 2019!
Dec 2017Code for 3D environments is now available!

Embodied Question Answering in Photorealistic Environments with Point Cloud Perception

Erik Wijmans*, Samyak Datta*, Oleksandr Maksymets*, Abhishek Das, Georgia Gkioxari, Stefan Lee, Irfan Essa, Devi Parikh, Dhruv Batra
CVPR 2019 (Oral) [Bibtex] [PDF] [Slides]



Multi-Target Embodied Question Answering

Licheng Yu, Xinlei Chen, Georgia Gkioxari, Mohit Bansal, Tamara L. Berg, Dhruv Batra
CVPR 2019 [Bibtex] [PDF]



Neural Modular Control for Embodied Question Answering

Abhishek Das, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra
CoRL 2018 (Spotlight) [Bibtex] [PDF]



Embodied Question Answering

Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra
CVPR 2018 (Oral) [Bibtex] [PDF] [Code]




Acknowledgements

We are grateful to the developers of PyTorch for building an excellent framework. We thank Yuxin Wu for help with the House3D environment. This work was funded in part by NSF CAREER awards to DB and DP, ONR YIP awards to DP and DB, ONR Grant N00014-14-1-0679 to DB, ONR Grant N00014-16-1-2713 to DP, an Allen Distinguished Investigator award to DP from the Paul G. Allen Family Foundation, Google Faculty Research Awards to DP and DB, Amazon Academic Research Awards to DP and DB, AWS in Education Research grant to DB, and NVIDIA GPU donations to DB. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government, or any sponsor.