Despite encouraging progress in embodied learning over the past two decades, there is still a large gap between embodied agents' perception and human perception. Humans have remarkable capabilities combining all our multisensory inputs. To close the gap, embodied agents should also be enabled to see, hear, touch, and interact with their surroundings in order to select the appropriate actions. However, today's learning algorithms primarily operate on a single modality. In order for Artificial Intelligence to make progress in understanding the world around us, it needs to be able to interpret such multimodal signals jointly. The goal of this workshop is to share recent progress and discuss current challenges on embodied learning with multiple modalities.
Invited Speakers
Call for Papers
We invite submissions of 2-4 pages extended abstracts in topics related to (but not limited to):
- audio-visual embodied learning
- touch sensing and embodied learning
- language and embodied learning
- speech and embodied learning
- self-supervised/semi-supervised learning with multiple modalities
- multimodal reinforcement learning
- meta-learning with multiple modalities
- novel multimodal datasets/simulators/tasks for embodied agents
- combining multisensory inputs for robot perception
- bio-inspired approaches for multimodal perception
A submission should take the form of an extended abstract (2-4 pages long excluding references) in PDF format using the ICLR style. We will accept submissions of (1) papers that have not been previously published or accepted for publication in substantially similar form; (2) papers that have been published or accepted for publication in recent venues including journal, conference, workshop, and arXiv; and (3) research proposals for future work with a focus on well-defined concepts and ideas. All submissions will be reviewed with single blind policy. Accepted extended abstracts will not appear in ICLR proceedings, and hence will not affect future publication of the work. We will publish all accepted extended abstracts on the workshop webpage.
CMT submissions website: https://cmt3.research.microsoft.com/EML2021
Key Dates:
- Extended abstract submission deadline:
March 5th, 2021 (11:59 PM PST)
- Late submission deadline:
March 22nd, 2021 (11:59 PM PST)
- Notification to authors:
March 26th, 2021
- Workshop date: May 7th, 2021
Program Committee:
Unnat Jain (UIUC), Michelle Lee (Stanford), Paul Pu Liang (CMU), Senthil Purushwalkam (CMU), Santhosh Kumar Ramakrishnan (UT Austin), Mohit Shridhar (UW), Tianmin Shu (MIT), Shaoxiong Wang (MIT)
Key Dates:
- Extended abstract submission deadline:
March 5th, 2021 (11:59 PM PST)
- Late submission deadline:
March 22nd, 2021 (11:59 PM PST)
- Notification to authors:
March 26th, 2021
- Workshop date: May 7th, 2021
- Extended abstract submission deadline:
March 5th, 2021 (11:59 PM PST) - Late submission deadline:
March 22nd, 2021 (11:59 PM PST) - Notification to authors:
March 26th, 2021 - Workshop date: May 7th, 2021
Program Committee:
Unnat Jain (UIUC), Michelle Lee (Stanford), Paul Pu Liang (CMU), Senthil Purushwalkam (CMU), Santhosh Kumar Ramakrishnan (UT Austin), Mohit Shridhar (UW), Tianmin Shu (MIT), Shaoxiong Wang (MIT)
Schedule
07:55 am - 08:00 am (PDT) | Introduction and Opening Remarks | ||
08:00 am - 08:30 am (PDT) | Invited Talk | Katherine Kuchenbecker (MPI-IS) |
|
08:30 am - 09:00 am (PDT) | Invited Talk | Danica Kragic (KTH) |
|
09:00 am - 09:30 am (PDT) | Paper Session A | A1 - A5 | |
09:30 am - 09:40 am (PDT) | Paper Session A Q&A | ||
09:40 am - 10:00 am (PDT) | Break | ||
10:00 am - 10:30 am (PDT) | Invited Talk | Linda Smith (Indiana University) |
|
10:30 am - 11:00 am (PDT) | Invited Talk | Felix Hill (DeepMind) |
|
11:00 am - 12:00 pm (PDT) | Panel Discussion | Kristen Grauman, Felix Hill, Katherine Kuchenbecker, Sergey Levine, Jitendra Malik, Linda Smith | Having a question for the panelists? Ask here! |
12:00 pm - 12:30 pm (PDT) | Break | ||
12:30 pm - 01:00 pm (PDT) | Invited Talk | Abhinav Gupta (CMU & FAIR) |
|
01:00 pm - 01:30 pm (PDT) | Invited Talk | Sergey Levine (UC Berkeley & Google) |
|
01:30 pm - 02:00 pm (PDT) | Paper Session B | B1 - B4 | |
02:00 pm - 02:10 pm (PDT) | Paper Session B Q&A | ||
02:10 pm - 02:30 pm (PDT) | Break | ||
02:30 pm - 03:00 pm (PDT) | Invited Talk | Jitendra Malik (UC Berkeley & FAIR) |
|
03:00 pm - 03:30 pm (PDT) | Invited Talk | Claudia Pérez D'Arpino (Stanford University) |
|
03:30 pm - 03:35 pm (PDT) | Closing Remarks |
Accepted Papers
Title | Authors | Paper Session |
ABC Problem: An Investigation of Offline RL for Vision-Based Dynamic Manipulation | Kamyar Ghassemipour, Igor Mordatch, Shixiang Shane Gu | A1 |
Language Acquisition is Embodied, Interactive, Emotive: a Research Proposal | Casey Kennington | A2 |
Ask & Explore: Grounded Question Answering for Curiosity-Driven Exploration | Jivat Neet Kaur, Yiding Jiang, Paul Pu Liang | A3 |
Towards Teaching Machines with Language: Interactive Learning From Only Language Descriptions of Activities | Khanh Nguyen, Dipendra Misra, Robert Schapire, Miroslav Dudik, Patrick Shafto | A4 |
YouRefIt: Embodied Reference Understanding with Language and Gesture | Yixin Chen, Qing Li, Deqian Kong, Yik Lun Kei, Tao Gao, Yixin Zhu, Song-Chun Zhu, Siyuan Huang | A5 |
Learning to Set Waypoints for Audio-Visual Navigation | Changan Chen, Sagnik Majumder, Ziad Al-Halah, Ruohan Gao, Santhosh K. Ramakrishnan, Kristen Grauman | B1 |
Semantic Audio-Visual Navigation | Changan Chen, Ziad Al-Halah, Kristen Grauman | B2 |
Attentive Feature Reuse for Multi Task Meta learning | Kiran Lekkala, Laurent Itti | B3 |
SeLaVi: self-labelling videos without any annotations from scratch | Yuki Asano, Mandela Patric, Christian Rupprecht, Andrea Vedaldi | B4 |