دورية أكاديمية

Leveraging explainability for understanding object descriptions in ambiguous 3D environments.

التفاصيل البيبلوغرافية
العنوان: Leveraging explainability for understanding object descriptions in ambiguous 3D environments.
المؤلفون: Doğan FI; Division of Robotics, Perception and Learning, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden., Melsión GI; Division of Robotics, Perception and Learning, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden., Leite I; Division of Robotics, Perception and Learning, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden.
المصدر: Frontiers in robotics and AI [Front Robot AI] 2023 Jan 04; Vol. 9, pp. 937772. Date of Electronic Publication: 2023 Jan 04 (Print Publication: 2022).
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: Frontiers Media SA Country of Publication: Switzerland NLM ID: 101749350 Publication Model: eCollection Cited Medium: Internet ISSN: 2296-9144 (Electronic) Linking ISSN: 22969144 NLM ISO Abbreviation: Front Robot AI Subsets: PubMed not MEDLINE
أسماء مطبوعة: Original Publication: Lausanne, Switzerland : Frontiers Media SA, [2014]-
مستخلص: For effective human-robot collaboration, it is crucial for robots to understand requests from users perceiving the three-dimensional space and ask reasonable follow-up questions when there are ambiguities. While comprehending the users' object descriptions in the requests, existing studies have focused on this challenge for limited object categories that can be detected or localized with existing object detection and localization modules. Further, they have mostly focused on comprehending the object descriptions using flat RGB images without considering the depth dimension. On the other hand, in the wild, it is impossible to limit the object categories that can be encountered during the interaction, and 3-dimensional space perception that includes depth information is fundamental in successful task completion. To understand described objects and resolve ambiguities in the wild, for the first time, we suggest a method leveraging explainability. Our method focuses on the active areas of an RGB scene to find the described objects without putting the previous constraints on object categories and natural language instructions. We further improve our method to identify the described objects considering depth dimension. We evaluate our method in varied real-world images and observe that the regions suggested by our method can help resolve ambiguities. When we compare our method with a state-of-the-art baseline, we show that our method performs better in scenes with ambiguous objects which cannot be recognized by existing object detectors. We also show that using depth features significantly improves performance in scenes where depth data is critical to disambiguate the objects and across our evaluation dataset that contains objects that can be specified with and without the depth dimension.
Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
(Copyright © 2023 Doğan, Melsión and Leite.)
References: Nat Rev Neurosci. 2001 Mar;2(3):194-203. (PMID: 11256080)
IEEE Trans Pattern Anal Mach Intell. 2013 Jan;35(1):185-207. (PMID: 22487985)
BMJ. 2019 Mar 12;364:l886. (PMID: 30862612)
Sci Robot. 2019 Dec 18;4(37):. (PMID: 33137717)
فهرسة مساهمة: Keywords: depth; explainability; real-world environments; referring expression comprehension (REC); resolving ambiguities
تواريخ الأحداث: Date Created: 20230127 Latest Revision: 20230202
رمز التحديث: 20240628
مُعرف محوري في PubMed: PMC9872646
DOI: 10.3389/frobt.2022.937772
PMID: 36704241
قاعدة البيانات: MEDLINE
الوصف
تدمد:2296-9144
DOI:10.3389/frobt.2022.937772