Goto

Collaborating Authors

 Commonsense Reasoning






Automating Dataset Updates Towards Reliable and Timely Evaluation of Large Language Models

Neural Information Processing Systems

There are two updating strategies: 1) mimicking strategy to generate similar samples based on original data, preserving stylistic and contextual essence, and 2) extending strategy that further expands existing samples at varying cognitive levels by adapting Bloom's taxonomy of educational objectives.






Landmark-Guided Knowledge for Vision-and-Language Navigation

arXiv.org Artificial Intelligence

Vision-and-language navigation is one of the core tasks in embodied intelligence, requiring an agent to autonomously navigate in an unfamiliar environment based on natural language instructions. However, existing methods often fail to match instructions with environmental information in complex scenarios, one reason being the lack of common-sense reasoning ability. This paper proposes a vision-and-language navigation method called Landmark-Guided Knowledge (LGK), which introduces an external knowledge base to assist navigation, addressing the misjudgment issues caused by insufficient common sense in traditional methods. Specifically, we first construct a knowledge base containing 630,000 language descriptions and use knowledge Matching to align environmental subviews with the knowledge base, extracting relevant descriptive knowledge. Next, we design a Knowledge-Guided by Landmark (KGL) mechanism, which guides the agent to focus on the most relevant parts of the knowledge by leveraging landmark information in the instructions, thereby reducing the data bias that may arise from incorporating external knowledge. Finally, we propose Knowledge-Guided Dynamic Augmentation (KGDA), which effectively integrates language, knowledge, vision, and historical information. Experimental results demonstrate that the LGK method outperforms existing state-of-the-art methods on the R2R and REVERIE vision-and-language navigation datasets, particularly in terms of navigation error, success rate, and path efficiency.