Goto

Collaborating Authors

 taro


Implementing a Logical Inference System for Japanese Comparatives

Mikami, Yosuke, Matsuoka, Daiki, Yanaka, Hitomi

arXiv.org Artificial Intelligence

Natural Language Inference (NLI) involving comparatives is challenging because it requires understanding quantities and comparative relations expressed by sentences. While some approaches leverage Large Language Models (LLMs), we focus on logic-based approaches grounded in compositional semantics, which are promising for robust handling of numerical and logical expressions. Previous studies along these lines have proposed logical inference systems for English comparatives. However, it has been pointed out that there are several morphological and semantic differences between Japanese and English comparatives. These differences make it difficult to apply such systems directly to Japanese comparatives. To address this gap, this study proposes ccg-jcomp, a logical inference system for Japanese comparatives based on compositional semantics. We evaluate the proposed system on a Japanese NLI dataset containing comparative expressions. We demonstrate the effectiveness of our system by comparing its accuracy with that of existing LLMs.


Can Large Language Models Robustly Perform Natural Language Inference for Japanese Comparatives?

Mikami, Yosuke, Matsuoka, Daiki, Yanaka, Hitomi

arXiv.org Artificial Intelligence

Large Language Models (LLMs) perform remarkably well in Natural Language Inference (NLI). However, NLI involving numerical and logical expressions remains challenging. Comparatives are a key linguistic phenomenon related to such inference, but the robustness of LLMs in handling them, especially in languages that are not dominant in the models' training data, such as Japanese, has not been sufficiently explored. To address this gap, we construct a Japanese NLI dataset that focuses on comparatives and evaluate various LLMs in zero-shot and few-shot settings. Our results show that the performance of the models is sensitive to the prompt formats in the zero-shot setting and influenced by the gold labels in the few-shot examples. The LLMs also struggle to handle linguistic phenomena unique to Japanese. Furthermore, we observe that prompts containing logical semantic representations help the models predict the correct labels for inference problems that they struggle to solve even with few-shot examples.


LLMs Struggle with NLI for Perfect Aspect: A Cross-Linguistic Study in Chinese and Japanese

Lu, Jie, Jin, Du, Yanaka, Hitomi

arXiv.org Artificial Intelligence

Unlike English, which uses distinct forms (e.g., had, has, will have) to mark the perfect aspect across tenses, Chinese and Japanese lack separate grammatical forms for tense within the perfect aspect, which complicates Natural Language Inference (NLI). Focusing on the perfect aspect in these languages, we construct a linguistically motivated, template-based NLI dataset (1,350 pairs per language). Experiments reveal that even advanced LLMs struggle with temporal inference, particularly in detecting subtle tense and reference-time shifts. These findings highlight model limitations and underscore the need for cross-linguistic evaluation in temporal semantics. Our dataset is available at https://github.com/Lujie2001/CrossNLI.


Exploring Reasoning Biases in Large Language Models Through Syllogism: Insights from the NeuBAROCO Dataset

Ozeki, Kentaro, Ando, Risako, Morishita, Takanobu, Abe, Hirohiko, Mineshima, Koji, Okada, Mitsuhiro

arXiv.org Artificial Intelligence

This paper explores the question of how accurately current large language models can perform logical reasoning in natural language, with an emphasis on whether these models exhibit reasoning biases similar to humans. Specifically, our study focuses on syllogistic reasoning, a form of deductive reasoning extensively studied in cognitive science as a natural form of human reasoning. We present a syllogism dataset called NeuBAROCO, which consists of syllogistic reasoning problems in English and Japanese. This dataset was originally designed for psychological experiments to assess human reasoning capabilities using various forms of syllogisms. Our experiments with leading large language models indicate that these models exhibit reasoning biases similar to humans, along with other error tendencies. Notably, there is significant room for improvement in reasoning problems where the relationship between premises and hypotheses is neither entailment nor contradiction. We also present experimental results and in-depth analysis using a new Chain-of-Thought prompting method, which asks LLMs to translate syllogisms into abstract logical expressions and then explain their reasoning process. Our analysis using this method suggests that the primary limitations of LLMs lie in the reasoning process itself rather than the interpretation of syllogisms.


Effective Targeted Attacks for Adversarial Self-Supervised Learning

Kim, Minseon, Ha, Hyeonjeong, Son, Sooel, Hwang, Sung Ju

arXiv.org Artificial Intelligence

Recently, unsupervised adversarial training (AT) has been highlighted as a means of achieving robustness in models without any label information. Previous studies in unsupervised AT have mostly focused on implementing self-supervised learning (SSL) frameworks, which maximize the instance-wise classification loss to generate adversarial examples. However, we observe that simply maximizing the self-supervised training loss with an untargeted adversarial attack often results in generating ineffective adversaries that may not help improve the robustness of the trained model, especially for non-contrastive SSL frameworks without negative examples. To tackle this problem, we propose a novel positive mining for targeted adversarial attack to generate effective adversaries for adversarial SSL frameworks. Specifically, we introduce an algorithm that selects the most confusing yet similar target example for a given instance based on entropy and similarity, and subsequently perturbs the given instance towards the selected target. Our method demonstrates significant enhancements in robustness when applied to non-contrastive SSL frameworks, and less but consistent robustness improvements with contrastive SSL frameworks, on the benchmark datasets.


Interactive Task Planning with Language Models

Li, Boyi, Wu, Philipp, Abbeel, Pieter, Malik, Jitendra

arXiv.org Artificial Intelligence

An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals or distinct tasks, even during execution. However, most traditional methods require predefined module design, which makes it hard to generalize to different goals. Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain-specific pretrained models. To tackle this, we propose a simple framework that achieves interactive task planning with language models. Our system incorporates both high-level planning and low-level function execution via language. We verify the robustness of our system in generating novel high-level instructions for unseen objectives and its ease of adaptation to different tasks by merely substituting the task guidelines, without the need for additional complex prompt engineering. Furthermore, when the user sends a new request, our system is able to replan accordingly with precision based on the new request, task guidelines and previously executed steps. Please check more details on our https://wuphilipp.github.io/itp_site and https://youtu.be/TrKLuyv26_g.


Data Scientist at Charger Logistics Inc - Santiago de Querétaro, Querétaro, Mexico

#artificialintelligence

Charger Logistics is a world class asset-based carrier. We specialize in delivering assets, on time and on budget. With the diverse fleet of equipment, we can handle a range of freight, including dedicated loads, specialized hauls, temperature-controlled goods and HAZMAT cargo. We invest our time and support the employees to provide them with the room to learn and grow their expertise and work their way up. We are entrepreneurial-minded organization that welcomes and support individual idea and strategies.


'The world is chaotic, not me' – Nier: Automata's Yoko Taro

The Guardian

Bereft of his signature mask – which he will not be photographed without – and perched awkwardly on a folding chair, video game director Yoko Taro has the air of a dishevelled monk. The famously camera shy developer behind cult hit games like Drakengard 3 and last year's Nier: Automata listens attentively as questions and answers are rapidly translated. "To be honest, I think I am making normal games targeted towards normal people," he says. "But ultimately when I release those normal games, weird people find them to be weird games and enjoy them. Which probably means there's something wrong with me."


Nier: Automata – how a 'weird game for weird people' became a sleeper hit

The Guardian

In 2014, game designer Yoko Taro gave a talk about the creative process behind his cult PlayStation 3 title Nier: Replicant. He called the talk "Weird Games for Weird People". That is the best possible description of what he makes. Taro is famous for the eccentric persona he presents to the world. He rarely shows his face in public or interviews, preferring to talk from behind a sock puppet or the eerie wide grin of a mask.


NieR: Automata reviewed

#artificialintelligence

Here, in reality, we live in a period of unprecedented introspection with regard to robots and automation. Hastening developments in the fields of artificial intelligence and cybernetics are converging on the creation of machines that are independent from human oversight. A recent commission from the European Parliament demanded a set of regulations be drawn up to govern the creation, use, and even rights of robots. Luminaries such as Stephen Hawking and Elon Musk have warned that AI presents the greatest existential threat to mankind. You play as YoRHa No. 2 Model B (hereafter 2B for brevity and sanity's sake), a combat droid deployed by the human survivors who live on a spaceship circumnavigating Earth.