taro
Implementing a Logical Inference System for Japanese Comparatives
Mikami, Yosuke, Matsuoka, Daiki, Yanaka, Hitomi
Natural Language Inference (NLI) involving comparatives is challenging because it requires understanding quantities and comparative relations expressed by sentences. While some approaches leverage Large Language Models (LLMs), we focus on logic-based approaches grounded in compositional semantics, which are promising for robust handling of numerical and logical expressions. Previous studies along these lines have proposed logical inference systems for English comparatives. However, it has been pointed out that there are several morphological and semantic differences between Japanese and English comparatives. These differences make it difficult to apply such systems directly to Japanese comparatives. To address this gap, this study proposes ccg-jcomp, a logical inference system for Japanese comparatives based on compositional semantics. We evaluate the proposed system on a Japanese NLI dataset containing comparative expressions. We demonstrate the effectiveness of our system by comparing its accuracy with that of existing LLMs.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (3 more...)
Can Large Language Models Robustly Perform Natural Language Inference for Japanese Comparatives?
Mikami, Yosuke, Matsuoka, Daiki, Yanaka, Hitomi
Large Language Models (LLMs) perform remarkably well in Natural Language Inference (NLI). However, NLI involving numerical and logical expressions remains challenging. Comparatives are a key linguistic phenomenon related to such inference, but the robustness of LLMs in handling them, especially in languages that are not dominant in the models' training data, such as Japanese, has not been sufficiently explored. To address this gap, we construct a Japanese NLI dataset that focuses on comparatives and evaluate various LLMs in zero-shot and few-shot settings. Our results show that the performance of the models is sensitive to the prompt formats in the zero-shot setting and influenced by the gold labels in the few-shot examples. The LLMs also struggle to handle linguistic phenomena unique to Japanese. Furthermore, we observe that prompts containing logical semantic representations help the models predict the correct labels for inference problems that they struggle to solve even with few-shot examples.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- North America > Dominican Republic (0.04)
- (2 more...)
LLMs Struggle with NLI for Perfect Aspect: A Cross-Linguistic Study in Chinese and Japanese
Lu, Jie, Jin, Du, Yanaka, Hitomi
Unlike English, which uses distinct forms (e.g., had, has, will have) to mark the perfect aspect across tenses, Chinese and Japanese lack separate grammatical forms for tense within the perfect aspect, which complicates Natural Language Inference (NLI). Focusing on the perfect aspect in these languages, we construct a linguistically motivated, template-based NLI dataset (1,350 pairs per language). Experiments reveal that even advanced LLMs struggle with temporal inference, particularly in detecting subtle tense and reference-time shifts. These findings highlight model limitations and underscore the need for cross-linguistic evaluation in temporal semantics. Our dataset is available at https://github.com/Lujie2001/CrossNLI.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
- North America > Dominican Republic (0.04)
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
- (2 more...)
Exploring Reasoning Biases in Large Language Models Through Syllogism: Insights from the NeuBAROCO Dataset
Ozeki, Kentaro, Ando, Risako, Morishita, Takanobu, Abe, Hirohiko, Mineshima, Koji, Okada, Mitsuhiro
This paper explores the question of how accurately current large language models can perform logical reasoning in natural language, with an emphasis on whether these models exhibit reasoning biases similar to humans. Specifically, our study focuses on syllogistic reasoning, a form of deductive reasoning extensively studied in cognitive science as a natural form of human reasoning. We present a syllogism dataset called NeuBAROCO, which consists of syllogistic reasoning problems in English and Japanese. This dataset was originally designed for psychological experiments to assess human reasoning capabilities using various forms of syllogisms. Our experiments with leading large language models indicate that these models exhibit reasoning biases similar to humans, along with other error tendencies. Notably, there is significant room for improvement in reasoning problems where the relationship between premises and hypotheses is neither entailment nor contradiction. We also present experimental results and in-depth analysis using a new Chain-of-Thought prompting method, which asks LLMs to translate syllogisms into abstract logical expressions and then explain their reasoning process. Our analysis using this method suggests that the primary limitations of LLMs lie in the reasoning process itself rather than the interpretation of syllogisms.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Effective Targeted Attacks for Adversarial Self-Supervised Learning
Kim, Minseon, Ha, Hyeonjeong, Son, Sooel, Hwang, Sung Ju
Recently, unsupervised adversarial training (AT) has been highlighted as a means of achieving robustness in models without any label information. Previous studies in unsupervised AT have mostly focused on implementing self-supervised learning (SSL) frameworks, which maximize the instance-wise classification loss to generate adversarial examples. However, we observe that simply maximizing the self-supervised training loss with an untargeted adversarial attack often results in generating ineffective adversaries that may not help improve the robustness of the trained model, especially for non-contrastive SSL frameworks without negative examples. To tackle this problem, we propose a novel positive mining for targeted adversarial attack to generate effective adversaries for adversarial SSL frameworks. Specifically, we introduce an algorithm that selects the most confusing yet similar target example for a given instance based on entropy and similarity, and subsequently perturbs the given instance towards the selected target. Our method demonstrates significant enhancements in robustness when applied to non-contrastive SSL frameworks, and less but consistent robustness improvements with contrastive SSL frameworks, on the benchmark datasets.
- Information Technology > Security & Privacy (0.87)
- Government > Military (0.70)
Interactive Task Planning with Language Models
Li, Boyi, Wu, Philipp, Abbeel, Pieter, Malik, Jitendra
An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals or distinct tasks, even during execution. However, most traditional methods require predefined module design, which makes it hard to generalize to different goals. Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain-specific pretrained models. To tackle this, we propose a simple framework that achieves interactive task planning with language models. Our system incorporates both high-level planning and low-level function execution via language. We verify the robustness of our system in generating novel high-level instructions for unseen objectives and its ease of adaptation to different tasks by merely substituting the task guidelines, without the need for additional complex prompt engineering. Furthermore, when the user sends a new request, our system is able to replan accordingly with precision based on the new request, task guidelines and previously executed steps. Please check more details on our https://wuphilipp.github.io/itp_site and https://youtu.be/TrKLuyv26_g.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Research Report (0.64)
- Workflow (0.51)
Data Scientist at Charger Logistics Inc - Santiago de Querétaro, Querétaro, Mexico
Charger Logistics is a world class asset-based carrier. We specialize in delivering assets, on time and on budget. With the diverse fleet of equipment, we can handle a range of freight, including dedicated loads, specialized hauls, temperature-controlled goods and HAZMAT cargo. We invest our time and support the employees to provide them with the room to learn and grow their expertise and work their way up. We are entrepreneurial-minded organization that welcomes and support individual idea and strategies.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.40)
- North America > Mexico > Querétaro > Santiago de Querétaro (0.40)
- North America > Mexico > Querétaro > Querétaro (0.40)
- Information Technology > Data Science (0.84)
- Information Technology > Artificial Intelligence (0.55)
'The world is chaotic, not me' – Nier: Automata's Yoko Taro
Bereft of his signature mask – which he will not be photographed without – and perched awkwardly on a folding chair, video game director Yoko Taro has the air of a dishevelled monk. The famously camera shy developer behind cult hit games like Drakengard 3 and last year's Nier: Automata listens attentively as questions and answers are rapidly translated. "To be honest, I think I am making normal games targeted towards normal people," he says. "But ultimately when I release those normal games, weird people find them to be weird games and enjoy them. Which probably means there's something wrong with me."
Nier: Automata – how a 'weird game for weird people' became a sleeper hit
In 2014, game designer Yoko Taro gave a talk about the creative process behind his cult PlayStation 3 title Nier: Replicant. He called the talk "Weird Games for Weird People". That is the best possible description of what he makes. Taro is famous for the eccentric persona he presents to the world. He rarely shows his face in public or interviews, preferring to talk from behind a sock puppet or the eerie wide grin of a mask.
NieR: Automata reviewed
Here, in reality, we live in a period of unprecedented introspection with regard to robots and automation. Hastening developments in the fields of artificial intelligence and cybernetics are converging on the creation of machines that are independent from human oversight. A recent commission from the European Parliament demanded a set of regulations be drawn up to govern the creation, use, and even rights of robots. Luminaries such as Stephen Hawking and Elon Musk have warned that AI presents the greatest existential threat to mankind. You play as YoRHa No. 2 Model B (hereafter 2B for brevity and sanity's sake), a combat droid deployed by the human survivors who live on a spaceship circumnavigating Earth.
- North America > United States > New York (0.05)
- Europe > United Kingdom > England (0.05)
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.05)