AITopics

2503.06112

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Vietnam > Da Nang > Da Nang (0.04)
Asia > Vietnam > Lâm Đồng Province (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Jiang, Qile, Gao, Zhiwei, Karniadakis, George Em

DeepSeek vs. ChatGPT vs. Claude: A Comparative Study for Scientific Computing and Scientific Machine Learning Tasks

arXiv.org Artificial IntelligenceMar-8-2025

Large Language Models (LLMs) have emerged as powerful tools for tackling a wide range of problems, including those in scientific computing, particularly in solving partial differential equations (PDEs). However, different models exhibit distinct strengths and preferences, resulting in varying levels of performance. In this paper, we compare the capabilities of the most advanced LLMs--DeepSeek, ChatGPT, and Claude--along with their reasoning-optimized versions in addressing computational challenges. Specifically, we evaluate their proficiency in solving traditional numerical problems in scientific computing as well as leveraging scientific machine learning techniques for PDE-based problems. We designed all our experiments so that a non-trivial decision is required, e.g. defining the proper space of input functions for neural operator learning. Our findings show that reasoning and hybrid-reasoning models consistently and significantly outperform non-reasoning ones in solving challenging problems, with ChatGPT o3-mini-high generally offering the fastest reasoning speed.

claude 3, sonnet, zhang, (16 more...)

2502.17764

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

The GuardianMar-7-2025, 16:21:58 GMT

Crypto giant Tether CEO on cooperating with Trump administration: 'We've never been shady'

Paolo Ardoino, CEO of the cryptocurrency company Tether, was flying over Switzerland last week as he contemplated the changing regulatory landscape. Tether used to be at war with the establishment. Now it is the establishment. The crypto giant – tether is the most traded cryptocurrency in the world – has had a strange trip. Four years ago, banks were dropping Tether as a client, and regulators in New York had the company against the wall over questions about commingled client and corporate funds.

artificial intelligence, large language model, natural language, (14 more...)

The Guardian

Country:

North America > United States > New York (0.25)
Europe > Switzerland (0.25)
South America > Argentina (0.05)
(6 more...)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)

Sharma, Krish, Barman, Niyar R, Chaturvedi, Akshay, Asher, Nicholas

DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module

We look at reasoning on GSM8k, a dataset of short texts presenting primary school, math problems. We find, with Mirzadeh et al. (2024), that current LLM progress on the data set may not be explained by better reasoning but by exposure to a broader pretraining data distribution. We then introduce a novel information source for helping models with less data or inferior training reason better: discourse structure. We show that discourse structure improves performance for models like Llama2 13b by up to 160%. Even for models that have most likely memorized the data set, adding discourse structural information to the model still improves predictions and dramatically improves large model performance on out of distribution examples.

discourse structure, extension, kiwis, (15 more...)

2503.04685

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Lima, João Alberto de Oliveira

Improving RAG Retrieval via Propositional Content Extraction: a Speech Act Theory Approach

When users formulate queries, they often include not only the information they seek, but also pragmatic markers such as interrogative phrasing or polite requests. Although these speech act indicators communicate the user\textquotesingle s intent -- whether it is asking a question, making a request, or stating a fact -- they do not necessarily add to the core informational content of the query itself. This paper investigates whether extracting the underlying propositional content from user utterances -- essentially stripping away the linguistic markers of intent -- can improve retrieval quality in Retrieval-Augmented Generation (RAG) systems. Drawing upon foundational insights from speech act theory, we propose a practical method for automatically transforming queries into their propositional equivalents before embedding. To assess the efficacy of this approach, we conducted an experimental study involving 63 user queries related to a Brazilian telecommunications news corpus with precomputed semantic embeddings. Results demonstrate clear improvements in semantic similarity between query embeddings and document embeddings at top ranks, confirming that queries stripped of speech act indicators more effectively retrieve relevant content.

propositional content, query, user query, (15 more...)

2503.10654

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > France (0.04)
Europe > Portugal > Coimbra > Coimbra (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Telecommunications (0.68)
Information Technology (0.49)
Law (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Research on Superalignment Should Advance Now with Parallel Optimization of Competence and Conformity

Kim, HyunJin, Yi, Xiaoyuan, Yao, Jing, Huang, Muhua, Bak, JinYeong, Evans, James, Xie, Xing

The recent leap in AI capabilities, driven by big generative models, has sparked the possibility of achieving Artificial General Intelligence (AGI) and further triggered discussions on Artificial Superintelligence (ASI), a system surpassing all humans across all domains. This gives rise to the critical research question of: If we realize ASI, how do we align it with human values, ensuring it benefits rather than harms human society, a.k.a., the Superalignment problem. Despite ASI being regarded by many as solely a hypothetical concept, in this paper, we argue that superalignment is achievable and research on it should advance immediately, through simultaneous and alternating optimization of task competence and value conformity. We posit that superalignment is not merely a safeguard for ASI but also necessary for its realization. To support this position, we first provide a formal definition of superalignment rooted in the gap between capability and capacity and elaborate on our argument. Then we review existing paradigms, explore their interconnections and limitations, and illustrate a potential path to superalignment centered on two fundamental principles. We hope this work sheds light on a practical approach for developing the value-aligned next-generation AI, garnering greater benefits and reducing potential harms for humanity.

arxiv preprint arxiv, language model, superalignment, (13 more...)

2503.0766

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
(11 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Education (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Rozner, Joshua, Weissweiler, Leonie, Mahowald, Kyle, Shain, Cory

Constructions are Revealed in Word Distributions

Construction grammar posits that constructions (form-meaning pairings) are acquired through experience with language (the distributional learning hypothesis). But how much information about constructions does this distribution actually contain? Corpus-based analyses provide some answers, but text alone cannot answer counterfactual questions about what caused a particular word to occur. For that, we need computable models of the distribution over strings -- namely, pretrained language models (PLMs). Here we treat a RoBERTa model as a proxy for this distribution and hypothesize that constructions will be revealed within it as patterns of statistical affinity. We support this hypothesis experimentally: many constructions are robustly distinguished, including (i) hard cases where semantically distinct constructions are superficially similar, as well as (ii) schematic constructions, whose "slots" can be filled by abstract word classes. Despite this success, we also provide qualitative evidence that statistical affinity alone may be insufficient to identify all constructions from text. Thus, statistical affinity is likely an important, but partial, signal available to learners.

affinity, construction, global affinity, (13 more...)

2503.06048

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(14 more...)

Genre: Research Report (1.00)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Sajadi, Amirali, Damevski, Kostadin, Chatterjee, Preetha

Psycholinguistic Analyses in Software Engineering Text: A Systematic Literature Review

Context: A deeper understanding of human factors in software engineering (SE) is essential for improving team collaboration, decision-making, and productivity. Communication channels like code reviews and chats provide insights into developers' psychological and emotional states. While large language models excel at text analysis, they often lack transparency and precision. Psycholinguistic tools like Linguistic Inquiry and Word Count (LIWC) offer clearer, interpretable insights into cognitive and emotional processes exhibited in text. Despite its wide use in SE research, no comprehensive review of LIWC's use has been conducted. Objective: We examine the importance of psycholinguistic tools, particularly LIWC, and provide a thorough analysis of its current and potential future applications in SE research. Methods: We conducted a systematic review of six prominent databases, identifying 43 SE-related papers using LIWC. Our analysis focuses on five research questions. Results: Our findings reveal a wide range of applications, including analyzing team communication to detect developer emotions and personality, developing ML models to predict deleted Stack Overflow posts, and more recently comparing AI-generated and human-written text. LIWC has been primarily used with data from project management platforms (e.g., GitHub) and Q&A forums (e.g., Stack Overflow). Key BSE concepts include Communication, Organizational Climate, and Positive Psychology. 26 of 43 papers did not formally evaluate LIWC. Concerns were raised about some limitations, including difficulty handling SE-specific vocabulary. Conclusion: We highlight the potential of psycholinguistic tools and their limitations, and present new use cases for advancing the research of human factors in SE (e.g., bias in human-LLM conversations).

developer, liwc, software engineering, (14 more...)

2503.05992

Country:

North America > United States > California > San Francisco County > San Francisco (0.28)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(27 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Information Technology (0.93)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.66)

Zhao, Eric, Awasthi, Pranjal, Haghtalab, Nika

From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning

Finetuning provides a scalable and cost-effective means of customizing language models for specific tasks or response styles, with greater reliability than prompting or in-context learning. In contrast, the conventional wisdom is that injecting knowledge via finetuning results in brittle performance and poor generalization. We argue that the dichotomy of "task customization" (e.g., instruction tuning) and "knowledge injection" (e.g., teaching new facts) is a distinction without a difference. We instead identify concrete factors that explain the heterogeneous effectiveness observed with finetuning. To this end, we conduct a large-scale experimental study of finetuning the frontier Gemini v1.5 model family on a spectrum of datasets that are artificially engineered to interpolate between the strengths and failure modes of finetuning. Our findings indicate that question-answer training data formats provide much stronger knowledge generalization than document/article-style training data, numerical information can be harder for finetuning to retain than categorical information, and models struggle to apply finetuned knowledge during multi-step reasoning even when trained on similar examples -- all factors that render "knowledge injection" to be especially difficult, even after controlling for considerations like data augmentation and information volume. On the other hand, our findings also indicate that it is not fundamentally more difficult to finetune information about a real-world event than information about what a model's writing style should be.

evaluation task, figure 4, information, (12 more...)

2503.05919

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > District of Columbia > Washington (0.04)
(27 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Government (0.93)
Leisure & Entertainment > Sports > Soccer (0.93)
Law (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Koo, Hamin, Kim, Jaehyung

Extracting and Emulsifying Cultural Explanation to Improve Multilingual Capability of LLMs

Large Language Models (LLMs) have achieved remarkable success, but their English-centric training data limits performance in non-English languages, highlighting the need for enhancements in their multilingual capabilities. While some work on multilingual prompting methods handles non-English queries by utilizing English translations or restructuring them to more closely align with LLM reasoning patterns, these works often overlook the importance of cultural context, limiting their effectiveness. To address this limitation, we propose EMCEI, a simple yet effective approach that improves LLMs' multilingual capabilities by incorporating cultural context for more accurate and appropriate responses. Specifically, EMCEI follows a two-step process that first extracts relevant cultural context from the LLM's parametric knowledge via prompting. Then, EMCEI employs an LLM-as-Judge mechanism to select the most appropriate response by balancing cultural relevance and reasoning ability. Experiments on diverse multilingual benchmarks show that EMCEI outperforms existing baselines, demonstrating its effectiveness in handling multilingual queries with LLMs.

dataset, llm, query, (17 more...)

2503.05846

Country:

Asia > Vietnam (0.04)
South America > Peru (0.04)
South America > Brazil (0.04)
(35 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Sports (0.46)
Government (0.46)
Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)