AITopics

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.61)

Neural Information Processing SystemsFeb-9-2026, 14:56:17 GMT

832353270aacb6e3322f493a66aaf5b9-Paper.pdf

artificial intelligence, machine learning, natural language, (20 more...)

Country:

North America > United States > Illinois > Champaign County > Champaign (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Japan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsFeb-8-2026, 16:47:10 GMT

52c4608c2f126708211b9e0a60eaf050-Supplemental.pdf

There is no universally agreed upon definition of adversarial risk.

adversarial risk, artificial intelligence, machine learning, (18 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Almajed, Abdalwahab, Tabar, Maryam, Najafirad, Peyman

A Comprehensive Evaluation of the Sensitivity of Density-Ratio Estimation Based Fairness Measurement in Regression

arXiv.org Artificial IntelligenceAug-21-2025

The prevalence of algorithmic bias in Machine Learning (ML)-driven approaches has inspired growing research on measuring and mitigating bias in the ML domain. Accordingly, prior research studied how to measure fairness in regression which is a complex problem. In particular, recent research proposed to formulate it as a density-ratio estimation problem and relied on a Logistic Regression-driven probabilistic classifier-based approach to solve it. However, there are several other methods to estimate a density ratio, and to the best of our knowledge, prior work did not study the sensitivity of such fairness measurement methods to the choice of underlying density ratio estimation algorithm. To fill this gap, this paper develops a set of fairness measurement methods with various density-ratio estimation cores and thoroughly investigates how different cores would affect the achieved level of fairness. Our experimental results show that the choice of density-ratio estimation core could significantly affect the outcome of fairness measurement method, and even, generate inconsistent results with respect to the relative fairness of various algorithms. These observations suggest major issues with density-ratio estimation based fairness measurement in regression and a need for further research to enhance their reliability.

artificial intelligence, logistic regression, machine learning, (14 more...)

2508.14576

Country: North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Neural Information Processing SystemsAug-15-2025, 13:35:27 GMT

832353270aacb6e3322f493a66aaf5b9-Paper.pdf

artificial intelligence, machine learning, natural language, (20 more...)

Country:

North America > United States > Illinois > Champaign County > Champaign (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Japan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMay-23-2025

NAN: A Training-Free Solution to Coefficient Estimation in Model Merging

Si, Chongjie, Lv, Kangtao, Jiang, Jingjing, Wang, Yadao, Wang, Yongwei, Yang, Xiaokang, Su, Wenbo, Zheng, Bo, Shen, Wei

Model merging offers a training-free alternative to multi-task learning by combining independently fine-tuned models into a unified one without access to raw data. However, existing approaches often rely on heuristics to determine the merging coefficients, limiting their scalability and generality. In this work, we revisit model merging through the lens of least-squares optimization and show that the optimal merging weights should scale with the amount of task-specific information encoded in each model. Based on this insight, we propose NAN, a simple yet effective method that estimates model merging coefficients via the inverse of parameter norm. NAN is training-free, plug-and-play, and applicable to a wide range of merging strategies. Extensive experiments on show that NAN consistently improves performance of baseline methods.

arxiv preprint arxiv, machine learning, natural language, (15 more...)

2505.16148

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

arXiv.org Artificial IntelligenceFeb-6-2025

Improving Natural Language Understanding for LLMs via Large-Scale Instruction Synthesis

Yuan, Lin, Xu, Jun, Gui, Honghao, Sun, Mengshu, Zhang, Zhiqiang, Liang, Lei, Zhou, Jun

High-quality, large-scale instructions are crucial for aligning large language models (LLMs), however, there is a severe shortage of instruction in the field of natural language understanding (NLU). Previous works on constructing NLU instructions mainly focus on information extraction (IE), neglecting tasks such as machine reading comprehension, question answering, and text classification. Furthermore, the lack of diversity in the data has led to a decreased generalization ability of trained LLMs in other NLU tasks and a noticeable decline in the fundamental model's general capabilities. To address this issue, we propose Hum, a large-scale, high-quality synthetic instruction corpus for NLU tasks, designed to enhance the NLU capabilities of LLMs. Specifically, Hum includes IE (either close IE or open IE), machine reading comprehension, text classification, and instruction generalist tasks, thereby enriching task diversity. Additionally, we introduce a human-LLMs collaborative mechanism to synthesize instructions, which enriches instruction diversity by incorporating guidelines, preference rules, and format variants. We conduct extensive experiments on 5 NLU tasks and 28 general capability evaluation datasets for LLMs. Experimental results show that Hum enhances the NLU capabilities of six LLMs by an average of 3.1\%, with no significant decline observed in other general capabilities.

large language model, machine learning, natural language, (19 more...)

2502.03843

Country:

Europe > France (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
(32 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Leisure & Entertainment (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-16-2024

Light-Weight Fault Tolerant Attention for Large Language Model Training

Liang, Yuhang, Li, Xinyi, Ren, Jie, Li, Ang, Fang, Bo, Chen, Jieyang

Large Language Models (LLMs) have demonstrated remarkable performance in various natural language processing tasks. However, the training of these models is computationally intensive and susceptible to faults, particularly in the attention mechanism, which is a critical component of transformer-based LLMs. In this paper, we investigate the impact of faults on LLM training, focusing on INF, NaN, and near-INF values in the computation results with systematic fault injection experiments. We observe the propagation patterns of these errors, which can trigger non-trainable states in the model and disrupt training, forcing the procedure to load from checkpoints. To mitigate the impact of these faults, we propose ATTNChecker, the first Algorithm-Based Fault Tolerance (ABFT) technique tailored for the attention mechanism in LLMs. ATTNChecker is designed based on fault propagation patterns of LLM and incorporates performance optimization to adapt to both system reliability and model vulnerability while providing lightweight protection for fast LLM training. Evaluations on four LLMs show that ATTNChecker on average incurs on average 7% overhead on training while detecting and correcting all extreme errors. Compared with the state-of-the-art checkpoint/restore approach, ATTNChecker reduces recovery overhead by up to 49x.

checksum, large language model, machine learning, (19 more...)

2410.1172

Country:

North America > United States > Alabama (0.04)
North America > United States > Virginia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-26-2024

Is larger always better? Evaluating and prompting large language models for non-generative medical tasks

Zhu, Yinghao, Gao, Junyi, Wang, Zixiang, Liao, Weibin, Zheng, Xiaochen, Liang, Lifang, Wang, Yasha, Pan, Chengwei, Harrison, Ewen M., Ma, Liantao

The use of Large Language Models (LLMs) in medicine is growing, but their ability to handle both structured Electronic Health Record (EHR) data and unstructured clinical notes is not well-studied. This study benchmarks various models, including GPT-based LLMs, BERT-based models, and traditional clinical predictive models, for non-generative medical tasks utilizing renowned datasets. We assessed 14 language models (9 GPT-based and 5 BERT-based) and 7 traditional predictive models using the MIMIC dataset (ICU patient records) and the TJH dataset (early COVID-19 EHR data), focusing on tasks such as mortality and readmission prediction, disease hierarchy reconstruction, and biomedical sentence matching, comparing both zero-shot and finetuned performance. Results indicated that LLMs exhibited robust zero-shot predictive capabilities on structured EHR data when using well-designed prompting strategies, frequently surpassing traditional models. However, for unstructured medical texts, LLMs did not outperform finetuned BERT models, which excelled in both supervised and unsupervised tasks. Consequently, while LLMs are effective for zero-shot learning on structured data, finetuned BERT models are more suitable for unstructured texts, underscoring the importance of selecting models based on specific task requirements and data characteristics to optimize the application of NLP technology in healthcare.

nan, prediction, reference range, (15 more...)

2407.18525

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Beijing > Beijing (0.04)
(4 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Chandra, Kartik, Li, Tzu-Mao, Nigam, Rachit, Tenenbaum, Joshua, Ragan-Kelley, Jonathan

WatChat: Explaining perplexing programs by debugging mental models

arXiv.org Artificial IntelligenceMar-8-2024

Often, a good explanation for a program's unexpected behavior is a bug in the programmer's code. But sometimes, an even better explanation is a bug in the programmer's mental model of the language they are using. Instead of merely debugging our current code ("giving the programmer a fish"), what if our tools could directly debug our mental models ("teaching the programmer to fish")? In this paper, we apply ideas from computational cognitive science to do exactly that. Given a perplexing program, we use program synthesis techniques to automatically infer potential misconceptions that might cause the user to be surprised by the program's behavior. By analyzing these misconceptions, we provide succinct, useful explanations of the program's behavior. Our methods can even be inverted to synthesize pedagogical example programs for diagnosing and correcting misconceptions in students.

explanation, mental model, misconception, (16 more...)

2403.05334

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Slovenia > Drava > Municipality of Maribor > Maribor (0.04)

Genre: Research Report (0.40)

Industry: Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.34)