AITopics | chatterjee

Collaborating Authors

chatterjee

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Variance, Admissibility, and Stability of Empirical Risk Minimization

Neural Information Processing SystemsDec-26-2025, 03:12:02 GMT

It is well known that Empirical Risk Minimization (ERM) may attain minimax suboptimal rates in terms of the mean squared error (Birgé and Massart, 1993). In this paper, we prove that, under relatively mild assumptions, the suboptimality of ERM must be due to its bias. Namely, the variance error term of ERM (in terms of the bias and variance decomposition) enjoys the minimax rate. In the fixed design setting, we provide an elementary proof of this result using the probabilistic method. Then, we extend our proof to the random design setting for various models. In addition, we provide a simple proof of Chatterjee's admissibility theorem (Chatterjee, 2014, Theorem 1.4), which states that in the fixed design setting, ERM cannot be ruled out as an optimal method, and then we extend this result to the random design setting. We also show that our estimates imply stability of ERM, complementing the main result of Caponnetto and Rakhlin (2006) for non-Donsker classes. Finally, we highlight the somewhat irregular nature of the loss landscape of ERM in the non-Donsker regime, by showing that functions can be close to ERM, in terms of $L_2$ distance, while still being far from almost-minimizers of the empirical loss.

admissibility, empirical risk minimization, name change, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.63)

Add feedback

Reinforcement Learning for Self-Healing Material Systems

Chatterjee, Maitreyi, Agarwal, Devansh, Chatterjee, Biplab

arXiv.org Artificial IntelligenceNov-25-2025

The transition to autonomous material systems necessitates adaptive control methodologies to maximize structural longevity. This study frames the self-healing process as a Reinforcement Learning (RL) problem within a Markov Decision Process (MDP), enabling agents to autonomously derive optimal policies that efficiently balance structural integrity maintenance against finite resource consumption. A comparative evaluation of discrete-action (Q-learning, DQN) and continuous-action (TD3) agents in a stochastic simulation environment revealed that RL controllers significantly outperform heuristic baselines, achieving near-complete material recovery. Crucially, the TD3 agent utilizing continuous dosage control demonstrated superior convergence speed and stability, underscoring the necessity of fine-grained, proportional actuation in dynamic self-healing applications.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2511.18728

Country:

North America > United States (0.29)
Asia > India (0.18)
North America > Canada > Alberta (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

A more efficient method for large-sample model-free feature screening via multi-armed bandits

Ouyang, Xiaxue, Kang, Xinlai, Li, Mengyu, Dou, Zhenxing, Yu, Jun, Meng, Cheng

arXiv.org Machine LearningSep-22-2025

We consider the model-free feature screening in large-scale ultrahigh-dimensional data analysis. Existing feature screening methods often face substantial computational challenges when dealing with large sample sizes. To alleviate the computational burden, we propose a rank-based model-free sure independence screening method (CR-SIS) and its efficient variant, BanditCR-SIS. The CR-SIS method, based on Chatterjee's rank correlation, is as straightforward to implement as the sure independence screening (SIS) method based on Pearson correlation introduced by Fan and Lv(2008), but it is significantly more powerful in detecting nonlinear relationships between variables. Motivated by the multi-armed bandit (MAB) problem, we reformulate the feature screening procedure to significantly reduce the computational complexity of CR-SIS. For a predictor matrix of size n \times p, the computational cost of CR-SIS is O(nlog(n)p), while BanditCR-SIS reduces this to O(\sqrt(n)log(n)p + nlog(n)). Theoretically, we establish the sure screening property for both CR-SIS and BanditCR-SIS under mild regularity conditions. Furthermore, we demonstrate the effectiveness of our methods through extensive experimental studies on both synthetic and real-world datasets. The results highlight their superior performance compared to classical screening methods, requiring significantly less computational time.

banditcr-sis, chatterjee, screening, (15 more...)

arXiv.org Machine Learning

2509.16085

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Portugal > Porto > Porto (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Apple quietens Wall Street's fears of China struggles and slow AI progress

The GuardianJul-31-2025, 22:49:33 GMT

Apple has been under pressure this year. It's playing catch-up to its fellow tech giants on artificial intelligence, it's seen its stock fall by double digits since the year began, it closed a store in China for the first time ever this week, and looming US tariffs on Beijing threaten its supply chain. On Thursday, the company released its third-quarter earnings of the fiscal year as investors scrutinize how the iPhone maker might turn things around. Despite the gloomy outlook, the company is still worth more than 3tn, and it beat Wall Street's expectations for profit and revenue this quarter. Apple reported a massive 10% year-over-year increase in revenue to 94.04bn, and 1.57 per share in earnings.

apple, artificial intelligence, wall street, (11 more...)

The Guardian

Country:

North America > United States > New York > New York County > New York City (0.63)
Asia > China > Beijing > Beijing (0.26)
Asia > India (0.07)
Asia > Vietnam (0.06)

Genre: Financial News (0.36)

Industry:

Banking & Finance > Trading (0.77)
Government > Regional Government > North America Government > United States Government (0.52)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

HIDE and Seek: Detecting Hallucinations in Language Models via Decoupled Representations

Chatterjee, Anwoy, Goel, Yash, Chakraborty, Tanmoy

arXiv.org Artificial IntelligenceJun-24-2025

Contemporary Language Models (LMs), while impressively fluent, often generate content that is factually incorrect or unfaithful to the input context - a critical issue commonly referred to as 'hallucination'. This tendency of LMs to generate hallucinated content undermines their reliability, especially because these fabrications are often highly convincing and therefore difficult to detect. While several existing methods attempt to detect hallucinations, most rely on analyzing multiple generations per input, leading to increased computational cost and latency. To address this, we propose a single-pass, training-free approach for effective Hallucination detectIon via Decoupled rEpresentations (HIDE). Our approach leverages the hypothesis that hallucinations result from a statistical decoupling between an LM's internal representations of input context and its generated output. We quantify this decoupling using the Hilbert-Schmidt Independence Criterion (HSIC) applied to hidden-state representations extracted while generating the output sequence. We conduct extensive experiments on four diverse question answering datasets, evaluating both faithfulness and factuality hallucinations across six open-source LMs of varying scales and properties. Our results demonstrate that HIDE outperforms other single-pass methods in almost all settings, achieving an average relative improvement of ~29% in AUC-ROC over the best-performing single-pass strategy across various models and datasets. Additionally, HIDE shows competitive and often superior performance with multi-pass state-of-the-art methods, obtaining an average relative improvement of ~3% in AUC-ROC while consuming ~51% less computation time. Our findings highlight the effectiveness of exploiting internal representation decoupling in LMs for efficient and practical hallucination detection.

large language model, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2506.17748

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)

Add feedback

AI takes backseat as Apple unveils software revamp and new apps

The GuardianJun-9-2025, 20:34:24 GMT

Apple's artificial intelligence features took a backseat on Monday at its latest annual Worldwide Developers Conference. The company announced a revamped software design called Liquid Glass, new phone and camera apps as well as new features on Apple Watch and Vision Pro. But in spite of pressure to compete with firms that have gone all-in on AI, Apple's AI announcements were limited to incremental features and upgrades. Users will have a few new Apple Intelligence-powered features to look forward to including live translation, a real-time language translation feature that will be integrated into messages, FaceTime and the Phone app. The Android operating system has offered a similar feature for several years.

artificial intelligence, large language model, natural language, (13 more...)

The Guardian

Genre: Press Release (0.60)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.38)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)

Add feedback

XicorAttention: Time Series Transformer Using Attention with Nonlinear Correlation

Kimura, Daichi, Izumitani, Tomonori, Kashima, Hisashi

arXiv.org Artificial IntelligenceJun-4-2025

Various Transformer-based models have been proposed for time series forecasting. These models leverage the self-attention mechanism to capture long-term temporal or variate dependencies in sequences. Existing methods can be divided into two approaches: (1) reducing computational cost of attention by making the calculations sparse, and (2) reshaping the input data to aggregate temporal features. However, existing attention mechanisms may not adequately capture inherent nonlinear dependencies present in time series data, leaving room for improvement. In this study, we propose a novel attention mechanism based on Chatterjee's rank correlation coefficient, which measures nonlinear dependencies between variables. Specifically, we replace the matrix multiplication in standard attention mechanisms with this rank coefficient to measure the query-key relationship. Since computing Chatterjee's correlation coefficient involves sorting and ranking operations, we introduce a differentiable approximation employing SoftSort and SoftRank. Our proposed mechanism, ``XicorAttention,'' integrates it into several state-of-the-art Transformer models. Experimental results on real-world datasets demonstrate that incorporating nonlinear correlation into the attention improves forecasting accuracy by up to approximately 9.1\% compared to existing models.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.02694

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Modeling & Simulation (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Value Iteration with Guessing for Markov Chains and Markov Decision Processes

Chatterjee, Krishnendu, JafariRaviz, Mahdi, Saona, Raimundo, Svoboda, Jakub

arXiv.org Artificial IntelligenceMay-13-2025

Two standard models for probabilistic systems are Markov chains (MCs) and Markov decision processes (MDPs). Classic objectives for such probabilistic models for control and planning problems are reachability and stochastic shortest path. The widely studied algorithmic approach for these problems is the Value Iteration (VI) algorithm which iteratively applies local updates called Bellman updates. There are many practical approaches for VI in the literature but they all require exponentially many Bellman updates for MCs in the worst case. A preprocessing step is an algorithm that is discrete, graph-theoretical, and requires linear space. An important open question is whether, after a polynomial-time preprocessing, VI can be achieved with sub-exponentially many Bellman updates. In this work, we present a new approach for VI based on guessing values. Our theoretical contributions are twofold. First, for MCs, we present an almost-linear-time preprocessing algorithm after which, along with guessing values, VI requires only subexponentially many Bellman updates. Second, we present an improved analysis of the speed of convergence of VI for MDPs. Finally, we present a practical algorithm for MDPs based on our new approach. Experimental results show that our approach provides a considerable improvement over existing VI-based approaches on several benchmark examples from the literature.

artificial intelligence, bellman update, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-90653-4_11

2505.06769

Country: North America > United States (0.45)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Early-Stopped Mirror Descent for Linear Regression over Convex Bodies

Wegel, Tobias, Kur, Gil, Rebeschini, Patrick

arXiv.org Machine LearningMar-5-2025

Early-stopped iterative optimization methods are widely used as alternatives to explicit regularization, and direct comparisons between early-stopping and explicit regularization have been established for many optimization geometries. However, most analyses depend heavily on the specific properties of the optimization geometry or strong convexity of the empirical objective, and it remains unclear whether early-stopping could ever be less statistically efficient than explicit regularization for some particular shape constraint, especially in the overparameterized regime. To address this question, we study the setting of high-dimensional linear regression under additive Gaussian noise when the ground truth is assumed to lie in a known convex body and the task is to minimize the in-sample mean squared error. Our main result shows that for any convex body and any design matrix, up to an absolute constant factor, the worst-case risk of unconstrained early-stopped mirror descent with an appropriate potential is at most that of the least squares estimator constrained to the convex body. We achieve this by constructing algorithmic regularizers based on the Minkowski functional of the convex body.

convex, convex body, probability, (16 more...)

arXiv.org Machine Learning

2503.03426

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.27)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.60)

Add feedback

Apple unveils souped up version of its cheapest iPhone

Al JazeeraFeb-19-2025, 21:08:31 GMT

Apple has released a sleeker and more expensive version of its lowest-priced iPhone in an attempt to widen the audience for a bundle of artificial intelligence technology that the company has been hoping will revive demand for its most profitable product lineup. The iPhone 16e unveiled Wednesday is the fourth generation of a model that's sold at a dramatically lower price than the iPhone's standard and premium models. The previous bargain-bin models were called the iPhone SE, with the last version coming out in 2022. Like the higher-priced iPhone 16 lineup unveiled last September, the iPhone 16e includes the souped-up computer chip needed to process an array of AI features that automatically summarise text and audio and create on-the-fly emojis while smartening up the device's virtual assistant, Siri. It will also have a more powerful battery and camera.

artificial intelligence, cheapest iphone, iphone, (5 more...)

Al Jazeera

Country: North America > United States > California (0.18)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.58)

Add feedback