AITopics | mukherjee

Collaborating Authors

mukherjee

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

India's communists once ruled millions. What happened to them?

BBC NewsMay-26-2026, 23:12:59 GMT

India's communists once ruled millions. For the first time since 1957, India no longer has a single communist-led state government. The defeat of the Communist Party of India (Marxist)-led Left Democratic Front (LDF) in Kerala this month, after a decade in power, marked the end - at least for now - of one of the world's most enduring experiments in democratic communism. At their peak, India's communist parties ruled states stretching from West Bengal to Kerala and Tripura. They impacted the lives of more than 100 million people through trade unions, peasant organisations, student wings and disciplined cadre networks.

artificial intelligence, communist, west bengal, (14 more...)

BBC News

Country:

Asia > India > West Bengal (0.28)
Asia > India > Tripura (0.26)

Industry:

Leisure & Entertainment (1.00)
Government > Regional Government > Asia Government > India Government (0.30)

Technology: Information Technology > Artificial Intelligence (0.48)

Add feedback

Bilevel Learning via Inexact Stochastic Gradient Descent

Salehi, Mohammad Sadegh, Mukherjee, Subhadip, Roberts, Lindon, Ehrhardt, Matthias J.

arXiv.org Artificial IntelligenceNov-11-2025

Bilevel optimization is a central tool in machine learning for high-dimensional hyperparameter tuning. Its applications are vast; for instance, in imaging it can be used for learning data-adaptive regularizers and optimizing forward operators in variational regularization. These problems are large in many ways: a lot of data is usually available to train a large number of parameters, calling for stochastic gradient-based algorithms. However, exact gradients with respect to parameters (so-called hypergradients) are not available, and their precision is usually linearly related to computational cost. Hence, algorithms must solve the problem efficiently without unnecessary precision. The design of such methods is still not fully understood, especially regarding how accuracy requirements and step size schedules affect theoretical guarantees and practical performance. Existing approaches introduce stochasticity at both the upper level (e.g., in sampling or mini-batch estimates) and the lower level (e.g., in solving the inner problem) to improve generalization, but they typically fix the number of lower-level iterations, which conflicts with asymptotic convergence assumptions. In this work, we advance the theory of inexact stochastic bilevel optimization. We prove convergence and establish rates under decaying accuracy and step size schedules, showing that with optimal configurations convergence occurs at an $\mathcal{O}(k^{-1/4})$ rate in expectation. Experiments on image denoising and inpainting with convex ridge regularizers and input-convex networks confirm our analysis: decreasing step sizes improve stability, accuracy scheduling is more critical than step size strategy, and adaptive preconditioning (e.g., Adam) further boosts performance. These results bridge theory and practice, providing convergence guarantees and practical guidance for large-scale imaging problems.

artificial intelligence, machine learning, step size, (16 more...)

arXiv.org Artificial Intelligence

2511.06774

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Pivotal CLTs for Pseudolikelihood via Conditional Centering in Dependent Random Fields

Deb, Nabarun

arXiv.org Artificial IntelligenceOct-7-2025

Data from such models often exhibits significant deviations from classical Gaussian approximations. A natural class of statistics to analyze in such models are conditionally centered averages (see [30, 63, 52]), where one recenters the observations by their mean, given all other observations. Crucially, such conditionally centered CLTs are closely tied to maximum pseudolikelihood estimators (MPLEs) through the MPLE score (see [64, 60, 41]). This connection is practically important because in many graphical/Markov random field models (such as Ising models, exponential random graph models (ERGMs), etc.), computing the MLE is impeded by an intractable normalizing constant, whereas pseudolikelihood replaces the joint likelihood with a product of tractable conditional models, scales to large networks, and is widely usable in practice. However, most existing theory for conditionally centered statistics and for MPLE focuses on local dependence -- e.g., bounded degree or sparse neighborhoods -- and does not cover realistic dense regimes in which every node may have many connections (which scale with the size of the network). This paper bridges that gap by developing a general limit theory for conditionally centered statistics under weak and verifiable assumptions. Our results accommodate both sparse and dense interactions, as well as regular and irregular network connections. In particular, we deliver valid studentized inference for pseudolikelihood in network/Markov random field settings. As examples, we obtain new CLTs for conditionally centered averages and pseudo-likelihood estimators in Ising models (with pairwise and tensor interactions), and exponential random graph models, without imposing sparsity, regularity, or high temperature restrictions.

artificial intelligence, deb pseudolikelihood and conditional, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2510.04972

Country:

North America > United States (0.45)
Europe (0.27)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

AI-enhanced semantic feature norms for 786 concepts

Suresh, Siddharth, Mukherjee, Kushin, Giallanza, Tyler, Yu, Xizheng, Patil, Mia, Cohen, Jonathan D., Rogers, Timothy T.

arXiv.org Artificial IntelligenceMay-19-2025

Semantic feature norms have been foundational in the study of human conceptual knowledge, yet traditional methods face trade-offs between concept/feature coverage and verifiability of quality due to the labor-intensive nature of norming studies. Here, we introduce a novel approach that augments a dataset of human-generated feature norms with responses from large language models (LLMs) while verifying the quality of norms against reliable human judgments. We find that our AI-enhanced feature norm dataset, NOVA: Norms Optimized Via AI, shows much higher feature density and overlap among concepts while outperforming a comparable human-only norm dataset and word-embedding models in predicting people's semantic similarity judgments. Taken together, we demonstrate that human conceptual knowledge is richer than captured in previous norm datasets and show that, with proper validation, LLMs can serve as powerful tools for cognitive science research.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.10718

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.94)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Jennifer Doudna on the Brave New World Being Ushered In by Gene Editing

The New YorkerOct-16-2024, 20:00:00 GMT

In 2012, the biochemist Jennifer Doudna and her colleague Emmanuelle Charpentier developed a method for using RNA-guided proteins to edit specific sections of DNA. Their innovation--for which the two won the Nobel Prize in Chemistry, in 2020--is known as the CRISPR-Cas9 gene-editing system. CRISPR has since been used to alter plants (to, for instance, produce greater yields), insects (preventing them from carrying certain diseases), and people (to treat sickle-cell disease). The technology's promise can sound as if derived from science fiction: it might help us adapt to a radically different climate, or grow organs for those in need, or reprogram a cancer patient's own cells to target tumors. But there are also worries about its possible side effects, both biological and social.

brave new world, crispr, jennifer doudna, (10 more...)

The New Yorker

Country: North America > United States > California > Alameda County > Berkeley (0.05)

Genre: Personal > Honors (1.00)

Industry:

Health & Medicine > Therapeutic Area > Genetic Disease (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.70)

Technology: Information Technology > Artificial Intelligence (0.35)

Add feedback

RISSOLE: Parameter-efficient Diffusion Models via Block-wise Generation and Retrieval-Guidance

Mukherjee, Avideep, Banerjee, Soumya, Rai, Piyush, Namboodiri, Vinay P.

arXiv.org Artificial IntelligenceSep-2-2024

Diffusion-based models demonstrate impressive generation capabilities. However, they also have a massive number of parameters, resulting in enormous model sizes, thus making them unsuitable for deployment on resource-constraint devices. Block-wise generation can be a promising alternative for designing compact-sized (parameter-efficient) deep generative models since the model can generate one block at a time instead of generating the whole image at once. However, block-wise generation is also considerably challenging because ensuring coherence across generated blocks can be non-trivial. To this end, we design a retrieval-augmented generation (RAG) approach and leverage the corresponding blocks of the images retrieved by the RAG module to condition the training and generation stages of a block-wise denoising diffusion model. Our conditioning schemes ensure coherence across the different blocks during training and, consequently, during generation. While we showcase our approach using the latent diffusion model (LDM) as the base model, it can be used with other variants of denoising diffusion models. We validate the solution of the coherence problem through the proposed approach by reporting substantive experiments to demonstrate our approach's effectiveness in compact model size and excellent generation quality.

diffusion model, representation, rissole, (14 more...)

arXiv.org Artificial Intelligence

2408.17095

Country:

Europe > United Kingdom (0.04)
Asia > India > Uttar Pradesh (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LADDER: Revisiting the Cosmic Distance Ladder with Deep Learning Approaches and Exploring its Applications

Shah, Rahul, Saha, Soumadeep, Mukherjee, Purba, Garain, Utpal, Pal, Supratik

arXiv.org Artificial IntelligenceJan-30-2024

ABSTRACT We investigate the prospect of reconstructing the "cosmic distance ladder" of the Universe using a novel deep learning framework called LADDER - Learning Algorithm for Deep Distance Estimation and Reconstruction. LADDER is trained on the apparent magnitude data from the Pantheon Type Ia supernovae compilation, incorporating the full covariance information among data points, to produce predictions along with corresponding errors. After employing several validation tests with a number of deep learning models, we pick LADDER as the best performing one. We then demonstrate applications of our method in the cosmological context, that include serving as a model-independent tool for consistency checks for other datasets like baryon acoustic oscillations, calibration of high-redshift datasets such as gamma ray bursts, use as a model-independent mock catalog generator for future probes, etc. INTRODUCTION Knowledge of accurate distances to astronomical entities at various redshifts is essential for deducing the expansion history of the Universe. Observationally, however, this task is not simple since there does not exist one single standardizable measure of distances at all scales of cosmological interest. Hence one has to resort to a progressive method of calibrating distances, called the "cosmic distance ladder" method, using overlapping regions of potentially different standardizable objects as "rungs of the ladder". The conventional distance ladder method (Riess & Breuval 2023) starts with direct measures of geometric distance measures and progresses to calibrating Cepheid variables (Freedman & Madore 2023) or Tip of the Red Giant Branch (TRGB) stars (Freedman et al. 2020), and finally Type Ia supernovae (SNIa).

cosmological model, dataset, redshift, (15 more...)

arXiv.org Artificial Intelligence

2401.17029

Country: Asia > India > West Bengal > Kolkata (0.05)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PINSAT: Parallelized Interleaving of Graph Search and Trajectory Optimization for Kinodynamic Motion Planning

Natarajan, Ramkumar, Mukherjee, Shohin, Choset, Howie, Likhachev, Maxim

arXiv.org Artificial IntelligenceJan-16-2024

Trajectory optimization is a widely used technique in robot motion planning for letting the dynamics and constraints on the system shape and synthesize complex behaviors. Several previous works have shown its benefits in high-dimensional continuous state spaces and under differential constraints. However, long time horizons and planning around obstacles in non-convex spaces pose challenges in guaranteeing convergence or finding optimal solutions. As a result, discrete graph search planners and sampling-based planers are preferred when facing obstacle-cluttered environments. A recently developed algorithm called INSAT effectively combines graph search in the low-dimensional subspace and trajectory optimization in the full-dimensional space for global kinodynamic planning over long horizons. Although INSAT successfully reasoned about and solved complex planning problems, the numerous expensive calls to an optimizer resulted in large planning times, thereby limiting its practical use. Inspired by the recent work on edge-based parallel graph search, we present PINSAT, which introduces systematic parallelization in INSAT to achieve lower planning times and higher success rates, while maintaining significantly lower costs over relevant baselines. We demonstrate PINSAT by evaluating it on 6 DoF kinodynamic manipulation planning with obstacles.

optimization, pinsat, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2401.08948

Country:

North America > United States > Pennsylvania > Centre County > University Park (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Towards More Faithful Natural Language Explanation Using Multi-Level Contrastive Learning in VQA

Lai, Chengen, Song, Shengli, Meng, Shiqi, Li, Jingyang, Yan, Sitong, Hu, Guangneng

arXiv.org Artificial IntelligenceDec-21-2023

Natural language explanation in visual question answer (VQA-NLE) aims to explain the decision-making process of models by generating natural language sentences to increase users' trust in the black-box systems. Existing post-hoc methods have achieved significant progress in obtaining a plausible explanation. However, such post-hoc explanations are not always aligned with human logical inference, suffering from the issues on: 1) Deductive unsatisfiability, the generated explanations do not logically lead to the answer; 2) Factual inconsistency, the model falsifies its counterfactual explanation for answers without considering the facts in images; and 3) Semantic perturbation insensitivity, the model can not recognize the semantic changes caused by small perturbations. These problems reduce the faithfulness of explanations generated by models. To address the above issues, we propose a novel self-supervised \textbf{M}ulti-level \textbf{C}ontrastive \textbf{L}earning based natural language \textbf{E}xplanation model (MCLE) for VQA with semantic-level, image-level, and instance-level factual and counterfactual samples. MCLE extracts discriminative features and aligns the feature spaces from explanations with visual question and answer to generate more consistent explanations. We conduct extensive experiments, ablation analysis, and case study to demonstrate the effectiveness of our method on two VQA-NLE benchmarks.

explanation, mcle, proceedings, (9 more...)

arXiv.org Artificial Intelligence

2312.13594

Country: Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Provably Convergent Plug-and-Play Quasi-Newton Methods

Tan, Hong Ye, Mukherjee, Subhadip, Tang, Junqi, Schönlieb, Carola-Bibiane

arXiv.org Machine LearningNov-13-2023

Plug-and-Play (PnP) methods are a class of efficient iterative methods that aim to combine data fidelity terms and deep denoisers using classical optimization algorithms, such as ISTA or ADMM, with applications in inverse problems and imaging. Provable PnP methods are a subclass of PnP methods with convergence guarantees, such as fixed point convergence or convergence to critical points of some energy function. Many existing provable PnP methods impose heavy restrictions on the denoiser or fidelity function, such as non-expansiveness or strict convexity, respectively. In this work, we propose a novel algorithmic approach incorporating quasi-Newton steps into a provable PnP framework based on proximal denoisers, resulting in greatly accelerated convergence while retaining light assumptions on the denoiser. By characterizing the denoiser as the proximal operator of a weakly convex function, we show that the fixed points of the proposed quasi-Newton PnP algorithm are critical points of a weakly convex function. Numerical experiments on image deblurring and super-resolution demonstrate 2--8x faster convergence as compared to other provable PnP methods with similar reconstruction quality.

artificial intelligence, convergence, machine learning, (18 more...)

arXiv.org Machine Learning

2303.07271

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report (0.81)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback