AITopics | Chowdhury, Agniva

Collaborating Authors

Chowdhury, Agniva

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Provably Accurate Randomized Sampling Algorithm for Logistic Regression

Chowdhury, Agniva, Ramuhalli, Pradeep

arXiv.org Machine LearningFeb-29-2024

In statistics and machine learning, logistic regression is a widely-used supervised learning technique primarily employed for binary classification tasks. When the number of observations greatly exceeds the number of predictor variables, we present a simple, randomized sampling-based algorithm for logistic regression problem that guarantees high-quality approximations to both the estimated probabilities and the overall discrepancy of the model. Our analysis builds upon two simple structural conditions that boil down to randomized matrix multiplication, a fundamental and well-understood primitive of randomized numerical linear algebra. We analyze the properties of estimated probabilities of logistic regression when leverage scores are used to sample observations, and prove that accurate approximations can be achieved with a sample whose size is much smaller than the total number of observations. To further validate our theoretical findings, we conduct comprehensive empirical evaluations. Overall, our work sheds light on the potential of using randomized sampling approaches to efficiently approximate the estimated probabilities in logistic regression, offering a practical and computationally efficient solution for large-scale datasets.

artificial intelligence, logistic regression, machine learning, (17 more...)

arXiv.org Machine Learning

2402.16326

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Deep Learning with Physics Priors as Generalized Regularizers

Liu, Frank, Chowdhury, Agniva

arXiv.org Artificial IntelligenceDec-14-2023

In various scientific and engineering applications, there is typically an approximate model of the underlying complex system, even though it contains both aleatoric and epistemic uncertainties. In this paper, we present a principled method to incorporate these approximate models as physics priors in modeling, to prevent overfitting and enhancing the generalization capabilities of the trained models. Utilizing the structural risk minimization (SRM) inductive principle pioneered by Vapnik, this approach structures the physics priors into generalized regularizers. The experimental results demonstrate that our method achieves up to two orders of magnitude of improvement in testing accuracy.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2312.08678

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.34)

Industry: Government > Regional Government > North America Government > United States Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Randomized Iterative Algorithms for Fisher Discriminant Analysis

Chowdhury, Agniva, Yang, Jiasen, Drineas, Petros

arXiv.org Machine LearningSep-9-2018

Fisher discriminant analysis (FDA) is a widely used method for classification and dimensionality reduction. When the number of predictor variables greatly exceeds the number of observations, one of the alternatives for conventional FDA is regularized Fisher discriminant analysis (RFDA). In this paper, we present a simple, iterative, sketching-based algorithm for RFDA that comes with provable accuracy guarantees when compared to the conventional approach. Our analysis builds upon two simple structural results that boil down to randomized matrix multiplication, a fundamental and well-understood primitive of randomized linear algebra. We analyze the behavior of RFDA when the ridge leverage and the standard leverage scores are used to select predictor variables and we prove that accurate approximations can be achieved by a sample whose size depends on the effective degrees of freedom of the RFDA problem. Our results yield significant improvements over existing approaches and our empirical evaluations support our theoretical analyses.

eqn, health & medicine, us government, (18 more...)

arXiv.org Machine Learning

1809.03045

Country: North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Government > Regional Government > North America Government > United States Government (0.69)
Health & Medicine > Pharmaceuticals & Biotechnology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.88)

Add feedback

Coreset Construction via Randomized Matrix Multiplication

Yang, Jiasen, Chowdhury, Agniva, Drineas, Petros

arXiv.org Machine LearningMay-29-2017

Coresets are small sets of points that approximate the properties of a larger point-set. For example, given a compact set $\mathcal{S} \subseteq \mathbb{R}^d$, a coreset could be defined as a (weighted) subset of $\mathcal{S}$ that approximates the sum of squared distances from $\mathcal{S}$ to every linear subspace of $\mathbb{R}^d$. As such, coresets can be used as a proxy to the full dataset and provide an important technique to speed up algorithms for solving problems including principal component analysis, latent semantic indexing, etc. In this paper, we provide a structural result that connects the construction of such coresets to approximating matrix products. This structural result implies a simple, randomized algorithm that constructs coresets whose sizes are independent of the number and dimensionality of the input points. The expected size of the resulting coresets yields an improvement over the state-of-the-art deterministic approach. Finally, we evaluate the proposed randomized algorithm on synthetic and real data, and demonstrate its effective performance relative to its deterministic counterpart.

artificial intelligence, coreset size, health & medicine, (16 more...)

arXiv.org Machine Learning

1705.10102

Country: North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback