AITopics | Gilani, Ammar

Collaborating Authors

Gilani, Ammar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency

Hu, Jerry Yao-Chieh, Wang, Wei-Po, Gilani, Ammar, Li, Chenyang, Song, Zhao, Liu, Han

arXiv.org Machine LearningNov-25-2024

We investigate the statistical and computational limits of prompt tuning for transformer-based foundation models. Our key contributions are prompt tuning on \textit{single-head} transformers with only a \textit{single} self-attention layer: (i) is universal, and (ii) supports efficient (even almost-linear time) algorithms under the Strong Exponential Time Hypothesis (SETH). Statistically, we prove that prompt tuning on such simplest possible transformers are universal approximators for sequence-to-sequence Lipschitz functions. In addition, we provide an exponential-in-$dL$ and -in-$(1/\epsilon)$ lower bound on the required soft-prompt tokens for prompt tuning to memorize any dataset with 1-layer, 1-head transformers. Computationally, we identify a phase transition in the efficiency of prompt tuning, determined by the norm of the \textit{soft-prompt-induced} keys and queries, and provide an upper bound criterion. Beyond this criterion, no sub-quadratic (efficient) algorithm for prompt tuning exists under SETH. Within this criterion, we showcase our theory by proving the existence of almost-linear time prompt tuning inference algorithms. These fundamental limits provide important necessary conditions for designing expressive and efficient prompt tuning methods for practitioners.

boltz, large language model, machine learning, (18 more...)

arXiv.org Machine Learning

2411.16525

Country:

North America > United States (0.67)
Asia (0.45)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model

Xu, Chenwei, Huang, Yu-Chao, Hu, Jerry Yao-Chieh, Li, Weijian, Gilani, Ammar, Goan, Hsi-Sheng, Liu, Han

arXiv.org Machine LearningApr-4-2024

The field of developing deep learning architectures for tabular data is recently experiencing rapid advancements [Arik and Pfister, 2021, Gorishniy et al., 2021, Huang et al., 2020, Somepalli et al., 2021]. The primary driving force behind this trend is the limitations of the current dominant methods for tabular data: tree-based methods. Specifically, while tree-based methods excel in tabular learning, tree-based methods lack the capability to integrate with deep learning architectures. Therefore, the pursuit of deep tabular learning is not just a matter of enhancing performance but is also crucial to bridge the existing gap. However, a recent tabular benchmark study [Grinsztajn et al., 2022] reveals that tree-based methods still surpass deep learning models, underscoring two main challenges for deep tabular learning, as highlighted by Grinsztajn et al. [2022, Section 5.3 & 5.4]: (C1) Non-Rotationally Invariant Data Structure: The non-rotationally invariant structure of tabular data weakens the effectiveness of deep learning models that have rotational invariant learning procedures.

artificial intelligence, deep learning, machine learning, (13 more...)

arXiv.org Machine Learning

2404.0383

Country:

North America > United States (0.28)
Asia > Taiwan (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CoRMF: Criticality-Ordered Recurrent Mean Field Ising Solver

Pan, Zhenyu, Gilani, Ammar, Kuo, En-Jui, Liu, Zhuo

arXiv.org Machine LearningMar-7-2024

We propose an RNN-based efficient Ising model solver, the Criticality-ordered Recurrent Mean On one hand, the connection between NP problems and Field (CoRMF), for forward Ising problems. In Ising models has resulted in strong physics intuitions [Kirkpatrick its core, a criticality-ordered spin sequence of et al., 1983] that the hardness of these problems an N-spin Ising model is introduced by sorting emerges through the lens of complex energy landscapes mission-critical edges with greedy algorithm, over discrete random variables with multiple local minima such that an autoregressive mean-field factorization [Barahona, 1982, Chowdhury, 2014]. On the other hand, can be utilized and optimized with Recurrent the computational difficulty on the Ising side resonates with Neural Networks (RNNs). Our method the difficulties of numerous significant scientific problems, has two notable characteristics: (i) by leveraging including numerous other combinatorial decision-making the approximated tree structure of the underlying and optimization problems [Benati and Rizzi, 2007, Ngo Ising graph, the newly-obtained criticality et al., 1994, Garey and Johnson, 1979]. As the opposite order enables the unification between variational of conventional inverse Ising problems [Nguyen et al., mean-field and RNN, allowing the generally 2017, Reneau et al., 2023] that reconstruct graphical structure intractable Ising model to be efficiently from data, we refer to these problems, which have probed with probabilistic inference; (ii) it is wellmodulized, pre-specified graphical structures, as forward Ising problems model-independent while at the same (combinatorial inference and optimization problems time expressive enough, and hence fully applicable in Ising formulations [De las Cuevas and Cubitt, 2016, Lucas, to any forward Ising inference problems 2014, Pan et al., 2023]), and any efficient computational with minimal effort. Computationally, by using method or hardware solver [Mohseni et al., 2022] a variance-reduced Monte Carlo gradient estimator, for Ising models can potentially benefit them. CoRFM solves the Ising problems in a selftrain To describe the Ising model, we first introduce some notation fashion without data/evidence, and the inference here. We consider an Ising model of N spins as an exponential family model for binary N-spin data up to tasks can be executed by directly sampling quadratic sufficient statistic taking the Boltzmann form from RNN. Theoretically, we establish a provably tighter error bound than naive meanfield

artificial intelligence, ising model, machine learning, (18 more...)

arXiv.org Machine Learning

2403.03391

Country:

Asia (0.46)
North America > United States > Maryland (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Feature Programming for Multivariate Time Series Prediction

Reneau, Alex, Hu, Jerry Yao-Chieh, Xu, Chenwei, Li, Weijian, Gilani, Ammar, Liu, Han

arXiv.org Artificial IntelligenceJun-9-2023

We introduce the concept of programmable feature Our key motivation comes from a novel dynamical Ising-like engineering for time series modeling and propose model, the spin-gas Glauber dynamics, originated from a a feature programming framework. This newly debuted gas-like interaction that includes momentum framework generates large amounts of predictive and acceleration information. By using spin-gas Glauber features for noisy multivariate time series while dynamics as the fundamental model for time series generating allowing users to incorporate their inductive bias processes at the smallest time scale, we explore the with minimal effort. The key motivation of our potential of treating time series as the path-sum of infinitesimal framework is to view any multivariate time series increments generated by a series of Markovian coin as a cumulative sum of fine-grained trajectory tosses following the spin-gas Glauber dynamics. From such increments, with each increment governed by a a fine-grained perspective, a set of operators is motivated for novel spin-gas dynamical Ising model.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2306.06252

Country: North America > United States > Hawaii (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Unsupervised Hypergraph Feature Selection via a Novel Point-Weighting Framework and Low-Rank Representation

Gilani, Ammar, Amirmazlaghani, Maryam

arXiv.org Machine LearningAug-25-2018

Feature selection methods are widely used in order to solve the 'curse of dimensionality' problem. Many proposed feature selection frameworks, treat all data points equally; neglecting their different representation power and importance. In this paper, we propose an unsupervised hypergraph feature selection method via a novel point-weighting framework and low-rank representation that captures the importance of different data points. We introduce a novel soft hypergraph with low complexity to model data. Then, we formulate the feature selection as an optimization problem to preserve local relationships and also global structure of data. Our approach for global structure preservation helps the framework overcome the problem of unavailability of data labels in unsupervised learning. The proposed feature selection method treats with different data points based on their importance in defining data structure and representation power. Moreover, since the robustness of feature selection methods against noise and outlier is of great importance, we adopt low-rank representation in our model. Also, we provide an efficient algorithm to solve the proposed optimization problem. The computational cost of the proposed algorithm is lower than many state-of-the-art methods which is of high importance in feature selection tasks. We conducted comprehensive experiments with various evaluation methods on different benchmark data sets. These experiments indicate significant improvement, compared with state-of-the-art feature selection methods.

hypergraph, oncology, survey article, (20 more...)

arXiv.org Machine Learning

1808.08414

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.48)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.31)
Health & Medicine > Therapeutic Area > Immunology (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback