AITopics | base feature

Collaborating Authors

base feature

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A State Representation for Diminishing Rewards

Neural Information Processing SystemsFeb-15-2026, 19:44:39 GMT

In such situations, the successor representation (SR) is a popular framework which supports rapid policy evaluation by decoupling a policy's

data mining, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Portugal > Porto > Porto (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.92)

Add feedback

Attention Saturation and Gradient Suppression at Inflection Layers: Diagnosing and Mitigating Bottlenecks in Transformer Adaptation

Zixian, Wang

arXiv.org Artificial IntelligenceNov-4-2025

Pre-trained Transformers often exhibit over-confidence in source patterns and difficulty in forming new target-domain patterns during fine-tuning. We formalize the mechanism of output saturation leading to gradient suppression through standard cross-entropy and softmax analysis, showing that gradient suppression at inflection layers confines adaptation to high-level recombination of existing features while preventing low-level reconstruction. We introduce a set of layer-wise diagnostic metrics -- attention entropy (saturation proxy), activation gradient norm, parameter gradient norm, and Delta-CKA under a shared PCA basis -- to identify inflection layers characterized by both low attention entropy and steep gradient decay. Building on these findings, we propose a diagnose-first, inject-light fine-tuning strategy: selectively inserting LoRA adapters at inflection layers to restore suppressed backward signals with minimal parameter overhead. Experiments on BERT-base transfer from SST-2 to Rotten Tomatoes under under-trained and over-trained source regimes reveal that over-trained initialization benefits from inflection-layer LoRA injection, while under-trained initialization suffers performance degradation. When base features are strong, unblocking inflection layers facilitates high-level compositional adaptation; when base features are weak, full-pathway unblocking is required for low-level reconstruction, as supported by joint analysis of layer-wise activation gradients and Delta-CKA dynamics.

artificial intelligence, inflection layer, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2511.00797

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A State Representation for Diminishing Rewards

Neural Information Processing SystemsOct-10-2025, 23:29:32 GMT

In such situations, the successor representation (SR) is a popular framework which supports rapid policy evaluation by decoupling a policy's

data mining, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Portugal > Porto > Porto (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.92)

Add feedback

Towards White Box Deep Learning

Satkiewicz, Maciej

arXiv.org Artificial IntelligenceApr-17-2024

The main advantages of deep neural networks (DNNs) are their architectural simplicity and automatic feature learning. The latter is crucial for working with unstructured data as developers don't need to design features by hand. However, giving away the control over features leads to black box models - DNNs tend to learn hardly interpretable "shortcut" correlations [17] that leak from train to test [20], hampering alignment and out-of-distribution performance. In particular, this gives rise to adversarial attacks [35] - semantically negligible perturbations of data that arbitrarily change model's predictions. Adversarial vulnerability is a widespread phenomenon (vision [35], segmentation/detection [39], speech recognition [9], tabular data [10], RL [19], NLP [41]) and largely contributes to the general lack of trust in DNNs, substantially limiting their adoption in high-stakes applications such as healthcare, military, autonomous vehicles or cybersecurity. Conversely, the main advantage of hand-designed features is the fine-grained control over model's performance; however, such systems quickly become infeasibly complex. This paper aims to address those issues by reconciling Deep Learning with feature engineering - with the help of locality engineering. Specifically, semantic features are introduced as a general conceptual machinery for controlled dimensionality reduction inside a neural network layer. Figure 1 presents the core idea behind the notion and the rigorous definition is given in Section 4. Implementing a semantic feature predominantly involves encoding appropriate invariants (i.e.

arxiv, robustness, semantic feature, (14 more...)

arXiv.org Artificial Intelligence

2403.09863

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > Poland > Lesser Poland Province > Kraków (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (0.55)
Government > Military (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Automated Model Selection for Tabular Data

Amballa, Avinash, Mekala, Anmol, Akkinapalli, Gayathri, Madine, Manas, Yarrabolu, Naga Pavana Priya, Grabowicz, Przemyslaw A.

arXiv.org Artificial IntelligenceJan-1-2024

Structured data in the form of tabular datasets contain features that are distinct and discrete, with varying individual and relative importances to the target. Combinations of one or more features may be more predictive and meaningful than simple individual feature contributions. R's mixed effect linear models library allows users to provide such interactive feature combinations in the model design. However, given many features and possible interactions to select from, model selection becomes an exponentially difficult task. We aim to automate the model selection process for predictions on tabular datasets incorporating feature interactions while keeping computational costs small. The framework includes two distinct approaches for feature selection: a Priority-based Random Grid Search and a Greedy Search method. The Priority-based approach efficiently explores feature combinations using prior probabilities to guide the search. The Greedy method builds the solution iteratively by adding or removing features based on their impact. Experiments on synthetic demonstrate the ability to effectively capture predictive feature combinations.

feature interaction, interaction, selection, (16 more...)

arXiv.org Artificial Intelligence

2401.00961

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > Experimental Study (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Towards model-free RL algorithms that scale well with unstructured data

Modayil, Joseph, Abbas, Zaheer

arXiv.org Artificial IntelligenceNov-3-2023

Conventional reinforcement learning (RL) algorithms exhibit broad generality in their theoretical formulation and high performance on several challenging domains when combined with powerful function approximation. However, developing RL algorithms that perform well across problems with unstructured observations at scale remains challenging because most function approximation methods rely on externally provisioned knowledge about the structure of the input for good performance (e.g. convolutional networks, graph neural networks, tile-coding). A common practice in RL is to evaluate algorithms on a single problem, or on problems with limited variation in the observation scale. RL practitioners lack a systematic way to study how well a single RL algorithm performs when instantiated across a range of problem scales, and they lack function approximation techniques that scale well with unstructured observations. We address these limitations by providing environments and algorithms to study scaling for unstructured observation vectors and flat action spaces. We introduce a family of combinatorial RL problems with an exponentially large state space and high-dimensional dynamics but where linear computation is sufficient to learn a (nonlinear) value function estimate for performant control. We provide an algorithm that constructs reward-relevant general value function (GVF) questions to find and exploit predictive structure directly from the experience stream. In an empirical evaluation of the approach on synthetic problems, we observe a sample complexity that scales linearly with the observation size. The proposed algorithm reliably outperforms a conventional deep RL algorithm on these scaling problems, and they exhibit several desirable auxiliary properties. These results suggest new algorithmic mechanisms by which algorithms can learn at scale from unstructured data.

algorithm, base feature, gvf question, (16 more...)

arXiv.org Artificial Intelligence

2311.02215

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Education (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A State Representation for Diminishing Rewards

Moskovitz, Ted, Hromadka, Samo, Touati, Ahmed, Borsa, Diana, Sahani, Maneesh

arXiv.org Artificial IntelligenceSep-7-2023

A common setting in multitask reinforcement learning (RL) demands that an agent rapidly adapt to various stationary reward functions randomly sampled from a fixed distribution. In such situations, the successor representation (SR) is a popular framework which supports rapid policy evaluation by decoupling a policy's expected discounted, cumulative state occupancies from a specific reward function. However, in the natural world, sequential tasks are rarely independent, and instead reflect shifting priorities based on the availability and subjective perception of rewarding stimuli. Reflecting this disjunction, in this paper we study the phenomenon of diminishing marginal utility and introduce a novel state representation, the $\lambda$ representation ($\lambda$R) which, surprisingly, is required for policy evaluation in this setting and which generalizes the SR as well as several other state representations from the literature. We establish the $\lambda$R's formal properties and examine its normative advantages in the context of machine learning, as well as its usefulness for studying natural behaviors, particularly foraging.

agent, experiment, representation, (12 more...)

arXiv.org Artificial Intelligence

2309.0371

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Portugal > Porto > Porto (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.92)

Add feedback

Verifiable Goal Recognition for Autonomous Driving with Occlusions

Brewitt, Cillian, Tamborski, Massimiliano, Wang, Cheng, Albrecht, Stefano V.

arXiv.org Artificial IntelligenceAug-1-2023

Goal recognition (GR) involves inferring the goals of other vehicles, such as a certain junction exit, which can enable more accurate prediction of their future behaviour. In autonomous driving, vehicles can encounter many different scenarios and the environment may be partially observable due to occlusions. We present a novel GR method named Goal Recognition with Interpretable Trees under Occlusion (OGRIT). OGRIT uses decision trees learned from vehicle trajectory data to infer the probabilities of a set of generated goals. We demonstrate that OGRIT can handle missing data due to occlusions and make inferences across multiple scenarios using the same learned decision trees, while being computationally fast, accurate, interpretable and verifiable. We also release the inDO, rounDO and OpenDDO datasets of occluded regions used to evaluate OGRIT.

artificial intelligence, machine learning, vehicle, (21 more...)

arXiv.org Artificial Intelligence

2206.14163

Country: Europe > Germany (0.04)

Genre: Research Report (0.40)

Industry:

Transportation > Ground > Road (0.86)
Automobiles & Trucks (0.84)
Information Technology > Robotics & Automation (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

I Prefer not to Say: Protecting User Consent in Models with Optional Personal Data

Leemann, Tobias, Pawelczyk, Martin, Eberle, Christian Thomas, Kasneci, Gjergji

arXiv.org Artificial IntelligenceJun-6-2023

We examine machine learning models in a setup where individuals have the choice to share optional personal information with a decision-making system, as seen in modern insurance pricing models. Some users consent to their data being used whereas others object and keep their data undisclosed. In this work, we show that the decision not to share data can be considered as information in itself that should be protected to respect users' privacy. This observation raises the overlooked problem of how to ensure that users who protect their personal data do not suffer any disadvantages as a result. To address this problem, we formalize protection requirements for models which only use the information for which active user consent was obtained. This excludes implicit information contained in the decision to share data or not. We offer the first solution to this problem by proposing the notion of Protected User Consent (PUC), which we prove to be loss-optimal under our protection requirement. To learn PUC-compliant models, we devise a model-agnostic data augmentation strategy with finite sample convergence guarantees. Finally, we analyze the implications of PUC on a variety of challenging real-world datasets, tasks, and models.

data mining, information, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2210.13954

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > California (0.04)
Oceania > Australia > New South Wales (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Health & Medicine > Health Care Providers & Services > Reimbursement (0.92)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

OpenFE: Automated Feature Generation with Expert-level Performance

Zhang, Tianping, Zhang, Zheyu, Fan, Zhiyuan, Luo, Haoyan, Liu, Fengyuan, Liu, Qian, Cao, Wei, Li, Jian

arXiv.org Artificial IntelligenceJun-5-2023

The goal of automated feature generation is to liberate machine learning experts from the laborious task of manual feature generation, which is crucial for improving the learning performance of tabular data. The major challenge in automated feature generation is to efficiently and accurately identify effective features from a vast pool of candidate features. In this paper, we present OpenFE, an automated feature generation tool that provides competitive results against machine learning experts. OpenFE achieves high efficiency and accuracy with two components: 1) a novel feature boosting method for accurately evaluating the incremental performance of candidate features and 2) a two-stage pruning algorithm that performs feature pruning in a coarse-to-fine manner. Extensive experiments on ten benchmark datasets show that OpenFE outperforms existing baseline methods by a large margin. We further evaluate OpenFE in two Kaggle competitions with thousands of data science teams participating. In the two competitions, features generated by OpenFE with a simple baseline model can beat 99.3% and 99.6% data science teams respectively. In addition to the empirical results, we provide a theoretical perspective to show that feature generation can be beneficial in a simple yet representative setting. The code is available at https://github.com/ZhangTP1996/OpenFE.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2211.12507

Country:

North America > United States > California (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Beijing > Beijing (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback