AITopics | data transformation

Collaborating Authors

data transformation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Measurements and Generalization Bounds

Neural Information Processing SystemsApr-25-2026, 02:44:56 GMT

The proof has the following steps.

artificial intelligence, machine learning, transformation, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

2287c6b8641dd2d21ab050eb9ff795f3-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 02:44:53 GMT

artificial intelligence, machine learning, transformation, (14 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report (0.46)

Industry: Government > Military (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Add feedback

Real-time Core-Periphery Guided ViT with Smart Data Layout Selection on Mobile Devices

Neural Information Processing SystemsMar-22-2026, 01:37:30 GMT

Mobile devices have become essential enablers for AI applications, particularly in scenarios that require real-time performance. Vision Transformer (ViT) has become a fundamental cornerstone in this regard due to its high accuracy. Recent efforts have been dedicated to developing various transformer architectures that offer improved accuracy while reducing the computational requirements. However, existing research primarily focuses on reducing the theoretical computational complexity through methods such as local attention and model pruning, rather than considering realistic performance on mobile hardware. Although these optimizations reduce computational demands, they either introduce additional overheads related to data transformation (e.g., Reshape and Transpose) or irregular computation/data-access patterns.

artificial intelligence, machine learning, real time system, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Architecture > Real Time Systems (0.64)
Information Technology > Communications > Mobile (0.44)
Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

adb77ecc8ba1c2d3135c86a46b8f2496-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 10:09:23 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

2287c6b8641dd2d21ab050eb9ff795f3-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 20:44:16 GMT

The proof has the following steps.(I)Any

artificial intelligence, machine learning, transformation, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

2287c6b8641dd2d21ab050eb9ff795f3-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 20:44:14 GMT

data transformation, generalization, transformation, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Industry: Government > Military (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Add feedback

Understanding the Generalization Benefit of Model Invariance from a Data Perspective

Neural Information Processing SystemsDec-23-2025, 21:17:53 GMT

Machine learning models that are developed to be invariant under certain types of data transformations have shown improved generalization in practice. However, a principled understanding of why invariance benefits generalization is limited. Given a dataset, there is often no principled way to select suitable data transformations under which model invariance guarantees better generalization. This paper studies the generalization benefit of model invariance by introducing the sample cover induced by transformations, i.e., a representative subset of a dataset that can approximately recover the whole dataset using transformations. For any data transformations, we provide refined generalization bounds for invariant models based on the sample cover. We also characterize the suitability of a set of data transformations by the sample covering number induced by transformations, i.e., the smallest size of its induced sample covers. We show that we may tighten the generalization bounds for suitable transformations that have a small sample covering number. In addition, our proposed sample covering number can be empirically evaluated and thus provides a guidance for selecting transformations to develop model invariance for better generalization. In experiments on multiple datasets, we evaluate sample covering numbers for some commonly used transformations and show that the smaller sample covering number for a set of transformations (e.g., the 3D-view transformation) indicates a smaller gap between the test and training error for invariant models, which verifies our propositions.

generalization, generalization benefit, transformation, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

Dataforge: A Data Agent Platform for Autonomous Data Engineering

Wang, Xinyuan, Fu, Yanjie

arXiv.org Artificial IntelligenceNov-11-2025

B. Hierarchical Routing After data cleaning, to enable efficient and reliable decision-making, we adopt a hierarchical routing architecture, including task-level and action-level reasoning. At the task-level routing, a rule-based router quickly identifies the task type: classification, regression, or unsupervised learning, based on table schema metadata, such as, data types, label structures, and feature distribution. Such lightweight router relies on deterministic heuristics, instead of large language models, thus, enable fast and reliable responses across diverse datasets. At the action-level routing, a compact LLM-based planner refines the decision by selects and plans the most suitable feature-level actions such as, different ordered combinations of feature selection, transformation, or generation, under the identified task (e.g., a classification dataset). Since each router operates within a smaller, well-defined action space, this hierarchical routing approach not only accelerates processing but also avoid invalid or high-risk operations. C. Dual Feedback Loops We develop two collaborative feedback loops to transform the static workflow into an adaptive, self-correcting process, in order to achieve autonomy and continual refinement. 1) Action V alidation Loop for Safety: This feddback loop is to ground actions to ensure operational safety before execution. Each planned action is first grounded through schema alignment, type checking, and logical consistency tests, such as, detecting divisions by zero or invalid type conversions. Only actions that pass validation proceed to execution so as to prevent runtime errors and maintaining workflow integrity.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.06185

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.75)

Add feedback

Identifying Super Spreaders in Multilayer Networks

Czuba, Michał, Stolarski, Mateusz, Piróg, Adam, Bielak, Piotr, Bródka, Piotr

arXiv.org Artificial IntelligenceOct-27-2025

Identifying super-spreaders can be framed as a subtask of the influence maximisation problem. It seeks to pinpoint agents within a network that, if selected as single diffusion seeds, disseminate information most effectively. Multilayer networks, a specific class of heterogeneous graphs, can capture diverse types of interactions (e.g., physical-virtual or professional-social), and thus offer a more accurate representation of complex relational structures. In this work, we introduce a novel approach to identifying super-spreaders in such networks by leveraging graph neural networks. To this end, we construct a dataset by simulating information diffusion across hundreds of networks - to the best of our knowledge, the first of its kind tailored specifically to multilayer networks. We further formulate the task as a variation of the ranking prediction problem based on a four-dimensional vector that quantifies each agent's spreading potential: (i) the number of activations; (ii) the duration of the diffusion process; (iii) the peak number of activations; and (iv) the simulation step at which this peak occurs. Our model, TopSpreadersNetwork, comprises a relationship-agnostic encoder and a custom aggregation layer. This design enables generalisation to previously unseen data and adapts to varying graph sizes. In an extensive evaluation, we compare our model against classic centrality-based heuristics and competitive deep learning methods. The results, obtained across a broad spectrum of real-world and synthetic multilayer networks, demonstrate that TopSpreadersNetwork achieves superior performance in identifying high-impact nodes, while also offering improved interpretability through its structured output.

actor, artificial intelligence, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.3233/FAIA251060

2505.2098

Country: Europe > Poland (0.28)

Genre: