AITopics | step function

Collaborating Authors

step function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bayesian Dyadic Trees and Histograms for Regression

Stéphanie van der Pas, Veronika Rockova

Neural Information Processing SystemsApr-23-2026, 22:22:39 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, partition, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search

Neural Information Processing SystemsFeb-15-2026, 17:44:10 GMT

Given the description of an environment and a task, we use an LLM guided by the GIF-MCTS method to iteratively generate and refine a candidate CWM. The candidate's correctness is evaluated by checking if it correctly

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > Finland (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

The Splendors and Miseries of Heavisidisation

Dolotin, V., Morozov, A.

arXiv.org Artificial IntelligenceNov-10-2025

Machine Learning (ML) is applicable to scientific problems, i.e. to those which have a well defined answer, only if this answer can be brought to a peculiar form ${\cal G}: X\longrightarrow Z$ with ${\cal G}(\vec x)$ expressed as a combination of iterated Heaviside functions. At present it is far from obvious, if and when such representations exist, what are the obstacles and, if they are absent, what are the ways to convert the known formulas into this form. This gives rise to a program of reformulation of ordinary science in such terms -- which sounds like a strong enhancement of the constructive mathematics approach, only this time it concerns all natural sciences. We describe the first steps on this long way.

artificial intelligence, formula, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1142/S0217751X25501751

2205.07377

Genre:

Research Report (0.50)
Workflow (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search

Neural Information Processing SystemsOct-10-2025, 05:36:28 GMT

code world model, experiment, gif-mct, (12 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > Finland (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

0740bb92e583cd2b88ec7c59f985cb41-Paper.pdf

Neural Information Processing SystemsOct-1-2025, 23:08:41 GMT

adversarial training, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SCAR: A Characterization Scheme for Multi-Modal Dataset

Su, Ri, Chen, Zhao, Cao, Caleb Chen, Tang, Nan, Chen, Lei

arXiv.org Artificial IntelligenceAug-28-2025

Foundation models exhibit remarkable generalization across diverse tasks, largely driven by the characteristics of their training data. Recent data-centric methods like pruning and compression aim to optimize training but offer limited theoretical insight into how data properties affect generalization, especially the data characteristics in sample scaling. Traditional perspectives further constrain progress by focusing predominantly on data quantity and training efficiency, often overlooking structural aspects of data quality. In this study, we introduce SCAR, a principled scheme for characterizing the intrinsic structural properties of datasets across four key measures: Scale, Coverage, Authenticity, and Richness. Unlike prior data-centric measures, SCAR captures stable characteristics that remain invariant under dataset scaling, providing a robust and general foundation for data understanding. Leveraging these structural properties, we introduce Foundation Data-a minimal subset that preserves the generalization behavior of the full dataset without requiring model-specific retraining. We model single-modality tasks as step functions and estimate the distribution of the foundation data size to capture step-wise generalization bias across modalities in the target multi-modal dataset. Finally, we develop a SCAR-guided data completion strategy based on this generalization bias, which enables efficient, modality-aware expansion of modality-specific characteristics in multimodal datasets. Experiments across diverse multi-modal datasets and model architectures validate the effectiveness of SCAR in predicting data utility and guiding data acquisition. Code is available at https://github.com/McAloma/SCAR.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2508.19659

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Three Things to Know about Deep Metric Learning

Patel, Yash, Tolias, Giorgos, Matas, Jiri

arXiv.org Artificial IntelligenceDec-16-2024

This paper addresses supervised deep metric learning for open-set image retrieval, focusing on three key aspects: the loss function, mixup regularization, and model initialization. In deep metric learning, optimizing the retrieval evaluation metric, recall@k, via gradient descent is desirable but challenging due to its non-differentiable nature. To overcome this, we propose a differentiable surrogate loss that is computed on large batches, nearly equivalent to the entire training set. This computationally intensive process is made feasible through an implementation that bypasses the GPU memory limitations. Additionally, we introduce an efficient mixup regularization technique that operates on pairwise scalar similarities, effectively increasing the batch size even further. The training process is further enhanced by initializing the vision encoder using foundational models, which are pre-trained on large-scale datasets. Through a systematic study of these components, we demonstrate that their synergy enables large models to nearly solve popular benchmarks.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2412.12432

Country:

Europe > Germany > Saarland > Saarbrücken (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Bayesian Dyadic Trees and Histograms for Regression

Stéphanie van der Pas, Veronika Rockova

Neural Information Processing SystemsOct-4-2024, 08:31:18 GMT

Many machine learning tools for regression are based on recursive partitioning of the covariate space into smaller regions, where the regression function can be estimated locally. Among these, regression trees and their ensembles have demonstrated impressive empirical performance. In this work, we shed light on the machinery behind Bayesian variants of these methods. In particular, we study Bayesian regression histograms, such as Bayesian dyadic trees, in the simple regression case with just one predictor. We focus on the reconstruction of regression surfaces that are piecewise constant, where the number of jumps is unknown.

partition, regression tree, step function, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

Bayes' Power for Explaining In-Context Learning Generalizations

Müller, Samuel, Hollmann, Noah, Hutter, Frank

arXiv.org Machine LearningOct-2-2024

Traditionally, neural network training has been primarily viewed as an approximation of maximum likelihood estimation (MLE). This interpretation originated in a time when training for multiple epochs on small datasets was common and performance was data bound; but it falls short in the era of large-scale single-epoch trainings ushered in by large self-supervised setups, like language models. In this new setup, performance is compute-bound, but data is readily available. As models became more powerful, in-context learning (ICL), i.e., learning in a single forward-pass based on the context, emerged as one of the dominant paradigms. In this paper, we argue that a more useful interpretation of neural network behavior in this era is as an approximation of the true posterior, as defined by the data-generating process. We demonstrate this interpretations' power for ICL and its usefulness to predict generalizations to previously unseen tasks. We show how models become robust in-context learners by effectively composing knowledge from their training data. We illustrate this with experiments that reveal surprising generalizations, all explicable through the exact posterior. Finally, we show the inherent constraints of the generalization capabilities of posteriors and the limitations of neural networks in approximating these posteriors.

neural network, posterior, prediction, (13 more...)

arXiv.org Machine Learning

2410.01565

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Filters

Collaborating Authors

step function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Bayesian Dyadic Trees and Histograms for Regression

Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search

The Splendors and Miseries of Heavisidisation

Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search

0740bb92e583cd2b88ec7c59f985cb41-Paper.pdf

SCAR: A Characterization Scheme for Multi-Modal Dataset

f3f1b7fc5a8779a9e618e1f23a7b7860-Supplemental.pdf

Three Things to Know about Deep Metric Learning

Bayesian Dyadic Trees and Histograms for Regression

Bayes' Power for Explaining In-Context Learning Generalizations