AITopics | duplicate

Collaborating Authors

duplicate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Datasets, Documents, and Repetitions: The Practicalities of Unequal Data Quality

Neural Information Processing SystemsJun-15-2026, 21:09:28 GMT

Data filtering has become a powerful tool for improving model performance while reducing computational cost. However, as large language model compute budgets continue to grow, the limited data volume provided by heavily filtered and deduplicated datasets will become a practical constraint. In efforts to better understand how to proceed, we study model performance at various compute budgets and across multiple pre-training datasets created through data filtering and deduplication. We find that, given appropriate modifications to the training recipe, repeating existing aggressively filtered datasets for up to ten epochs can outperform training on the ten times larger superset for a single epoch across multiple compute budget orders of magnitude. While this finding relies on repeating the dataset for many epochs, we also investigate repeats within these datasets at the document level. We find that not all documents within a dataset are equal, and we can create better datasets relative to a token budget by explicitly manipulating the counts of individual documents. We conclude by arguing that even as large language models scale, data filtering remains an important direction of research.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data Only The Falcon LLMTeam

Neural Information Processing SystemsApr-30-2026, 09:16:27 GMT

This curation process is believed to be necessary to produce 5 performant models with broad zero-shot generalization abilities. However, as larger 6 models requiring pretraining on trillions of tokens are considered, it is unclear how 7 scalable is curation, and whether we will run out of unique high-quality data soon.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia (0.28)
North America > United States (0.28)

Genre:

Research Report (0.68)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Supplementary AViT 3B model

Neural Information Processing SystemsApr-25-2026, 06:33:50 GMT

The ViT model we use in this work is based on a standard Vision Transformer [7] model scaled to577 nearly 3 billion parameters, using a patch size of 14, 16 heads, 64 blocks, an MLP dimension of 8192578 and a hidden dimension of 2048. The model is defined and trained in Lingvo [32]; we additionally579 employ GSPMD [41] for training. The model is pre-trained on JFT-3B [35] using training settings580 that optimize for performance on JFT-3B rather than for fine-tuning on ImageNet; notably, we do not581 use the training recipe that helps few-shot transfer performance [44]. BReview tools586 We include screenshots of the reviewing tools we built to analyze model mistakes. Figure 3 shows587 the UI for reviewing model predictions and Figure 4 shows the UI that displays the labeling guide588 and slide bar to browse images for a particular class.

artificial intelligence, machine learning, pred, (12 more...)

Neural Information Processing Systems

Country: North America (0.14)

Industry: Transportation > Ground (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

1abed6ee581b9ceb4e2ddf37822c7fcb-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 22:03:07 GMT

A.1 Graph-building strategies The graphs were built using the IsayevNN class from the pymatgen [48] package. It implements the commonly used Voronoi tessalation to define neighbors. Two atoms are considered bonded if they share a face in the Voronoi tessalation of the supercell and their distance is less than the sum of the atomic Cordero radii (a measure of the atomic radius) plus a cutoff =0 .5Å. This value of the cutoff was increase compared to [32] to reduce the number of disconnected graphs. We provide statistics for the graphs obtained by the method described in Section 5. A hard cutoff on atomic distances of 6Å is also imposed on atomic distances. Figure 5: Histogram of the number of primitive cell sites per material in the processed Materials Project dataset.

artificial intelligence, dataset, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.72)

Add feedback

Appendix

Neural Information Processing SystemsApr-24-2026, 16:50:13 GMT

In this section we motivate the design choices and inductive biases that we encode into our neural encoder network e, which is the network that is used to model the relative accuracies of the weak supervision sources λ. Recall that we model the probability of a particular sample x X having the class label y Y = {1,...,C}as Pθ(y|λ) = softmax(s)yP(y), (4) s = θ(λ,x)Tλ RC . Connection to prior PGM models We now motivate this choice by deriving a less expressive variant of it from the standard Markov Random Field (MRF) used in the related work. If we view the attention scores θ(λ,x) Rm, that assign sample-dependent accuracies to each labeling function, as sample-independent parameters θ1 and, by that, drop the features from the equation - as is done in the related work [30, 32, 19, 11] - we can rewrite Eq. 4 as exp θT1 1 {λ = y} P We can recognize Pθ as a distribution from the exponential familiy, and more specifically as a pairwise MRF, or factor graph, with canonical parameters θ = (θ1,θ2) and corresponding sufficient statistics, or factors, φ(λ,y) = (φ1(λ,y),φ2(λ)), as well as the log partition function Zθ. The accuracy factors and parameters φ1,θ1 are the core component of this model and sometimes take the form φ1(λy) = λy in binary models as in [30, 19, 11]. The label-independent factors φ2(λ) have, as can be seen from the derivation above, no direct influence on the latent label posterior, but are often used to model labeling propensities 1 {λ 6= 0}and correlation dependencies 1 {λi = λj}, which can be important for PGM parameter learning, but are susceptible to misspecifications [39, 11, 8].

artificial intelligence, experiment, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.66)

Add feedback

License of the assets

Neural Information Processing SystemsApr-24-2026, 12:31:04 GMT

Licence for the codes We use the code for MS-TCN [13], ASRF [24], LAS [9], all of which are under MITLicense according to https://opensource.org/licenses/MIT. For the Jigsaws [18] dataset, we follow the data use agreeement according to https://cs.jhu. Action classification: Action classification is the task of identifying a single action, as opposed to a sequence of actions. Several methods use 2DCNNs to extract frame-wise features from an input video, which are then combined to predict a coarse action taking place in the video [56, 39, 59]. There also exist several works that perform action classification from kinematic data [2, 12]. Action segmentation: Action segmentation is the problem of segmenting an input stream of data, labeling each frame according to the action that is being carried out. Earlier methods for action segmentation employed hidden Markov models [33, 22]. More recently, convolutional neural networks [58, 26] and recurrent neural networks [50] have been applied to this problem Inspired by the success of temporal convolutional networks (TCNs) in speech synthesis, [37] adapted these models to action segmentation. MS-TCN [13], which uses a multi-stage TCN architecture, has become one of the most widely used architecture for action segmentation. Although these methods achieve high frame-wise accuracy, they still produce a significant number of over-segmentation errors. In order to address this, several boundary-aware methods have been developed which perform temporal smoothing of the frame-wise predictions [57, 24]. These methods use ground-truth boundary information to train a binary classification network to perform boundary detection. The boundary estimates are then used to aggregate the frame-wise predictions either in a soft manner (boundary-aware pooling) or by setting a hard threshold. However, for elemental actions with a short duration, such as the functional primitives in the StrokeRehab dataset, the duration of each action is very short. As a result, the boundaries between actions can be hard to detect or even hard to define (see Figure 4). Sequence-to-sequence models: Our proposed method is based on sequence-to-sequence (seq2seq) models. These models allow us to learn a mapping of a variable-length input sequence to a variablelength output sequence [53].

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Government (1.00)
Information Technology (0.93)
Law > Intellectual Property & Technology Law (0.46)
Health & Medicine > Therapeutic Area (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

Intrinsic Self-Supervision for Data Quality Audits

Neural Information Processing SystemsMar-21-2026, 23:27:06 GMT

Requests for name changes in the electronic proceedings will be accepted with no questions asked. However name changes may cause bibliographic tracking issues. Authors are asked to consider this carefully and discuss it with their co-authors prior to requesting a name change in the electronic proceedings. Use the Report an Issue link to request a name change.

artificial intelligence, data quality, proceedings, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Quality (0.37)
Information Technology > Artificial Intelligence (0.37)

Add feedback

Appendix Contents 1 Introduction 1 2 Related Work 3 3 The D

Neural Information Processing SystemsFeb-19-2026, 12:00:59 GMT

Participants are not allowed to modify the training procedure.

artificial intelligence, machine learning, natural language, (23 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Europe > Austria > Styria > Graz (0.04)
(4 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Law (1.00)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(7 more...)

Add feedback

Copycats

Neural Information Processing SystemsFeb-18-2026, 05:02:34 GMT

In the past, MI datasets were frequently proprietary, confined to particular institutions, and stored in private repositories. In this particular setting, there is a pressing need for alternative models of data sharing, documentation, and governance. Within this context,theemergence ofCommunityContributed Platforms (CCPs) presented a potential for the public sharing of medical datasets.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: