AITopics

Plotting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

8cf04c64d1734e5f7e63418a2a4d49de-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsMar-27-2025, 11:28:51 GMT

artificial intelligence, machine learning, programming language, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre:

Instructional Material (0.69)
Research Report (0.46)

Industry:

Information Technology (0.67)
Education > Educational Technology (0.48)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

Neural Information Processing SystemsMar-27-2025, 11:28:43 GMT

Recent advancements in generation models have showcased remarkable capabilities in generating fantastic content. However, most of them are trained on proprietary high-quality data, and some models withhold their parameters and only provide accessible application programming interfaces (APIs), limiting their benefits for downstream tasks. To explore the feasibility of training a text-to-image generation model comparable to advanced models using publicly available resources, we introduce EvolveDirector. This framework interacts with advanced models through their public APIs to obtain text-image data pairs to train a base model. Our experiments with extensive data indicate that the model trained on generated data of the advanced model can approximate its generation capability.

artificial intelligence, arxiv preprint arxiv, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Germany (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Media > Photography (0.92)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

aac02401755a65904cf977a33136af4a-Supplemental-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 11:28:35 GMT

Figure 7: Training loss, Adam variance norm/max element, and correlations between loss spikes and variance norm/max during GPT-2 pre-training (without the proposed method) under different model sizes, batch sizes (and LR), and sequence lengths. A.1 Zoom in of Figure 1 Figure 7 zoom in the first 30B token in main paper Figure 1, where the training is the most unstable. A.2 Learning rate decay for proposed approach As discussed in main paper Section 5.1 GPT-2 experiments, proposed approach needs more training steps than baseline in order to reach the same 157B training tokens. This makes it necessary to modify the learning rate decay schedule for proposed approach. We first tried to increase the number of learning rate decay steps by half of the proposed approach's pacing function duration T (since the proposed approach roughly needs T/2 additional steps to reach 157B tokens).

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)

Add feedback

The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models

Neural Information Processing SystemsMar-27-2025, 11:28:31 GMT

Recent works have demonstrated great success in pre-training large-scale autoregressive language models (e.g., GPT-3) on massive GPUs. To reduce the wall-clock training time, a common practice is to increase the batch size and learning rate. However, such practice is often brittle and leads to a so-called stability-efficiency dilemma: increasing the batch sizes and learning rates leads to better training efficiency but can also result in training instability, leading to poor generalization accuracy or failed runs. To better understand this phenomenon, we conduct an in-depth analysis on large-scale pre-training experiments replicating the GPT-2 model with public dataset. We find that there is a strong correlation between training instability and extreme values of gradient variance.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Robot Policy Learning with Temporal Optimal Transport Reward

Neural Information Processing SystemsMar-27-2025, 11:28:13 GMT

Reward specification is one of the most tricky problems in Reinforcement Learning, which usually requires tedious hand engineering in practice. One promising approach to tackle this challenge is to adopt existing expert video demonstrations for policy learning. Some recent work investigates how to learn robot policies from only a single/few expert video demonstrations. For example, reward labeling via Optimal Transport (OT) has been shown to be an effective strategy to generate a proxy reward by measuring the alignment between the robot trajectory and the expert demonstrations. However, previous work mostly overlooks that the OT reward is invariant to temporal order information, which could bring extra noise to the reward signal. To address this issue, in this paper, we introduce the Temporal Optimal Transport (TemporalOT) reward to incorporate temporal order information for learning a more accurate OT-based proxy reward. Extensive experiments on the Meta-world benchmark tasks validate the efficacy of the proposed method.

demonstration, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Experiment on zero-shot classification

Neural Information Processing SystemsMar-27-2025, 11:28:07 GMT

The top two rows show easy cases, while the bottom three rows present hard cases, including crowdedness, complex backgrounds, and tiny objects.

artificial intelligence, large language model, natural language, (18 more...)

Neural Information Processing Systems

Industry: Information Technology (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.57)

Add feedback

Species196: A One-Million Semi-supervised Dataset for Fine-grained Species Recognition Wei He, Kai Han

Neural Information Processing SystemsMar-27-2025, 11:28:05 GMT

The development of foundation vision models has pushed the general visual recognition to a high level, but cannot well address the fine-grained recognition in specialized domain such as invasive species classification. Identifying and managing invasive species has strong social and ecological value. Currently, most invasive species datasets are limited in scale and cover a narrow range of species, which restricts the development of deep-learning based invasion biometrics systems. To fill the gap of this area, we introduced Species196, a large-scale semi-supervised dataset of 196-category invasive species. It collects over 19K images with expert-level accurate annotations (Species196-L), and 1.2M unlabeled images of invasive species (Species196-U). The dataset provides four experimental settings for benchmarking the existing models and algorithms, namely, supervised learning, semi-supervised learning, self-supervised pretraining and zero-shot inference ability of large multimodal models. To facilitate future research on these four learning paradigms, we conduct an empirical study of the representative methods on the introduced dataset. The dataset is publicly available at https://species-dataset.github.io/.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > Switzerland (0.28)

Genre: Research Report (0.46)

Industry: Information Technology > Security & Privacy (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection

Neural Information Processing SystemsMar-27-2025, 11:27:56 GMT

Large numbers of synthesized videos from diffusion models pose threats to information security and authenticity, leading to an increasing demand for generated content detection. However, existing video-level detection algorithms primarily focus on detecting facial forgeries and often fail to identify diffusion-generated content with a diverse range of semantics. To advance the field of video forensics, we propose an innovative algorithm named Multi-Modal Detection(MM-Det) for detecting diffusion-generated videos. MM-Det utilizes the profound perceptual and comprehensive abilities of Large Multi-modal Models (LMMs) by generating a Multi-Modal Forgery Representation (MMFR) from LMM's multi-modal space, enhancing its ability to detect unseen forgery content.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
North America > Canada (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(5 more...)

Add feedback

Predicting Cellular Responses to Novel Drug Perturbations at a Single-Cell Resolution

Neural Information Processing SystemsMar-27-2025, 11:27:47 GMT

Single-cell transcriptomics enabled the study of cellular heterogeneity in response to perturbations at the resolution of individual cells. However, scaling high-throughput screens (HTSs) to measure cellular responses for many drugs remains a challenge due to technical limitations and, more importantly, the cost of such multiplexed experiments. Thus, transferring information from routinely performed bulk RNA HTS is required to enrich single-cell data meaningfully.We introduce chemCPA, a new encoder-decoder architecture to study the perturbational effects of unseen drugs. We combine the model with an architecture surgery for transfer learning and demonstrate how training on existing bulk RNA HTS datasets can improve generalisation performance. Better generalisation reduces the need for extensive and costly screens at single-cell resolution. We envision that our proposed method will facilitate more efficient experiment designs through its ability to generate in-silico hypotheses, ultimately accelerating drug discovery.

artificial intelligence, machine learning, novel drug perturbation, (2 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.45)

Add feedback

Filters

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

8cf04c64d1734e5f7e63418a2a4d49de-Supplemental-Datasets_and_Benchmarks.pdf

8cf04c64d1734e5f7e63418a2a4d49de-Paper-Datasets_and_Benchmarks.pdf

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

aac02401755a65904cf977a33136af4a-Supplemental-Conference.pdf

The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models

Robot Policy Learning with Temporal Optimal Transport Reward

A Experiment on zero-shot classification

Species196: A One-Million Semi-supervised Dataset for Fine-grained Species Recognition Wei He, Kai Han

On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection

Predicting Cellular Responses to Novel Drug Perturbations at a Single-Cell Resolution