AITopics

OpenAI releases impressive 4o image generator for free and paid users

PCWorldMar-27-2025, 14:43:07 GMT

Earlier this week, OpenAI released their "most advanced image generator yet" and made it available through ChatGPT using the GPT-4o model. ChatGPT previously relied on Dall-E to generate images. According to OpenAI, the improved 4o model is able to produce precise, accurate, and photorealistic results. They claim that it's also particularly good at rendering text, following instructions precisely, and even understanding the context of a chat. All of this includes the transformation of uploaded images or using uploaded images as visual inspiration.

large language model, machine learning, natural language, (8 more...)

PCWorld

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Off-policy estimation with adaptively collected data: the power of online learning

Neural Information Processing SystemsMar-27-2025, 14:43:00 GMT

We consider estimation of a linear functional of the treatment effect from adaptively collected data. This problem finds a variety of applications including off-policy evaluation in contextual bandits, and estimation of the average treatment effect in causal inference. While a certain class of augmented inverse propensity weighting (AIPW) estimators enjoys desirable asymptotic properties including the semiparametric efficiency, much less is known about their non-asymptotic theory with adaptively collected data. To fill in the gap, we first present generic upper bounds on the mean-squared error of the class of AIPW estimators that crucially depends on a sequentially weighted error between the treatment effect and its estimates. Motivated by this, we propose a general reduction scheme that allows one to produce a sequence of estimates for the treatment effect via online learning to minimize the sequentially weighted estimation error. To illustrate this, we provide three concrete instantiations in (1) the tabular case; (2) the case of linear function approximation; and (3) the case of general function approximation for the outcome model. We then provide a local minimax lower bound to show the instance-dependent optimality of the AIPW estimator using no-regret online learning algorithms.

artificial intelligence, estimator, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.93)

Industry: Education > Educational Setting > Online (0.90)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models and Time-Dependent Layer Normalization

Neural Information Processing SystemsMar-27-2025, 14:42:55 GMT

This paper presents innovative enhancements to diffusion models by integrating a novel multi-resolution network and time-dependent layer normalization. Diffusion models have gained prominence for their effectiveness in high-fidelity image generation. While conventional approaches rely on convolutional U-Net architectures, recent Transformer-based designs have demonstrated superior performance and scalability.

artificial intelligence, diffusion model, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

9adc8ada9183f4b9a007a02773fd8114-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 14:42:54 GMT

artificial intelligence, excess risk, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

DiffuPac: Contextual Mimicry in Adversarial Packets Generation via Diffusion Model

Neural Information Processing SystemsMar-27-2025, 14:42:40 GMT

In domains of cybersecurity, recent advancements in Machine Learning (ML) and Deep Learning (DL) have significantly enhanced Network Intrusion Detection Systems (NIDS), improving the effectiveness of cybersecurity operations. However, attackers have also leveraged ML/DL to develop sophisticated models that generate adversarial packets capable of evading NIDS detection. Consequently, defenders must study and analyze these models to prepare for the evasion attacks that exploit NIDS detection mechanisms. Unfortunately, conventional generation models often rely on unrealistic assumptions about attackers' knowledge of NIDS components, making them impractical for real-world scenarios. To address this issue, we present DiffuPac, a first-of-its-kind generation model designed to generate adversarial packets that evade detection without relying on specific NIDS components. DiffuPac integrates a pre-trained Bidirectional Encoder Representations from Transformers (BERT) with diffusion model, which, through its capability for conditional denoising and classifier-free guidance, effectively addresses the real-world constraint of limited attacker knowledge. By concatenating malicious packets with contextually relevant normal packets and applying targeted noising only to the malicious packets, DiffuPac seamlessly blends adversarial packets into genuine network traffic. Through evaluations on real-world datasets, we demonstrate that DiffuPac achieves strong evasion capabilities against sophisticated NIDS, outperforming conventional methods by an average of 6.69 percentage points, while preserving the functionality and practicality of the generated adversarial packets.

artificial intelligence, machine learning, packet, (21 more...)

Neural Information Processing Systems

Country:

Asia > Japan (0.14)
North America > United States (0.14)
Europe > France (0.14)

Genre:

Research Report > Experimental Study (0.93)
Workflow (0.67)
Research Report > Promising Solution (0.67)
Overview (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

9ab8da29b1eb3bec912a06e0879065cd-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 14:42:25 GMT

artificial intelligence, interaction, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

A Broader Impact, Limitations and Future Work

Neural Information Processing SystemsMar-27-2025, 14:42:19 GMT

Figure 8: Visualisation of programmatically generated captions for Shapes3D [19] (right) and DSprites [115] (left, black and white). Chosen at random, some captions are complete with exact details, while some only have more generic descriptors. Caption style leverages templates generated by GPT-4. The default resolution of these images is 64 64, hence the low-resolution appearance.

caption, large language model, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

A Practitioner's Guide to Continual Multimodal Pretraining Karsten Roth 1,2,6 Sebastian Dziadzio

Neural Information Processing SystemsMar-27-2025, 14:42:16 GMT

Multimodal foundation models serve numerous applications at the intersection of vision and language. Still, despite being pretrained on extensive data, they become outdated over time. To keep models updated, research into continual pretraining mainly explores scenarios with either (1) infrequent, indiscriminate updates on large-scale new data, or (2) frequent, sample-level updates. However, practical model deployment often operates in the gap between these two limit cases, as real-world applications demand adaptation to specific subdomains, tasks or concepts -- spread over the entire, varying life cycle of a model. In this work, we complement current perspectives on continual pretraining through a research test bed and offer comprehensive guidance for effective continual model updates in such scenarios. We first introduce FoMo-in-Flux, a continual multimodal pretraining benchmark with realistic compute constraints and practical deployment requirements, constructed over 63 datasets with diverse visual and semantic coverage. Using FoMo-in-Flux, we explore the complex landscape of practical continual pretraining through multiple perspectives: (1) data mixtures and stream orderings that emulate real-world deployment settings, (2) methods ranging from simple fine-tuning and traditional continual learning strategies to parameter-efficient updates and model merging, (3) meta-learning-rate schedules and mechanistic design choices, and (4) model and compute scaling. Together, our insights provide a practitioner's guide to continual multimodal pretraining for real-world deployment.

caption, large language model, machine learning, (16 more...)

Neural Information Processing Systems

Country: