AITopics | Generative AI

Collaborating Authors

Generative AI

News Overviews Instructional Materials AI-Alerts Classics

The Download: following DeepSeek's lead, and OpenAI's new research agent

MIT Technology ReviewFeb-3-2025, 13:30:00 GMT

When the Chinese firm DeepSeek dropped a large language model called R1 two weeks ago, it sent shock waves through the US tech industry. Not only did R1 match the best of the homegrown competition, it was built for a fraction of the cost--and given away for free. DeepSeek has now suddenly become the company to beat. What exactly did it do to rattle the tech world so fully? And what can we learn from the buzz about what's coming next?

deepseek, new research agent, openai, (1 more...)

MIT Technology Review

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.53)

Add feedback

OpenAI's new agent can compile detailed reports on practically any topic

MIT Technology ReviewFeb-3-2025, 12:01:11 GMT

OpenAI claims the tool represents a significant step toward its overarching goal of developing artificial general intelligence (AGI) that matches (or surpasses) human performance. It says that what takes the tool "tens of minutes" would take a human many hours. In response to a single query, such as "Draw me up a competitive analysis between streaming platforms," Deep Research will search the web, analyze the information it encounters, and compile a detailed report that cites its sources. It's also able to draw from files uploaded by users. OpenAI developed Deep Research using the same "chain of thought" reinforcement-learning methods it used to create its o1 multistep reasoning model. But while o1 was designed to focus primarily on mathematics, coding, or other STEM-based tasks, Deep Research can tackle a far broader range of subjects.

compile detailed report, deep research, openai, (3 more...)

MIT Technology Review

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.95)

Add feedback

SoftBank forms joint venture with OpenAI in enterprise play

The Japan TimesFeb-3-2025, 08:44:00 GMT

SoftBank Group will spend 3 billion a year to adopt and deploy OpenAI technology throughout its operations, while the two companies have agreed to form a joint venture to market the artificial intelligence as an enterprise solution. "This initiative will not only transform the way SoftBank Group operates but also revolutionize the way companies work in Japan and around the globe," SoftBank CEO Masayoshi Son said in a statement Monday. The technology, which the company describes as an advanced enterprise AI called Cristal intelligence, will be used at all companies under the SoftBank group, including Arm, Line and PayPay, to improve productivity and drive innovation. For instance, SoftBank's telecom unit plans to make more than 100 million workflows automated, the company said in the press release.

enterprise play, openai, softbank form joint venture, (1 more...)

The Japan Times

Country: Asia > Japan (0.32)

Genre: Press Release (1.00)

Industry:

Telecommunications (1.00)
Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.68)

Add feedback

ChatGPT's Deep Research tool can create reports from hundreds of online sources

EngadgetFeb-3-2025, 02:22:23 GMT

Two days after releasing o3-mini to the world, the company made a surprise announcement on Sunday evening, revealing Deep Research. The new feature allows ChatGPT to find, analyze and synthesize hundreds of websites and online sources to create reports "at the level of a research analyst." The chatbot will then take "anywhere from 5 to 30 minutes" to compile an answer, a side panel documenting the agent's progress and citations as it works. "It accomplishes in tens of minutes what would take a human many hours," OpenAI says of the new feature. "Our ultimate aspiration is a model that can uncover and discover new knowledge for itself," said Mark Chen, chief research officer at OpenAI, during the company's reveal livestream.

chatgpt, online source, openai, (4 more...)

Engadget

Country:

Europe > United Kingdom (0.06)
Europe > Switzerland (0.06)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.59)

Add feedback

s1: Simple test-time scaling

Muennighoff, Niklas, Yang, Zitong, Shi, Weijia, Li, Xiang Lisa, Fei-Fei, Li, Hajishirzi, Hannaneh, Zettlemoyer, Luke, Liang, Percy, Candès, Emmanuel, Hashimoto, Tatsunori

arXiv.org Artificial IntelligenceFeb-3-2025

Test-time scaling is a promising new approach to language modeling that uses extra test-time compute to improve performance. Recently, OpenAI's o1 model showed this capability but did not publicly share its methodology, leading to many replication efforts. We seek the simplest approach to achieve test-time scaling and strong reasoning performance. First, we curate a small dataset s1K of 1,000 questions paired with reasoning traces relying on three criteria we validate through ablations: difficulty, diversity, and quality. Second, we develop budget forcing to control test-time compute by forcefully terminating the model's thinking process or lengthening it by appending "Wait" multiple times to the model's generation when it tries to end. This can lead the model to double-check its answer, often fixing incorrect reasoning steps. After supervised finetuning the Qwen2.5-32B-Instruct language model on s1K and equipping it with budget forcing, our model s1-32B exceeds o1-preview on competition math questions by up to 27% (MATH and AIME24). Further, scaling s1-32B with budget forcing allows extrapolating beyond its performance without test-time intervention: from 50% to 57% on AIME24. Our model, data, and code are open-source at https://github.com/simplescaling/s1

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.19393

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:

Research Report (0.52)
Workflow (0.46)

Industry:

Leisure & Entertainment > Games (0.46)
Education > Educational Setting > K-12 Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback

VILP: Imitation Learning with Latent Video Planning

Xu, Zhengtong, Qiu, Qiang, She, Yu

arXiv.org Artificial IntelligenceFeb-3-2025

In the era of generative AI, integrating video generation models into robotics opens new possibilities for the general-purpose robot agent. This paper introduces imitation learning with latent video planning (VILP). We propose a latent video diffusion model to generate predictive robot videos that adhere to temporal consistency to a good degree. Our method is able to generate highly time-aligned videos from multiple views, which is crucial for robot policy learning. Our video generation model is highly time-efficient. For example, it can generate videos from two distinct perspectives, each consisting of six frames with a resolution of 96x160 pixels, at a rate of 5 Hz. In the experiments, we demonstrate that VILP outperforms the existing video generation robot policy across several metrics: training costs, inference speed, temporal consistency of generated videos, and the performance of the policy. We also compared our method with other imitation learning methods. Our findings indicate that VILP can rely less on extensive high-quality task-specific robot action data while still maintaining robust performance. In addition, VILP possesses robust capabilities in representing multi-modal action distributions. Our paper provides a practical example of how to effectively integrate video generation models into robot policies, potentially offering insights for related fields and directions. For more details, please refer to our open-source repository https://github.com/ZhengtongXu/VILP.

artificial intelligence, machine learning, video, (15 more...)

arXiv.org Artificial Intelligence

2502.01784

Country:

North America > United States (0.28)
Asia > Vietnam > Hanoi > Hanoi (0.05)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Towards Safer Chatbots: A Framework for Policy Compliance Evaluation of Custom GPTs

Rodriguez, David, Seymour, William, Del Alamo, Jose M., Such, Jose

arXiv.org Artificial IntelligenceFeb-3-2025

Large Language Models (LLMs) have gained unprecedented prominence, achieving widespread adoption across diverse domains and integrating deeply into society. The capability to fine-tune general-purpose LLMs, such as Generative Pre-trained Transformers (GPT), for specific tasks has facilitated the emergence of numerous Custom GPTs. These tailored models are increasingly made available through dedicated marketplaces, such as OpenAI's GPT Store. However, their black-box nature introduces significant safety and compliance risks. In this work, we present a scalable framework for the automated evaluation of Custom GPTs against OpenAI's usage policies, which define the permissible behaviors of these systems. Our framework integrates three core components: (1) automated discovery and data collection of models from the GPT store, (2) a red-teaming prompt generator tailored to specific policy categories and the characteristics of each target GPT, and (3) an LLM-as-a-judge technique to analyze each prompt-response pair for potential policy violations. We validate our framework with a manually annotated ground truth, and evaluate it through a large-scale study with 782 Custom GPTs across three categories: Romantic, Cybersecurity, and Academic GPTs. Our manual annotation process achieved an F1 score of 0.975 in identifying policy violations, confirming the reliability of the framework's assessments. The results reveal that 58.7% of the analyzed models exhibit indications of non-compliance, exposing weaknesses in the GPT store's review and approval processes. Furthermore, our findings indicate that a model's popularity does not correlate with compliance, and non-compliance issues largely stem from behaviors inherited from base models rather than user-driven customizations. We believe this approach is extendable to other chatbot platforms and policy domains, improving LLM-based systems safety.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.01436

Country:

Europe > Spain > Galicia > Madrid (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)
Government > Military > Cyberwarfare (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)

Add feedback

Dance recalibration for dance coherency with recurrent convolution block

Eum, Seungho, Cho, Ihjoon, Kim, Junghyeon

arXiv.org Artificial IntelligenceFeb-3-2025

With the recent advancements in generative AI such as GAN, Diffusion, and VAE, the use of generative AI for dance generation has seen significant progress and received considerable interest. In this study, We propose R-Lodge, an enhanced version of Lodge. R-Lodge incorporates Recurrent Sequential Representation Learning named Dance Recalibration to original coarse-to-fine long dance generation model. R-Lodge utilizes Dance Recalibration method using $N$ Dance Recalibration Block to address the lack of consistency in the coarse dance representation of the Lodge model. By utilizing this method, each generated dance motion incorporates a bit of information from the previous dance motions. We evaluate R-Lodge on FineDance dataset and the results show that R-Lodge enhances the consistency of the whole generated dance motions.

artificial intelligence, machine learning, recurrent convolution block, (3 more...)

arXiv.org Artificial Intelligence

2502.0119

Genre: Research Report > New Finding (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.44)

Add feedback

Single-neuron deep generative model uncovers underlying physics of neuronal activity in Ca imaging data

Abante, Jordi, Piga, Angelo, Ros, Berta, López-León, Clara F, Canals, Josep M, Soriano, Jordi

arXiv.org Artificial IntelligenceFeb-3-2025

Calcium imaging has become a powerful alternative to electrophysiology for studying neuronal activity, offering spatial resolution and the ability to measure large populations of neurons in a minimally invasive manner. This technique has broad applications in neuroscience, neuroengineering, and medicine, enabling researchers to explore the relationship between neuron location and activity. Recent advancements in deep generative models (DGMs) have facilitated the modeling of neuronal population dynamics, uncovering latent representations that provide insights into behavior prediction and neuronal variance. However, these models often rely on spike inference algorithms and primarily focus on population-level dynamics, limiting their applicability for single-neuron analyses. To address this gap, we propose a novel framework for single-neuron representation learning using autoregressive variational autoencoders (AVAEs). Our approach embeds individual neurons' spatiotemporal signals into a reduced-dimensional space without the need for spike inference algorithms. The AVAE excels over traditional linear methods by generating more informative and discriminative latent representations, improving tasks such as visualization, clustering, and the understanding of neuronal activity. Additionally, the reconstruction performance of the AVAE outperforms the state of the art, demonstrating its ability to accurately recover the original fluorescence signal from the learned representation. Using realistic simulations, we show that our model captures underlying physical properties and connectivity patterns, enabling it to distinguish between different firing and connectivity types. These findings position the AVAE as a versatile and powerful tool for advancing single-neuron analysis and lays the groundwork for future integration of multimodal single-cell datasets in neuroscience.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2501.14615

Country: Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)

Add feedback

Standardizing Intelligence: Aligning Generative AI for Regulatory and Operational Compliance

Imperial, Joseph Marvin, Jones, Matthew D., Madabushi, Harish Tayyar

arXiv.org Artificial IntelligenceFeb-3-2025

Technical standards, or simply standards, are established documented guidelines and rules that facilitate the interoperability, quality, and accuracy of systems and processes. In recent years, we have witnessed an emerging paradigm shift where the adoption of generative AI (GenAI) models has increased tremendously, spreading implementation interests across standard-driven industries, including engineering, legal, healthcare, and education. In this paper, we assess the criticality levels of different standards across domains and sectors and complement them by grading the current compliance capabilities of state-of-the-art GenAI models. To support the discussion, we outline possible challenges and opportunities with integrating GenAI for standard compliance tasks while also providing actionable recommendations for entities involved with developing and using standards. Overall, we argue that aligning GenAI with standards through computational methods can help strengthen regulatory and operational compliance. We anticipate this area of research will play a central role in the management, oversight, and trustworthiness of larger, more powerful GenAI-based systems in the near future.

compliance, computational linguistic, genai model, (11 more...)

arXiv.org Artificial Intelligence

2503.04736

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(13 more...)

Genre: Research Report (1.00)

Industry:

Law > Statutes (1.00)
Information Technology > Security & Privacy (1.00)
Education (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback