AITopics

Identifying user intents in information-seeking dialogs is crucial for a system to meet user's information needs. Intent prediction (IP) is challenging and demands sufficient dialogs with human-labeled intents for training. However, manually annotating intents is resource-intensive. While large language models (LLMs) have been shown to be effective in generating synthetic data, there is no study on using LLMs to generate intent-aware information-seeking dialogs. In this paper, we focus on leveraging LLMs for zero-shot generation of large-scale, open-domain, and intent-aware information-seeking dialogs. We propose SOLID, which has novel self-seeding and multi-intent self-instructing schemes. The former improves the generation quality by using the LLM's own knowledge scope to initiate dialog generation; the latter prompts the LLM to generate utterances sequentially, and mitigates the need for manual prompt design by asking the LLM to autonomously adapt its prompt instruction when generating complex multi-intent utterances. Furthermore, we propose SOLID-RL, which is further trained to generate a dialog in one step on the data generated by SOLID. We propose a length-based quality estimation mechanism to assign varying weights to SOLID-generated dialogs based on their quality during the training process of SOLID-RL. We use SOLID and SOLID-RL to generate more than 300k intent-aware dialogs, surpassing the size of existing datasets. Experiments show that IP methods trained on dialogs generated by SOLID and SOLID-RL achieve better IP quality than ones trained on human-generated dialogs.

artificial intelligence, large language model, natural language, (17 more...)

2402.11633

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (0.92)
Law (0.67)
Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Unified Approaches in Self-Supervised Event Stream Modeling: Progress and Prospects

Zólyomi, Levente, Wang, Tianze, Ennadir, Sofiane, Smirnov, Oleg, Cao, Lele

The proliferation of digital interactions across diverse domains, such as healthcare, e-commerce, gaming, and finance, has resulted in the generation of vast volumes of event stream (ES) data. ES data comprises continuous sequences of timestamped events that encapsulate detailed contextual information relevant to each domain. While ES data holds significant potential for extracting actionable insights and enhancing decision-making, its effective utilization is hindered by challenges such as the scarcity of labeled data and the fragmented nature of existing research efforts. Self-Supervised Learning (SSL) has emerged as a promising paradigm to address these challenges by enabling the extraction of meaningful representations from unlabeled ES data. In this survey, we systematically review and synthesize SSL methodologies tailored for ES modeling across multiple domains, bridging the gaps between domain-specific approaches that have traditionally operated in isolation. We present a comprehensive taxonomy of SSL techniques, encompassing both predictive and contrastive paradigms, and analyze their applicability and effectiveness within different application contexts. Furthermore, we identify critical gaps in current research and propose a future research agenda aimed at developing scalable, domain-agnostic SSL frameworks for ES modeling. By unifying disparate research efforts and highlighting cross-domain synergies, this survey aims to accelerate innovation, improve reproducibility, and expand the applicability of SSL to diverse real-world ES challenges.

data mining, large language model, machine learning, (17 more...)

2502.04899

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(2 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Yang, Joshua C., Bachmann, Fynn

Bridging Voting and Deliberation with Algorithms: Field Insights from vTaiwan and Kultur Komitee

Democratic processes increasingly aim to integrate large-scale voting with face-to-face deliberation, addressing the challenge of reconciling individual preferences with collective decision-making. This work introduces new methods that use algorithms and computational tools to bridge online voting with face-to-face deliberation, tested in two real-world scenarios: Kultur Komitee 2024 (KK24) and vTaiwan. These case studies highlight the practical applications and impacts of the proposed methods. We present three key contributions: (1) Radial Clustering for Preference Based Subgroups, which enables both in-depth and broad discussions in deliberative settings by computing homogeneous and heterogeneous group compositions with balanced and adjustable group sizes; (2) Human-in-the-loop MES, a practical method that enhances the Method of Equal Shares (MES) algorithm with real-time digital feedback. This builds algorithmic trust by giving participants full control over how much decision-making is delegated to the voting aggregation algorithm as compared to deliberation; and (3) the ReadTheRoom deliberation method, which uses opinion space mapping to identify agreement and divergence, along with spectrum-based preference visualisation to track opinion shifts during deliberation. This approach enhances transparency by clarifying collective sentiment and fosters collaboration by encouraging participants to engage constructively with differing perspectives. By introducing these actionable frameworks, this research extends in-person deliberation with scalable digital methods that address the complexities of modern decision-making in participatory processes.

artificial intelligence, deliberation, machine learning, (15 more...)

2502.05017

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > New York > New York County > New York City (0.04)
(11 more...)

Genre:

Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.68)

Industry:

Law (0.93)
Government > Voting & Elections (0.66)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Communications > Collaboration (0.67)

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

Deng, Yihe, Yang, Yu, Zhang, Junkai, Wang, Wei, Li, Bo

The rapid advancement of large language models (LLMs) has increased the need for guardrail models to ensure responsible use, particularly in detecting unsafe and illegal content. While substantial safety data exist in English, multilingual guardrail modeling remains underexplored due to the scarcity of open-source safety data in other languages. To address this gap, we propose a novel two-player Reinforcement Learning (RL) framework, where a generator and a guardrail model co-evolve adversarially to produce high-quality synthetic data for multilingual guardrail training. We theoretically formalize this interaction as a two-player game, proving convergence to a Nash equilibrium. Empirical evaluations show that our model \ours outperforms state-of-the-art models, achieving nearly 10% improvement over LlamaGuard3 (8B) on English benchmarks while being 4.5x faster at inference with a significantly smaller model (0.5B). We achieve substantial advancements in multilingual safety tasks, particularly in addressing the imbalance for lower-resource languages in a collected real dataset. Ablation studies emphasize the critical role of synthetic data generation in bridging the imbalance in open-source data between English and other languages. These findings establish a scalable and efficient approach to synthetic data generation, paving the way for improved multilingual guardrail models to enhance LLM safety. Code, model, and data will be open-sourced at https://github.com/yihedeng9/DuoGuard.

arxiv preprint arxiv, duoguard, two-player rl-driven framework, (12 more...)

2502.05163

Country:

North America > Mexico (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Information Technology (0.46)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Position: AI agents should be regulated based on autonomous action sequences

Osogami, Takauki

This position paper argues that AI agents should be regulated based on the sequence of actions they autonomously take. AI agents with long-term planning and strategic capabilities can pose significant risks of human extinction and irreversible global catastrophes. While existing regulations often focus on computational scale as a proxy for potential harm, we contend that such measures are insufficient for assessing the risks posed by AI agents whose capabilities arise primarily from inference-time computation. To support our position, we discuss relevant regulations and recommendations from AI scientists regarding existential risks, as well as the advantages of action sequences over existing impact measures that require observing environmental states.

ai agent, language model, sequence, (15 more...)

2503.0475

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > Canada > Ontario > Toronto (0.04)
(10 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Detecting Content Rating Violations in Android Applications: A Vision-Language Approach

Denipitiyage, D., Silva, B., Seneviratne, S., Seneviratne, A., Chawla, S.

Despite regulatory efforts to establish reliable content-rating guidelines for mobile apps, the process of assigning content ratings in the Google Play Store remains self-regulated by the app developers. There is no straightforward method of verifying developer-assigned content ratings manually due to the overwhelming scale or automatically due to the challenging problem of interpreting textual and visual data and correlating them with content ratings. We propose and evaluate a visionlanguage approach to predict the content ratings of mobile game applications and detect content rating violations, using a dataset of metadata of popular Android games. Our method achieves ~6% better relative accuracy compared to the state-of-the-art CLIP-fine-tuned model in a multi-modal setting. Applying our classifier in the wild, we detected more than 70 possible cases of content rating violations, including nine instances with the 'Teacher Approved' badge. Additionally, our findings indicate that 34.5% of the apps identified by our classifier as violating content ratings were removed from the Play Store. In contrast, the removal rate for correctly classified apps was only 27%. This discrepancy highlights the practical effectiveness of our classifier in identifying apps that are likely to be removed based on user complaints.

app, content rating, malpractice, (16 more...)

2502.15739

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Sri Lanka (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(8 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Law (1.00)
Government (0.87)
Information Technology > Security & Privacy (0.67)
Education > Educational Setting (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Daily Mail - Science & techFeb-6-2025, 22:02:50 GMT

Outrage as Google scraps its promise not to use AI for weapons or surveillance

Google has updated its AI ethical guidelines and removed a key pledge not to use the tech in a dangerous way. The company erased the 2018 pledge on Tuesday which stated the tech giant'would not use AI for weapons or surveillance'. The revised policy now shows that Google will only develop AI'responsibly' and in line with'widely accepted principles of international law and human rights.' Google's change has sparked internal backlash as employees called the move'deeply concerning' and that the company should not be involved in'the business of war.' Matt Mahmoudi, Amnesty adviser on AI and human rights, shamed Google for the move, saying the tech giant set a'dangerous precedent.' 'AI-powered technologies could fuel surveillance and lethal killing systems at a vast scale, potentially leading to mass violations and infringing on the fundamental right to privacy,' he added.

google, google employee, project maven, (13 more...)

Daily Mail - Science & tech

Country:

Asia > Middle East > Israel (0.08)
North America > United States > Pennsylvania (0.05)
Asia > Middle East > Palestine > Gaza Strip > Gaza Governorate > Gaza (0.05)

Genre: Research Report (0.36)

Industry:

Law > International Law (0.59)
Government > Military (0.57)
Government > Regional Government > North America Government > United States Government (0.36)

Technology: Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Los Angeles TimesFeb-6-2025, 20:40:04 GMT

Drones, cameras and metal detectors: Edison faces new scrutiny over start of Eaton fire

Armed with drones, long-distance camera lenses and metal detectors, a hillside in Eaton Canyon has become the focus of intense scrutiny over the last month by teams of private investigators now seeking clues on whether Southern California Edison equipment caused the massive fire that destroyed large swaths of Altadena. Some of the findings and theories of these privately hired teams of fire investigators and electrical engineers have emerged in more than 40 lawsuits that residents have filed against the utility. Much of the focus has been centered on a group of transmission towers where the first flames were seen just as the Eaton fire exploded. Earlier this week, a new lawsuit alleged that an idle transmission tower on the hillside -- one that has not been in use for more than 50 years -- might have sparked the devastating blaze. With more than 9,000 homes lost and 17 people killed, liability is going to be a costly question that could affect how Altadena is rebuilt.

eaton fire, edison, lawsuit, (10 more...)

Los Angeles Times

Country: North America > United States > California > Los Angeles County (0.05)

Industry:

Law (1.00)
Energy > Power Industry > Utilities (0.39)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.40)

MIT Technology ReviewFeb-6-2025, 10:00:00 GMT

An AI chatbot told a user how to kill himself--but the company doesn't want to "censor" it

Nomi is among a growing number of AI companion platforms that let their users create personalized chatbots to take on the roles of AI girlfriend, boyfriend, parents, therapist, favorite movie personalities, or any other personas they can dream up. Users can specify the type of relationship they're looking for (Nowatzki chose "romantic") and customize the bot's personality traits (he chose "deep conversations/intellectual," "high sex drive," and "sexually open") and interests (he chose, among others, Dungeons & Dragons, food, reading, and philosophy). The companies that create these types of custom chatbots--including Glimpse AI (which developed Nomi), Chai Research, Replika, Character.AI, Kindroid, Polybuzz, and MyAI from Snap, among others--tout their products as safe options for personal exploration and even cures for the loneliness epidemic. Many people have had positive, or at least harmless, experiences. However, a darker side of these applications has also emerged, sometimes veering into abusive, criminal, and even violent content; reports over the past year have revealed chatbots that have encouraged users to commit suicide, homicide, and self-harm. But even among these incidents, Nowatzki's conversation stands out, says Meetali Jain, the executive director of the nonprofit Tech Justice Law Clinic.

ai chatbot, chatbot, suicide, (3 more...)

MIT Technology Review

Industry:

Health & Medicine (0.59)
Law > Litigation (0.36)

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)

arXiv.org Artificial IntelligenceFeb-6-2025

LLMs to Support a Domain Specific Knowledge Assistant

Lovin, Maria-Flavia

This work presents a custom approach to developing a domain specific knowledge assistant for sustainability reporting using the International Financial Reporting Standards (IFRS). In this domain, there is no publicly available question-answer dataset, which has impeded the development of a high-quality chatbot to support companies with IFRS reporting. The two key contributions of this project therefore are: (1) A high-quality synthetic question-answer (QA) dataset based on IFRS sustainability standards, created using a novel generation and evaluation pipeline leveraging Large Language Models (LLMs). This comprises 1,063 diverse QA pairs that address a wide spectrum of potential user queries in sustainability reporting. Various LLM-based techniques are employed to create the dataset, including chain-of-thought reasoning and few-shot prompting. A custom evaluation framework is developed to assess question and answer quality across multiple dimensions, including faithfulness, relevance, and domain specificity. The dataset averages a score range of 8.16 out of 10 on these metrics. (2) Two architectures for question-answering in the sustainability reporting domain - a RAG pipeline and a fully LLM-based pipeline. The architectures are developed by experimenting, fine-tuning, and training on the QA dataset. The final pipelines feature an LLM fine-tuned on domain specific data and an industry classification component to improve the handling of complex queries. The RAG architecture achieves an accuracy of 85.32% on single-industry and 72.15% on cross-industry multiple-choice questions, outperforming the baseline approach by 4.67 and 19.21 percentage points, respectively. The LLM-based pipeline achieves an accuracy of 93.45% on single-industry and 80.30% on cross-industry multiple-choice questions, an improvement of 12.80 and 27.36 percentage points over the baseline, respectively.

community relations, large language model, machine learning, (21 more...)

2502.04095

Country:

Asia (0.67)
North America > United States (0.27)

Genre:

Questionnaire & Opinion Survey (0.87)
Public Relations > Community Relations (0.75)
Research Report > New Finding (0.45)

Industry:

Transportation (1.00)
Law (1.00)
Energy > Renewable (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)