AITopics | Personal

Collaborating Authors

Personal

Google teases new camera-powered AI feature one day ahead of I/O

EngadgetMay-13-2024, 17:54:52 GMT

Google is teasing an intriguing new AI feature one day ahead of its I/O developer conference. The company shared a brief video on X that appears to show a new camera-powered AI feature that's able to recognize what's in the frame in real time. The video, which is labeled as a "prototype," shows what appears to be a Pixel device with the camera open viewing the keynote stage at I/O. The person holding the camera asks, "hey, what do you think is happening here?" A voice replies that "it looks like people are setting up for a large event, perhaps a conference or presentation."

artificial intelligence, google, google tease, (3 more...)

Engadget

Genre: Personal > Interview (0.59)

Technology: Information Technology > Artificial Intelligence (0.93)

Add feedback

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Vidgen, Bertie, Agrawal, Adarsh, Ahmed, Ahmed M., Akinwande, Victor, Al-Nuaimi, Namir, Alfaraj, Najla, Alhajjar, Elie, Aroyo, Lora, Bavalatti, Trupti, Bartolo, Max, Blili-Hamelin, Borhane, Bollacker, Kurt, Bomassani, Rishi, Boston, Marisa Ferrara, Campos, Siméon, Chakra, Kal, Chen, Canyu, Coleman, Cody, Coudert, Zacharie Delpierre, Derczynski, Leon, Dutta, Debojyoti, Eisenberg, Ian, Ezick, James, Frase, Heather, Fuller, Brian, Gandikota, Ram, Gangavarapu, Agasthya, Gangavarapu, Ananya, Gealy, James, Ghosh, Rajat, Goel, James, Gohar, Usman, Goswami, Sujata, Hale, Scott A., Hutiri, Wiebke, Imperial, Joseph Marvin, Jandial, Surgan, Judd, Nick, Juefei-Xu, Felix, Khomh, Foutse, Kailkhura, Bhavya, Kirk, Hannah Rose, Klyman, Kevin, Knotz, Chris, Kuchnik, Michael, Kumar, Shachi H., Kumar, Srijan, Lengerich, Chris, Li, Bo, Liao, Zeyi, Long, Eileen Peters, Lu, Victor, Luger, Sarah, Mai, Yifan, Mammen, Priyanka Mary, Manyeki, Kelvin, McGregor, Sean, Mehta, Virendra, Mohammed, Shafee, Moss, Emanuel, Nachman, Lama, Naganna, Dinesh Jinenhally, Nikanjam, Amin, Nushi, Besmira, Oala, Luis, Orr, Iftach, Parrish, Alicia, Patlak, Cigdem, Pietri, William, Poursabzi-Sangdeh, Forough, Presani, Eleonora, Puletti, Fabrizio, Röttger, Paul, Sahay, Saurav, Santos, Tim, Scherrer, Nino, Sebag, Alice Schoenauer, Schramowski, Patrick, Shahbazi, Abolfazl, Sharma, Vin, Shen, Xudong, Sistla, Vamsi, Tang, Leonard, Testuggine, Davide, Thangarasa, Vithursan, Watkins, Elizabeth Anne, Weiss, Rebecca, Welty, Chris, Wilbers, Tyler, Williams, Adina, Wu, Carole-Jean, Yadav, Poonam, Yang, Xianjun, Zeng, Yi, Zhang, Wenhui, Zhdanov, Fedor, Zhu, Jiacheng, Liang, Percy, Mattson, Peter, Vanschoren, Joaquin

arXiv.org Artificial IntelligenceMay-13-2024

This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-purpose assistant in English), and a limited set of personas (i.e., typical users, malicious users, and vulnerable users). We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0.5 benchmark. We plan to release version 1.0 of the AI Safety Benchmark by the end of 2024. The v1.0 benchmark will provide meaningful insights into the safety of AI systems. However, the v0.5 benchmark should not be used to assess the safety of AI systems. We have sought to fully document the limitations, flaws, and challenges of v0.5. This release of v0.5 of the AI Safety Benchmark includes (1) a principled approach to specifying and constructing the benchmark, which comprises use cases, types of systems under test (SUTs), language and context, personas, tests, and test items; (2) a taxonomy of 13 hazard categories with definitions and subcategories; (3) tests for seven of the hazard categories, each comprising a unique set of test items, i.e., prompts. There are 43,090 test items in total, which we created with templates; (4) a grading system for AI systems against the benchmark; (5) an openly available platform, and downloadable tool, called ModelBench that can be used to evaluate the safety of AI systems on the benchmark; (6) an example evaluation report which benchmarks the performance of over a dozen openly available chat-tuned language models; (7) a test specification for the benchmark.

benchmark, hazard category, taxonomy, (12 more...)

arXiv.org Artificial Intelligence

2404.12241

Country:

Europe > Western Europe (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Virginia (0.04)
(22 more...)

Genre:

Personal > Interview (0.46)
Research Report > New Finding (0.45)

Industry:

Media (1.00)
Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Terrorism (1.00)
(10 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

On-Demand Model and Client Deployment in Federated Learning with Deep Reinforcement Learning

Chahoud, Mario, Sami, Hani, Mourad, Azzam, Otrok, Hadi, Bentahar, Jamal, Guizani, Mohsen

arXiv.org Artificial IntelligenceMay-12-2024

Abstract--In Federated Learning (FL), the limited accessibility of data from diverse locations and user types poses a significant challenge due to restricted user participation. Expanding client access and diversifying data enhance models by incorporating diverse perspectives, thereby enhancing adaptability. However, challenges arise in dynamic and mobile environments where certain devices may become inaccessible as FL clients, impacting data availability and client selection methods. To address this, we propose an On-Demand solution, deploying new clients using Docker Containers on-the-fly. It employs an autonomous end-to-end solution for handling model deployment and client selection. Simulated tests show that our architecture can easily adjust to changes in the environment and respond to On-Demand requests. FL can enhance traffic prediction models using realtime data from vehicles moving on the road. Regulation in the European Union, aim to protect data privacy One of the main limitations in existing FL frameworks [1]. However, the stringency of these regulations varies is in accessing the full potential of available data due to globally. A study [2] revealed a notable increase in privacy reliance on static clients, leading to incomplete or biased requests from 2021 to 2022, indicating growing concerns about dataset representations and affecting model performance. Access and Deletion requests saw a today's digital landscape, acquiring more clients is about substantial peak, with a 72% year-over-year increase in data efficiency.

accuracy, deployment, learning, (15 more...)

arXiv.org Artificial Intelligence

2405.07175

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Lebanon > Beirut Governorate > Beirut (0.04)
North America > United States > New York > Onondaga County > Syracuse (0.04)

Genre:

Research Report (0.82)
Personal (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Demystifying the Hypercomplex: Inductive Biases in Hypercomplex Deep Learning

Comminiello, Danilo, Grassucci, Eleonora, Mandic, Danilo P., Uncini, Aurelio

arXiv.org Artificial IntelligenceMay-11-2024

Hypercomplex algebras have recently been gaining prominence in the field of deep learning owing to the advantages of their division algebras over real vector spaces and their superior results when dealing with multidimensional signals in real-world 3D and 4D paradigms. This paper provides a foundational framework that serves as a roadmap for understanding why hypercomplex deep learning methods are so successful and how their potential can be exploited. Such a theoretical framework is described in terms of inductive bias, i.e., a collection of assumptions, properties, and constraints that are built into training algorithms to guide their learning process toward more efficient and accurate solutions. We show that it is possible to derive specific inductive biases in the hypercomplex domains, which extend complex numbers to encompass diverse numbers and data structures. These biases prove effective in managing the distinctive properties of these domains, as well as the complex structures of multidimensional and multimodal signals. This novel perspective for hypercomplex deep learning promises to both demystify this class of methods and clarify their potential, under a unifying framework, and in this way promotes hypercomplex models as viable alternatives to traditional real-valued deep learning for multidimensional signal processing.

hypercomplex domain, inductive bias, neural network, (15 more...)

arXiv.org Artificial Intelligence

2405.07024

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > Greater London > London (0.14)
Europe > Italy > Lazio > Rome (0.05)
(5 more...)

Genre:

Research Report (0.64)
Personal (0.46)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

BLIP: Facilitating the Exploration of Undesirable Consequences of Digital Technologies

Pang, Rock Yuren, Santy, Sebastin, Just, René, Reinecke, Katharina

arXiv.org Artificial IntelligenceMay-10-2024

Digital technologies have positively transformed society, but they have also led to undesirable consequences not anticipated at the time of design or development. We posit that insights into past undesirable consequences can help researchers and practitioners gain awareness and anticipate potential adverse effects. To test this assumption, we introduce BLIP, a system that extracts real-world undesirable consequences of technology from online articles, summarizes and categorizes them, and presents them in an interactive, web-based interface. In two user studies with 15 researchers in various computer science disciplines, we found that BLIP substantially increased the number and diversity of undesirable consequences they could list in comparison to relying on prior knowledge or searching online. Moreover, BLIP helped them identify undesirable consequences relevant to their ongoing projects, made them aware of undesirable consequences they "had never considered," and inspired them to reflect on their own experiences with technology.

blip, conséquence, undesirable consequence, (12 more...)

arXiv.org Artificial Intelligence

2405.06783

Country:

North America > United States > Washington > King County > Seattle (0.14)
Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(26 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Personal > Interview (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.93)
Health & Medicine > Therapeutic Area (0.92)
Media > News (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Information Management (1.00)
Information Technology > Communications > Social Media (1.00)
(8 more...)

Add feedback

Exploring the Potential of Human-LLM Synergy in Advancing Qualitative Analysis: A Case Study on Mental-Illness Stigma

Meng, Han, Yang, Yitian, Li, Yunan, Lee, Jungup, Lee, Yi-Chieh

arXiv.org Artificial IntelligenceMay-9-2024

Qualitative analysis is a challenging, yet crucial aspect of advancing research in the field of Human-Computer Interaction (HCI). Recent studies show that large language models (LLMs) can perform qualitative coding within existing schemes, but their potential for collaborative human-LLM discovery and new insight generation in qualitative analysis is still underexplored. To bridge this gap and advance qualitative analysis by harnessing the power of LLMs, we propose CHALET, a novel methodology that leverages the human-LLM collaboration paradigm to facilitate conceptualization and empower qualitative research. The CHALET approach involves LLM-supported data collection, performing both human and LLM deductive coding to identify disagreements, and performing collaborative inductive coding on these disagreement cases to derive new conceptual insights. We validated the effectiveness of CHALET through its application to the attribution model of mental-illness stigma, uncovering implicit stigmatization themes on cognitive, emotional and behavioral dimensions. We discuss the implications for future research, methodology, and the transdisciplinary opportunities CHALET presents for the HCI community and beyond.

llm, mental illness, participant, (13 more...)

arXiv.org Artificial Intelligence

2405.05758

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > Singapore > Central Region > Singapore (0.04)
(20 more...)

Genre:

Research Report > New Finding (1.00)
Personal > Interview (1.00)
Research Report > Experimental Study (0.92)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Congratulations to the #ICLR2024 test of time and outstanding paper award winners

AIHubMay-8-2024, 09:15:45 GMT

The Twelfth International Conference on Learning Representations (ICLR) is taking place this week in Vienna, Austria. During the opening of the conference, the outstanding paper award winners, and honourable mentions, were announced. The conference organisers also introduced a new award for this year: the test of time award. This award honours a paper from 2013/2014 that the programme chairs judge to have had a lasting impact. Abstract: How can we perform efficient inference and learning in directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and large datasets?

artificial intelligence, machine learning, natural language, (17 more...)

AIHub

Country: Europe > Austria > Vienna (0.55)

Genre: Personal > Honors > Award (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference Resolution

Ding, Bowen, Min, Qingkai, Ma, Shengkun, Li, Yingjie, Yang, Linyi, Zhang, Yue

arXiv.org Artificial IntelligenceMay-8-2024

Based on Pre-trained Language Models (PLMs), event coreference resolution (ECR) systems have demonstrated outstanding performance in clustering coreferential events across documents. However, the state-of-the-art system exhibits an excessive reliance on the'triggers lexical matching' spurious pattern in the input mention pair text. We formalize the decision-making process of the baseline ECR system using a Structural Causal Model (SCM), aiming to identify spurious and causal associations (i.e., rationales) within the ECR task. Leveraging the debiasing capability of counterfactual data augmentation, we develop a rationale-centric counterfactual data augmentation method with LLM-in-the-loop. This method is specialized for pairwise input in the Figure 1: The distribution of'triggers lexical matching' ECR system, where we conduct direct interventions in mention pairs from ECB+ training set, along with a on triggers and context to mitigate the false negative example from Held et al.'s system which spurious association while emphasizing the causation.

internet explorer, participant, security update, (15 more...)

arXiv.org Artificial Intelligence

2404.01921

Country:

North America > United States > Missouri > Jackson County > Kansas City (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Indiana > Marion County > Indianapolis (0.04)
(28 more...)

Genre:

Research Report (1.00)
Personal > Obituary (1.00)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Sports > Soccer (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Uncovering implementable dormant pruning decisions from three different stakeholder perspectives

Flynn, Deanna, Jain, Abhinav, Knight, Heather, Wilson, Cristina G., Grimm, Cindy

arXiv.org Artificial IntelligenceMay-7-2024

Dormant pruning, or the removal of unproductive portions of a tree while a tree is not actively growing, is an important orchard task to help maintain yield, requiring years to build expertise. Because of long training periods and an increasing labor shortage in agricultural jobs, pruning could benefit from robotic automation. However, to program robots to prune branches, we first need to understand how pruning decisions are made, and what variables in the environment (e.g., branch size and thickness) we need to capture. Working directly with three pruning stakeholders -- horticulturists, growers, and pruners -- we find that each group of human experts approaches pruning decision-making differently. To capture this knowledge, we present three studies and two extracted pruning protocols from field work conducted in Prosser, Washington in January 2022 and 2023. We interviewed six stakeholders (two in each group) and observed pruning across three cultivars -- Bing Cherries, Envy Apples, and Jazz Apples -- and two tree architectures -- Upright Fruiting Offshoot and V-Trellis. Leveraging participant interviews and video data, this analysis uses grounded coding to extract pruning terminology, discover horticultural contexts that influence pruning decisions, and find implementable pruning heuristics for autonomous systems. The results include a validated terminology set, which we offer for use by both pruning stakeholders and roboticists, to communicate general pruning concepts and heuristics. The results also highlight seven pruning heuristics utilizing this terminology set that would be relevant for use by future autonomous robot pruning systems, and characterize three discovered horticultural contexts (i.e., environmental management, crop-load management, and replacement wood) across all three cultivars.

architecture, pruning, pruning decision, (15 more...)

arXiv.org Artificial Intelligence

2405.0403

Country:

North America > United States > Oregon (0.04)
Oceania > New Zealand (0.04)
North America > United States > Washington (0.04)
(5 more...)

Genre:

Research Report (1.00)
Personal > Interview (1.00)

Industry: Food & Agriculture > Agriculture (1.00)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness

Zheng, Danna, Liu, Danyang, Lapata, Mirella, Pan, Jeff Z.

arXiv.org Artificial IntelligenceMay-6-2024

Large Language Models (LLMs) have demonstrated impressive capabilities across various domains, prompting a surge in their practical applications. However, concerns have arisen regarding the trustworthiness of LLMs' outputs, particularly in closed-book question-answering tasks, where non-experts may struggle to identify inaccuracies due to the absence of contextual or ground truth information. This paper introduces TrustScore, a framework based on the concept of Behavioral Consistency, which evaluates whether an LLM's response aligns with its intrinsic knowledge. Additionally, TrustScore can seamlessly integrate with factchecking methods, which assesses alignment with external knowledge sources. The experimental results show that TrustScore achieves strong correlations with human judgments, surpassing existing reference-free metrics, and achieving results on par with reference-based metrics. Large-scale language models (LLMs) have recently been in the spotlight due to their impressive performance in various NLP tasks, sparking enthusiasm for potential applications (Kaddour et al., 2023; Bubeck et al., 2023). However, a notable concern has emerged regarding the ability of LLMs to generate plausible yet incorrect responses (Tam et al., 2022; Liu et al., 2023; Devaraj et al., 2022), particularly challenging for users without specialized expertise. Consequently, users are often advised to employ LLMs in scenarios where they can confidently assess the information provided.

arxiv preprint arxiv, secure and trustworthy, trustscore, (15 more...)

arXiv.org Artificial Intelligence

2402.12545

Country:

Europe > Spain (0.05)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.05)
North America > United States > West Virginia (0.04)
(10 more...)

Genre:

Research Report > New Finding (0.48)
Personal > Honors (0.30)

Industry: Government > Regional Government > North America Government > United States Government (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback