Goto

Collaborating Authors

 ai performance


AI performances and screenplays won't be eligible for Oscars

Engadget

AI performances and screenplays won't be eligible for Oscars AI performances and screenplays won't be eligible for Oscars Sorry, Val Kilmer fans, but the late actor's Oscar ship has officially sailed. On Friday, reported that AI-generated acting and writing won't be eligible for Academy Awards. The new rules from the Academy of Motion Picture Arts and Sciences will take effect beginning with next year's presentation, scheduled for March 2027. The Academy's updated rules state that while filmmakers can use AI tools, synthetic performers can't win any awards. Ditto for AI-written screenplays, which must be human-authored.


When concept-based XAI is imprecise: Do people distinguish between generalisations and misrepresentations?

arXiv.org Artificial Intelligence

Concept-based explainable artificial intelligence (C-XAI) can let people see which representations an AI model has learned. This is particularly important when high-level semantic information (e.g., actions and relations) is used to make decisions about abstract categories (e.g., danger). In such tasks, AI models need to generalise beyond situation-specific details, and this ability can be reflected in C-XAI outputs that randomise over irrelevant features. However, it is unclear whether people appreciate such generalisation and can distinguish it from other, less desirable forms of imprecision in C-XAI outputs. Therefore, the present study investigated how the generality and relevance of C-XAI outputs affect people's evaluation of AI. In an experimental railway safety evaluation scenario, participants rated the performance of a simulated AI that classified traffic scenes involving people as dangerous or not. These classification decisions were explained via concepts in the form of similar image snippets. The latter differed in their match with the classified image, either regarding a highly relevant feature (i.e., people's relation to tracks) or a less relevant feature (i.e., people's action). Contrary to the hypotheses, concepts that generalised over less relevant features were rated lower than concepts that matched the classified image precisely. Moreover, their ratings were no better than those for systematic misrepresentations of the less relevant feature. Conversely, participants were highly sensitive to imprecisions in relevant features. These findings cast doubts on the assumption that people can easily infer from C-XAI outputs whether AI models have gained a deeper understanding of complex situations.


Apple M5 unveiled: 10 CPU cores, 10 GPU cores, and the 'next big leap' in AI

PCWorld

When you purchase through links in our articles, we may earn a small commission. Apple M5 unveiled: 10 CPU cores, 10 GPU cores, and the'next big leap' in AI New iPad Pro, MacBook Pro, and Vision Pro all benefit from upgraded Apple silicon. Apple on Wednesday announced the launch of its M5 processor, saying the chip "ushers in the next big leap in AI performance for Apple silicon." The M5 appears in new editions of the iPad Pro, MacBook Pro, and Vision Pro, all of which are available for U.S. and U.K. customers to pre-order as of today. The M5, as you would expect, is a higher-performance chip than its M4 predecessor.


An Approach to Grounding AI Model Evaluations in Human-derived Criteria

arXiv.org Artificial Intelligence

In the rapidly evolving field of artificial intelligence (AI), traditional benchmarks can fall short in attempting to capture the nuanced capabilities of AI models. We focus on the case of physical world modeling and propose a novel approach to augment existing benchmarks with human-derived evaluation criteria, aiming to enhance the interpretability and applicability of model behaviors. Grounding our study in the Perception Test and OpenEQA benchmarks, we conducted in-depth interviews and large-scale surveys to identify key cognitive skills, such as Prioritization, Memorizing, Discerning, and Contextualizing, that are critical for both AI and human reasoning. Our findings reveal that participants perceive AI as lacking in interpretive and empathetic skills yet hold high expectations for AI performance. By integrating insights from our findings into benchmark design, we offer a framework for developing more human-aligned means of defining and measuring progress. This work underscores the importance of user-centered evaluation in AI development, providing actionable guidelines for researchers and practitioners aiming to align AI capabilities with human cognitive processes. Our approach both enhances current benchmarking practices and sets the stage for future advancements in AI model evaluation.


To Err Is AI! Debugging as an Intervention to Facilitate Appropriate Reliance on AI Systems

arXiv.org Artificial Intelligence

Powerful predictive AI systems have demonstrated great potential in augmenting human decision making. Recent empirical work has argued that the vision for optimal human-AI collaboration requires 'appropriate reliance' of humans on AI systems. However, accurately estimating the trustworthiness of AI advice at the instance level is quite challenging, especially in the absence of performance feedback pertaining to the AI system. In practice, the performance disparity of machine learning models on out-of-distribution data makes the dataset-specific performance feedback unreliable in human-AI collaboration. Inspired by existing literature on critical thinking and a critical mindset, we propose the use of debugging an AI system as an intervention to foster appropriate reliance. In this paper, we explore whether a critical evaluation of AI performance within a debugging setting can better calibrate users' assessment of an AI system and lead to more appropriate reliance. Through a quantitative empirical study (N = 234), we found that our proposed debugging intervention does not work as expected in facilitating appropriate reliance. Instead, we observe a decrease in reliance on the AI system after the intervention -- potentially resulting from an early exposure to the AI system's weakness. We explore the dynamics of user confidence and user estimation of AI trustworthiness across groups with different performance levels to help explain how inappropriate reliance patterns occur. Our findings have important implications for designing effective interventions to facilitate appropriate reliance and better human-AI collaboration.


NVIDIA's RTX 500 and 1000 Ada GPUs bring more AI smarts to thin and light workstations

Engadget

Just ahead of Mobile World Congress, NVIDIA unveiled its latest laptop GPUs and, what a surprise, they're designed largely to assist with AI processing. The RTX 500 and 1000 Ada Generation graphics cards are primarily for thin and light laptops. While they won't offer as much TOPS AI performance as current higher-end mobile GPUs, they could be a handy option for on-the-go AI processing for the likes of researchers, content creators and video editors. It's worth noting they're workstation GPUs rather than ones designed for gaming. NVIDIA says the GPUs, which are based on the Ada Lovelace architecture, offer up to twice the ray-tracing performance of previous-gen GPUs (they employ third-gen ray-tracing cores).


How to Build the Best AI PC

PCWorld

Artificial Intelligence is rapidly transforming the computing world, and anyone looking to ride this accelerating wave of innovation will need to be properly equipped. The extreme processing demands of AI and deep learning technologies demand a PC with the specs to handle the tremendous calculation loads of AI applications. Here's what you need to know to configure the ultimate AI powerhouse PC. Because it's a relatively new capability in personal desktop computing, even most tech enthusiasts underestimate the degree to which AI computing differs from other kinds of advanced technologies. This is especially true of models like Stable Diffusion that generate rich graphics from text-based input – an increasingly common feature in today's PC games.


AMD confirms Zen 5 is due soon with 'Strix Point' Ryzen laptop chip

PCWorld

AMD may have just launched the Ryzen 8000 series of mobile processors, but company executives confirmed that its next mobile processor, Strix Point, will make the leap to the next-generation Zen 5 architecture. The disclosure was made by Dr. Lisa Su, AMD's chief executive officer, during the company's fourth-quarter 2023 earnings report. They reported a net income of 667 million, up 3,076 percent from 21 million in profits a year ago. Revenue climbed 10 percent to 6.168 billion. AMD launched its Ryzen 8000 mobile lineup in December and Su said Tuesday that the chips will begin shipping in notebooks in February.


AI reality check: New NPUs don't matter as much as you'd think

PCWorld

You probably already own an AI PC. In the past few months, Intel and PC makers have beat the drum of the AI PC loudly and in concert with AMD, Intel, and Qualcomm. It's no secret that "AI" is the new "metaverse" -- you know, that thing that everyone was talking up a few years ago -- and executives and investors alike want to use AI to boost sales and stock prices. And it's true that AI does depend on the NPUs found in chips like Intel's Core Ultra, the brand that Intel is positioning as synonymous with on-chip AI. The same goes for AMD's Ryzen 8000 series -- which beat Intel to the desktop with an NPU -- as well as Qualcomm's Snapdragon X Elite.


AI and Jobs: Has the Inflection Point Arrived? Evidence from an Online Labor Platform

arXiv.org Artificial Intelligence

Artificial intelligence (AI) refers to the ability of machines or software to mimic or even surpass human intelligence in a given cognitive task. While humans learn by both induction and deduction, the success of current AI is rooted in induction, relying on its ability to detect statistical regularities in task input -- an ability learnt from a vast amount of training data using enormous computation resources. We examine the performance of such a statistical AI in a human task through the lens of four factors, including task learnability, statistical resource, computation resource, and learning techniques, and then propose a three-phase visual framework to understand the evolving relation between AI and jobs. Based on this conceptual framework, we develop a simple economic model of competition to show the existence of an inflection point for each occupation. Before AI performance crosses the inflection point, human workers always benefit from an improvement in AI performance, but after the inflection point, human workers become worse off whenever such an improvement occurs. To offer empirical evidence, we first argue that AI performance has passed the inflection point for the occupation of translation but not for the occupation of web development. We then study how the launch of ChatGPT, which led to significant improvement of AI performance on many tasks, has affected workers in these two occupations on a large online labor platform. Consistent with the inflection point conjecture, we find that translators are negatively affected by the shock both in terms of the number of accepted jobs and the earnings from those jobs, while web developers are positively affected by the very same shock. Given the potentially large disruption of AI on employment, more studies on more occupations using data from different platforms are urgently needed.