Personal
Viktor Antonov, art director for Half-Life 2 and Dishonored, has died, according to colleagues
Viktor Antonov, best known for his work as art lead on Half-Life 2 and Dishonored, has reportedly died at age 52. Half-Life writer Marc Laidlaw broke the news in an Instagram Story, and other colleagues have since taken to social media to pay tribute as well. "I didn't want to say much till I felt it was confirmed, but I learned today that Viktor Antonov, our visionary art lead on HL2, has died," Laidla wrote in the now-expired post, which was reshared by LambdaGeneration on Saturday night. Antonov got his start in video games working on Redneck Rampage, and in addition to serving as art director for Half-Life 2 and Dishonored, he went on to consult on titles including Doom (2016) and Fallout 4. The Bulgarian artist just recently appeared in a documentary celebrating the 20th anniversary of Half-life 2 this past November. I wish I told you how much admiration I had for you but we get caught in our lives until a surprise like this hits us," Raphael Colantonio, founder of Arkane Studios and Wolfeye Studios, wrote on Bluesky. "You were instrumental to the success of Arkane Studios and an inspiration to many of us, also a friend with whom I have many fond memories." In another post, game designer Harvey Smith added, "All this about his impact and talent is true, but I will also always remember how much he made me laugh, with his dry, devastating wit.
Why Amazon Web Services CEO Matt Garman Is Playing the Long Game on AI
Matt Garman took the helm at Amazon Web Services (AWS), the cloud computing arm of the U.S. tech giant, in June, but he joined the business around 19 years ago as an intern. He went on to become AWS's first product manager and helped to build and launch many of its core services, before eventually becoming the CEO last year. Like many other tech companies, AWS, which is Amazon's most profitable unit, is betting big on AI. In April 2023, the company launched Amazon Bedrock, which gives cloud customers access to foundation models built by AI companies including Anthropic and Mistral. At its re:Invent conference in Las Vegas in December, the AWS made a series of announcements, including a new generation of foundation AI models, called Nova. It also said that it's building one of the world's most powerful AI supercomputers with Anthropic, which it has a strategic partnership with, using a giant cluster of AWS's Trainium 2 training chips. TIME spoke with Garman a few days after the re:Invent conference, about his AI ambitions, how he's thinking about ensuring the technology is safe, and how the company is balancing its energy needs with its emissions targets.
Crash victims honoured at basketball matches
Four students killed in a car crash were honoured at a university as basketball matches resumed for the first time since the incident. Makyle Bayley, 22, Eva Darold-Tchikaya, 21, Anthony "TJ" Hibbert, 24 and Daljang Wol, 22, died when a car crashed into a building on Magdalen Street, Colchester on 1 February. Mr Hibbert and Mr Wol played for the Essex Rebels, who dedicated Saturday's fixtures to the victims and held an applause in their memory. University of Essex director of sport Dave Parry said: "We've lost four really loved members of our university and sporting community, who gave so much to their friends and others." Mr Bayley was a member of the British Universities and Colleges Sport (BUCS) basketball team, while Ms Darold-Tchikaya was a member of the Essex Blades dance club and other societies.Dawid Wojtowicz/BBCSaturday's basketball fixtures at the University of Essex were dedicated to the victimsDawid Wojtowicz/BBCIt was the first time matches had been played there since the incident Last week, more than 1,000 people including students, staff and relatives of the victims attended a gathering.
WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning
Rao, Rajath, Ganesan, Adithya, Kjell, Oscar, Luby, Jonah, Raghavan, Akshay, Feltman, Scott, Ringwald, Whitney, Boyd, Ryan L., Luft, Benjamin, Ruggero, Camilo, Ryant, Neville, Kotov, Roman, Schwartz, H. Andrew
Current speech encoding pipelines often rely on an additional text-based LM to get robust representations of human communication, even though SotA speech-to-text models often have a LM within. This work proposes an approach to improve the LM within an audio model such that the subsequent text-LM is unnecessary. We introduce WhiSPA (Whisper with Semantic and Psychological Alignment), which leverages a novel audio training objective: contrastive loss with a language model embedding as a teacher. Using over 500k speech segments from mental health audio interviews, we evaluate the utility of aligning Whisper's latent space with semantic representations from a text autoencoder (SBERT) and lexically derived embeddings of basic psychological dimensions: emotion and personality. Over self-supervised affective tasks and downstream psychological tasks, WhiSPA surpasses current speech encoders, achieving an average error reduction of 73.4% and 83.8%, respectively. WhiSPA demonstrates that it is not always necessary to run a subsequent text LM on speech-to-text output in order to get a rich psychological representation of human communication.
RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation
Jiang, Pengcheng, Cao, Lang, Zhu, Ruike, Jiang, Minhao, Zhang, Yunyi, Sun, Jimeng, Han, Jiawei
Retrieval-augmented language models often struggle with knowledge-intensive tasks due to inefficient retrieval, unstructured knowledge integration, and single-pass architectures. We present Retrieval-And-Structuring (RAS), a novel framework that dynamically constructs and reasons over query-specific knowledge graphs through iterative retrieval and structuring. RAS introduces four key technical innovations: (1) a themescoped retrieval mechanism that efficiently narrows the search space while maintaining retrieval quality, (2) an action planning module that determines knowledge needs and generates focused sub-queries, (3) a dynamic knowledge structuring approach that converts retrieved text into an evolving knowledge graph, and (4) a graph-augmented answering component that leverages the accumulated structured information. Our framework achieves state-of-the-art performance, surpassing leading baselines by 6.4% with open-source language models and 7.0% with proprietary models on seven knowledge-intensive generation datasets across all evaluation metrics. Detailed ablation studies verify the contribution of each technical component to the overall system performance.
OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning
Lu, Pan, Chen, Bowen, Liu, Sheng, Thapa, Rahul, Boen, Joseph, Zou, James
Solving complex reasoning tasks may involve visual understanding, domain knowledge retrieval, numerical calculation, and multi-step reasoning. Existing methods augment large language models (LLMs) with external tools but are restricted to specialized domains, limited tool types, or require additional training data. In this paper, we introduce OctoTools, a training-free, user-friendly, and easily extensible open-source agentic framework designed to tackle complex reasoning across diverse domains. OctoTools introduces standardized tool cards to encapsulate tool functionality, a planner for both high-level and low-level planning, and an executor to carry out tool usage. We validate OctoTools' generality across 16 diverse tasks (including MathVista, MMLU-Pro, MedQA, and GAIA-Text), achieving substantial average accuracy gains of 9.3% over GPT-4o. Furthermore, OctoTools outperforms AutoGen, GPT-Functions and LangChain by up to 10.6% when given the same set of tools. Through comprehensive analysis and ablations, OctoTools demonstrates advantages in task planning, effective tool usage, and multi-step problem solving.
FairFare: A Tool for Crowdsourcing Rideshare Data to Empower Labor Organizers
Calacci, Dana, Rao, Varun Nagaraj, Dalal, Samantha, Di, Catherine, Pua, Kok-Wei, Schwartz, Andrew, Spitzberg, Danny, Monroy-Hernรกndez, Andrรฉs
In recent years, labor organizers representing rideshare and delivery workers have advocated for regulations to improve working conditions in the rideshare industry that set wage floors and job loss protections [67]. To call for these improvements, organizers need to understand workers' existing conditions [37], a significant data access and social computing challenge in the rideshare industry. Labor organizers representing rideshare workers typically rely on a collage of qualitative anecdotes and screenshots to provide data about existing working conditions [24]. While these qualitative data provide rich, "thick descriptions" [30] of workers' experience, they are often dismissed by platforms as non-representative, cherry-picked examples. Rideshare platforms, on the other hand, have exclusive access to large-scale, comprehensive quantitative datasets of driver, trip, and pay data that they can draw upon to create authoritative narratives about working conditions in their industry [72]. Labor organizers need comprehensive access to large-scale quantitative data describing working conditions to conduct rigorous, independent investigations and contest platform-driven narratives. There are tools and legal frameworks that empower individual rideshare workers to independently access quantitative work data (e.g., Gridwise and Data Subject Access Requests). However, these tools and frameworks do not provide an intuitive way to aggregate individual worker data into a dataset that provides collective insight into overarching working conditions. Algorithmic auditing scholarship provides methods, like crowdsourcing data, to independently investigate black-boxed systems [66].
Interview with Kayla Boggess: Explainable AI for more accessible and understandable technologies
In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. The Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. In the second of our interviews with the 2025 cohort, we hear from Kayla Boggess, a PhD student at the University of Virginia, and find out more about her research on explainable AI. I'm a member of the University of Virginia Link Lab, which is a multi-disciplinary lab focused on cyber-physical systems. There's individuals from departments across the University of Virginia that all work in the lab, so I've had the opportunity to work with other researchers in computer science, system engineering, psychology, and even law during my time there.
Tradeoffs in Processing Queries and Supporting Updates over an ML-Enhanced R-tree
Al-Mamun, Abdullah, Haider, Ch. Md. Rakin, Wang, Jianguo, Aref, Walid G.
Machine Learning (ML) techniques have been successfully applied to design various learned database index structures for both the one- and multi-dimensional spaces. Particularly, a class of traditional multi-dimensional indexes has been augmented with ML models to design ML-enhanced variants of their traditional counterparts. This paper focuses on the R-tree multi-dimensional index structure as it is widely used for indexing multi-dimensional data. The R-tree has been augmented with machine learning models to enhance the R-tree performance. The AI+R-tree is an ML-enhanced R-tree index structure that augments a traditional disk-based R-tree with an ML model to enhance the R-tree's query processing performance, mainly, to avoid navigating the overlapping branches of the R-tree that do not yield query results, e.g., in the presence of high-overlap among the rectangles of the R-tree nodes. We investigate the empirical tradeoffs in processing dynamic query workloads and in supporting updates over the AI+R-tree. Particularly, we investigate the impact of the choice of ML models over the AI+R-tree query processing performance. Moreover, we present a case study of designing a custom loss function for a neural network model tailored to the query processing requirements of the AI+R-tree. Furthermore, we present the design tradeoffs for adopting various strategies for supporting dynamic inserts, updates, and deletes with the vision of realizing a mutable AI+R-tree. Experiments on real datasets demonstrate that the AI+R-tree can enhance the query processing performance of a traditional R-tree for high-overlap range queries by up to 5.4X while achieving up to 99% average query recall.
'Not on the Best Path'
In an age of breathless predictions and sky-high valuations, cognitive scientist Gary Marcus has emerged as one of the best-known skeptics of generative artificial intelligence (AI). In fact, he recently wrote a book about his concerns, Taming Silicon Valley, in which he made the case that "we are not on the best path right now, either technically or morally." Marcus--who has spent his career examining both natural and artificial intelligence--explained his reasoning in a recent conversation with Leah Hoffmann. You've written about neural networks in everything from your 1992 monograph on language acquisition to, most recently, your book Taming Silicon Valley. Your thoughts about how AI companies and policies fall short have been well covered in your U.S. Senate testimony and other outlets (including your own Substack).