Goto

Collaborating Authors

 calendar




1cc70be9fb6a83bc46cf4ac21a91e0b0-Supplemental-Conference.pdf

Neural Information Processing Systems

In this section, we provide the class assignment of all datasets under different missing rates. The proposed setting is anew multi-task learning scenario. Its practical applications could not be limited by the mentioned assumption in the testing space. Table B.2: The observed classes of each task onOffice-Caltech with different missing rates. Office-Home [9] contains images from four domains/tasks: Artistic, Clipart, Product and Realworld. Skin-Lesion contains three skin lesion classification tasks: HAM10000 [8], Dermofit [2] and Derm7pt[5].


Apple's Most Overlooked App Just Got a Lot Better

WIRED

Apple Shortcuts, which lets users write custom automations, recently earned some new capabilities thanks to Apple Intelligence. Here's how to make the most of this upgrade. As sentences go, "Apple Intelligence now works in Apple Shortcuts" isn't the most likely to inspire a lot of people to click a link. And that's too bad: This change, one of the more overlooked new features in macOS 26, means you can use Apple's on-board AI to do all kinds of things while designing shortcuts. Look, I get it: Apple Intelligence makes AI a feature, not a product, and features are generally less interesting to read about than full-blown products.


Can Language Models Handle a Non-Gregorian Calendar? The Case of the Japanese wareki

Sasaki, Mutsumi, Kamoda, Go, Takahashi, Ryosuke, Sato, Kosuke, Inui, Kentaro, Sakaguchi, Keisuke, Heinzerling, Benjamin

arXiv.org Artificial Intelligence

Temporal reasoning and knowledge are essential capabilities for language models (LMs). While much prior work has analyzed and improved temporal reasoning in LMs, most studies have focused solely on the Gregorian calendar. However, many non-Gregorian systems, such as the Japanese, Hijri, and Hebrew calendars, are in active use and reflect culturally grounded conceptions of time. If and how well current LMs can accurately handle such non-Gregorian calendars has not been evaluated so far. Here, we present a systematic evaluation of how well language models handle one such non-Gregorian system: the Japanese wareki. We create datasets that require temporal knowledge and reasoning in using wareki dates. Evaluating open and closed LMs, we find that some models can perform calendar conversions, but GPT-4o, Deepseek V3, and even Japanese-centric models struggle with Japanese calendar arithmetic and knowledge involving wareki dates. Error analysis suggests corpus frequency of Japanese calendar expressions and a Gregorian bias in the model's knowledge as possible explanations. Our results show the importance of developing LMs that are better equipped for culture-specific tasks such as calendar understanding.


A API Details

Neural Information Processing Systems

API calls for each position identified in a piece of text. Question Answering We use the Atlas model of Izacard et al. (2022) finetuned on Natural Questions Calculator Our calculator is based on a simple Python script and only supports the operators " It does not return any result for syntactically invalid equations. "=", "equals", "equal to", "total of", "average of" followed by a number, or (iii) contain at least three English text before generating API calls. Below, we list the prompts used to sample API calls for each tool considered. Your task is to add calls to a Question Answering API to a piece of text. Input: Joe Biden was born in Scranton, Pennsylvania. Output: Joe Biden was born in [QA("Where was Joe Biden born?")] Scranton, [QA("In Output: Coca-Cola, or [QA("What other name is Coca-Cola known by?")] Coke, is Your task is to add calls to a Calculator API to a piece of text.



Better Privilege Separation for Agents by Restricting Data Types

Jacob, Dennis, Alghamdi, Emad, Hu, Zhanhao, Alomair, Basel, Wagner, David

arXiv.org Artificial Intelligence

Large language models (LLMs) have become increasingly popular due to their ability to interact with unstructured content. As such, LLMs are now a key driver behind the automation of language processing systems, such as AI agents. Unfortunately, these advantages have come with a vulnerability to prompt injections, an attack where an adversary subverts the LLM's intended functionality with an injected task. Past approaches have proposed detectors and finetuning to provide robustness, but these techniques are vulnerable to adaptive attacks or cannot be used with state-of-the-art models. To this end we propose type-directed privilege separation for LLMs, a method that systematically prevents prompt injections. We restrict the ability of an LLM to interact with third-party data by converting untrusted content to a curated set of data types; unlike raw strings, each data type is limited in scope and content, eliminating the possibility for prompt injections. We evaluate our method across several case studies and find that designs leveraging our principles can systematically prevent prompt injection attacks while maintaining high utility.


OpenID Connect for Agents (OIDC-A) 1.0: A Standard Extension for LLM-Based Agent Identity and Authorization

Nagabhushanaradhya, Subramanya

arXiv.org Artificial Intelligence

OpenID Connect for Agents (OIDC-A) 1.0 is an extension to OpenID Connect Core 1.0 that provides a comprehensive framework for representing, authenticating, and authorizing LLM-based agents within the OAuth 2.0 ecosystem. As autonomous AI agents become increasingly prevalent in digital systems, there is a critical need for standardized protocols to establish agent identity, verify agent attestation, represent delegation chains, and enable fine-grained authorization based on agent attributes. This specification defines standard claims, endpoints, and protocols that address these requirements while maintaining compatibility with existing OAuth 2.0 and OpenID Connect infrastructure. The proposed framework introduces mechanisms for agent identity representation, delegation chain validation, attestation verification, and capability-based authorization, providing a foundation for secure and trustworthy agent-to-service interactions in modern distributed systems.


Can You Really Live One Day at a Time?

The New Yorker

Productivity culture encourages us to live inside our tasks and projects. But nature offers its own organizational system. This summer, I reread the novel " Aurora," by Kim Stanley Robinson, a science-fiction writer whom I profiled a few years ago. Robinson has an ecological orientation, and "Aurora" is basically a book about how we fit into nature. It ends on a beach, with an extended description of swimming in big waves. It's early morning, and the waves, as they rise, "turn a deep translucent green."