Goto

Collaborating Authors

 Government


Beyond Connectivity: An Open Architecture for AI-RAN Convergence in 6G

arXiv.org Artificial Intelligence

Abstract--Data-intensive Artificial Intelligence (AI) applications at the network edge demand a fundamental shift in Radio Access Network (RAN) design, from merely consuming AI for network optimization, to actively enabling distributed AI workloads. This presents a significant opportunity for network operators to monetize AI while leveraging existing infrastructure. T o realize this vision, this article presents a novel converged O-RAN and AI-RAN architecture for unified orchestration and management of telecommunications and AI workloads on shared infrastructure. The proposed architecture extends the Open RAN principles of modularity, disaggregation, and cloud-nativeness to support heterogeneous AI deployments. We introduce two key architectural innovations: (i) the AI-RAN Orchestrator, which extends the O-RAN Service Management and Orchestration (SMO) to enable integrated resource and allocation across RAN and AI workloads; and (ii) AI-RAN sites that provide distributed edge AI platforms with real-time processing capabilities. The proposed architecture enables flexible orchestration, meeting requirements for managing heterogeneous workloads at different time scales while maintaining open, standardized interfaces and multi-vendor interoperability.This paper has been submitted to IEEE for publication. M. Polese, L. Bonati, and T. Melodia are with the Institute for the Wireless Internet of Things, Northeastern University, Boston, MA, USA. This article is based upon work partially supported by the NTIA PWSCIF under A ward No. 25-60-IF054, the U.S. NSF under award CNS-2112471, and by OUSD(R&E) through Army Research Laboratory Cooperative Agreement Number W911NF-24-2-0065.


HeavyWater and SimplexWater: Distortion-Free LLM Watermarks for Low-Entropy Next-Token Predictions

arXiv.org Artificial Intelligence

Large language model (LLM) watermarks enable authentication of text provenance, curb misuse of machine-generated text, and promote trust in AI systems. Current watermarks operate by changing the next-token predictions output by an LLM. The updated (i.e., watermarked) predictions depend on random side information produced, for example, by hashing previously generated tokens. LLM watermarking is particularly challenging in low-entropy generation tasks -- such as coding -- where next-token predictions are near-deterministic. In this paper, we propose an optimization framework for watermark design. Our goal is to understand how to most effectively use random side information in order to maximize the likelihood of watermark detection and minimize the distortion of generated text. Our analysis informs the design of two new watermarks: HeavyWater and SimplexWater. Both watermarks are tunable, gracefully trading-off between detection accuracy and text distortion. They can also be applied to any LLM and are agnostic to side information generation. We examine the performance of HeavyWater and SimplexWater through several benchmarks, demonstrating that they can achieve high watermark detection accuracy with minimal compromise of text generation quality, particularly in the low-entropy regime. Our theoretical analysis also reveals surprising new connections between LLM watermarking and coding theory. The code implementation can be found in https://github.com/DorTsur/HeavyWater_SimplexWater


BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems

arXiv.org Artificial Intelligence

AI agents have the potential to significantly alter the cybersecurity landscape. Here, we introduce the first framework to capture offensive and defensive cyber-capabilities in evolving real-world systems. Instantiating this framework with BountyBench, we set up 25 systems with complex, real-world codebases. To capture the vulnerability lifecycle, we define three task types: Detect (detecting a new vulnerability), Exploit (exploiting a given vulnerability), and Patch (patching a given vulnerability). For Detect, we construct a new success indicator, which is general across vulnerability types and provides localized evaluation. We manually set up the environment for each system, including installing packages, setting up server(s), and hydrating database(s). We add 40 bug bounties, which are vulnerabilities with monetary awards from \$10 to \$30,485, covering 9 of the OWASP Top 10 Risks. To modulate task difficulty, we devise a new strategy based on information to guide detection, interpolating from identifying a zero day to exploiting a given vulnerability. We evaluate 10 agents: Claude Code, OpenAI Codex CLI with o3-high and o4-mini, and custom agents with o3-high, GPT-4.1, Gemini 2.5 Pro Preview, Claude 3.7 Sonnet Thinking, Qwen3 235B A22B, Llama 4 Maverick, and DeepSeek-R1. Given up to three attempts, the top-performing agents are Codex CLI: o3-high (12.5% on Detect, mapping to \$3,720; 90% on Patch, mapping to \$14,152), Custom Agent: Claude 3.7 Sonnet Thinking (67.5% on Exploit), and Codex CLI: o4-mini (90% on Patch, mapping to \$14,422). Codex CLI: o3-high, Codex CLI: o4-mini, and Claude Code are more capable at defense, achieving higher Patch scores of 90%, 90%, and 87.5%, compared to Exploit scores of 47.5%, 32.5%, and 57.5% respectively; while the custom agents are relatively balanced between offense and defense, achieving Exploit scores of 17.5-67.5% and Patch scores of 25-60%.


Towards responsible AI for education: Hybrid human-AI to confront the Elephant in the room

arXiv.org Artificial Intelligence

Despite significant advancements in AI-driven educational systems and ongoing calls for responsible AI for education, several critical issues remain unresolved -- acting as the elephant in the room within AI in education, learning analytics, educational data mining, learning sciences, and educational psychology communities. This critical analysis identifies and examines nine persistent challenges that continue to undermine the fairness, transparency, and effectiveness of current AI methods and applications in education. These include: (1) the lack of clarity around what AI for education truly means -- often ignoring the distinct purposes, strengths, and limitations of different AI families -- and the trend of equating it with domain-agnostic, company-driven large language models; (2) the widespread neglect of essential learning processes such as motivation, emotion, and (meta)cognition in AI-driven learner modelling and their contextual nature; (3) limited integration of domain knowledge and lack of stakeholder involvement in AI design and development; (4) continued use of non-sequential machine learning models on temporal educational data; (5) misuse of non-sequential metrics to evaluate sequential models; (6) use of unreliable explainable AI methods to provide explanations for black-box models; (7) ignoring ethical guidelines in addressing data inconsistencies during model training; (8) use of mainstream AI methods for pattern discovery and learning analytics without systematic benchmarking; and (9) overemphasis on global prescriptions while overlooking localised, student-specific recommendations. Supported by theoretical and empirical research, we demonstrate how hybrid AI methods -- specifically neural-symbolic AI -- can address the elephant in the room and serve as the foundation for responsible, trustworthy AI systems in education.


On the Temporal Question-Answering Capabilities of Large Language Models Over Anonymized Data

arXiv.org Artificial Intelligence

The applicability of Large Language Models (LLMs) in temporal reasoning tasks over data that is not present during training is still a field that remains to be explored. In this paper we work on this topic, focusing on structured and semi-structured anonymized data. We not only develop a direct LLM pipeline, but also compare various methodologies and conduct an in-depth analysis. We identified and examined seventeen common temporal reasoning tasks in natural language, focusing on their algorithmic components. To assess LLM performance, we created the \textit{Reasoning and Answering Temporal Ability} dataset (RATA), featuring semi-structured anonymized data to ensure reliance on reasoning rather than on prior knowledge. We compared several methodologies, involving SoTA techniques such as Tree-of-Thought, self-reflexion and code execution, tuned specifically for this scenario. Our results suggest that achieving scalable and reliable solutions requires more than just standalone LLMs, highlighting the need for integrated approaches.


A long lost silver dollar may be worth 5 million

Popular Science

The'King of American Coins' remained hidden in a late collector's archive for decades. Breakthroughs, discoveries, and DIY tips sent every weekday. One of the country's rarest coins is rarer than even expert coin collectors believed. After the surprise discovery of a long-lost 1804 dollar (aka the " King of American Coins "), the rarity's total known count now stands at 16. Regardless of its ranking, the silver coin is expected to fetch significantly more than its original worth when it hits the auction block on December 9. According to auctioneers at Stack's Bowers Galleries, the story begins with former President Andrew Jackson.


First lady Melania Trump rolls out AI audiobook of first memoir in Spanish: 'Amazing journey'

FOX News

First Lady Melania Trump is launching a Spanish-language edition of the audiobook of her memoir using artificial intelligence (AI) audio technology to tell her story.


OECD warns tariffs, AI will test resilience of the global economy

Al Jazeera

Global growth is holding up better than expected as an artificial intelligence (AI) investment boom helps offset some of the shock from United States tariff hikes, according to the Organisation for Economic Co-operation and Development (OECD). The Paris-based organisation, however, warned on Tuesday that global growth was vulnerable to any new outbreak of trade tensions, while investor optimism about AI could trigger a stock market correction if expectations are not met. It predicted a rebound to 3.1 percent in 2027. OECD head Mathias Cormann said the trade shocks triggered by US President Donald Trump's tariff hikes had so far proved relatively mild, but added their costs were likely to rise. "The full effects of those higher tariffs since the start of the year will become clearer as firms run down the inventories that they built up," he told a press conference.


How to tell time on Mars

Popular Science

Physicists finally know how much faster time moves on the Red Planet. Breakthroughs, discoveries, and DIY tips sent every weekday. Tracking the first astronauts' visit to Mars won't be as simple as watching a clock or marking days off of a calendar. Thanks to relativity, time actually moves faster on the Red Planet than it does here on Earth. For years, scientists have wondered about the exact temporal difference between planets, but physicists at the National Institute of Standards and Technology (NIST) finally have an answer.


Your Data Might Determine How Much You Pay for Eggs

WIRED

A newly enacted New York law requires retailers to say whether your data influences the price of basic goods like a dozen eggs or toilet paper, but not how. If you're near Rochester, New York, the price for a carton of Target's Good & Gather eggs is listed as $1.99 on its website. It's unclear why the prices differ, but a new notice on Target's website offers a potential hint: "This price was set by an algorithm using your personal data." A recently enacted New York State law requires businesses that algorithmically set prices using customers' personal data to disclose that. According to the law, personal data includes any data that can be "linked or reasonably linked, directly or indirectly, with a specific consumer or device." The law doesn't require businesses to explicitly state what information about a person or device is being used or how each piece of information affects the final price a customer sees.