Redmond
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.05)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.05)
- Asia > Nepal (0.04)
- (5 more...)
Microsoft stock plunges as Wall Street questions AI investments
Microsoft stock has slumped 12 percent as part of a software industry sell-off, stoking fears of whether hefty investments in artificial intelligence will pay off across the sector. The Redmond, Washington-based tech giant is on track Thursday to finish at its worst day since March 2020 and has seen approximately $400bn in valuation wiped out. Capital expenditures grew by 66 percent in the second quarter compared with the same period the year before, reaching a record $37.5bn for the quarter. Meanwhile, Microsoft predicted Azure growth to stay stable in the period from January to March at 37 percent to 38 percent, after slowing in the last three months of 2025, partially due to AI chip capacity constraints. "[Wall Street] wanted to see less cap-ex spending and faster cloud/AI monetisation and coming out of the gates, it's the opposite. We have said this is a multi-year journey, and Redmond needs to focus on its data center buildout with more customers heading down the AI path. It's a balancing act with 2026 the inflection year for AI and MSFT [Microsoft]," Dan Ives, analyst at Wedbush Securities, said in a note provided to Al Jazeera.
- North America > United States > New York > New York County > New York City (0.63)
- North America > Central America (0.41)
- North America > Canada (0.41)
- (11 more...)
- Information Technology (1.00)
- Banking & Finance > Trading (0.71)
- Government > Regional Government > North America Government > United States Government (0.31)
SmartAlert: Implementing Machine Learning-Driven Clinical Decision Support for Inpatient Lab Utilization Reduction
Liang, April S., Amrollahi, Fatemeh, Jiang, Yixing, Corbin, Conor K., Kim, Grace Y. E., Mui, David, Crowell, Trevor, Acharya, Aakash, Mony, Sreedevi, Punnathanam, Soumya, McKeown, Jack, Smith, Margaret, Lin, Steven, Milstein, Arnold, Schulman, Kevin, Hom, Jason, Pfeffer, Michael A., Pham, Tho D., Svec, David, Chu, Weihan, Shieh, Lisa, Sharp, Christopher, Ma, Stephen P., Chen, Jonathan H.
Repetitive laboratory testing unlikely to yield clinically useful information is a common practice that burdens patients and increases healthcare costs. Education and feedback interventions have limited success, while general test ordering restrictions and electronic alerts impede appropriate clinical care. We introduce and evaluate SmartAlert, a machine learning (ML)-driven clinical decision support (CDS) system integrated into the electronic health record that predicts stable laboratory results to reduce unnecessary repeat testing. This case study describes the implementation process, challenges, and lessons learned from deploying SmartAlert targeting complete blood count (CBC) utilization in a randomized controlled pilot across 9270 admissions in eight acute care units across two hospitals between August 15, 2024, and March 15, 2025. Results show significant decrease in number of CBC results within 52 hours of SmartAlert display (1.54 vs 1.82, p <0.01) without adverse effect on secondary safety outcomes, representing a 15% relative reduction in repetitive testing. Implementation lessons learned include interpretation of probabilistic model predictions in clinical contexts, stakeholder engagement to define acceptable model behavior, governance processes for deploying a complex model in a clinical environment, user interface design considerations, alignment with clinical operational priorities, and the value of qualitative feedback from end users. In conclusion, a machine learning-driven CDS system backed by a deliberate implementation and governance process can provide precision guidance on inpatient laboratory testing to safely reduce unnecessary repetitive testing.
- North America > United States > California > Santa Clara County > Palo Alto (0.15)
- North America > United States > Washington > King County > Redmond (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Personal > Interview (0.96)
- Health & Medicine > Health Care Providers & Services (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
- Health & Medicine > Health Care Technology > Medical Record (0.87)
- Health & Medicine > Therapeutic Area > Hematology (0.70)
PerfBench: Can Agents Resolve Real-World Performance Bugs?
Garg, Spandan, Moghaddam, Roshanak Zilouchian, Sundaresan, Neel
Performance bugs are inefficiencies in software that waste computational resources without causing functional failures, making them particularly challenging to detect and fix. While recent advances in Software Engineering agents have shown promise in automated bug fixing, existing benchmarks primarily focus on functional correctness and fail to evaluate agents' abilities to identify and resolve non-functional issues like performance bugs. We introduce PerfBench, a benchmark comprising 81 real-world performance bug-fixing tasks from popular .NET repositories on GitHub. Unlike existing benchmarks that rely on pre-existing test suites, PerfBench features a novel evaluation harness that allows agents to generate their own performance benchmarks and validates fixes by comparing execution metrics collected for developer fix and agent fix. Each task in PerfBench is derived from actual developer fixes linked to performance-related issues, which are then verified by human experts, ensuring real-world relevance. Our evaluation reveals that current state-of-the-art coding agents struggle with performance optimization tasks, with baseline OpenHands agent achieving only a ~3% success rate on our benchmark. We develop OpenHands-Perf-Agent, which incorporates performance-aware tooling and instructions and achieves a ~20% success rate on the benchmark. We show that by ensuring the agent has proper instructions to benchmark its changes and tooling for benchmark output processing, we can improve the agent performance significantly, but room for improvement still remains. PerfBench provides a challenging test set for furthering the capabilities of agents in fixing performance issues.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Washington > King County > Redmond (0.04)
- (3 more...)
Failure Modes in LLM Systems: A System-Level Taxonomy for Reliable AI Applications
Large language models (LLMs) are being rapidly integrated into decision-support tools, automation workflows, and AI-enabled software systems. However, their behavior in production environments remains poorly understood, and their failure patterns differ fundamentally from those of traditional machine learning models. This paper presents a system-level taxonomy of fifteen hidden failure modes that arise in real-world LLM applications, including multi-step reasoning drift, latent inconsistency, context-boundary degradation, incorrect tool invocation, version drift, and cost-driven performance collapse. Using this taxonomy, we analyze the growing gap in evaluation and monitoring practices: existing benchmarks measure knowledge or reasoning but provide little insight into stability, reproducibility, drift, or workflow integration. We further examine the production challenges associated with deploying LLMs - including observability limitations, cost constraints, and update-induced regressions - and outline high-level design principles for building reliable, maintainable, and cost-aware LLM systems. Finally, we outline high-level design principles for building reliable, maintainable, and cost-aware LLM-based systems. By framing LLM reliability as a system-engineering problem rather than a purely model-centric one, this work provides an analytical foundation for future research on evaluation methodology, AI system robustness, and dependable LLM deployment.
- Europe > Austria > Vienna (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (4 more...)
- Research Report > New Finding (0.46)
- Overview > Growing Problem (0.34)
Liberating Logic in the Age of AI: Going Beyond Programming with Computational Thinking
Schmidt, Douglas C., Runfola, Dan
Mastering one or more programming languages has historically been the gateway to implementing ideas on a computer. Today, that gateway is widening with advances in large language models (LLMs) and artificial intelligence (AI)-powered coding assistants. What matters is no longer just fluency in traditional programming languages but the ability to think computationally by translating problems into forms that can be solved with computing tools. The capabilities enabled by these AI-augmented tools are rapidly leading to the commoditization of computational thinking, such that anyone who can articulate a problem in natural language can potentially harness computing power via AI. This shift is poised to radically influence how we teach computer science and data science in the United States and around the world. Educators and industry leaders are grappling with how to adapt: What should students learn when the hottest new programming language is English? How do we prepare a generation of computational thinkers who need not code every algorithm manually, but must still think critically, design solutions, and verify AI-augmented results? This paper explores these questions, examining the impact of natural language programming on software development, the emerging distinction between programmers and prompt-crafting problem solvers, the reforms needed in computer science and data science curricula, and the importance of maintaining our fundamental computational science principles in an AI-augmented future. Along the way, we compare approaches and share best practices for embracing this new paradigm in computing education.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Washington > King County > Redmond (0.04)
- (5 more...)
- Education > Curriculum > Subject-Specific Education (1.00)
- Education > Educational Setting > Higher Education (0.68)
Revisiting Pre-trained Language Models for Vulnerability Detection
Li, Youpeng, Qi, Weiliang, Wang, Xuyu, Yu, Fuxun, Wang, Xinda
The rapid advancement of pre-trained language models (PLMs) has demonstrated promising results for various code-related tasks. However, their effectiveness in detecting real-world vulnerabilities remains a critical challenge. While existing empirical studies evaluate PLMs for vulnerability detection (VD), they suffer from data leakage, limited scope, and superficial analysis, hindering the accuracy and comprehensiveness of evaluations. This paper begins by revisiting the common issues in existing research on PLMs for VD through the evaluation pipeline. It then proceeds with an accurate and extensive evaluation of 18 PLMs on high-quality datasets that feature accurate labeling, diverse vulnerability types, and various projects. Specifically, we compare the performance of PLMs under both fine-tuning and prompt engineering, assess their effectiveness and generalizability across various training and testing settings, and analyze their robustness to a series of perturbations. Our findings reveal that PLMs incorporating pre-training tasks designed to capture the syntactic and semantic patterns of code outperform both general-purpose PLMs and those solely pre-trained or fine-tuned on large code corpora. However, these models face notable challenges in real-world scenarios, such as difficulties in detecting vulnerabilities with complex dependencies, handling perturbations introduced by code normalization and abstraction, and identifying semantic-preserving vulnerable code transformations. Also, the truncation caused by the limited context windows of PLMs can lead to a non-negligible number of labeling errors, which is overlooked by previous work. This study underscores the importance of thorough evaluations of model performance in practical scenarios and outlines future directions to help enhance the effectiveness of PLMs for realistic VD applications.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > India > Karnataka > Bengaluru (0.05)
- (28 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Washington > King County > Redmond (0.04)
- North America > United States > North Carolina > Durham County > Durham (0.04)
- (3 more...)
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > Washington > King County > Redmond (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (2 more...)
- North America > United States > Washington > King County > Redmond (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Florida > Alachua County > Gainesville (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)