Goto

Collaborating Authors

 workflow


AI needs a strong data fabric to deliver business value

MIT Technology Review

A modern data fabric makes it possible to turn existing enterprise knowledge into a trusted foundation for AI. Artificial intelligence is moving quickly in the enterprise, from experimentation to everyday use. Organizations are deploying copilots, agents, and predictive systems across finance, supply chains, human resources, and customer operations. By the end of 2025, half of companies used AI in at least three business functions, according to a recent survey. But as AI becomes embedded in core workflows, business leaders are discovering that the biggest obstacle is not model performance or computing power but the quality and the context of the data on which those systems rely. AI essentially introduces a new requirement: Systems must not only access data -- they must understand the business context behind it.


Spurious Predictability in Financial Machine Learning

Nikolopoulos, Sotirios D.

arXiv.org Machine Learning

Adaptive specification search generates statistically significant backtests even under martingale-difference nulls. We introduce a falsification audit testing complete predictive workflows against synthetic reference classes, including zero-predictability environments and microstructure placebos. Workflows generating significant walk-forward evidence in these environments are falsified. For passing workflows, we quantify selection-induced performance inflation using an absolute magnitude gap linking optimized in-sample evidence to disjoint walk-forward realizations, adjusted for effective multiplicity. Simulations validate extreme-value scaling under correlated searches and demonstrate detection power under genuine structure. Empirical case studies confirm that many apparent findings represent methodological artifacts rather than genuine predictability.


bioLeak: Leakage-Aware Modeling and Diagnostics for Machine Learning in R

Korkmaz, Selçuk

arXiv.org Machine Learning

Data leakage remains a recurrent source of optimistic bias in biomedical machine learning studies. Standard row-wise cross-validation and globally estimated preprocessing steps are often inappropriate for data with repeated measurements, study-level heterogeneity, batch effects, or temporal dependencies. This paper describes bioLeak, an R package for constructing leakage-aware resampling workflows and for auditing fitted models for common leakage mechanisms. The package provides leakage-aware split construction, train-fold-only preprocessing, cross-validated model fitting, nested hyperparameter tuning, post hoc leakage audits, and HTML reporting. The implementation supports binary classification, multiclass classification, regression, and survival analysis, with task-specific metrics and S4 containers for splits, fits, audits, and inflation summaries. The simulation artifacts show how apparent performance changes under controlled leakage mechanisms, and the case study illustrates how guarded and leaky pipelines can yield materially different conclusions on multi-study transcriptomic data. The emphasis throughout is on software design, reproducible workflows, and interpretation of diagnostic output.


fastml: Guarded Resampling Workflows for Safer Automated Machine Learning in R

Korkmaz, Selcuk, Goksuluk, Dincer, Karaismailoglu, Eda

arXiv.org Machine Learning

Preprocessing leakage arises when scaling, imputation, or other data-dependent transformations are estimated before resampling, inflating apparent performance while remaining hard to detect. We present fastml, an R package that provides a single-call interface for leakage-aware machine learning through guarded resampling, where preprocessing is re-estimated inside each resample and applied to the corresponding assessment data. The package supports grouped and time-ordered resampling, blocks high-risk configurations, audits recipes for external dependencies, and includes sandboxed execution and integrated model explanation. We evaluate fastml with a Monte Carlo simulation contrasting global and fold-local normalization, a usability comparison with tidymodels under matched specifications, and survival benchmarks across datasets of different sizes. The simulation demonstrates that global preprocessing substantially inflates apparent performance relative to guarded resampling. fastml matched held-out performance obtained with tidymodels while reducing workflow orchestration, and it supported consistent benchmarking of multiple survival model classes through a unified interface.


On the Reliability Limits of LLM-Based Multi-Agent Planning

Ao, Ruicheng, Gao, Siyang, Simchi-Levi, David

arXiv.org Machine Learning

This technical note studies the reliability limits of LLM-based multi-agent planning as a delegated decision problem. We model the LLM-based multi-agent architecture as a finite acyclic decision network in which multiple stages process shared model-context information, communicate through language interfaces with limited capacity, and may invoke human review. We show that, without new exogenous signals, any delegated network is decision-theoretically dominated by a centralized Bayes decision maker with access to the same information. In the common-evidence regime, this implies that optimizing over multi-agent directed acyclic graphs under a finite communication budget can be recast as choosing a budget-constrained stochastic experiment on the shared signal. We also characterize the loss induced by communication and information compression. Under proper scoring rules, the gap between the centralized Bayes value and the value after communication admits an expected posterior divergence representation, which reduces to conditional mutual information under logarithmic loss and to expected squared posterior error under the Brier score. These results characterize the fundamental reliability limits of delegated LLM planning. Experiments with LLMs on a controlled problem set further demonstrate these characterizations.


Bridging the Gap Between Climate Science and Machine Learning in Climate Model Emulation

Schmidt, Luca, Effenberger, Nina

arXiv.org Machine Learning

While climate models provide insights for climate decision-making, their use is constrained by significant computational and technical demands. Although machine learning (ML) emulators offer a way to bypass the high computational costs, their effective use remains challenging. The hurdles are diverse, ranging from limited accessibility and a lack of specialized knowledge to a general mistrust of ML methods that are perceived as insufficiently physical. Here, we introduce a framework to overcome these barriers by integrating both climate science and machine learning perspectives. We find that designing easy-to-adopt emulators that address a clearly defined task and demonstrating their reliability offers a promising path for bridging the gap between our two fields.


Nurturing agentic AI beyond the toddler stage

MIT Technology Review

The promise of autonomous agentic AI requires significant changes in the governance landscape. Parents of young children face a lot of fears about developmental milestones, from infancy through adulthood. The number of months it takes a baby to learn to talk or walk is often used as a benchmark for wellness, or an indicator of additional tests needed to properly diagnose a potential health condition. A parent rejoices over the child's first steps and then realizes how much has changed when the child can quickly walk outside, instead of slowly crawling in a safe area inside. Suddenly safety, including childproofing, takes a completely different lens and approach. Generative AI hit toddlerhood between December 2025 and January 2026 with the introduction of no code tools from multiple vendors and the debut of OpenClaw, an open source personal agent posted on GitHub.


Why physical AI is becoming manufacturing's next advantage

MIT Technology Review

Why physical AI is becoming manufacturing's next advantage From simulation driven development to real world execution, Microsoft and NVIDIA are helping manufacturers leverage AI to cross the industrial frontier with confidence. For decades, manufacturers have pursued automation to drive efficiency, reduce costs, and stabilize operations. That approach delivered meaningful gains, but it is no longer enough. Today's manufacturing leaders face a different challenge: how to grow amid labor constraints, rising complexity, and increasing pressure to innovate faster without sacrificing safety, quality, or trust. The next phase of transformation will not be defined by isolated AI tools or individual robots, but by intelligence that can operate reliably in the physical world . This is where physical AI--intelligence that can sense, reason, and act in the real world--marks a decisive shift.


Apple MacBook Pro Review (M5 Max, 16-inch): The Fastest MacBook Yet

WIRED

A more exciting MacBook Pro is waiting in the wings, but the M5 Max shows the continued success of Apple Silicon. The M5 Max is a monster performer. Gaming is surprisingly smooth, and on-device AI speeds up. The display, keyboard, ports, and speakers remain top-of-class. The MacBook Pro is in its awkward era.


10 things to know about Apple's new M5 Pro and M5 Max MacBook Pros

Popular Science

Gear Computers 10 things to know about Apple's new M5 Pro and M5 Max MacBook Pros The latest versions of Apple's MacBook Pro laptops include M5 chips with revamped architecture to bring performance upgrades across the board. The new computers look similar, but the guts have gotten revamped. We may earn revenue from the products available on this page and participate in affiliate programs. Apple's latest MacBook Pro refresh landed today with two new processors, the M5 Pro and M5 Max, built on what the company calls its Fusion Architecture. We have already been using the vanilla M5 chip in the latest version of the Apple Vision Pro headset, but these new MBP models crank up the power level even more.