Industry
Former Giants manager Abe referred to prosecutors over alleged assault
Former Yomiuri Giants manager Shinnosuke Abe was referred to prosecutors on suspicion of assaulting his eldest daughter, though it is believed the police attached a recommendation for leniency. The Metropolitan Police Department on Tuesday sent papers to prosecutors on Shinnosuke Abe, former manager of the Yomiuri Giants, on suspicion of assaulting his eldest daughter. The MPD is believed to have attached a recommendation for leniency. Abe, 47, has admitted to the allegations, according to investigative sources. In the case referred to prosecutors, Abe is suspected of grabbing his 18-year-old daughter by the collar and pushing her down at his home in Shibuya Ward in the capital at around 7 p.m. on May 25. Police arrested Abe at the scene but released him shortly afterward.
'Birdwatching saved me from my gaming addiction'
'Birdwatching saved me from my gaming addiction' When things were at their worst Edward Bartlett was playing computer games twenty hours a day, sometimes only pausing to eat and sleep. I was addicted to video games, the 28-year-old said. I lived with friends, but for two years I never really saw them in real life. We talked through a gaming microphone. Now the University of Sheffield zoology student has swapped his gaming headset for a pair of binoculars as he's embraced a new passion - birdwatching.
Japan to launch language support project for foreign children
The number of public school students requiring special Japanese-language instruction reached a record high of 84,759 in fiscal 2025. The education ministry plans to launch a model project in fiscal 2027 to provide basic Japanese-language instruction for school life and classes to children of foreign nationals living in Japan. In response to an increase in the number of such children, the ministry aims to establish guidelines for effective language lessons through the project. The number of public school students requiring special Japanese-language instruction, including those who are unable to communicate adequately in daily Japanese conversation, reached a record high of 84,759 in fiscal 2025, which ended in March this year. The number doubled over the past nine years, according to the ministry. Of those students, about 10% were not given sufficient instruction at their schools due to staff shortages and other reasons.
OpenAI makes move to go public one week after rival Anthropic
OpenAI, founded in San Francisco in 2015 as a nonprofit research lab, burst into the mainstream with the launch of ChatGPT in November 2022. It has since restructured as a for-profit corporation. SAN FRANCISCO, UNITED STATES - ChatGPT-maker OpenAI on Monday took the first step toward going public, one week after archrival Anthropic announced its own filing, as both companies look to raise the massive sums needed to expand. In a social media post, the Sam Altman-led company said it had confidentially submitted an S-1 registration statement to U.S. securities regulators but had "not decided on timing yet" for any potential debut. OpenAI's move follows a confidential filing by Anthropic, the maker of the Claude chatbot, which announced last Monday that it had taken the same step. In a time of both misinformation and too much information, quality journalism is more crucial than ever.
When Are Neural Interaction Discoveries Real? Identifiability, Recoverability, and a Pre-Fit Diagnostic
Kuskova, Valentina, Zaytsev, Dmitry, Coppedge, Michael
When a neural time-series model reports that one variable modulates another's effect on a target, is the discovered interaction a property of the data or an artifact of model flexibility? We argue that this is fundamentally a question of identifiability, governed by the geometry of the observed input support rather than by the specific neural architecture. We study the problem in a multiplicative-gating extension of neural additive vector autoregression (GNAVAR), in which source contributions are modulated by other lagged variables. We show that representational capacity is not identifiability: dependent inputs induce leakage between edge-specific interaction terms, and low-dimensional support permits distinct interaction decompositions that agree on the observed data while differing elsewhere. We then prove a population identifiability theorem for normalized minimal GNAVAR decompositions under explicit support conditions, including settings with shared modulators. The theory yields a simple practitioner-facing diagnostic: the effective rank of the joint lag-block covariance predicts, before fitting, whether interaction recovery is feasible for a given candidate set. When the candidate set is unknown, a two-seed stability check provides a practical operational test. The same support condition organizes empirical outcomes into the three states predicted by the theory. Our results show that interaction recoverability depends on support geometry, that effective rank provides a practical pre-fit diagnostic, and that instability across independent fits is a characteristic signature of non-identifiable interaction discovery. The identifiability phenomenon, the support condition, and the instability signature are model-agnostic; GNAVAR is the vehicle that makes them provable.
Causal Longitudinal Prior-Fitted Networks for Counterfactual Outcome Prediction
Zare, Amirhossein, Zare, Amirhessam, Rahimi, Herlock, Salarikia, Reza, Kashkooli, Mohammad
Longitudinal treatment decisions from multivariate time-series data require predicting potential outcomes under future treatment sequences in the presence of timevarying confounding, heterogeneous patient dynamics, and limited domain-specific data. Existing longitudinal causal estimators typically address this problem by training a new model for each cohort or simulator. We introduce Causal Longitudinal Prior-Fitted Networks (CAUSALLONGPFN), a prior-fitted network for time-series causal inference in longitudinal treatment-response data and zero-shot in-context counterfactual outcome prediction. To our knowledge, CAUSALLONGPFN is the first PFN-style model for history-conditional potential-outcome prediction under planned longitudinal treatment sequences, with systematic comparison against established longitudinal causal baselines on branchable counterfactual treatmentresponse benchmarks and factual real-world clinical data. The model is pretrained entirely on synthetic episodes sampled from a broad prior over temporal structural causal models, exposing it to treatment-confounder feedback, latent heterogeneity, nonlinear state evolution, delayed effects, and cumulative treatment responses. At test time, CAUSALLONGPFN remains frozen and is used zero-shot: it conditions on support trajectories, a query history, and a planned future treatment sequence, and returns a predictive distribution over future outcomes without gradient updates or propensity-model fitting. Multi-step predictions are obtained by recursively applying the one-step predictor under the specified treatment sequence. We evaluate the model on branchable cancer, HIV, and warfarin benchmarks with ground-truth counterfactual labels, and on factual-only rolling-origin prediction in MIMIC-III ICU trajectories. CAUSALLONGPFN is competitive with domain-trained longitudinal baselines on counterfactual benchmarks and performs strongly on factual MIMIC-III prediction, suggesting that broad synthetic causal pretraining can provide a frozen, amortized alternative for zero-shot longitudinal treatment-response prediction when repeated domain-specific training is costly or impractical.
Disentangling Latent Risk Pathways via Bayesian Hypergraph Inference
Ding, Shengxian, Gao, Haonan, Liu, Pangpang, Tian, Xinyuan, Zhao, Yize
Electronic health records (EHR) pose large-scale multi-disease modeling problems in which many outcomes are rare and strongly influenced by shared risk factors. While modern approaches achieve strong predictive performance, they often treat diseases independently or rely on black-box architectures, offering limited insight into how risk factors organize disease risk and little principled uncertainty quantification. We introduce a Bayesian hypergraph inference framework that reframes multi-disease modeling around latent, risk-factor-modulated disease pathways. Risk factors act on hyperedges, latent disease subsets with shared risk patterns, allowing diseases to participate in multiple distinct pathways and enabling interpretable, higher-order structure beyond pairwise associations. A repulsion prior encourages parsimonious and identifiable structure, while posterior inference provides calibrated uncertainty over both disease groupings and risk-factor influence. To enable scalable inference on large EHR datasets, we develop a structured variational inference algorithm that preserves logical dependencies among hyperedge existence, disease membership, and pathway-level effects. Experiments on simulated data and UK Biobank demonstrate stable and interpretable disease pathway structure, well-calibrated uncertainty, improved estimation for rare diseases, and competitive predictive performance.
Inference for High-Dimensional Sparse Spectral Precision Matrices
Deb, Navonil, Kim, Younghoon, Basu, Sumanta
Gaussian graphical models in the spectral domain offer a principled approach for recovering conditional dependence structures in stationary high-dimensional time series. Inference on the spectral precision matrix at a fixed frequency enables tests of frequency-specific conditional associations among time series components. The problem is challenging because finite-sample discrete Fourier transforms induce truncation and smoothing biases, while the complex-valued nature of the spectral precision matrix complicates high-dimensional variance estimation, rendering methods for i.i.d. samples not directly applicable. Existing approaches do not provide full likelihood-based inference for the discrete Fourier transforms. We propose a high-dimensional inference framework for sparse spectral precision matrices using the full likelihood of neighboring discrete Fourier transforms. We construct a debiased complex graphical lasso estimator at any fixed frequency. Using asymptotic theory for quadratic forms of multivariate time series, we establish its asymptotic normality and construct entry-wise consistent covariance estimators by aggregating information across neighboring frequencies. The key theoretical contribution is the simultaneous control of regularization, finite-sample truncation, and smoothing biases, enabling valid inference. Simulation studies show reliable coverage away from zero frequency and improved detection power over the benchmark, with false discovery rates near the desired level.
ReSkill: Reconciling Skill Creation with Policy Optimization in Agentic RL
He, Zelin, Lin, Haotian, Han, Boran, Zhu, Wei, Fang, Haoyang, Wang, Bernie, Zhu, Xuan, Li, Runze, Reimherr, Matthew
Agentic reinforcement learning (RL) enables LLM agents to improve continuously from environment rewards, yet the resulting policies do not systematically accumulate reusable strategies that generalize across tasks. Modular skills can provide such reusable strategies, yet existing skill-augmented RL methods decouple skill creation from policy optimization, risking adopting skills that conflict with the evolving policy. Inspired by Anthropic's Skill Creator, we introduce RESKILL, an RL-in-the-loop skill creation framework that reconciles skill evolution with policy learning. RESKILL exploits the group-wise structure of GRPO to naturally embed three mechanisms with only marginal additional overhead: (1) an assertion-driven skill creator that diagnoses failures from past experience and proposes conditional, trigger-based skill revisions; (2) within-group rollout sampling that enables controlled comparison of skill versions, capturing which version best supports the policy's ongoing learning; and (3) Thompson Sampling with adaptive discounting to balance exploration and exploitation in skill version selection as the policy evolves. Across several domains, RESKILL consistently outperforms existing memory and skill-based RL methods, with the largest gains on unseen tasks. Analysis of the skill lifecycle shows skills being automatically created, tested, refined, and pruned as the policy improves, demonstrating reconciled skill-policy co-evolution.
Backward Coherence and Hidden-State Stability in Recurrent Neural Networks: A Quasi-Reverse-Martingale Theory
Recurrent neural networks maintain a hidden state $h_t$, but its probabilistic meaning is often unclear. We study hidden-state stability through \emph{backward coherence}: the extent to which $h_t$ can be reconstructed from $h_{t+1}$ by a learned backward projector $g_ϕ$. Under contraction and summable backward drift, the hidden-state sequence forms a quasi-reverse-martingale. This yields almost-sure convergence, rates under mixing, an interpretable limiting representation, finite pathwise stopping times, and a theoretical framework for time-uniform confidence sequences. Simulations support the theory. Backward-coherence regularisation reduces the empirical quasi-martingale total $\hat Q$ by $43$--$58%$, reaches stability $28$--$44%$ earlier than an unregularised RNN, and gives tracking-error recovery consistent with geometric bounds. Additional tests confirm echo-state forgetting rates bounded by $ρ$ and verify the increment-sum tube $R_t$ with $100%$ simultaneous coverage, although $R_t$ is conservative; in practice, the defect-tail proxy $\hat Q_t$ is the more useful monitor. The backward-coherence loss is also equivalent to minimising a Kullback--Leibler divergence in a Gaussian backward model, linking the method to variational inference. Extensions cover $ϕ$-mixing inputs, change-point tracking, and finite-sample concentration. Three real-data studies further validate the approach. On PhysioNet 2012 ICU data, the Reverse Martingale RNN (RMRNN) matches RNN mortality-prediction AUC while reaching stable representations 13 hours earlier. On FRED-MD, it reduces one-month-ahead forecast error by about fourfold under concept drift. On UCI Human Activity Recognition, it maintains lower post-transition tracking error with geometric decay. The guarantees apply under the stated assumptions; universality is not claimed.