Plotting

Weak instrumental variables due to nonlinearities in panel data: A Super Learner Control Function estimator

arXiv.org Machine Learning

A triangular structural panel data model with additive separable individual-specific effects is used to model the causal effect of a covariate on an outcome variable when there are unobservable confounders with some of them time-invariant. In this setup, a linear reduced-form equation might be problematic when the conditional mean of the endogenous covariate and the instrumental variables is nonlinear. The reason is that ignoring the nonlinearity could lead to weak instruments As a solution, we propose a triangular simultaneous equation model for panel data with additive separable individual-specific fixed effects composed of a linear structural equation with a nonlinear reduced form equation. The parameter of interest is the structural parameter of the endogenous variable. The identification of this parameter is obtained under the assumption of available exclusion restrictions and using a control function approach. Estimating the parameter of interest is done using an estimator that we call Super Learner Control Function estimator (SLCFE). The estimation procedure is composed of two main steps and sample splitting. We estimate the control function using a super learner using sample splitting. In the following step, we use the estimated control function to control for endogeneity in the structural equation. Sample splitting is done across the individual dimension. We perform a Monte Carlo simulation to test the performance of the estimators proposed. We conclude that the Super Learner Control Function Estimators significantly outperform Within 2SLS estimators.


Multi-agent Application System in Office Collaboration Scenarios

arXiv.org Artificial Intelligence

This paper introduces a multi-agent application system designed to enhance office collaboration efficiency and work quality. The system integrates artificial intelligence, machine learning, and natural language processing technologies, achieving functionalities such as task allocation, progress monitoring, and information sharing. The agents within the system are capable of providing personalized collaboration support based on team members' needs and incorporate data analysis tools to improve decision-making quality. The paper also proposes an intelligent agent architecture that separates Plan and Solver, and through techniques such as multi-turn query rewriting and business tool retrieval, it enhances the agent's multi-intent and multi-turn dialogue capabilities. Furthermore, the paper details the design of tools and multi-turn dialogue in the context of office collaboration scenarios, and validates the system's effectiveness through experiments and evaluations. Ultimately, the system has demonstrated outstanding performance in real business applications, particularly in query understanding, task planning, and tool calling. Looking forward, the system is expected to play a more significant role in addressing complex interaction issues within dynamic environments and large-scale multi-agent systems.


Attentional Graph Meta-Learning for Indoor Localization Using Extremely Sparse Fingerprints

arXiv.org Machine Learning

Fingerprint-based indoor localization is often labor-intensive due to the need for dense grids and repeated measurements across time and space. Maintaining high localization accuracy with extremely sparse fingerprints remains a persistent challenge. Existing benchmark methods primarily rely on the measured fingerprints, while neglecting valuable spatial and environmental characteristics. In this paper, we propose a systematic integration of an Attentional Graph Neural Network (AGNN) model, capable of learning spatial adjacency relationships and aggregating information from neighboring fingerprints, and a meta-learning framework that utilizes datasets with similar environmental characteristics to enhance model training. To minimize the labor required for fingerprint collection, we introduce two novel data augmentation strategies: 1) unlabeled fingerprint augmentation using moving platforms, which enables the semi-supervised AGNN model to incorporate information from unlabeled fingerprints, and 2) synthetic labeled fingerprint augmentation through environmental digital twins, which enhances the meta-learning framework through a practical distribution alignment, which can minimize the feature discrepancy between synthetic and real-world fingerprints effectively. By integrating these novel modules, we propose the Attentional Graph Meta-Learning (AGML) model. This novel model combines the strengths of the AGNN model and the meta-learning framework to address the challenges posed by extremely sparse fingerprints. To validate our approach, we collected multiple datasets from both consumer-grade WiFi devices and professional equipment across diverse environments. Extensive experiments conducted on both synthetic and real-world datasets demonstrate that the AGML model-based localization method consistently outperforms all baseline methods using sparse fingerprints across all evaluated metrics.


PHEONA: An Evaluation Framework for Large Language Model-based Approaches to Computational Phenotyping

arXiv.org Artificial Intelligence

Computational phenotyping is essential for biomedical research but often requires significant time and resources, especially since traditional methods typically involve extensive manual data review. While machine learning and natural language processing advancements have helped, further improvements are needed. Few studies have explored using Large Language Models (LLMs) for these tasks despite known advantages of LLMs for text-based tasks. T o facilitate further research in this area, we developed an evaluation framework, Evaluation of PHEnotyping for Observational Health Data (PHEONA), that outlines context-specific considerations. W e applied and demonstrated PHEONA on concept classification, a specific task within a broader phenotyping process for Acute Respiratory Failure (ARF) respiratory support therapies. From the sample concepts tested, we achieved high classification accuracy, suggesting the potential for LLM-based methods to improve computational phenotyping processes.


Topological Schr\"odinger Bridge Matching

arXiv.org Machine Learning

Given two boundary distributions, the Schr\"odinger Bridge (SB) problem seeks the ``most likely`` random evolution between them with respect to a reference process. It has revealed rich connections to recent machine learning methods for generative modeling and distribution matching. While these methods perform well in Euclidean domains, they are not directly applicable to topological domains such as graphs and simplicial complexes, which are crucial for data defined over network entities, such as node signals and edge flows. In this work, we propose the Topological Schr\"odinger Bridge problem (TSBP) for matching signal distributions on a topological domain. We set the reference process to follow some linear tractable topology-aware stochastic dynamics such as topological heat diffusion. For the case of Gaussian boundary distributions, we derive a closed-form topological SB (TSB) in terms of its time-marginal and stochastic differential. In the general case, leveraging the well-known result, we show that the optimal process follows the forward-backward topological dynamics governed by some unknowns. Building on these results, we develop TSB-based models for matching topological signals by parameterizing the unknowns in the optimal process as (topological) neural networks and learning them through likelihood training. We validate the theoretical results and demonstrate the practical applications of TSB-based models on both synthetic and real-world networks, emphasizing the role of topology. Additionally, we discuss the connections of TSB-based models to other emerging models, and outline future directions for topological signal matching.


Real-Time Evaluation Models for RAG: Who Detects Hallucinations Best?

arXiv.org Artificial Intelligence

This article surveys Evaluation models to automatically detect hallucinations in Retrieval-Augmented Generation (RAG), and presents a comprehensive benchmark of their performance across six RAG applications. Methods included in our study include: LLM-as-a-Judge, Prometheus, Lynx, the Hughes Hallucination Evaluation Model (HHEM), and the Trustworthy Language Model (TLM). These approaches are all reference-free, requiring no ground-truth answers/labels to catch incorrect LLM responses. Our study reveals that, across diverse RAG applications, some of these approaches consistently detect incorrect RAG responses with high precision/recall.


Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds

arXiv.org Machine Learning

This paper studies constrained Markov decision processes (CMDPs) with constraints against stochastic thresholds, aiming at safety of reinforcement learning in unknown and uncertain environments. We leverage a Growing-Window estimator sampling from interactions with the uncertain and dynamic environment to estimate the thresholds, based on which we design Stochastic Pessimistic-Optimistic Thresholding (SPOT), a novel model-based primal-dual algorithm for multiple constraints against stochastic thresholds. SPOT enables reinforcement learning under both pessimistic and optimistic threshold settings. We prove that our algorithm achieves sublinear regret and constraint violation; i.e., a reward regret of $\tilde{\mathcal{O}}(\sqrt{T})$ while allowing an $\tilde{\mathcal{O}}(\sqrt{T})$ constraint violation over $T$ episodes. The theoretical guarantees show that our algorithm achieves performance comparable to that of an approach relying on fixed and clear thresholds. To the best of our knowledge, SPOT is the first reinforcement learning algorithm that realises theoretical guaranteed performance in an uncertain environment where even thresholds are unknown.


A Minecraft Movie just set a new record with the biggest opening ever for a video game adaptation in the US

Engadget

A Minecraft Movie has reportedly surpassed the record previously set by 2023's The Super Mario Bros. Movie for the biggest ever domestic box office opening of a video game adaptation. The new movie, which was released in theaters on Friday, raked in 157 million in the US in its opening weekend, according to The Hollywood Reporter. A Minecraft Movie is doing well internationally, too; THR reports that it's earned 301M altogether in its global debut. The Super Mario Bros. Movie pulled in 146 million in its domestic opening and 377 million globally. A Minecraft Movie stars Jack Black, Sebastian Hansen, Emma Myers, Jason Momoa, Danielle Brooks and Jennifer Coolidge.


You have until 27 April to score a A 24 Windows 11 Pro license

Mashable

TL;DR: Give your old PC a new lease on life with a Microsoft Windows 11 Pro license, now only A 24 through 27 April. Need a new laptop but don't have the budget to buy one? We've found the next best thing -- updating your operating system. If you've got an old PC that could use an upgrade, Microsoft Windows 11 Pro is now just A 24, A 301 off the usual price. But you'll want to act fast because this deal ends 27 April.


A Minecraft Movie storms box office despite lukewarm reviews

BBC News

Around half of the film's global takings came from North America, according to EntTelligence. The box office numbers come despite reviews for the film being mostly underwhelming. The Telegraph awarded it two stars, saying the charm of the video game was "nowhere to be found", while the Guardian gave it just one star, saying it has "a cobbled-together feel". It does not appear to have stopped families showing up in force to see it. "It has definitely overperformed all industry projections," said Steve Buck, chief strategy officer at EntTelligence, which said the film had enjoyed a late ticket surge.