Oceania
Sequential Treatment Effect Estimation with Unmeasured Confounders
Wang, Yingrong, Wu, Anpeng, Li, Baohong, Xiao, Ziyang, Xiong, Ruoxuan, Han, Qing, Kuang, Kun
This paper studies the cumulative causal effects of sequential treatments in the presence of unmeasured confounders. It is a critical issue in sequential decision-making scenarios where treatment decisions and outcomes dynamically evolve over time. Advanced causal methods apply transformer as a backbone to model such time sequences, which shows superiority in capturing long time dependence and periodic patterns via attention mechanism. However, even they control the observed confounding, these estimators still suffer from unmeasured confounders, which influence both treatment assignments and outcomes. How to adjust the latent confounding bias in sequential treatment effect estimation remains an open challenge. Therefore, we propose a novel Decomposing Sequential Instrumental Variable framework for CounterFactual Regression (DSIV-CFR), relying on a common negative control assumption. Specifically, an instrumental variable (IV) is a special negative control exposure, while the previous outcome serves as a negative control outcome. This allows us to recover the IVs latent in observation variables and estimate sequential treatment effects via a generalized moment condition. We conducted experiments on 4 datasets and achieved significant performance in one- and multi-step prediction, supported by which we can identify optimal treatments for dynamic systems.
Accelerating Machine Learning Systems via Category Theory: Applications to Spherical Attention for Gene Regulatory Networks
Abbott, Vincent, Kamiya, Kotaro, Glowacki, Gerard, Atsumi, Yu, Zardini, Gioele, Maruyama, Yoshihiro
How do we enable artificial intelligence models to improve themselves? This is central to exponentially improving generalized artificial intelligence models, which can improve their own architecture to handle new problem domains in an efficient manner that leverages the latest hardware. However, current automated compilation methods are poor, and efficient algorithms require years of human development. In this paper, we use neural circuit diagrams, based in category theory, to prove a general theorem related to deep learning algorithms, guide the development of a novel attention algorithm catered to the domain of gene regulatory networks, and produce a corresponding efficient kernel. The algorithm we propose, spherical attention, shows that neural circuit diagrams enable a principled and systematic method for reasoning about deep learning architectures and providing high-performance code. By replacing SoftMax with an $L^2$ norm as suggested by diagrams, it overcomes the special function unit bottleneck of standard attention while retaining the streaming property essential to high-performance. Our diagrammatically derived \textit{FlashSign} kernel achieves comparable performance to the state-of-the-art, fine-tuned FlashAttention algorithm on an A100, and $3.6\times$ the performance of PyTorch. Overall, this investigation shows neural circuit diagrams' suitability as a high-level framework for the automated development of efficient, novel artificial intelligence architectures.
Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training
Chen, Yangyi, Peng, Hao, Zhang, Tong, Ji, Heng
In standard large vision-language models (LVLMs) pre-training, the model typically maximizes the joint probability of the caption conditioned on the image via next-token prediction (NTP); however, since only a small subset of caption tokens directly relates to the visual content, this naive NTP unintentionally fits the model to noise and increases the risk of hallucination. We present PRIOR, a simple vision-language pre-training approach that addresses this issue by prioritizing image-related tokens through differential weighting in the NTP loss, drawing from the importance sampling framework. PRIOR introduces a reference model-a text-only large language model (LLM) trained on the captions without image inputs, to weight each token based on its probability for LVLMs training. Intuitively, tokens that are directly related to the visual inputs are harder to predict without the image and thus receive lower probabilities from the text-only reference LLM. During training, we implement a token-specific re-weighting term based on the importance scores to adjust each token's loss. We implement PRIOR in two distinct settings: LVLMs with visual encoders and LVLMs without visual encoders. We observe 19% and 8% average relative improvement, respectively, on several vision-language benchmarks compared to NTP. In addition, PRIOR exhibits superior scaling properties, as demonstrated by significantly higher scaling coefficients, indicating greater potential for performance gains compared to NTP given increasing compute and data.
Ornithologist: Towards Trustworthy "Reasoning" about Central Bank Communications
I develop Ornithologist, a weakly-supervised textual classification system and measure the hawkishness and dovishness of central bank text. Ornithologist uses ``taxonomy-guided reasoning'', guiding a large language model with human-authored decision trees. This increases the transparency and explainability of the system and makes it accessible to non-experts. It also reduces hallucination risk. Since it requires less supervision than traditional classification systems, it can more easily be applied to other problems or sources of text (e.g. news) without much modification. Ornithologist measurements of hawkishness and dovishness of RBA communication carry information about the future of the cash rate path and of market expectations.
Beyond the Known: Decision Making with Counterfactual Reasoning Decision Transformer
Nguyen, Minh Hoang, Van, Linh Le Pham, Karimpanal, Thommen George, Gupta, Sunil, Le, Hung
Decision Transformers (DT) play a crucial role in modern reinforcement learning, leveraging offline datasets to achieve impressive results across various domains. However, DT requires high-quality, comprehensive data to perform optimally. In real-world applications, the lack of training data and the scarcity of optimal behaviours make training on offline datasets challenging, as suboptimal data can hinder performance. To address this, we propose the Counterfactual Reasoning Decision Transformer (CRDT), a novel framework inspired by counterfactual reasoning. CRDT enhances DT ability to reason beyond known data by generating and utilizing counterfactual experiences, enabling improved decision-making in unseen scenarios. Experiments across Atari and D4RL benchmarks, including scenarios with limited data and altered dynamics, demonstrate that CRDT outperforms conventional DT approaches. Additionally, reasoning counterfactually allows the DT agent to obtain stitching abilities, combining suboptimal trajectories, without architectural modifications. These results highlight the potential of counterfactual reasoning to enhance reinforcement learning agents' performance and generalization capabilities.
SafeNav: Safe Path Navigation using Landmark Based Localization in a GPS-denied Environment
Sapkota, Ganesh, Madria, Sanjay
In battlefield environments, adversaries frequently disrupt GPS signals, requiring alternative localization and navigation methods. Traditional vision-based approaches like Simultaneous Localization and Mapping (SLAM) and Visual Odometry (VO) involve complex sensor fusion and high computational demand, whereas range-free methods like DV-HOP face accuracy and stability challenges in sparse, dynamic networks. This paper proposes LanBLoc-BMM, a navigation approach using landmark-based localization (LanBLoc) combined with a battlefield-specific motion model (BMM) and Extended Kalman Filter (EKF). Its performance is benchmarked against three state-of-the-art visual localization algorithms integrated with BMM and Bayesian filters, evaluated on synthetic and real-imitated trajectory datasets using metrics including Average Displacement Error (ADE), Final Displacement Error (FDE), and a newly introduced Average Weighted Risk Score (AWRS). LanBLoc-BMM (with EKF) demonstrates superior performance in ADE, FDE, and AWRS on real-imitated datasets. Additionally, two safe navigation methods, SafeNav-CHull and SafeNav-Centroid, are introduced by integrating LanBLoc-BMM(EKF) with a novel Risk-Aware RRT* (RAw-RRT*) algorithm for obstacle avoidance and risk exposure minimization. Simulation results in battlefield scenarios indicate SafeNav-Centroid excels in accuracy, risk exposure, and trajectory efficiency, while SafeNav-CHull provides superior computational speed.
Scientists confirm woke change made to Barbie over the course of 35 years - so did you notice it?
Barbie is one of the most successful children's toys in history, spawning a multimedia franchise that includes merchandise, video games and a live-action film. Since US toy giant Mattel launched the original Barbie in 1959, more than 1 billion of the dolls have been sold worldwide. Certainly, Barbie's looks have been tweaked over the years to reflect changing beauty ideals and societal shifts. But according to a new study, one subtle change to Barbie has gone largely unnoticed – until now. Scientists in Australia have found that Barbies today have flatter feet than they did in past decades.
Who needs Eurovision when we have the Dance Your PhD contest?
Feedback is New Scientist's popular sideways look at the latest science and technology news. You can submit items you believe may amuse readers to Feedback by emailing feedback@newscientist.com Saturday 17 May will see the final of this year's Eurovision Song Contest, which will be the most over-the-top evening of television since, well, the previous Eurovision. Feedback is deeply relieved that Feedback Jr appears not to be interested this year, so we might escape having to sit up and watch the entire thing. While we are deeply supportive of the contest's kind and welcoming vibe, most of the songs make our ears bleed.
DoorDash dives into delicious drone deliveries
And in so many ways, it kinda sucks. A new graphics card costs more than a mortgage payment because billionaires are sucking up all the GPUs to boil the planet and make Hayao Miyazaki cry at the same time, and I still don't have a Marty McFly hoverboard. But at least I can order fast food that literally flies to my door. In fact, I could order a flying curry delivery if I lived in Charlotte, North Carolina--specifically, within four miles of the Arboretum Shopping Center--where DoorDash is now offering food deliveries via drone. You can choose from a limited selection of local eateries, including Panera Bread, Matcha Cafe Maiko, and Joa Korean.
Far-right extremists guilty of planning attacks
Three far-right extremists who amassed hundreds of weapons and planned to carry out attacks on targets including a mosque have been convicted of terrorism offences. Brogan Stewart, 25, from West Yorkshire, Christopher Ringrose, 34, from Staffordshire, and Marco Pitzettu, 25, from Derbyshire, were part of an online group who "idolised the Nazi regime". Sheffield Crown Court was told how Stewart had detailed torturing a Muslim leader using an "information extraction kit". All three were found guilty of terrorism offences at the same court on Wednesday and are due to be sentenced on 17 July.Counter Terrorism Policing North EastThe trio had amassed a cache of weapons as part of their planning During the nine-week trial, the court heard more than 200 weapons including machetes, hunting knives, swords and crossbows were found at their homes. Ringrose had also begun to build a 3D-printed semi-automatic firearm, which counter-terror police said would have been a "lethal weapon".