Oceania
Biden Proposes New Export Curbs on AI Chips, Provoking an Industry Pushback
The Biden administration is proposing a new framework for the exporting of the advanced computer chips used to develop artificial intelligence, an attempt to balance national security concerns about the technology with the economic interests of producers and other countries. But the framework proposed Monday also raised concerns of chip industry executives who say the rules would limit access to existing chips used for video games and restrict in 120 countries the chips used for data centers and AI products. Mexico, Portugal, Israel and Switzerland are among the nations that could have limited access. Commerce Secretary Gina Raimondo said on a call with reporters previewing the framework that it's "critical" to preserve America's leadership in AI and the development of AI-related computer chips. The fast-evolving AI technology enables computers to produce novels, make scientific research breakthroughs, automate driving and foster a range of other transformations that could reshape economies and warfare.
New US Rule Aims to Block China's Access to AI Chips and Models by Restricting the World
The Biden administration announced a bold and controversial new export control scheme today, designed to prevent the advanced chips and artificial intelligence models themselves from ending up in the hands of adversaries such as China. The administration's new "AI Diffusion rule" divides the world into nations that are allowed relatively unfettered access to America's most advanced AI silicon and algorithms, and those that will require special licenses to access the technology. The rule, which will be enforced by the Commerce Department's Bureau of Industry and Security, also seeks to restrict the movement of the most powerful AI models for the first time. "The US leads the world in AI now, both AI development and AI chip design, and it's critical that we keep it that way," the US Commerce Secretary Gina Raimondo said ahead of today's announcement. The list of trusted nations are the UK, Canada, Australia, Japan, France, Germany, Belgium, Denmark, Finland, Ireland, Italy, the Netherlands, New Zealand, Norway, Republic of Korea, Spain, Sweden and Taiwan.
Skip Mamba Diffusion for Monocular 3D Semantic Scene Completion
Liang, Li, Akhtar, Naveed, Vice, Jordan, Kong, Xiangrui, Mian, Ajmal Saeed
3D semantic scene completion is critical for multiple downstream tasks in autonomous systems. It estimates missing geometric and semantic information in the acquired scene data. Due to the challenging real-world conditions, this task usually demands complex models that process multi-modal data to achieve acceptable performance. We propose a unique neural model, leveraging advances from the state space and diffusion generative modeling to achieve remarkable 3D semantic scene completion performance with monocular image input. Our technique processes the data in the conditioned latent space of a variational autoencoder where diffusion modeling is carried out with an innovative state space technique. A key component of our neural network is the proposed Skimba (Skip Mamba) denoiser, which is adept at efficiently processing long-sequence data. The Skimba diffusion model is integral to our 3D scene completion network, incorporating a triple Mamba structure, dimensional decomposition residuals and varying dilations along three directions. We also adopt a variant of this network for the subsequent semantic segmentation stage of our method. Extensive evaluation on the standard SemanticKITTI and SSCBench-KITTI360 datasets show that our approach not only outperforms other monocular techniques by a large margin, it also achieves competitive performance against stereo methods. The code is available at https://github.com/xrkong/skimba
A monthly sub-national Harmonized Food Insecurity Dataset for comprehensive analysis and predictive modeling
Machefer, Mélissande, Ronco, Michele, Thomas, Anne-Claire, Assouline, Michael, Rabier, Melanie, Corbane, Christina, Rembold, Felix
Food security is a complex, multidimensional concept challenging to measure comprehensively. Effective anticipation, monitoring, and mitigation of food crises require timely and comprehensive global data. This paper introduces the Harmonized Food Insecurity Dataset (HFID), an open-source resource consolidating four key data sources: the Integrated Food Security Phase Classification (IPC)/Cadre Harmonis\'e (CH) phases, the Famine Early Warning Systems Network (FEWS NET) IPC-compatible phases, and the World Food Program's (WFP) Food Consumption Score (FCS) and reduced Coping Strategy Index (rCSI). Updated monthly and using a common reference system for administrative units, the HFID offers extensive spatial and temporal coverage. It serves as a vital tool for food security experts and humanitarian agencies, providing a unified resource for analyzing food security conditions and highlighting global data disparities. The scientific community can also leverage the HFID to develop data-driven predictive models, enhancing the capacity to forecast and prevent future food crises.
Enhancing Team Diversity with Generative AI: A Novel Project Management Framework
This research-in-progress paper presents a new project management framework that utilises GenAI technology. The framework is designed to address the common challenge of uniform team compositions in academic and research project teams, particularly in universities and research institutions. It does so by integrating sociologically identified patterns of successful team member personalities and roles, using GenAI agents to fill gaps in team dynamics. This approach adds an additional layer of analysis to conventional project management processes by evaluating team members' personalities and roles and employing GenAI agents, fine-tuned on personality datasets, to fill specific team roles. Our initial experiments have shown improvements in the model's ability to understand and process personality traits, suggesting the potential effectiveness of GenAI teammates in real-world project settings. This paper aims to explore the practical application of AI in enhancing team diversity and project management
VaeDiff-DocRE: End-to-end Data Augmentation Framework for Document-level Relation Extraction
Tran, Khai Phan, Hua, Wen, Li, Xue
Document-level Relation Extraction (DocRE) aims to identify relationships between entity pairs within a document. However, most existing methods assume a uniform label distribution, resulting in suboptimal performance on real-world, imbalanced datasets. To tackle this challenge, we propose a novel data augmentation approach using generative models to enhance data from the embedding space. Our method leverages the Variational Autoencoder (VAE) architecture to capture all relation-wise distributions formed by entity pair representations and augment data for underrepresented relations. To better capture the multi-label nature of DocRE, we parameterize the VAE's latent space with a Diffusion Model. Additionally, we introduce a hierarchical training framework to integrate the proposed VAE-based augmentation module into DocRE systems. Experiments on two benchmark datasets demonstrate that our method outperforms state-of-the-art models, effectively addressing the long-tail distribution problem in DocRE.
TSEML: A task-specific embedding-based method for few-shot classification of cancer molecular subtypes
Su, Ran, Shi, Rui, Cui, Hui, Xuan, Ping, Fang, Chengyan, Feng, Xikang, Jin, Qiangguo
Molecular subtyping of cancer is recognized as a critical and challenging upstream task for personalized therapy. Existing deep learning methods have achieved significant performance in this domain when abundant data samples are available. However, the acquisition of densely labeled samples for cancer molecular subtypes remains a significant challenge for conventional data-intensive deep learning approaches. In this work, we focus on the few-shot molecular subtype prediction problem in heterogeneous and small cancer datasets, aiming to enhance precise diagnosis and personalized treatment. We first construct a new few-shot dataset for cancer molecular subtype classification and auxiliary cancer classification, named TCGA Few-Shot, from existing publicly available datasets. To effectively leverage the relevant knowledge from both tasks, we introduce a task-specific embedding-based meta-learning framework (TSEML). TSEML leverages the synergistic strengths of a model-agnostic meta-learning (MAML) approach and a prototypical network (ProtoNet) to capture diverse and fine-grained features. Comparative experiments conducted on the TCGA Few-Shot dataset demonstrate that our TSEML framework achieves superior performance in addressing the problem of few-shot molecular subtype classification.
Mitigating Out-of-Entity Errors in Named Entity Recognition: A Sentence-Level Strategy
Jiang, Guochao, Luo, Ziqin, Hu, Chengwei, Ding, Zepeng, Yang, Deqing
Many previous models of named entity recognition (NER) suffer from the problem of Out-of-Entity (OOE), i.e., the tokens in the entity mentions of the test samples have not appeared in the training samples, which hinders the achievement of satisfactory performance. To improve OOE-NER performance, in this paper, we propose a new framework, namely S+NER, which fully leverages sentence-level information. Our S+NER achieves better OOE-NER performance mainly due to the following two particular designs. 1) It first exploits the pre-trained language model's capability of understanding the target entity's sentence-level context with a template set. 2) Then, it refines the sentence-level representation based on the positive and negative templates, through a contrastive learning strategy and template pooling method, to obtain better NER results. Our extensive experiments on five benchmark datasets have demonstrated that, our S+NER outperforms some state-of-the-art OOE-NER models.
Quantifying Aleatoric Uncertainty of the Treatment Effect: A Novel Orthogonal Learner
Melnychuk, Valentyn, Feuerriegel, Stefan, van der Schaar, Mihaela
Estimating causal quantities from observational data is crucial for understanding the safety and effectiveness of medical treatments. However, to make reliable inferences, medical practitioners require not only estimating averaged causal quantities, such as the conditional average treatment effect, but also understanding the randomness of the treatment effect as a random variable. This randomness is referred to as aleatoric uncertainty and is necessary for understanding the probability of benefit from treatment or quantiles of the treatment effect. Yet, the aleatoric uncertainty of the treatment effect has received surprisingly little attention in the causal machine learning community. To fill this gap, we aim to quantify the aleatoric uncertainty of the treatment effect at the covariate-conditional level, namely, the conditional distribution of the treatment effect (CDTE). Unlike average causal quantities, the CDTE is not point identifiable without strong additional assumptions. As a remedy, we employ partial identification to obtain sharp bounds on the CDTE and thereby quantify the aleatoric uncertainty of the treatment effect. We then develop a novel, orthogonal learner for the bounds on the CDTE, which we call AU-learner. We further show that our AU-learner has several strengths in that it satisfies Neyman-orthogonality and, thus, quasi-oracle efficiency. Finally, we propose a fully-parametric deep learning instantiation of our AU-learner.
Prediction Interval Construction Method for Electricity Prices
Accurate prediction of electricity prices plays an essential role in the electricity market. To reflect the uncertainty of electricity prices, price intervals are predicted. This paper proposes a novel prediction interval construction method. A conditional generative adversarial network is first presented to generate electricity price scenarios, with which the prediction intervals can be constructed. Then, different generated scenarios are stacked to obtain the probability densities, which can be applied to accurately reflect the uncertainty of electricity prices. Furthermore, a reinforced prediction mechanism based on the volatility level of weather factors is introduced to address the spikes or volatile prices. A case study is conducted to verify the effectiveness of the proposed novel prediction interval construction method. The method can also provide the probability density of each price scenario within the prediction interval and has the superiority to address the volatile prices and price spikes with a reinforced prediction mechanism.