Goto

Collaborating Authors

 Pacific Ocean


Online Training of Large Language Models: Learn while chatting

arXiv.org Artificial Intelligence

Large Language Models(LLMs) have dramatically revolutionized the field of Natural Language Processing(NLP), offering remarkable capabilities that have garnered widespread usage. However, existing interaction paradigms between LLMs and users are constrained by either inflexibility, limitations in customization, or a lack of persistent learning. This inflexibility is particularly evident as users, especially those without programming skills, have restricted avenues to enhance or personalize the model. Existing frameworks further complicate the model training and deployment process due to their computational inefficiencies and lack of user-friendly interfaces. To overcome these challenges, this paper introduces a novel interaction paradigm-'Online Training using External Interactions'-that merges the benefits of persistent, real-time model updates with the flexibility for individual customization through external interactions such as AI agents or online/offline knowledge bases.


DNNLasso: Scalable Graph Learning for Matrix-Variate Data

arXiv.org Artificial Intelligence

We consider the problem of jointly learning row-wise and column-wise dependencies of matrix-variate observations, which are modelled separately by two precision matrices. Due to the complicated structure of Kronecker-product precision matrices in the commonly used matrix-variate Gaussian graphical models, a sparser Kronecker-sum structure was proposed recently based on the Cartesian product of graphs. However, existing methods for estimating Kronecker-sum structured precision matrices do not scale well to large scale datasets. In this paper, we introduce DNNLasso, a diagonally non-negative graphical lasso model for estimating the Kronecker-sum structured precision matrix, which outperforms the state-of-the-art methods by a large margin in both accuracy and computational time. Our code is available at https://github.com/YangjingZhang/DNNLasso.


ConvTimeNet: A Deep Hierarchical Fully Convolutional Model for Multivariate Time Series Analysis

arXiv.org Artificial Intelligence

Over a significant period in the past, the convolutional network [He et al., 2016; Zheng et al., 2014; Middlehurst This paper introduces ConvTimeNet, a novel deep et al., 2023] has played a crucial role in time series analysis, hierarchical fully convolutional network designed largely due to its inherent properties that strike an excellent to serve as a general-purpose model for time series balance between computational efficiency and representation analysis. The key design of this network is twofold, quality. Data back to the past years, many representative designed to overcome the limitations of traditional works [Bagnall et al., 2017] of time series analysis typically convolutional networks. Firstly, we propose an employ convolutional networks as the backbone. For adaptive segmentation of time series into sub-series instance, temporal convolutional network (TCN[Bai et al., level patches, treating these as fundamental modeling 2018]) and its variants are widely used in modeling temporal units. This setting avoids the sparsity semantics variation dependence for the time series forecasting task. Furthermore, associated with raw point-level time steps. Secondly, a large number of works (such as InceptionTime[Ismail we design a fully convolutional block by Fawaz et al., 2020], MiniRocket[Dempster et al., 2021], skillfully integrating deepwise and pointwise convolution and MCNN[Cui et al., 2016]) are also proposed by employing operations, following the advanced building convolutional networks to identify informative patterns block style employed in Transformer encoders.


CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables

arXiv.org Machine Learning

For Multivariate Time Series Forecasting (MTSF), recent deep learning applications show that univariate models frequently outperform multivariate ones. To address the difficiency in multivariate models, we introduce a method to Construct Auxiliary Time Series (CATS) that functions like a 2D temporal-contextual attention mechanism, which generates Auxiliary Time Series (ATS) from Original Time Series (OTS) to effectively represent and incorporate inter-series relationships for forecasting. Key principles of ATS - continuity, sparsity, and variability - are identified and implemented through different modules. Even with a basic 2-layer MLP as core predictor, CATS achieves state-of-the-art, significantly reducing complexity and parameters compared to previous multivariate models, marking it an efficient and transferable MTSF solution.


Waymo Will Bring Autonomous Taxis to Los Angeles--Its Biggest Challenge Yet

WIRED

Paid autonomous vehicle service is coming to Los Angeles, thanks to a decision by California regulators today to allow Alphabet subsidiary Waymo to operate in the city. Under the new ruling, Waymo is also permitted to launch service in a large section of the San Francisco Peninsula. The decision by the California Public Utilities Commission will likely prove controversial. It comes over the protest of local governments and agencies, including the Los Angeles Department of Transportation, the San Francisco County Transportation Authority, the city of South San Francisco, and the County of San Mateo. All argued that local government and citizens should have more input and oversight over the expanded autonomous taxi service.


TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables

arXiv.org Artificial Intelligence

Recent studies have demonstrated remarkable performance in time series forecasting. However, due to the partially-observed nature of real-world applications, solely focusing on the target of interest, so-called endogenous variables, is usually insufficient to guarantee accurate forecasting. Notably, a system is often recorded into multiple variables, where the exogenous series can provide valuable external information for endogenous variables. Thus, unlike prior well-established multivariate or univariate forecasting that either treats all the variables equally or overlooks exogenous information, this paper focuses on a practical setting, which is time series forecasting with exogenous variables. We propose a novel framework, TimeXer, to utilize external information to enhance the forecasting of endogenous variables. With a deftly designed embedding layer, TimeXer empowers the canonical Transformer architecture with the ability to reconcile endogenous and exogenous information, where patch-wise self-attention and variate-wise cross-attention are employed. Moreover, a global endogenous variate token is adopted to effectively bridge the exogenous series into endogenous temporal patches. Experimentally, TimeXer significantly improves time series forecasting with exogenous variables and achieves consistent state-of-the-art performance in twelve real-world forecasting benchmarks.


Enhancing Multivariate Time Series Forecasting with Mutual Information-driven Cross-Variable and Temporal Modeling

arXiv.org Machine Learning

Recent advancements have underscored the impact of deep learning techniques on multivariate time series forecasting (MTSF). Generally, these techniques are bifurcated into two categories: Channel-independence and Channel-mixing approaches. Although Channel-independence methods typically yield better results, Channel-mixing could theoretically offer improvements by leveraging inter-variable correlations. Nonetheless, we argue that the integration of uncorrelated information in channel-mixing methods could curtail the potential enhancement in MTSF model performance. To substantiate this claim, we introduce the Cross-variable Decorrelation Aware feature Modeling (CDAM) for Channel-mixing approaches, aiming to refine Channel-mixing by minimizing redundant information between channels while enhancing relevant mutual information. Furthermore, we introduce the Temporal correlation Aware Modeling (TAM) to exploit temporal correlations, a step beyond conventional single-step forecasting methods. This strategy maximizes the mutual information between adjacent sub-sequences of both the forecasted and target series. Combining CDAM and TAM, our novel framework significantly surpasses existing models, including those previously considered state-of-the-art, in comprehensive tests.


Thousands of humpback whales starved to death after marine heatwave

New Scientist

The number of humpback whales in the North Pacific Ocean fell by 20 per cent between 2012 and 2021, according to a study that used artificial intelligence to identify individual whales from photos of their tails. The decline coincided with a massive marine heatwave sometimes called the blob, which began in 2013 and lasted until 2016. The unprecedented intensity of the blob was almost certainly the result of global warming. The findings suggest that around 7000 whales starved to death because of the marine heatwave, says Ted Cheeseman at Southern Cross University in Australia. The blob is known to have caused mass die-offs of many other animals, such as seabirds.


Towards Generalist Prompting for Large Language Models by Mental Models

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated impressive performance on many tasks. However, to achieve optimal performance, specially designed prompting methods are still needed. These methods either rely on task-specific few-shot examples that require a certain level of domain knowledge, or are designed to be simple but only perform well on a few types of tasks. In this work, we attempt to introduce the concept of generalist prompting, which operates on the design principle of achieving optimal or near-optimal performance on a wide range of tasks while eliminating the need for manual selection and customization of prompts tailored to specific problems. Furthermore, we propose MeMo (Mental Models), an innovative prompting method that is simple-designed yet effectively fulfills the criteria of generalist prompting. MeMo distills the cores of various prompting methods into individual mental models and allows LLMs to autonomously select the most suitable mental models for the problem, achieving or being near to the state-of-the-art results on diverse tasks such as STEM, logical reasoning, and commonsense reasoning in zero-shot settings. We hope that the insights presented herein will stimulate further exploration of generalist prompting methods for LLMs.


AI technology could help US, allies monitor China's Taiwan invasion intentions

FOX News

China has stepped up its diplomatic and military pressure against Taiwan, alarming U.S. officials and allies in the region that Beijing is looking to take back the island by force. If projections of a Chinese military invasion to retake Taiwan are accurate, the U.S. can utilize artificial intelligence (AI) and other technology that will indicate to forces in the region that China isn't engaging in yet another provocative military exercise but is launching the invasion so many predict. According to experts, AI and machine learning (ML) can help the U.S. and its allies in the region improve the speed and efficiency of war plan development, intelligence assessments and targeting effectiveness. An MV-22 Osprey from the "Ugly Angels" of Marine Medium Tiltrotor Squadron 362 flies by the aircraft carrier USS Nimitz in the South China Sea Feb. 11, 2023. WHAT IS ARTIFICIAL INTELLIGENCE (AI)?