Goto

Collaborating Authors

 aries


ARIES: Relation Assessment and Model Recommendation for Deep Time Series Forecasting

Wang, Fei, Li, Yujie, Shao, Zezhi, Yu, Chengqing, Fu, Yisong, An, Zhulin, Xu, Yongjun, Cheng, Xueqi

arXiv.org Artificial Intelligence

Recent advancements in deep learning models for time series forecasting have been significant. These models often leverage fundamental time series properties such as seasonality and non-stationarity, which may suggest an intrinsic link between model performance and data properties. However, existing benchmark datasets fail to offer diverse and well-defined temporal patterns, restricting the systematic evaluation of such connections. Additionally, there is no effective model recommendation approach, leading to high time and cost expenditures when testing different architectures across different downstream applications. For those reasons, we propose ARIES, a framework for assessing relation between time series properties and modeling strategies, and for recommending deep forcasting models for realistic time series. First, we construct a synthetic dataset with multiple distinct patterns, and design a comprehensive system to compute the properties of time series. Next, we conduct an extensive benchmarking of over 50 forecasting models, and establish the relationship between time series properties and modeling strategies. Our experimental results reveal a clear correlation. Based on these findings, we propose the first deep forecasting model recommender, capable of providing interpretable suggestions for real-world time series. In summary, ARIES is the first study to establish the relations between the properties of time series data and modeling strategies, while also implementing a model recommendation system. The code is available at: https://github.com/blisky-li/ARIES.


ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments

Gimenes, Pedro, Cao, Zeyu, Wong, Jeffrey, Zhao, Yiren

arXiv.org Artificial Intelligence

Recent research has shown that LLM performance on reasoning tasks can be enhanced by scaling test-time compute. One promising approach, particularly with decomposable problems, involves arranging intermediate solutions as a graph on which transformations are performed to explore the solution space. However, prior works rely on pre-determined, task-specific transformation schedules which are subject to a set of searched hyperparame-ters. In this work, we view thought graph transformations as actions in a Markov decision process, and implement policy agents to drive effective action policies for the underlying reasoning LLM agent. In particular, we investigate the ability for another LLM to act as a policy agent on thought graph environments and introduce ARIES, a multi-agent architecture for reasoning with LLMs. In ARIES, reasoning LLM agents solve decomposed subproblems, while policy LLM agents maintain visibility of the thought graph states, and dynamically adapt the problem-solving strategy. Through extensive experiments, we observe that using off-the-shelf LLMs as policy agents with no supervised fine-tuning (SFT) can yield up to 29% higher accuracy on HumanEval relative to static transformation schedules, as well as reducing inference costs by 35% and avoid any search requirements. We also conduct a thorough analysis of observed failure modes, highlighting that limitations on LLM sizes and the depth of problem decomposition can be seen as challenges to scaling LLM-guided reasoning. Prior works have shown that Large Language Models (LLMs) are subject to the emergence of abilities as their parameter count grows (Wei et al., 2022), which spurred significant interest in training increasingly larger models. However, recent work showed that under a fixed compute budget for training and inference, LLM performance on reasoning tasks can be enhanced by allocating a higher proportion of compute to inference rather than training (Snell et al., 2024). This shift towards inference-time compute scaling can be intuitively understood through the Dual Process Theory, which postulates the existence of two distinct modes of reasoning in humans - (1) a fast, intuitive mode and (2) a slow, deliberate mode (Evans & Frankish, 2009).


ARIES: Stimulating Self-Refinement of Large Language Models by Iterative Preference Optimization

Zeng, Yongcheng, Cui, Xinyu, Jin, Xuanfa, Liu, Guoqing, Sun, Zexu, He, Quan, Li, Dong, Yang, Ning, Hao, Jianye, Zhang, Haifeng, Wang, Jun

arXiv.org Artificial Intelligence

A truly intelligent Large Language Model (LLM) should be capable of correcting errors in its responses through external interactions. However, even the most advanced models often face challenges in improving their outputs. In this paper, we explore how to cultivate LLMs with the self-refinement capability through iterative preference training, and how this ability can be leveraged to improve model performance during inference. To this end, we introduce a novel post-training and inference framework, called ARIES: Adaptive Refinement and Iterative Enhancement Structure. This method iteratively performs preference training and self-refinement-based data collection. During training, ARIES strengthen the model's direct question-answering capability while simultaneously unlocking its self-refinement potential. During inference, ARIES harnesses this self-refinement capability to generate a series of progressively refined responses, which are then filtered using either the Reward Model Scoring or a simple yet effective Rule-Based Selection mechanism, specifically tailored to our approach, to construct a dataset for the next round of preference training. Experimental results demonstrate the remarkable performance of ARIES. When applied to the Llama-3.1-8B model and under the self-refinement setting, ARIES surpasses powerful models such as GPT-4o, achieving 62.3% length-controlled (LC) and a 63.3% raw win rates on AlpacaEval 2, outperforming Iterative DPO by 27.8% and 35.5% respectively, as well as a 50.3% win rate on Arena-Hard, surpassing Iterative DPO by 26.6%. Furthermore, ARIES consistently enhances performance on mathematical reasoning tasks like GSM8K and MATH.


Aries: Efficient Testing of Deep Neural Networks via Labeling-Free Accuracy Estimation

Hu, Qiang, Guo, Yuejun, Xie, Xiaofei, Cordy, Maxime, Ma, Lei, Papadakis, Mike, Traon, Yves Le

arXiv.org Artificial Intelligence

Deep learning (DL) plays a more and more important role in our daily life due to its competitive performance in industrial application domains. As the core of DL-enabled systems, deep neural networks (DNNs) need to be carefully evaluated to ensure the produced models match the expected requirements. In practice, the \emph{de facto standard} to assess the quality of DNNs in the industry is to check their performance (accuracy) on a collected set of labeled test data. However, preparing such labeled data is often not easy partly because of the huge labeling effort, i.e., data labeling is labor-intensive, especially with the massive new incoming unlabeled data every day. Recent studies show that test selection for DNN is a promising direction that tackles this issue by selecting minimal representative data to label and using these data to assess the model. However, it still requires human effort and cannot be automatic. In this paper, we propose a novel technique, named \textit{Aries}, that can estimate the performance of DNNs on new unlabeled data using only the information obtained from the original test data. The key insight behind our technique is that the model should have similar prediction accuracy on the data which have similar distances to the decision boundary. We performed a large-scale evaluation of our technique on two famous datasets, CIFAR-10 and Tiny-ImageNet, four widely studied DNN models including ResNet101 and DenseNet121, and 13 types of data transformation methods. Results show that the estimated accuracy by \textit{Aries} is only 0.03\% -- 2.60\% off the true accuracy. Besides, \textit{Aries} also outperforms the state-of-the-art labeling-free methods in 50 out of 52 cases and selection-labeling-based methods in 96 out of 128 cases.


Artificial intelligence in surgery: the emergency surgeon's perspective (the ARIES project) - Discover Health Systems

#artificialintelligence

Artificial Intelligence (AI) has been developed and implemented worldwide in many fields. In healthcare, AI has the valuable potential to reduce health, social, and economic inequities, help actualize universal health coverage, and improve health outcomes on a global scale. The COVID-19 pandemic was characterized in its early period by a tremendous number of patients needing hospital admission, leading to collapsing healthcare systems globally. Healthcare system strain was exacerbated by limited hospital resources and the low availability of COVID tests and personal protective equipment (PPE). These limitations increased the interest of governments, private companies, and public healthcare systems in developing AI systems to improve the management of patients.


Postdoc in Machine Learning and Environmental Modeling

#artificialintelligence

During the past decade, the RL has envisioned and built the ARIES (ARtificial Intelligence for Environment and Sustainability) platform, a technology that integrates network-available data and model components through semantics and machine reasoning. Its underlying open-source software (k.LAB) handles the full end-to-end process of integrating data and with multiple model integration types to predict complex change. It also supports selection of the most appropriate data and models using cloud technology and following an open data paradigm: the resulting insight remains open and available to society at large, and becomes a base for further computations, contributing to an ever-increasing knowledge base. For the first time, it is possible to consistently characterize and publish data and models for their integration in predictive models, building and field-testing technologies that have eluded researchers to date. We are looking for an individual who can support strategic activities related to integrated data science and collaborative, integrated modelling on the semantic web (semantic meta-modelling).


WWE Extreme Rules 2017: Predictions, Match Card For 'Monday Night Raw' PPV

International Business Times

WWE's next pay-per-view is set for Sunday night in Baltimore with Extreme Rules 2017. It will exclusively feature wrestlers from the "Monday Night Raw" roster. Six matches are on the Extreme Rules card, and the main event will determine the No.1 contender for Brock Lesnar's WWE Universal Championship. Below are predictions for every match at the event. Reigns is really the only superstar that has no chance in this match. The winner is expected to face Lesnar at the Great Balls of Fire PPV on July 9, and Reigns vs. Lesnar would only happen at SummerSlam or WrestleMania.