AITopics | Yang, Yuanyuan

Collaborating Authors

Yang, Yuanyuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Combination Model Based on Sequential General Variational Mode Decomposition Method for Time Series Prediction

Chen, Wei, Yang, Yuanyuan, Liu, Jianyu

arXiv.org Artificial IntelligenceJun-7-2024

For example, combining ARIMA with various decomposition algorithms such as Empirical Mode Decomposition (EMD) and Variational Mode Decomposition (VMD) for predicting complex time series; For example, using an improved ARMA model for stock market forecasting. However, the above models need to be built on the basis of stable sequence data, and usually require testing and preprocessing of the original data, which may lead to the loss of some hidden information, especially in big data samples, and this disadvantage is easily magnified. With the development of computer technology, intelligent models represented by artificial neural networks (ANNs) are gradually emerging. This type of model is good at handling incomplete, fuzzy, uncertain, or irregular data, and has a good fit to nonlinear relationships. Shallow neural networks represented by backpropagation neural networks (BPNN) and shallow machine learning represented by support vector machines (SVM) are also widely used in financial market prediction. However, shallow neural networks do not consider the temporal nature of data, and financial time series often have certain long-term dependencies. Therefore, recurrent neural networks (RNNs) with memory function have become the latest choice. The output of RNN at a certain moment can be used as input to feedback to neurons again, and this cascade structure is very suitable for time series data, which can preserve the dependency relationships in the data.

artificial intelligence, machine learning, time sery, (17 more...)

arXiv.org Artificial Intelligence

2406.03157

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.34)
Asia (0.14)

Genre: Research Report (1.00)

Industry:

Energy > Renewable (0.68)
Banking & Finance > Trading (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning Galaxy Intrinsic Alignment Correlations

Pandya, Sneh, Yang, Yuanyuan, Van Alfen, Nicholas, Blazek, Jonathan, Walters, Robin

arXiv.org Artificial IntelligenceApr-21-2024

The intrinsic alignments (IA) of galaxies, regarded as a contaminant in weak lensing analyses, represents the correlation of galaxy shapes due to gravitational tidal interactions and galaxy formation processes. As such, understanding IA is paramount for accurate cosmological inferences from weak lensing surveys; however, one limitation to our understanding and mitigation of IA is expensive simulation-based modeling. In this work, we present a deep learning approach to emulate galaxy position-position ($\xi$), position-orientation ($\omega$), and orientation-orientation ($\eta$) correlation function measurements and uncertainties from halo occupation distribution-based mock galaxy catalogs. We find strong Pearson correlation values with the model across all three correlation functions and further predict aleatoric uncertainties through a mean-variance estimation training procedure. $\xi(r)$ predictions are generally accurate to $\leq10\%$. Our model also successfully captures the underlying signal of the noisier correlations $\omega(r)$ and $\eta(r)$, although with a lower average accuracy. We find that the model performance is inhibited by the stochasticity of the data, and will benefit from correlations averaged over multiple data realizations. Our code will be made open source upon journal publication.

artificial intelligence, correlation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2404.13702

Country: North America > United States (0.47)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training

Qin, Lianke, Mitra, Saayan, Song, Zhao, Yang, Yuanyuan, Zhou, Tianyi

arXiv.org Artificial IntelligenceNov-19-2023

In this paper, we consider a heavy inner product identification problem, which generalizes the Light Bulb problem~(\cite{prr89}): Given two sets $A \subset \{-1,+1\}^d$ and $B \subset \{-1,+1\}^d$ with $|A|=|B| = n$, if there are exact $k$ pairs whose inner product passes a certain threshold, i.e., $\{(a_1, b_1), \cdots, (a_k, b_k)\} \subset A \times B$ such that $\forall i \in [k], \langle a_i,b_i \rangle \geq \rho \cdot d$, for a threshold $\rho \in (0,1)$, the goal is to identify those $k$ heavy inner products. We provide an algorithm that runs in $O(n^{2 \omega / 3+ o(1)})$ time to find the $k$ inner product pairs that surpass $\rho \cdot d$ threshold with high probability, where $\omega$ is the current matrix multiplication exponent. By solving this problem, our method speed up the training of neural networks with ReLU activation function.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2311.11429

Country: North America > United States > California > Santa Barbara County > Santa Barbara (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Survey of Robustness and Safety of 2D and 3D Deep Learning Models Against Adversarial Attacks

Li, Yanjie, Xie, Bin, Guo, Songtao, Yang, Yuanyuan, Xiao, Bin

arXiv.org Artificial IntelligenceOct-1-2023

Benefiting from the rapid development of deep learning, 2D and 3D computer vision applications are deployed in many safe-critical systems, such as autopilot and identity authentication. However, deep learning models are not trustworthy enough because of their limited robustness against adversarial attacks. The physically realizable adversarial attacks further pose fatal threats to the application and human safety. Lots of papers have emerged to investigate the robustness and safety of deep learning models against adversarial attacks. To lead to trustworthy AI, we first construct a general threat model from different perspectives and then comprehensively review the latest progress of both 2D and 3D adversarial attacks. We extend the concept of adversarial examples beyond imperceptive perturbations and collate over 170 papers to give an overview of deep learning model robustness against various adversarial attacks. To the best of our knowledge, we are the first to systematically investigate adversarial attacks for 3D models, a flourishing field applied to many real-world applications. In addition, we examine physical adversarial attacks that lead to safety violations. Last but not least, we summarize present popular topics, give insights on challenges, and shed light on future research on trustworthy AI.

artificial intelligence, machine learning, survey article, (15 more...)

arXiv.org Artificial Intelligence

2310.00633

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Summary of ChatGPT-Related Research and Perspective Towards the Future of Large Language Models

Liu, Yiheng, Han, Tianle, Ma, Siyuan, Zhang, Jiayue, Yang, Yuanyuan, Tian, Jiaming, He, Hao, Li, Antong, He, Mengshen, Liu, Zhengliang, Wu, Zihao, Zhao, Lin, Zhu, Dajiang, Li, Xiang, Qiang, Ning, Shen, Dingang, Liu, Tianming, Ge, Bao

arXiv.org Artificial IntelligenceAug-21-2023

This paper presents a comprehensive survey of ChatGPT-related (GPT-3.5 and GPT-4) research, state-of-the-art large language models (LLM) from the GPT series, and their prospective applications across diverse domains. Indeed, key innovations such as large-scale pre-training that captures knowledge across the entire world wide web, instruction fine-tuning and Reinforcement Learning from Human Feedback (RLHF) have played significant roles in enhancing LLMs' adaptability and performance. We performed an in-depth analysis of 194 relevant papers on arXiv, encompassing trend analysis, word cloud representation, and distribution analysis across various application domains. The findings reveal a significant and increasing interest in ChatGPT-related research, predominantly centered on direct natural language processing applications, while also demonstrating considerable potential in areas ranging from education and history to mathematics, medicine, and physics. This study endeavors to furnish insights into ChatGPT's capabilities, potential implications, ethical concerns, and offer direction for future advancements in this field.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.metrad.2023.100017

2304.01852

Country:

North America > United States (0.67)
Asia > China (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

Joint Microseismic Event Detection and Location with a Detection Transformer

Yang, Yuanyuan, Birnie, Claire, Alkhalifah, Tariq

arXiv.org Artificial IntelligenceJul-16-2023

During the processes of reservoir stimulation, fluids are injected into a specific area underground. The high-pressure condition created by the fluid injection causes rocks to crack to release the built-up stress, resulting in small earthquakes called microseismic events. Detecting these events in seismic recordings and locating them back to their subsurface locations are important for understanding the subsurface conditions such as fracture networks and fluid flow pathways. This knowledge is critical for applications like carbon storage, geothermal energy extraction, and oil/gas production. Traditional approaches for microseismic event detection and location often suffer from manual intervention and/or heavy computation, while current machine learning-assisted approaches typically address detection and location separately. These limitations prevent the potential for real-time microseismic monitoring, which is crucial for scientists and engineers to make instant, informed decisions, like optimization of injection strategies. Here, we proposed a machine learning-based procedure for simultaneously detecting and locating microseismic events within a single framework, using a conventional Convolutional Neural Network and an encoder-decoder Transformer. Tests on synthetically-generated and field-collected passive seismic data illustrate the accuracy, efficiency, and potential of the proposed method, which could pave the way for real-time monitoring of microseismic events in the future.

artificial intelligence, machine learning, microseismic event, (19 more...)

arXiv.org Artificial Intelligence

2307.09207

Country: North America > United States > Texas (0.28)

Genre: Research Report (0.82)

Industry:

Energy > Renewable > Geothermal (1.00)
Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification

Qin, Lianke, Song, Zhao, Yang, Yuanyuan

arXiv.org Artificial IntelligenceJul-13-2023

Deep learning has been widely used in many fields, but the model training process usually consumes massive computational resources and time. Therefore, designing an efficient neural network training method with a provable convergence guarantee is a fundamental and important research question. In this paper, we present a static half-space report data structure that consists of a fully connected two-layer neural network for shifted ReLU activation to enable activated neuron identification in sublinear time via geometric search. We also prove that our algorithm can converge in $O(M^2/\epsilon^2)$ time with network size quadratic in the coefficient norm upper bound $M$ and error term $\epsilon$.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2307.06565

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The SLAM Hive Benchmarking Suite

Yang, Yuanyuan, Xu, Bowen, Li, Yinjie, Schwertfeger, Sören

arXiv.org Artificial IntelligenceMar-21-2023

Benchmarking Simultaneous Localization and Mapping (SLAM) algorithms is important to scientists and users of robotic systems alike. But through their many configuration options in hardware and software, SLAM systems feature a vast parameter space that scientists up to now were not able to explore. The proposed SLAM Hive Benchmarking Suite is able to analyze SLAM algorithms in 1000's of mapping runs, through its utilization of container technology and deployment in a cluster. This paper presents the architecture and open source implementation of SLAM Hive and compares it to existing efforts on SLAM evaluation. Furthermore, we highlight the function of SLAM Hive by exploring some open source algorithms on public datasets in terms of accuracy. We compare the algorithms against each other and evaluate how parameters effect not only accuracy but also CPU and memory usage. Through this we show that SLAM Hive can become an essential tool for proper comparisons and evaluations of SLAM algorithms and thus drive the scientific development in the research on SLAM.

algorithm, artificial intelligence, dataset, (16 more...)

arXiv.org Artificial Intelligence

2303.11854

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Software (0.87)

Add feedback

Just Least Squares: Binary Compressive Sampling with Low Generative Intrinsic Dimension

Jiao, Yuling, Li, Dingwei, Liu, Min, Lu, Xiangliang, Yang, Yuanyuan

arXiv.org Machine LearningNov-29-2021

In this paper, we consider recovering $n$ dimensional signals from $m$ binary measurements corrupted by noises and sign flips under the assumption that the target signals have low generative intrinsic dimension, i.e., the target signals can be approximately generated via an $L$-Lipschitz generator $G: \mathbb{R}^k\rightarrow\mathbb{R}^{n}, k\ll n$. Although the binary measurements model is highly nonlinear, we propose a least square decoder and prove that, up to a constant $c$, with high probability, the least square decoder achieves a sharp estimation error $\mathcal{O} (\sqrt{\frac{k\log (Ln)}{m}})$ as long as $m\geq \mathcal{O}( k\log (Ln))$. Extensive numerical simulations and comparisons with state-of-the-art methods demonstrated the least square decoder is robust to noise and sign flips, as indicated by our theory. By constructing a ReLU network with properly chosen depth and width, we verify the (approximately) deep generative prior, which is of independent interest.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

2111.14486

Country:

Asia > China > Hubei Province (0.15)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Towards Efficient Scheduling of Federated Mobile Devices under Computational and Statistical Heterogeneity

Wang, Cong, Yang, Yuanyuan, Zhou, Pengzhan

arXiv.org Machine LearningSep-15-2020

Originated from distributed learning, federated learning enables privacy-preserved collaboration on a new abstracted level by sharing the model parameters only. While the current research mainly focuses on optimizing learning algorithms and minimizing communication overhead left by distributed learning, there is still a considerable gap when it comes to the real implementation on mobile devices. In this paper, we start with an empirical experiment to demonstrate computation heterogeneity is a more pronounced bottleneck than communication on the current generation of battery-powered mobile devices, and the existing methods are haunted by mobile stragglers. Further, non-identically distributed data across the mobile users makes the selection of participants critical to the accuracy and convergence. To tackle the computational and statistical heterogeneity, we utilize data as a tuning knob and propose two efficient polynomial-time algorithms to schedule different workloads on various mobile devices, when data is identically or non-identically distributed. For identically distributed data, we combine partitioning and linear bottleneck assignment to achieve near-optimal training time without accuracy loss. For non-identically distributed data, we convert it into an average cost minimization problem and propose a greedy algorithm to find a reasonable balance between computation time and accuracy. We also establish an offline profiler to quantify the runtime behavior of different devices, which serves as the input to the scheduling algorithms. We conduct extensive experiments on a mobile testbed with two datasets and up to 20 devices. Compared with the common benchmarks, the proposed algorithms achieve 2-100x speedup epoch-wise, 2-7% accuracy gain and boost the convergence rate by more than 100% on CIFAR10.

accuracy, deep learning, neural network, (23 more...)

arXiv.org Machine Learning

2005.12326

Country:

North America > United States > New York (0.14)
North America > United States > Virginia (0.14)
North America > United States > Maryland (0.14)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.45)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback