AITopics | Chen, Yutian

Collaborating Authors

Chen, Yutian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AirIO: Learning Inertial Odometry with Enhanced IMU Feature Observability

Qiu, Yuheng, Xu, Can, Chen, Yutian, Zhao, Shibo, Geng, Junyi, Scherer, Sebastian

arXiv.org Artificial IntelligenceJan-26-2025

Inertial odometry (IO) using only Inertial Measurement Units (IMUs) offers a lightweight and cost-effective solution for Unmanned Aerial Vehicle (UAV) applications, yet existing learning-based IO models often fail to generalize to UAVs due to the highly dynamic and non-linear-flight patterns that differ from pedestrian motion. In this work, we identify that the conventional practice of transforming raw IMU data to global coordinates undermines the observability of critical kinematic information in UAVs. By preserving the body-frame representation, our method achieves substantial performance improvements, with a 66.7% average increase in accuracy across three datasets. Furthermore, explicitly encoding attitude information into the motion network results in an additional 23.8% improvement over prior results. Combined with a data-driven IMU correction model (AirIMU) and an uncertainty-aware Extended Kalman Filter (EKF), our approach ensures robust state estimation under aggressive UAV maneuvers without relying on external sensors or control inputs. Notably, our method also demonstrates strong generalizability to unseen data not included in the training set, underscoring its potential for real-world UAV applications.

artificial intelligence, dataset, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.15659

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.82)

Industry:

Information Technology > Robotics & Automation (0.88)
Aerospace & Defense > Aircraft (0.66)
Materials > Chemicals > Industrial Gases > Liquified Gas (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

TRecViT: A Recurrent Video Transformer

Pătrăucean, Viorica, He, Xu Owen, Heyward, Joseph, Zhang, Chuhan, Sajjadi, Mehdi S. M., Muraru, George-Cristian, Zholus, Artem, Karami, Mahdi, Goroshin, Ross, Chen, Yutian, Osindero, Simon, Carreira, João, Pascanu, Razvan

arXiv.org Artificial IntelligenceDec-18-2024

We propose a novel block for video modelling. It relies on a time-space-channel factorisation with dedicated blocks for each dimension: gated linear recurrent units (LRUs) perform information mixing over time, self-attention layers perform mixing over space, and MLPs over channels. The resulting architecture TRecViT performs well on sparse and dense tasks, trained in supervised or self-supervised regimes. Notably, our model is causal and outperforms or is on par with a pure attention model ViViT-L on large scale video datasets (SSv2, Kinetics400), while having $3\times$ less parameters, $12\times$ smaller memory footprint, and $5\times$ lower FLOPs count. Code and checkpoints will be made available online at https://github.com/google-deepmind/trecvit.

machine learning, natural language, trecvit, (20 more...)

arXiv.org Artificial Intelligence

2412.14294

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Predicting from Strings: Language Model Embeddings for Bayesian Optimization

Nguyen, Tung, Zhang, Qiuyi, Yang, Bangding, Lee, Chansoo, Bornschein, Jorg, Miao, Yingjie, Perel, Sagi, Chen, Yutian, Song, Xingyou

arXiv.org Artificial IntelligenceOct-15-2024

Bayesian Optimization is ubiquitous in the field of experimental design and blackbox optimization for improving search efficiency, but has been traditionally restricted to regression models which are only applicable to fixed search spaces and tabular input features. We propose Embed-then-Regress, a paradigm for applying in-context regression over string inputs, through the use of string embedding capabilities of pretrained language models. By expressing all inputs as strings, we are able to perform general-purpose regression for Bayesian Optimization over various domains including synthetic, combinatorial, and hyperparameter optimization, obtaining comparable results to state-of-the-art Gaussian Process-based algorithms. Code can be found at https://github.com/google-research/optformer/tree/main/optformer/embed_then_regress.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.1019

Country:

Europe (0.68)
North America > United States > California (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Position: Leverage Foundational Models for Black-Box Optimization

Song, Xingyou, Tian, Yingtao, Lange, Robert Tjarko, Lee, Chansoo, Tang, Yujin, Chen, Yutian

arXiv.org Artificial IntelligenceMay-9-2024

Undeniably, Large Language Models (LLMs) have stirred an extraordinary wave of innovation in the machine learning research domain, resulting in substantial impact across diverse fields such as reinforcement learning, robotics, and computer vision. Their incorporation has been rapid and transformative, marking a significant paradigm shift in the field of machine learning research. However, the field of experimental design, grounded on black-box optimization, has been much less affected by such a paradigm shift, even though integrating LLMs with optimization presents a unique landscape ripe for exploration. In this position paper, we frame the field of black-box optimization around sequence-based foundation models and organize their relationship with previous literature. We discuss the most promising ways foundational language models can revolutionize optimization, which include harnessing the vast wealth of information encapsulated in free-form text to enrich task comprehension, utilizing highly flexible sequence models such as Transformers to engineer superior optimization strategies, and enhancing performance prediction over previously unseen search spaces.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.03547

Country:

Europe (0.93)
North America > United States > California > Los Angeles County > Long Beach (0.14)

Genre: Research Report (0.70)

Industry: Transportation > Air (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Botev, Aleksandar, De, Soham, Smith, Samuel L, Fernando, Anushan, Muraru, George-Cristian, Haroun, Ruba, Berrada, Leonard, Pascanu, Razvan, Sessa, Pier Giuseppe, Dadashi, Robert, Hussenot, Léonard, Ferret, Johan, Girgin, Sertan, Bachem, Olivier, Andreev, Alek, Kenealy, Kathleen, Mesnard, Thomas, Hardin, Cassidy, Bhupatiraju, Surya, Pathak, Shreya, Sifre, Laurent, Rivière, Morgane, Kale, Mihir Sanjay, Love, Juliette, Tafti, Pouya, Joulin, Armand, Fiedel, Noah, Senter, Evan, Chen, Yutian, Srinivasan, Srivatsan, Desjardins, Guillaume, Budden, David, Doucet, Arnaud, Vikram, Sharad, Paszke, Adam, Gale, Trevor, Borgeaud, Sebastian, Chen, Charlie, Brock, Andy, Paterson, Antonia, Brennan, Jenny, Risdal, Meg, Gundluru, Raj, Devanathan, Nesh, Mooney, Paul, Chauhan, Nilay, Culliton, Phil, Martins, Luiz GUStavo, Bandy, Elisa, Huntsperger, David, Cameron, Glenn, Zucker, Arthur, Warkentin, Tris, Peran, Ludovic, Giang, Minh, Ghahramani, Zoubin, Farabet, Clément, Kavukcuoglu, Koray, Hassabis, Demis, Hadsell, Raia, Teh, Yee Whye, de Frietas, Nando

arXiv.org Artificial IntelligenceApr-11-2024

We introduce RecurrentGemma, an open language model which uses Google's novel Griffin architecture. Griffin combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state, which reduces memory use and enables efficient inference on long sequences. We provide a pre-trained model with 2B non-embedding parameters, and an instruction tuned variant. Both models achieve comparable performance to Gemma-2B despite being trained on fewer tokens.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2404.07839

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

OmniPred: Language Models as Universal Regressors

Song, Xingyou, Li, Oscar, Lee, Chansoo, Yang, Bangding, Peng, Daiyi, Perel, Sagi, Chen, Yutian

arXiv.org Artificial IntelligenceMar-4-2024

Over the broad landscape of experimental design, regression has been a powerful tool to accurately predict the outcome metrics of a system or model given a set of parameters, but has been traditionally restricted to methods which are only applicable to a specific task. In this paper, we propose OmniPred, a framework for training language models as universal end-to-end regressors over $(x,y)$ evaluation data from diverse real world experiments. Using data sourced from Google Vizier, one of the largest blackbox optimization databases in the world, our extensive experiments demonstrate that through only textual representations of mathematical parameters and values, language models are capable of very precise numerical regression, and if given the opportunity to train over multiple tasks, can significantly outperform traditional regression models.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.14547

Country:

Europe (0.67)
North America > United States > Maryland > Baltimore (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

De, Soham, Smith, Samuel L., Fernando, Anushan, Botev, Aleksandar, Cristian-Muraru, George, Gu, Albert, Haroun, Ruba, Berrada, Leonard, Chen, Yutian, Srinivasan, Srivatsan, Desjardins, Guillaume, Doucet, Arnaud, Budden, David, Teh, Yee Whye, Pascanu, Razvan, De Freitas, Nando, Gulcehre, Caglar

arXiv.org Artificial IntelligenceFeb-29-2024

Recurrent neural networks (RNNs) have fast inference and scale efficiently on long sequences, but they are difficult to train and hard to scale. We propose Hawk, an RNN with gated linear recurrences, and Griffin, a hybrid model that mixes gated linear recurrences with local attention. Hawk exceeds the reported performance of Mamba on downstream tasks, while Griffin matches the performance of Llama-2 despite being trained on over 6 times fewer tokens. We also show that Griffin can extrapolate on sequences significantly longer than those seen during training. Our models match the hardware efficiency of Transformers during training, and during inference they have lower latency and significantly higher throughput. We scale Griffin up to 14B parameters, and explain how to shard our models for efficient distributed training.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2402.19427

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GATS: Gather-Attend-Scatter

Zolna, Konrad, Cabi, Serkan, Chen, Yutian, Lau, Eric, Fantacci, Claudio, Pasukonis, Jurgis, Springenberg, Jost Tobias, Colmenarejo, Sergio Gomez

arXiv.org Artificial IntelligenceJan-16-2024

As the AI community increasingly adopts large-scale models, it is crucial to develop general and flexible tools to integrate them. We introduce Gather-Attend-Scatter (GATS), a novel module that enables seamless combination of pretrained foundation models, both trainable and frozen, into larger multimodal networks. GATS empowers AI systems to process and generate information across multiple modalities at different rates. In contrast to traditional fine-tuning, GATS allows for the original component models to remain frozen, avoiding the risk of them losing important knowledge acquired during the pretraining phase. We demonstrate the utility and versatility of GATS with a few experiments across games, robotics, and multimodal input-output systems.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2401.08525

Country: Europe > Ukraine (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Token Prediction as Implicit Classification to Identify LLM-Generated Text

Chen, Yutian, Kang, Hao, Zhai, Vivian, Li, Liangze, Singh, Rita, Raj, Bhiksha

arXiv.org Artificial IntelligenceNov-15-2023

This paper introduces a novel approach for identifying the possible large language models (LLMs) involved in text generation. Instead of adding an additional classification layer to a base LM, we reframe the classification task as a next-token prediction task and directly fine-tune the base LM to perform it. We utilize the Text-to-Text Transfer Transformer (T5) model as the backbone for our experiments. We compared our approach to the more direct approach of utilizing hidden states for classification. Evaluation shows the exceptional performance of our method in the text classification task, highlighting its simplicity and efficiency. Furthermore, interpretability studies on the features extracted by our model reveal its ability to differentiate distinctive writing styles among various LLMs even in the absence of an explicit classifier. We also collected a dataset named OpenLLMText, containing approximately 340k text samples from human and LLMs, including GPT3.5, PaLM, LLaMA, and GPT2.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2023.emnlp-main.810

2311.08723

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.80)

Add feedback

PyPose v0.6: The Imperative Programming Interface for Robotics

Zhan, Zitong, Li, Xiangfu, Li, Qihang, He, Haonan, Pandey, Abhinav, Xiao, Haitao, Xu, Yangmengfei, Chen, Xiangyu, Xu, Kuan, Cao, Kun, Zhao, Zhipeng, Wang, Zihan, Xu, Huan, Fang, Zihang, Chen, Yutian, Wang, Wentao, Fang, Xu, Du, Yi, Wu, Tianhao, Lin, Xiao, Qiu, Yuheng, Yang, Fan, Shi, Jingnan, Su, Shaoshu, Lu, Yiren, Fu, Taimeng, Dantu, Karthik, Wu, Jiajun, Xie, Lihua, Hutter, Marco, Carlone, Luca, Scherer, Sebastian, Huang, Daning, Hu, Yaoyu, Geng, Junyi, Wang, Chen

arXiv.org Artificial IntelligenceSep-22-2023

PyPose is an open-source library for robot learning. It combines a learning-based approach with physics-based optimization, which enables seamless end-to-end robot learning. It has been used in many tasks due to its meticulously designed application programming interface (API) and efficient implementation. From its initial launch in early 2022, PyPose has experienced significant enhancements, incorporating a wide variety of new features into its platform. To satisfy the growing demand for understanding and utilizing the library and reduce the learning curve of new users, we present the fundamental design principle of the imperative programming interface, and showcase the flexible usage of diverse functionalities and modules using an extremely simple Dubins car example. We also demonstrate that the PyPose can be easily used to navigate a real quadruped robot with a few lines of code.

artificial intelligence, machine learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2309.13035

Country:

North America > United States > Pennsylvania (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Georgia > Fulton County (0.14)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback