AITopics | Weiss, Ron

Control of Microrobots Using Model Predictive Control and Gaussian Processes for Disturbance Estimation

Kermanshah, Mehdi, Beaver, Logan E., Sokolich, Max, Das, Sambeeta, Weiss, Ron, Tron, Roberto, Belta, Calin

arXiv.org Artificial IntelligenceJun-4-2024

This paper presents a control framework for magnetically actuated micron-scale robots ($\mu$bots) designed to mitigate disturbances and improve trajectory tracking. To address the challenges posed by unmodeled dynamics and environmental variability, we combine data-driven modeling with model-based control to accurately track desired trajectories using a relatively small amount of data. The system is represented with a simple linear model, and Gaussian Processes (GP) are employed to capture and estimate disturbances. This disturbance-enhanced model is then integrated into a Model Predictive Controller (MPC). Our approach demonstrates promising performance in both simulation and experimental setups, showcasing its potential for precise and reliable microrobot control in complex environments.

artificial intelligence, disturbance, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2406.02722

Country: North America > United States > Maryland (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.51)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning a Tracking Controller for Rolling $\mu$bots

Beaver, Logan E, Sokolich, Max, Alsalehi, Suhail, Weiss, Ron, Das, Sambeeta, Belta, Calin

arXiv.org Artificial IntelligenceAug-13-2023

Micron-scale robots ($\mu$bots) have recently shown great promise for emerging medical applications. Accurate controlling $\mu$bots, while critical to their successful deployment, is challenging. In this work, we consider the problem of tracking a reference trajectory using a $\mu$bot in the presence of disturbances and uncertainty. The disturbances primarily come from Brownian motion and other environmental phenomena, while the uncertainty originates from errors in the model parameters. We model the $\mu$bot as an uncertain unicycle that is controlled by a global magnetic field. To compensate for disturbances and uncertainties, we develop a nonlinear mismatch controller. We define the model mismatch error as the difference between our model's predicted velocity and the actual velocity of the $\mu$bot. We employ a Gaussian Process to learn the model mismatch error as a function of the applied control input. Then we use a least-squares minimization to select a control action that minimizes the difference between the actual velocity of the $\mu$bot and a reference velocity. We demonstrate the online performance of our joint learning and control algorithm in simulation, where our approach accurately learns the model mismatch and improves tracking performance. We also validate our approach in an experiment and show that certain error metrics are reduced by up to $40\%$.

artificial intelligence, bot, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2212.00188

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.75)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis

Jia, Ye, Zhang, Yu, Weiss, Ron, Wang, Quan, Shen, Jonathan, Ren, Fei, Chen, zhifeng, Nguyen, Patrick, Pang, Ruoming, Moreno, Ignacio Lopez, Wu, Yonghui

Neural Information Processing SystemsDec-31-2018

We describe a neural network-based system for text-to-speech (TTS) synthesis that is able to generate speech audio in the voice of many different speakers, including those unseen during training. Our system consists of three independently trained components: (1) a speaker encoder network, trained on a speaker verification task using an independent dataset of noisy speech from thousands of speakers without transcripts, to generate a fixed-dimensional embedding vector from seconds of reference speech from a target speaker; (2) a sequence-to-sequence synthesis network based on Tacotron 2, which generates a mel spectrogram from text, conditioned on the speaker embedding; (3) an auto-regressive WaveNet-based vocoder that converts the mel spectrogram into a sequence of time domain waveform samples. We demonstrate that the proposed model is able to transfer the knowledge of speaker variability learned by the discriminatively-trained speaker encoder to the new task, and is able to synthesize natural speech from speakers that were not seen during training. We quantify the importance of training the speaker encoder on a large and diverse speaker set in order to obtain the best generalization performance. Finally, we show that randomly sampled speaker embeddings can be used to synthesize speech in the voice of novel speakers dissimilar from those used in training, indicating that the model has learned a high quality speaker representation.

acoustic processing, speech, speech synthesis, (22 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Speech > Acoustic Processing (0.86)

Add feedback

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis

Jia, Ye, Zhang, Yu, Weiss, Ron, Wang, Quan, Shen, Jonathan, Ren, Fei, Chen, zhifeng, Nguyen, Patrick, Pang, Ruoming, Moreno, Ignacio Lopez, Wu, Yonghui

Neural Information Processing SystemsDec-31-2018

We describe a neural network-based system for text-to-speech (TTS) synthesis that is able to generate speech audio in the voice of many different speakers, including those unseen during training. Our system consists of three independently trained components: (1) a speaker encoder network, trained on a speaker verification task using an independent dataset of noisy speech from thousands of speakers without transcripts, to generate a fixed-dimensional embedding vector from seconds of reference speech from a target speaker; (2) a sequence-to-sequence synthesis network based on Tacotron 2, which generates a mel spectrogram from text, conditioned on the speaker embedding; (3) an auto-regressive WaveNet-based vocoder that converts the mel spectrogram into a sequence of time domain waveform samples. We demonstrate that the proposed model is able to transfer the knowledge of speaker variability learned by the discriminatively-trained speaker encoder to the new task, and is able to synthesize natural speech from speakers that were not seen during training. We quantify the importance of training the speaker encoder on a large and diverse speaker set in order to obtain the best generalization performance. Finally, we show that randomly sampled speaker embeddings can be used to synthesize speech in the voice of novel speakers dissimilar from those used in training, indicating that the model has learned a high quality speaker representation.

acoustic processing, speech, speech synthesis, (22 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Speech > Acoustic Processing (0.86)

Add feedback

Affinity Weighted Embedding

Weston, Jason, Weiss, Ron, Yee, Hector

arXiv.org Machine LearningJan-17-2013

Supervised (linear) embedding models like Wsabie and PSI have proven successful at ranking, recommendation and annotation tasks. However, despite being scalable to large datasets they do not take full advantage of the extra data due to their linear nature, and typically underfit. We propose a new class of models which aim to provide improved performance while retaining many of the benefits of the existing class of embedding models. Our new approach works by iteratively learning a linear embedding model where the next iteration's features and labels are reweighted as a function of the previous iteration. We describe several variants of the family, and give some initial results.

affinity weighted embedding, artificial intelligence, neural network, (16 more...)

arXiv.org Machine Learning

1301.4171

Country: North America > United States (0.31)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback

Latent Collaborative Retrieval

Weston, Jason, Wang, Chong, Weiss, Ron, Berenzweig, Adam

arXiv.org Artificial IntelligenceJun-18-2012

Retrieval tasks typically require a ranking of items given a query. Collaborative filtering tasks, on the other hand, learn to model user's preferences over items. In this paper we study the joint problem of recommending items to a user with respect to a given query, which is a surprisingly common task. This setup differs from the standard collaborative filtering one in that we are given a query x user x item tensor for training instead of the more traditional user x item matrix. Compared to document retrieval we do have a query, but we may or may not have content features (we will consider both cases) and we can also take account of the user's profile. We introduce a factorized model for this new task that optimizes the top-ranked items returned for the given query and user. We report empirical results where it outperforms several baselines.

artificial intelligence, dataset, natural language, (17 more...)

arXiv.org Artificial Intelligence

1206.4603

Country: