Goto

Collaborating Authors

 sin 1



A Warm up Example Sample Covariance and Mar Equation

Neural Information Processing Systems

Results obtained by averaging over 50 runs. We start by introducing the following lemma. To prove Theorem 2, it indeed suffices to prove the following lemma. ' Q and depends on U Qy which, up to further simplifications, concludes the proof of Theorem 3. F or any > 0 and defined in (8), we have 1. Developing the inverse we obtain = " 1 Results obtained by averaging over 30 runs.


Safe Circumnavigation of a Hostile Target Using Range-Based Measurements

Bhati, Gaurav Singh, Vaishnavi, Arukonda, Jain, Anoop

arXiv.org Artificial Intelligence

Robotic systems are frequently deployed in missions that are dull, dirty, and dangerous, where ensuring their safety is of paramount importance when designing stabilizing controllers to achieve their desired goals. This paper addresses the problem of safe circumnavigation around a hostile target by a nonholonomic robot, with the objective of maintaining a desired safe distance from the target. Our solution approach involves incorporating an auxiliary circle into the problem formulation, which assists in navigating the robot around the target using available range-based measurements. By leveraging the concept of a barrier Lyapunov function, we propose a novel control law that ensures stable circumnavigation around the target while preventing the robot from entering the safety circle. This controller is designed based on a parameter that depends on the radii of three circles, namely the stabilizing circle, the auxiliary circle, and the safety circle. By identifying an appropriate range for this design parameter, we rigorously prove the stability of the desired equilibrium of the closed-loop system. Additionally, we provide an analysis of the robot's motion within the auxiliary circle, which is influenced by a gain parameter in the proposed controller. Simulation and experimental results are presented to illustrate the key theoretical developments.


Development of a Human-Robot Interaction Platform for Dual-Arm Robots Based on ROS and Multimodal Artificial Intelligence

Canh, Thanh Nguyen, Nguyen, Ba Phuong, Tran, Hong Quan, HoangVan, Xiem

arXiv.org Artificial Intelligence

In this paper, we propose the development of an interactive platform between humans and a dual-arm robotic system based on the Robot Operating System (ROS) and a multimodal artificial intelligence model. Our proposed platform consists of two main components: a dual-arm robotic hardware system and software that includes image processing tasks and natural language processing using a 3D camera and embedded computing. First, we designed and developed a dual-arm robotic system with a positional accuracy of less than 2 cm, capable of operating independently, performing industrial and service tasks while simultaneously simulating and modeling the robot in the ROS environment. Second, artificial intelligence models for image processing are integrated to execute object picking and classification tasks with an accuracy of over 90%. Finally, we developed remote control software using voice commands through a natural language processing model. Experimental results demonstrate the accuracy of the multimodal artificial intelligence model and the flexibility of the dual-arm robotic system in interactive human environments.


Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models

Kedia, Akhil, Zaidi, Mohd Abbas, Khyalia, Sushil, Jung, Jungho, Goka, Harshith, Lee, Haejun

arXiv.org Artificial Intelligence

In spite of their huge success, transformer models remain difficult to scale in depth. In this work, we develop a unified signal propagation theory and provide formulae that govern the moments of the forward and backward signal through the transformer model. Our framework can be used to understand and mitigate vanishing/exploding gradients, rank collapse, and instability associated with high attention scores. We also propose DeepScaleLM, an initialization and scaling scheme that conserves unit output/gradient moments throughout the model, enabling the training of very deep models with 100s of layers. We find that transformer models could be much deeper - our deep models with fewer parameters outperform shallow models in Language Modeling, Speech Translation, and Image Classification, across Encoder-only, Decoder-only and Encoder-Decoder variants, for both Pre-LN and Post-LN transformers, for multiple datasets and model sizes. These improvements also translate into improved performance on downstream Question Answering tasks and improved robustness for image classification.


Moral Machine or Tyranny of the Majority?

Feffer, Michael, Heidari, Hoda, Lipton, Zachary C.

arXiv.org Artificial Intelligence

With Artificial Intelligence systems increasingly applied in consequential domains, researchers have begun to ask how these systems ought to act in ethically charged situations where even humans lack consensus. In the Moral Machine project, researchers crowdsourced answers to "Trolley Problems" concerning autonomous vehicles. Subsequently, Noothigattu et al. (2018) proposed inferring linear functions that approximate each individual's preferences and aggregating these linear models by averaging parameters across the population. In this paper, we examine this averaging mechanism, focusing on fairness concerns in the presence of strategic effects. We investigate a simple setting where the population consists of two groups, with the minority constituting an {\alpha} < 0.5 share of the population. To simplify the analysis, we consider the extreme case in which within-group preferences are homogeneous. Focusing on the fraction of contested cases where the minority group prevails, we make the following observations: (a) even when all parties report their preferences truthfully, the fraction of disputes where the minority prevails is less than proportionate in {\alpha}; (b) the degree of sub-proportionality grows more severe as the level of disagreement between the groups increases; (c) when parties report preferences strategically, pure strategy equilibria do not always exist; and (d) whenever a pure strategy equilibrium exists, the majority group prevails 100% of the time. These findings raise concerns about stability and fairness of preference vector averaging as a mechanism for aggregating diverging voices. Finally, we discuss alternatives, including randomized dictatorship and median-based mechanisms.


A Probabilistic Model of Activity Recognition with Loose Clothing

Shen, Tianchen, Di Giulio, Irene, Howard, Matthew

arXiv.org Artificial Intelligence

Human activity recognition has become an attractive research area with the development of on-body wearable sensing technology. With comfortable electronic-textiles, sensors can be embedded into clothing so that it is possible to record human movement outside the laboratory for long periods. However, a long-standing issue is how to deal with motion artefacts introduced by movement of clothing with respect to the body. Surprisingly, recent empirical findings suggest that cloth-attached sensor can actually achieve higher accuracy of activity recognition than rigid-attached sensor, particularly when predicting from short time-windows. In this work, a probabilistic model is introduced in which this improved accuracy and resposiveness is explained by the increased statistical distance between movements recorded via fabric sensing. The predictions of the model are verified in simulated and real human motion capture experiments, where it is evident that this counterintuitive effect is closely captured.


A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent

Liao, Zhenyu, Couillet, Romain, Mahoney, Michael W.

arXiv.org Machine Learning

This article characterizes the exact asymptotics of random Fourier feature (RFF) regression, in the realistic setting where the number of data samples $n$, their dimension $p$, and the dimension of feature space $N$ are all large and comparable. In this regime, the random RFF Gram matrix no longer converges to the well-known limiting Gaussian kernel matrix (as it does when $N \to \infty$ alone), but it still has a tractable behavior that is captured by our analysis. This analysis also provides accurate estimates of training and test regression errors for large $n,p,N$. Based on these estimates, a precise characterization of two qualitatively different phases of learning, including the phase transition between them, is provided; and the corresponding double descent test error curve is derived from this phase transition behavior. These results do not depend on strong assumptions on the data distribution, and they perfectly match empirical results on finite-dimensional real-world data sets.


Clustering Time-Series by a Novel Slope-Based Similarity Measure Considering Particle Swarm Optimization

Kamalzadeh, Hossein, Ahmadi, Abbas, Mansour, Saeed

arXiv.org Machine Learning

Recently there has been an increase in the studies on time - series data mining specifically time - series clustering due to the vast existe nce of time - series in various domains. The large volume of data in the form of time - series make s it necessary to employ various techniques such as clustering to understand the data and to extract information and hidden patterns. In the field of clustering specifically, time - series clustering, the most important aspects are the similarity measure used and the algorithm employed to conduct the clustering. In this paper, a new similarity measure for time - series clustering is developed based on a combination of a simple representation of time - series, slope of each segment of time - series, Euclidean distance and the so - called dynamic time warping. It is proved in this paper that the proposed distance measure is metric and thus indexing can be applied. For the task of clustering, the Particle Swarm Optimization algorithm is employed. The proposed similarity measure is compared to three existing measures in terms of various criteria used for the evaluation of clustering algorithms. The results indicate that the propo sed similarity measure outperforms the rest in almost every dataset used in this paper.


Mean-field Analysis of Batch Normalization

Wei, Mingwei, Stokes, James, Schwab, David J

arXiv.org Machine Learning

Batch Normalization (BatchNorm) is an extremely useful component of modern neural network architectures, enabling optimization using higher learning rates and achieving faster convergence. In this paper, we use mean-field theory to analytically quantify the impact of BatchNorm on the geometry of the loss landscape for multi-layer networks consisting of fully-connected and convolutional layers. We show that it has a flattening effect on the loss landscape, as quantified by the maximum eigenvalue of the Fisher Information Matrix. These findings are then used to justify the use of larger learning rates for networks that use BatchNorm, and we provide quantitative characterization of the maximal allowable learning rate to ensure convergence. Experiments support our theoretically predicted maximum learning rate, and furthermore suggest that networks with smaller values of the BatchNorm parameter γ achieve lower loss after the same number of epochs of training.