Materials
Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection
Bai, Yu, Chen, Fan, Wang, Huan, Xiong, Caiming, Mei, Song
Neural sequence models based on the transformer architecture have demonstrated remarkable \emph{in-context learning} (ICL) abilities, where they can perform new tasks when prompted with training and test examples, without any parameter update to the model. This work first provides a comprehensive statistical theory for transformers to perform ICL. Concretely, we show that transformers can implement a broad class of standard machine learning algorithms in context, such as least squares, ridge regression, Lasso, learning generalized linear models, and gradient descent on two-layer neural networks, with near-optimal predictive power on various in-context data distributions. Using an efficient implementation of in-context gradient descent as the underlying mechanism, our transformer constructions admit mild size bounds, and can be learned with polynomially many pretraining sequences. Building on these ``base'' ICL algorithms, intriguingly, we show that transformers can implement more complex ICL procedures involving \emph{in-context algorithm selection}, akin to what a statistician can do in real life -- A \emph{single} transformer can adaptively select different base ICL algorithms -- or even perform qualitatively different tasks -- on different input sequences, without any explicit prompting of the right algorithm or task. We both establish this in theory by explicit constructions, and also observe this phenomenon experimentally. In theory, we construct two general mechanisms for algorithm selection with concrete examples: pre-ICL testing, and post-ICL validation. As an example, we use the post-ICL validation mechanism to construct a transformer that can perform nearly Bayes-optimal ICL on a challenging task -- noisy linear models with mixed noise levels. Experimentally, we demonstrate the strong in-context algorithm selection capabilities of standard transformer architectures.
FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy
Sun, Yan, Shen, Li, Huang, Tiansheng, Ding, Liang, Tao, Dacheng
Federated learning is an emerging distributed machine learning framework which jointly trains a global model via a large number of local devices with data privacy protections. Its performance suffers from the non-vanishing biases introduced by the local inconsistent optimal and the rugged client-drifts by the local over-fitting. In this paper, we propose a novel and practical method, FedSpeed, to alleviate the negative impacts posed by these problems. Concretely, FedSpeed applies the prox-correction term on the current local updates to efficiently reduce the biases introduced by the prox-term, a necessary regularizer to maintain the strong local consistency. Furthermore, FedSpeed merges the vanilla stochastic gradient with a perturbation computed from an extra gradient ascent step in the neighborhood, thereby alleviating the issue of local over-fitting. Our theoretical analysis indicates that the convergence rate is related to both the communication rounds $T$ and local intervals $K$ with a upper bound $\small \mathcal{O}(1/T)$ if setting a proper local interval. Moreover, we conduct extensive experiments on the real-world dataset to demonstrate the efficiency of our proposed FedSpeed, which performs significantly faster and achieves the state-of-the-art (SOTA) performance on the general FL experimental settings than several baselines. Our code is available at \url{https://github.com/woodenchild95/FL-Simulator.git}.
Touch, press and stroke: a soft capacitive sensor skin
Sarwar, Mirza S., Ishizaki, Ryusuke, Morton, Kieran, Preston, Claire, Nguyen, Tan, Fan, Xu, Dupont, Bertille, Hogarth, Leanna, Yoshiike, Takahide, Mirabbasi, Shahriar, Madden, John D. W.
Soft sensors that can discriminate shear and normal force could help provide machines the fine control desirable for safe and effective physical interactions with people. A capacitive sensor is made for this purpose, composed of patterned elastomer and containing both fixed and sliding pillars that allow the sensor to deform and buckle, much like skin itself. The sensor differentiates between simultaneously applied pressure and shear. In addition, finger proximity is detectable up to 15 mm, with a pressure and shear sensitivity of 1 kPa and a displacement resolution of 50 m. The operation is demonstrated on a simple gripper holding a cup. The combination of features and the straightforward fabrication method make this sensor a candidate for implementation as a sensing skin for humanoid robotics applications. Summary A 3-axis capacitive sensor with a dielectric composed of elastomer pillars creates a skinlike deformation that allows detection of approach, light touch, pressure and shear. MAIN TEXT Introduction To accommodate for complex interactions between humans and robots, it is important to design a method for touch identification that can be active on fingertips and other sensing surfaces. Ideally, the approach will be scalable to cover over most of a robot's surface area, forming an artificial or electronic skin (1, 2). Such a technology is also sought for neurally controlled prosthetic devices to enhance motor control (3, 4). The functional requirements of an artificial skin include the ability to sense and differentiate tactile stimuli such as light touch, pressure and shear (1). Having a smooth and soft skin, rather than a hard or bumpy surface, helps make the surface more lifelike, while the compliance allows for lower bandwidth control systems. There is a plethora of work on flexible touch and pressure sensors.
Panel Data Nowcasting: The Case of Price-Earnings Ratios
Babii, Andrii, Ball, Ryan T., Ghysels, Eric, Striaukas, Jonas
The paper uses structured machine learning regressions for nowcasting with panel data consisting of series sampled at different frequencies. Motivated by the problem of predicting corporate earnings for a large cross-section of firms with macroeconomic, financial, and news time series sampled at different frequencies, we focus on the sparse-group LASSO regularization which can take advantage of the mixed frequency time series panel data structures. Our empirical results show the superior performance of our machine learning panel data regression models over analysts' predictions, forecast combinations, firm-specific time series regression models, and standard machine learning methods.
Denoise Pretraining on Nonequilibrium Molecules for Accurate and Transferable Neural Potentials
Wang, Yuyang, Xu, Changwen, Li, Zijie, Farimani, Amir Barati
Recent advances in equivariant graph neural networks (GNNs) have made deep learning amenable to developing fast surrogate models to expensive ab initio quantum mechanics (QM) approaches for molecular potential predictions. However, building accurate and transferable potential models using GNNs remains challenging, as the data is greatly limited by the expensive computational costs and level of theory of QM methods, especially for large and complex molecular systems. In this work, we propose denoise pretraining on nonequilibrium molecular conformations to achieve more accurate and transferable GNN potential predictions. Specifically, atomic coordinates of sampled nonequilibrium conformations are perturbed by random noises and GNNs are pretrained to denoise the perturbed molecular conformations which recovers the original coordinates. Rigorous experiments on multiple benchmarks reveal that pretraining significantly improves the accuracy of neural potentials. Furthermore, we show that the proposed pretraining approach is model-agnostic, as it improves the performance of different invariant and equivariant GNNs. Notably, our models pretrained on small molecules demonstrate remarkable transferability, improving performance when fine-tuned on diverse molecular systems, including different elements, charged molecules, biomolecules, and larger systems. These results highlight the potential for leveraging denoise pretraining approaches to build more generalizable neural potentials for complex molecular systems.
Learning to reconstruct the bubble distribution with conductivity maps using Invertible Neural Networks and Error Diffusion
Kumar, Nishant, Krause, Lukas, Wondrak, Thomas, Eckert, Sven, Eckert, Kerstin, Gumhold, Stefan
Electrolysis is crucial for eco-friendly hydrogen production, but gas bubbles generated during the process hinder reactions, reduce cell efficiency, and increase energy consumption. Additionally, these gas bubbles cause changes in the conductivity inside the cell, resulting in corresponding variations in the induced magnetic field around the cell. Therefore, measuring these gas bubble-induced magnetic field fluctuations using external magnetic sensors and solving the inverse problem of Biot-Savart's Law allows for estimating the conductivity in the cell and, thus, bubble size and location. However, determining high-resolution conductivity maps from only a few induced magnetic field measurements is an ill-posed inverse problem. To overcome this, we exploit Invertible Neural Networks (INNs) to reconstruct the conductivity field. Our qualitative results and quantitative evaluation using random error diffusion show that INN achieves far superior performance compared to Tikhonov regularization.
Hierarchical Planning and Policy Shaping Shared Autonomy for Articulated Robots
Yousefi, Ehsan, Chen, Mo, Sharf, Inna
In this work, we propose a novel shared autonomy framework to operate articulated robots. We provide strategies to design both the task-oriented hierarchical planning and policy shaping algorithms for efficient human-robot interactions in context-aware operation of articulated robots. Our framework for interplay between the human and the autonomy, as the participating agents in the system, is particularly influenced by the ideas from multi-agent systems, game theory, and theory of mind for a sliding level of autonomy. We formulate the sequential hierarchical human-in-the-loop decision making process by extending MDPs and Options framework to shared autonomy, and make use of deep RL techniques to train an uncertainty-aware shared autonomy policy. To fine-tune the formulation to a human, we use history of the system states, human actions, and their error with respect to a surrogate optimal model to encode human's internal state embeddings, beyond the designed values, by using conditional VAEs. We showcase the effectiveness of our formulation for different human skill levels and degrees of cooperativeness by using a case study of a feller-buncher machine in the challenging tasks of timber harvesting. Our framework is successful in providing a sliding level of autonomy from fully autonomous to fully manual, and is particularly successful in handling a noisy non-cooperative human agent in the loop. The proposed framework advances the state-of-the-art in shared autonomy for operating articulated robots, but can also be applied to other domains where autonomous operation is the ultimate goal.
A hybrid machine learning framework for clad characteristics prediction in metal additive manufacturing
During the past decade, metal additive manufacturing (MAM) has experienced significant developments and gained much attention due to its ability to fabricate complex parts, manufacture products with functionally graded materials, minimize waste, and enable low-cost customization. Despite these advantages, predicting the impact of processing parameters on the characteristics of an MAM printed clad is challenging due to the complex nature of MAM processes. Machine learning (ML) techniques can help connect the physics underlying the process and processing parameters to the clad characteristics. In this study, we introduce a hybrid approach which involves utilizing the data provided by a calibrated multi-physics computational fluid dynamic (CFD) model and experimental research for preparing the essential big dataset, and then uses a comprehensive framework consisting of various ML models to predict and understand clad characteristics. We first compile an extensive dataset by fusing experimental data into the data generated using the developed CFD model for this study. This dataset comprises critical clad characteristics, including geometrical features such as width, height, and depth, labels identifying clad quality, and processing parameters. Second, we use two sets of processing parameters for training the ML models: machine setting parameters and physics-aware parameters, along with versatile ML models and reliable evaluation metrics to create a comprehensive and scalable learning framework for predicting clad geometry and quality. This framework can serve as a basis for clad characteristics control and process optimization. The framework resolves many challenges of conventional modeling methods in MAM by solving t the issue of data scarcity using a hybrid approach and introducing an efficient, accurate, and scalable platform for clad characteristics prediction and optimization.
A Biomimetic Fingerprint for Robotic Tactile Sensing
Quilachamรญn, Oscar Alberto Juiรฑa, Navarro-Guerrero, Nicolรกs
Tactile sensors have been developed since the early '70s and have greatly improved, but there are still no widely adopted solutions. Various technologies, such as capacitive, piezoelectric, piezoresistive, optical, and magnetic, are used in haptic sensing. However, most sensors are not mechanically robust for many applications and cannot cope well with curved or sizeable surfaces. Aiming to address this problem, we present a 3D-printed fingerprint pattern to enhance the body-borne vibration signal for dynamic tactile feedback. The 3D-printed fingerprint patterns were designed and tested for an RH8D Adult size Robot Hand. The patterns significantly increased the signal's power to over 11 times the baseline. A public haptic dataset including 52 objects of several materials was created using the best fingerprint pattern and material.
UN body discusses potential for deep sea mining, permits may be coming soon
Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. The International Seabed Authority -- the United Nations body that regulates the world's ocean floor -- is preparing to resume negotiations that could open the international seabed for mining, including for materials critical for the green energy transition. Years long negotiations are reaching a critical point where the authority will soon need to begin accepting mining permit applications, adding to worries over the potential impacts on sparsely researched marine ecosystems and habitats of the deep sea. Here's a look at what deep sea mining is, why some companies and countries are applying for permits to carry it out and why environmental activists are raising concerns.