Goto

Collaborating Authors

 transfer


Supplementary Material for PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization Appendix Outline

Neural Information Processing Systems

The appendix is organized as follows. In Appendix A, we report results for additional bounds for SVHN and ImageNet. We also report the compression size corresponding to our best bound values and compare it to the compression size obtained through standard pruning. Furthermore, in Appendix A.1 we prove why models cannot both be compressible and fit random labels. In Appendix B, we describe how optimization over hyperparameters like the intrinsic dimension impact the P AC-Bayes bound In Appendix C, we show how our P AC-Bayes bound benefit from transfer learning.




Safe Continual Domain Adaptation after Sim2Real Transfer of Reinforcement Learning Policies in Robotics

Josifovski, Josip, Gu, Shangding, Malmir, Mohammadhossein, Huang, Haoliang, Auddy, Sayantan, Navarro-Guerrero, Nicolás, Spanos, Costas, Knoll, Alois

arXiv.org Artificial Intelligence

Domain randomization has emerged as a fundamental technique in reinforcement learning (RL) to facilitate the transfer of policies from simulation to real-world robotic applications. Many existing domain randomization approaches have been proposed to improve robustness and sim2real transfer. These approaches rely on wide randomization ranges to compensate for the unknown actual system parameters, leading to robust but inefficient real-world policies. In addition, the policies pretrained in the domain-randomized simulation are fixed after deployment due to the inherent instability of the optimization processes based on RL and the necessity of sampling exploitative but potentially unsafe actions on the real system. This limits the adaptability of the deployed policy to the inevitably changing system parameters or environment dynamics over time. We leverage safe RL and continual learning under domain-randomized simulation to address these limitations and enable safe deployment-time policy adaptation in real-world robot control. The experiments show that our method enables the policy to adapt and fit to the current domain distribution and environment dynamics of the real system while minimizing safety risks and avoiding issues like catastrophic forgetting of the general policy found in randomized simulation during the pretraining phase. Videos and supplementary material are available at https://safe-cda.github.io/.


Beyond Literal Token Overlap: Token Alignability for Multilinguality

Hämmerl, Katharina, Limisiewicz, Tomasz, Libovický, Jindřich, Fraser, Alexander

arXiv.org Artificial Intelligence

Previous work has considered token overlap, or even similarity of token distributions, as predictors for multilinguality and cross-lingual knowledge transfer in language models. However, these very literal metrics assign large distances to language pairs with different scripts, which can nevertheless show good cross-linguality. This limits the explanatory strength of token overlap for knowledge transfer between language pairs that use distinct scripts or follow different orthographic conventions. In this paper, we propose subword token alignability as a new way to understand the impact and quality of multilingual tokenisation. In particular, this metric predicts multilinguality much better when scripts are disparate and the overlap of literal tokens is low. We analyse this metric in the context of both encoder and decoder models, look at data size as a potential distractor, and discuss how this insight may be applied to multilingual tokenisation in future work. We recommend our subword token alignability metric for identifying optimal language pairs for cross-lingual transfer, as well as to guide the construction of better multilingual tokenisers in the future. We publish our code and reproducibility details.


Transferring Graph Neural Networks for Soft Sensor Modeling using Process Topologies

Theisen, Maximilian F., Meesters, Gabrie M. H., Schweidtmann, Artur M.

arXiv.org Artificial Intelligence

Data-driven soft sensors help in process operations by providing real-time estimates of otherwise hard- to-measure process quantities, e.g., viscosities or product concentrations. Currently, soft sensors need to be developed individually per plant. Using transfer learning, machine learning-based soft sensors could be reused and fine-tuned across plants and applications. However, transferring data-driven soft sensor models is in practice often not possible, because the fixed input structure of standard soft sensor models prohibits transfer if, e.g., the sensor information is not identical in all plants. We propose a topology-aware graph neural network approach for transfer learning of soft sensor models across multiple plants. In our method, plants are modeled as graphs: Unit operations are nodes, streams are edges, and sensors are embedded as attributes. Our approach brings two advantages for transfer learning: First, we not only include sensor data but also crucial information on the plant topology. Second, the graph neural network algorithm is flexible with respect to its sensor inputs. This allows us to model data from different plants with different sensor networks. We test the transfer learning capabilities of our modeling approach on ammonia synthesis loops with different process topologies. We build a soft sensor predicting the ammonia concentration in the product. After training on data from one process, we successfully transfer our soft sensor model to a previously unseen process with a different topology. Our approach promises to extend the data-driven soft sensors to cases to leverage data from multiple plants.


Transfer of Knowledge through Reverse Annealing: A Preliminary Analysis of the Benefits and What to Share

Osaba, Eneko, Villar-Rodriguez, Esther

arXiv.org Artificial Intelligence

Being immersed in the NISQ-era, current quantum annealers present limitations for solving optimization problems efficiently. To mitigate these limitations, D-Wave Systems developed a mechanism called Reverse Annealing, a specific type of quantum annealing designed to perform local refinement of good states found elsewhere. Despite the research activity around Reverse Annealing, none has theorized about the possible benefits related to the transfer of knowledge under this paradigm. This work moves in that direction and is driven by experimentation focused on answering two key research questions: i) is reverse annealing a paradigm that can benefit from knowledge transfer between similar problems? and ii) can we infer the characteristics that an input solution should meet to help increase the probability of success? To properly guide the tests in this paper, the well-known Knapsack Problem has been chosen for benchmarking purposes, using a total of 34 instances composed of 14 and 16 items.


On the Transfer of Knowledge in Quantum Algorithms

Villar-Rodriguez, Esther, Osaba, Eneko, Oregi, Izaskun, Romero, Sebastián V., Ferreiro-Vélez, Julián

arXiv.org Artificial Intelligence

The field of quantum computing is generating significant anticipation within the scientific and industrial communities due to its potential to revolutionize computing paradigms. Recognizing this potential, this paper explores the integration of transfer of knowledge techniques, traditionally used in classical artificial intelligence, into quantum computing. We present a comprehensive classification of the transfer models, focusing on Transfer Learning and Transfer Optimization. Additionally, we analyze relevant schemes in quantum computing that can benefit from knowledge sharing, and we delve into the potential synergies, supported by theoretical insights and initial experimental results. Our findings suggest that leveraging the transfer of knowledge can enhance the efficiency and effectiveness of quantum algorithms, particularly in the context of hybrid solvers. This approach not only accelerates the optimization process but also reduces the computational burden on quantum processors, making it a valuable tool for advancing quantum computing technologies.


Large Language Model Enhanced Machine Learning Estimators for Classification

Wu, Yuhang, Wang, Yingfei, Wang, Chu, Zheng, Zeyu

arXiv.org Artificial Intelligence

Pre-trained large language models (LLM) have emerged as a powerful tool for simulating various scenarios and generating output given specific instructions and multimodal input. In this work, we analyze the specific use of LLM to enhance a classical supervised machine learning method for classification problems. We propose a few approaches to integrate LLM into a classical machine learning estimator to further enhance the prediction performance. We examine the performance of the proposed approaches through both standard supervised learning binary classification tasks, and a transfer learning task where the test data observe distribution changes compared to the training data. Numerical experiments using four publicly available datasets are conducted and suggest that using LLM to enhance classical machine learning estimators can provide significant improvement on prediction performance.


Learning Transfer Learning. Transfer learning is the process of…

#artificialintelligence

This concept is commonly studied in the field of machine learning, where it is used to refer to the practice of storing knowledge gained from solving one problem and applying it to a different," related problem. Transfer learning is often viewed as a design methodology, as it involves applying previously learned information to new situations in order to improve the efficiency and effectiveness of the learning process. In other words, transfer learning allows individuals or machine learning algorithms to build upon their existing knowledge and skills in order to solve new problems. Transfer learning involves taking knowledge and skills acquired in one context and applying them to a different, but related situation. For example, if you have learned how to recognize cars, that knowledge could be useful in learning how to recognize trucks. Similarly, if you have learned how to ride a motorbike, that knowledge may be transferable to learning how to ride an e-scooter.