adaptation mechanism
Dynamic Adaptation of LoRA Fine-Tuning for Efficient and Task-Specific Optimization of Large Language Models
Liao, Xiaoxuan, Wang, Chihang, Zhou, Shicheng, Hu, Jiacheng, Zheng, Hongye, Gao, Jia
This paper presents a novel methodology of fine-tuning for large language models-dynamic LoRA. Building from the standard Low-Rank Adaptation framework, this methodology further adds dynamic adaptation mechanisms to improve efficiency and performance. The key contribution of dynamic LoRA lies within its adaptive weight allocation mechanism coupled with an input feature-based adaptive strategy. These enhancements allow for a more precise fine-tuning process that is more tailored to specific tasks. Traditional LoRA methods use static adapter settings, not considering the different importance of model layers. In contrast, dynamic LoRA introduces a mechanism that dynamically evaluates the layer's importance during fine-tuning. This evaluation enables the reallocation of adapter parameters to fit the unique demands of each individual task, which leads to better optimization results. Another gain in flexibility arises from the consideration of the input feature distribution, which helps the model generalize better when faced with complicated and diverse datasets. The joint approach boosts not only the performance over each single task but also the generalization ability of the model. The efficiency of the dynamic LoRA was validated in experiments on benchmark datasets, such as GLUE, with surprising results. More specifically, this method achieved 88.1% accuracy with an F1-score of 87.3%. Noticeably, these improvements were made at a slight increase in computational costs: only 0.1% more resources than standard LoRA. This balance between performance and efficiency positions dynamic LoRA as a practical, scalable solution for fine-tuning LLMs, especially in resource-constrained scenarios. To take it a step further, its adaptability makes it a promising foundation for much more advanced applications, including multimodal tasks.
Hierarchical Adaptation with Hypernetworks for Few-shot Molecular Property Prediction
Wu, Shiguang, Wang, Yaqing, Yao, Quanming
Molecular property prediction (MPP) is important in biomedical applications, which naturally suffers from a lack of labels, thus forming a few-shot learning problem. State-of-the-art approaches are usually based on gradient-based meta learning strategy, which ignore difference in model parameter and molecule's learning difficulty. To address above problems, we propose a novel hierarchical adaptation mechanism for few-shot MPP (HiMPP). The model follows a encoder-predictor framework. First, to make molecular representation property-adaptive, we selectively adapt encoder's parameter by designing a hypernetwork to modulate node embeddings during message propagation. Next, we make molecule-level adaptation by design another hypernetwork, which assigns larger propagating steps for harder molecules in predictor. In this way, molecular representation is transformed by HiMPP hierarchically from property-level to molecular level. Extensive results show that HiMPP obtains the state-of-the-art performance in few-shot MPP problems, and our proposed hierarchical adaptation mechanism is rational and effective.
Machine Learning Approaches For Motor Learning: A Short Review
Caramiaux, Baptiste, Franรงoise, Jules, Liu, Abby Wanyu, Sanchez, Tรฉo, Bevilacqua, Frรฉdรฉric
The use of machine learning to model motor learning mechanisms is still limited, while it could help to design novel interactive systems for movement learning or rehabilitation. This approach requires to account for the motor variability induced by motor learning mechanisms. This represents specific challenges concerning fast adaptability of the computational models, from small variations to more drastic changes, including new movement classes. We propose a short review on machine learning based movement models and their existing adaptation mechanisms. We discuss the current challenges for applying these models in motor learning support systems, delineating promising research directions at the intersection of machine learning and motor learning.
LUNAR: Cellular Automata for Drifting Data Streams
Lobo, Jesus L., Del Ser, Javier, Herrera, Francisco
With the advent of huges volumes of data produced in the form of fast streams, real-time machine learning has become a challenge of relevance emerging in a plethora of real-world applications. Processing such fast streams often demands high memory and processing resources. In addition, they can be affected by non-stationary phenomena (concept drift), by which learning methods have to detect changes in the distribution of streaming data, and adapt to these evolving conditions. A lack of efficient and scalable solutions is particularly noted in real-time scenarios where computing resources are severely constrained, as it occurs in networks of small, numerous, interconnected processing units (such as the so-called Smart Dust, Utility Fog, or Swarm Robotics paradigms). In this work we propose LUNAR, a streamified version of cellular automata devised to successfully meet the aforementioned requirements. It is able to act as a real incremental learner while adapting to drifting conditions. Extensive simulations with synthetic and real data will provide evidence of its competitive behavior in terms of classification performance when compared to long-established and successful online learning methods.
The ravages of concept drift in stream learning applications and how to deal with it - KDnuggets
The Big Data paradigm has gained momentum last decade, because of its promise to deliver valuable insights to many real-world applications. With the advent of this emerging paradigm comes not only an increase in the volume of available data, but also the notion of its arrival velocity, that is, these real-world applications generate data in real-time at rates faster than those that can be handled by traditional systems. This situation leads us to assume that we have to deal with a potentially infinite and ever-growing datasets that may arrive continuously (stream learning) in batches of instances or instance by instance, in contrast to traditional systems where there is free access to all historical data. These traditional processing systems assume that data are at rest and simultaneously accessed. The models based on this traditional processing do not continuously integrate new information into already constructed models but, instead, regularly reconstruct new models from the scratch.
Parameterless Stochastic Natural Gradient Method for Discrete Optimization and its Application to Hyper-Parameter Optimization for Neural Network
Nishida, Kouhei, Aguirre, Hernan, Saito, Shota, Shirakawa, Shinichi, Akimoto, Youhei
Black box discrete optimization (BBDO) appears in wide range of engineering tasks. Evolutionary or other BBDO approaches have been applied, aiming at automating necessary tuning of system parameters, such as hyper parameter tuning of machine learning based systems when being installed for a specific task. However, automation is often jeopardized by the need of strategy parameter tuning for BBDO algorithms. An expert with the domain knowledge must undergo time-consuming strategy parameter tuning. This paper proposes a parameterless BBDO algorithm based on information geometric optimization, a recent framework for black box optimization using stochastic natural gradient. Inspired by some theoretical implications, we develop an adaptation mechanism for strategy parameters of the stochastic natural gradient method for discrete search domains. The proposed algorithm is evaluated on commonly used test problems. It is further extended to two examples of simultaneous optimization of the hyper parameters and the connection weights of deep learning models, leading to a faster optimization than the existing approaches without any effort of parameter tuning.
Evolving Personalized Content for Super Mario Bros Using Grammatical Evolution
Shaker, Noor (IT University of Copenhagen) | Yannakakis, Georgios N. (IT University of Copenhagen) | Togelius, Julian (IT University of Copenhagen) | Nicolau, Miguel (University College Dublin) | O' (University College Dublin) | Neill, Michael
Adapting game content to a particular player's needs and expertise constitutes an important aspect in game design. Most research in this direction has focused on adapting game difficultyto keep the player engaged in the game. Dynamic difficulty adjustment, however, focuses on one aspect of the gameplay experience by adjusting the content to increase ordecrease perceived challenge. In this paper, we introduce a method for automatic level generation for the platform game Super Mario Bros using grammatical evolution. The grammatical evolution-based level generator is used to generate player-adapted content by employing an adaptation mechanism as a fitness function in grammatical evolution to optimizethe player experience of three emotional states: engagement, frustration and challenge. The fitness functions used are models of player experience constructed in our previous work from crowd-sourced gameplay data collected from over 1500 game sessions.