Goto

Collaborating Authors

 iterative learning


Bayesian Optimization for Iterative Learning

Neural Information Processing Systems

The performance of deep (reinforcement) learning systems crucially depends on the choice of hyperparameters. Their tuning is notoriously expensive, typically requiring an iterative training process to run for numerous steps to convergence. Traditional tuning algorithms only consider the final performance of hyperparameters acquired after many expensive iterations and ignore intermediate information from earlier training steps. In this paper, we present a Bayesian optimization(BO) approach which exploits the iterative structure of learning algorithms for efficient hyperparameter tuning. We propose to learn an evaluation function compressing learning progress at any stage of the training process into a single numeric score according to both training success and stability. Our BO framework is then trade-off the benefit of assessing a hyperparameter setting over additional training steps against their computation cost. We further increase model efficiency by selectively including scores from different training steps for any evaluated hyperparameter set. We demonstrate the efficiency of our algorithm by tuning hyperparameters for the training of deep reinforcement learning agents and convolutional neural networks. Our algorithm outperforms all existing baselines in identifying optimal hyperparameters in minimal time.


Learning to Throw-Flip

arXiv.org Artificial Intelligence

Dynamic manipulation, such as robot tossing or throwing objects, has recently gained attention as a novel paradigm to speed up logistic operations. However, the focus has predominantly been on the object's landing location, irrespective of its final orientation. In this work, we present a method enabling a robot to accurately "throw-flip" objects to a desired landing pose (position and orientation). Conventionally, objects thrown by revolute robots suffer from parasitic rotation, resulting in highly restricted and uncontrollable landing poses. Our approach is based on two key design choices: first, leveraging the impulse-momentum principle, we design a family of throwing motions that effectively decouple the parasitic rotation, significantly expanding the feasible set of landing poses. Second, we combine a physics-based model of free flight with regression-based learning methods to account for unmodeled effects. Real robot experiments demonstrate that our framework can learn to throw-flip objects to a pose target within ($\pm$5 cm, $\pm$45 degrees) threshold in dozens of trials. Thanks to data assimilation, incorporating projectile dynamics reduces sample complexity by an average of 40% when throw-flipping to unseen poses compared to end-to-end learning methods. Additionally, we show that past knowledge on in-hand object spinning can be effectively reused, accelerating learning by 70% when throwing a new object with a Center of Mass (CoM) shift. A video summarizing the proposed method and the hardware experiments is available at https://youtu.be/txYc9b1oflU.



Review for NeurIPS paper: Bayesian Optimization for Iterative Learning

Neural Information Processing Systems

The paper proposes an idea for tuning hyper-parameters in deep (reinforcement) learning using Bayesian optimization. The key idea is to exploit the iterative structure of the problem and use a variable-augmentation trick to learn a score function that compresses the learning progress at any stage. The strengths of the paper are: - well written - good relation to prior work - good experimental study However, the paper also has weaknesses, which are mostly related to theoretical aspects and chosen heuristics (see some details below). If we are only interested in the predictive mean for the cost-GP, why do we use a GP in the first place, and not parametric function, which scales much better? That's the one part that caused us the most toothache.


TiniScript: A Simplified Language for Educational Robotics

arXiv.org Artificial Intelligence

The constructionism theory, formulated by Seymour Papert, has been a transformative approach in education, particularly within STEM (Science, Technology, Engineering, and Mathematics) fields. This theory emphasizes learning through creation, where students engage actively by building knowledge structures through hands-on tasks and meaningful projects. One of the early milestones influenced by constructionism was the development of the Logo programming language. Logo's simple, block-based structure enabled students to grasp fundamental programming concepts visually by manipulating blocks, establishing a foundation for educational tools that remain essential in early computer science education. Over time, educational robotics kits, like those from LEGO Education (RCX, NXT, and EV3), have set standards for integrating physical construction with software programming. These kits demonstrate the potential of robotics in educational settings by engaging students in both mechanical assembly and logical problem-solving, thereby fostering an understanding of hardware and software as interconnected aspects of robotics. Building on this foundation, programming environments in educational robotics have largely adopted block-based interfaces. These environments simplify coding for beginners, allowing students to create programs by connecting blocks representing specific actions. Once completed, the program is uploaded to a microcontroller, enabling the robot to execute the instructions.


Bayesian Optimization for Iterative Learning

Neural Information Processing Systems

The performance of deep (reinforcement) learning systems crucially depends on the choice of hyperparameters. Their tuning is notoriously expensive, typically requiring an iterative training process to run for numerous steps to convergence. Traditional tuning algorithms only consider the final performance of hyperparameters acquired after many expensive iterations and ignore intermediate information from earlier training steps. In this paper, we present a Bayesian optimization(BO) approach which exploits the iterative structure of learning algorithms for efficient hyperparameter tuning. We propose to learn an evaluation function compressing learning progress at any stage of the training process into a single numeric score according to both training success and stability.


Iterative Learning for Reliable Crowdsourcing Systems

Neural Information Processing Systems

Crowdsourcing systems, in which tasks are electronically distributed to numerous information piece-workers'', have emerged as an effective paradigm for human-powered solving of large scale problems in domains such as image classification, data entry, optical character recognition, recommendation, and proofreading. Because these low-paid workers can be unreliable, nearly all crowdsourcers must devise schemes to increase confidence in their answers, typically by assigning each task multiple times and combining the answers in some way such as majority voting. In this paper, we consider a general model of such rowdsourcing tasks, and pose the problem of minimizing the total price (i.e., number of task assignments) that must be paid to achieve a target overall reliability. We give new algorithms for deciding which tasks to assign to which workers and for inferring correct answers from the workers' answers. We show that our algorithm significantly outperforms majority voting and, in fact, are asymptotically optimal through comparison to an oracle that knows the reliability of every worker.


Learning Half-Spaces and other Concept Classes in the Limit with Iterative Learners

arXiv.org Machine Learning

In order to model an efficient learning paradigm, iterative learning algorithms access data one by one, updating the current hypothesis without regress to past data. Past research on iterative learning analyzed for example many important additional requirements and their impact on iterative learners. In this paper, our results are twofold. First, we analyze the relative learning power of various settings of iterative learning, including learning from text and from informant, as well as various further restrictions, for example we show that strongly non-U-shaped learning is restrictive for iterative learning from informant. Second, we investigate the learnability of the concept class of half-spaces and provide a constructive iterative algorithm to learn the set of half-spaces from informant.


Visual Pivoting for (Unsupervised) Entity Alignment

arXiv.org Artificial Intelligence

This work studies the use of visual semantic representations to align entities in heterogeneous knowledge graphs (KGs). Images are natural components of many existing KGs. By combining visual knowledge with other auxiliary information, we show that the proposed new approach, EVA, creates a holistic entity representation that provides strong signals for cross-graph entity alignment. Besides, previous entity alignment methods require human labelled seed alignment, restricting availability. EVA provides a completely unsupervised solution by leveraging the visual similarity of entities to create an initial seed dictionary (visual pivots). Experiments on benchmark data sets DBP15k and DWY15k show that EVA offers state-of-the-art performance on both monolingual and cross-lingual entity alignment tasks. Furthermore, we discover that images are particularly useful to align long-tail KG entities, which inherently lack the structural contexts necessary for capturing the correspondences.


Iterative Learning for Reliable Crowdsourcing Systems

Neural Information Processing Systems

Crowdsourcing systems, in which tasks are electronically distributed to numerous information piece-workers'', have emerged as an effective paradigm for human-powered solving of large scale problems in domains such as image classification, data entry, optical character recognition, recommendation, and proofreading. Because these low-paid workers can be unreliable, nearly all crowdsourcers must devise schemes to increase confidence in their answers, typically by assigning each task multiple times and combining the answers in some way such as majority voting. In this paper, we consider a general model of such rowdsourcing tasks, and pose the problem of minimizing the total price (i.e., number of task assignments) that must be paid to achieve a target overall reliability. We give new algorithms for deciding which tasks to assign to which workers and for inferring correct answers from the workers' answers. We show that our algorithm significantly outperforms majority voting and, in fact, are asymptotically optimal through comparison to an oracle that knows the reliability of every worker.