Goto

Collaborating Authors

 Search


Why a robot that can 'solve' Rubik's Cube one-handed has the AI community at war

#artificialintelligence

OpenAI, a non-profit co-founded by Elon Musk, recently unveiled its newest trick: A robot hand that can'solve' Rubik's Cube. Whether this is a feat of science or mere prestidigitation is a matter of some debate in the AI community right now. In case you missed it, OpenAI posted an article on its blog last week titled "Solving Rubik's Cube With a Robot Hand." Based on this title, you'd be forgiven if you thought the research discussed in said article was about solving Rubik's Cube with a robot hand. Don't get me wrong, OpenAI created a software and machine learning pipeline by which a robot hand can physically manipulate a Rubik's Cube from an'unsolved' state to a solved one. But the truly impressive bit here is that a robot hand can hold an object and move it around (to accomplish a goal) without dropping it.


This robot can now solve a Rubik's cube with one hand

#artificialintelligence

Once again, a robot can do something I cannot do. Researchers at the artificial intelligence lab OpenAI just revealed that its humanoid robotic hand can solve a Rubik's cube. The researchers utilized a pair of neural networks to make it happen. The team has been working on this project, named Dactyl, since the middle of 2017, and they felt showing their robotic hand could solve a Rubik's cube would show it had adequate dexterity. It can now solve the cube about 60 percent of the time.


OpenAI's AI-powered robot learned how to solve a Rubik's cube one-handed

#artificialintelligence

Artificial intelligence research organization OpenAI has achieved a new milestone in its quest to build general purpose, self-learning robots. The group's robotics division says Dactyl, its humanoid robotic hand first developed last year, has learned to solve a Rubik's cube one-handed. OpenAI sees the feat as a leap forward both for the dexterity of robotic appendages and its own AI software, which allows Dactyl to learn new tasks using virtual simulations before it is presented with a real, physical challenge to overcome. In a demonstration video showcasing Dactyl's new talent, we can see the robotic hand fumble its way toward a complete cube solve with clumsy yet accurate maneuvers. It takes many minutes, but Dactyl is eventually able to solve the puzzle.


Minimax Regret of Switching-Constrained Online Convex Optimization: No Phase Transition

arXiv.org Machine Learning

We study the problem of switching-constrained online convex optimization (OCO), where the player has a limited number of opportunities to change her action. While the discrete analog of this online learning task has been studied extensively, previous work in the continuous setting has neither established the minimax rate nor algorithmically achieved it. We here show that $ T $-round switching-constrained OCO with fewer than $ K $ switches has a minimax regret of $ \Theta(\frac{T}{\sqrt{K}}) $. In particular, it is at least $ \frac{T}{\sqrt{2K}} $ for one dimension and at least $ \frac{T}{\sqrt{K}} $ for higher dimensions. The lower bound in higher dimensions is attained by an orthogonal subspace argument. The minimax analysis in one dimension is more involved. To establish the one-dimensional result, we introduce the fugal game relaxation, whose minimax regret lower bounds that of switching-constrained OCO. We show that the minimax regret of the fugal game is at least $ \frac{T}{\sqrt{2K}} $ and thereby establish the minimax lower bound in one dimension. We next show that a mini-batching algorithm provides an $ O(\frac{T}{\sqrt{K}}) $ upper bound, and therefore we conclude that the minimax regret of switching-constrained OCO is $ \Theta(\frac{T}{\sqrt{K}}) $ for any $K$. This is in sharp contrast to its discrete counterpart, the switching-constrained prediction-from-experts problem, which exhibits a phase transition in minimax regret between the low-switching and high-switching regimes. In the case of bandit feedback, we first determine a novel linear (in $T$) minimax regret for bandit linear optimization against the strongly adaptive adversary of OCO, implying that a slightly weaker adversary is appropriate. We also establish the minimax regret of switching-constrained bandit convex optimization in dimension $n>2$ to be $\tilde{\Theta}(\frac{T}{\sqrt{K}})$.


Efficient Decoupled Neural Architecture Search by Structure and Operation Sampling

arXiv.org Machine Learning

We propose a novel neural architecture search algorithm via reinforcement learning by decoupling structure and operation search processes. Our approach samples candidate models from the multinomial distribution on the policy vectors defined on the two search spaces independently. The proposed technique improves the efficiency of architecture search process significantly compared to the conventional methods based on reinforcement learning with the RNN controllers while achieving competitive accuracy and model size in target tasks. Our policy vectors are easily interpretable throughout the training procedure, which allows to analyze the search progress and the discovered architectures; the black-box characteristics of the RNN controllers hamper understanding training progress in terms of policy parameter updates. Our experiments demonstrate outstanding performance compared to the state-of-the-art methods with a fraction of search cost.


Auto-Model: Utilizing Research Papers and HPO Techniques to Deal with the CASH problem

arXiv.org Artificial Intelligence

Auto-Model: Utilizing Research Papers and HPO Techniques to Deal with the CASH problem Chunnan Wang, Hongzhi Wang, Tianyu Mu, Jianzhong Li, Hong Gao Department of Computer Science Harbin Institute of T echnology Harbin, China {WangChunnan, wangzh, mutianyu, lijzh, honggao }@hit.edu.cn Abstract --In many fields, a mass of algorithms with completely different hyperparameters have been developed to address the same type of problems. Choosing the algorithm and hyperpa-rameter setting correctly can promote the overall performance greatly, but users often fail to do so due to the absence of knowledge. How to help users to effectively and quickly select the suitable algorithm and hyperparameter settings for the given task instance is an important research topic nowadays, which is known as the CASH problem. In this paper, we design the Auto-Model approach, which makes full use of known information in the related research paper and introduces hyperparameter optimization techniques, to solve the CASH problem effectively. Auto-Model tremendously reduces the cost of algorithm implementations and hyperparameter configuration space, and thus capable of dealing with the CASH problem efficiently and easily. T o demonstrate the benefit of Auto-Model, we compare it with classical Auto-Weka approach. The experimental results show that our proposed approach can provide superior results and achieves better performance in a short time. Index T erms--Algorithm selection, Hyperparameter optimization, Combined algorithm selection and hyperparameter optimization problem, Auto-Weka, Classification algorithms I. I NTRODUCTION In many fields, such as machine learning, data mining, artificial intelligence and constraint satisfaction, a variety of algorithms and heuristics have been developed to address the same type of problem [1], [2]. Each of these algorithms has its own advantages and disadvantages, and often they are complementary in the sense that one algorithm works well when others fail and vice versa [2]. If we are capable of selecting the algorithm and hyperparameter setting best suited to the task instance, any particular task instance will be well solved, and our ability of dealing with the problem will be improved considerably [3]. However, it is not trivial to achieve this goal. There are a mass of powerful and different algorithms to deal with a certain problem, and these algorithms have completely different hyperparameters, which have great effect on their performance. Even domain experts cannot easily and correctly select the appropriate algorithm with corresponding optimal hyperparameters from such a huge and complex choice space.


Minimax Rate Optimal Adaptive Nearest Neighbor Classification and Regression

arXiv.org Machine Learning

For both classification and regression problems, existing works have shown that, if the distribution of the feature vector has bounded support and the probability density function is bounded away from zero in its support, the convergence rate of the standard kNN method, in which k is the same for all test samples, is minimax optimal. On the contrary, if the distribution has unbounded support, we show that there is a gap between the convergence rate achieved by the standard kNN method and the minimax bound. To close this gap, we propose an adaptive kNN method, in which different k is selected for different samples. Our selection rule does not require precise knowledge of the underlying distribution of features. The new proposed method significantly outperforms the standard one. We characterize the convergence rate of the proposed adaptive method, and show that it matches the minimax lower bound.


New Robot Can Solve a Rubik's Cube with Just One Hand Lwin Htut Kyaw Digital Creator Mandalay Myanmar

#artificialintelligence

OpenAI has come up with a new robot capable of solving a Rubik's Cube with a single hand. The AI-based company trained neural networks in simulation using reinforcement learning to make this achievement possible. The company has been working on this project since May 2017 and has now achieved its goal marking this as a milestone towards its progress in the field of AI. The time taken by the robotic hand varies depending on how the cube is shuffled but on average, it takes about four minutes to solve the puzzle. However, it is worth noting that this is not the first-ever robot that managed to solve the Rubik's cube.


This Week In AI: Algolia Raises $110M, OpenAI Debuts Rubik's Cube Solving Bot, Kleiner Perkins Backs Cell Therapy Startup - CB Insights Research

#artificialintelligence

Medical data analytics startup Healx raised $56M from Atomico and others. Standard Cognition patented an inventory management system. Here's what went down in artificial intelligence this week. Become a CB Insights customer. If you're already a customer, log in here.


Research Guide: Data Augmentation for Deep Learning

#artificialintelligence

AutoAugment is an augmentation strategy that employs a search algorithm to find an augmentation policy that will yield the best results on the model. Each policy has several sub-policies. One sub-policy is randomly chosen for each image. Each sub-policy consists of an image processing function and the probability that the functions are applied with. The image processing operations could be translation, shearing or rotation.