Goto

Collaborating Authors

 Learning Management


Online Learning with an Almost Perfect Expert

arXiv.org Machine Learning

We study the online learning problem where a forecaster is trying to predict each day the next bit in a sequence, such as whether the stock market will go up or down. Every morning, for T days, he solicits the opinions of a number n of experts, who each make up or down predictions. Based on their predictions, the forecaster makes a choice between up and down, then buys or sells accordingly. The goal of the forecaster is to make as few mistakes as possible given that the bit sequence may be generated adversarially. This is a classical learning problem that has been studied in a large body of literature starting with the development of Blackwell approachability [Bla56] and Hannan consistency [Han57], and continued in learning theory under the paradigm of combining expert advice [LW94, Vov90]. One of the best known approaches is the Weighted-Majority algorithm [LW94], which keeps track of weights for all the experts and changes them in every round depending on the quality of their predictions. The average number of mistakes made by the forecaster when using such an algorithm can be bounded by the number of mistakes made by the best expert plus log n/T.


Preference-based Online Learning with Dueling Bandits: A Survey

arXiv.org Machine Learning

In machine learning, the notion of multi-armed bandits refers to a class of online learning problems, in which an agent is supposed to simultaneously explore and exploit a given set of choice alternatives in the course of a sequential decision process. In the standard setting, the agent learns from stochastic feedback in the form of real-valued rewards. In many applications, however, numerical reward signals are not readily available -- instead, only weaker information is provided, in particular relative preferences in the form of qualitative comparisons between pairs of alternatives. This observation has motivated the study of variants of the multi-armed bandit problem, in which more general representations are used both for the type of feedback to learn from and the target of prediction. The aim of this paper is to provide a survey of the state of the art in this field, referred to as preference-based multi-armed bandits or dueling bandits. To this end, we provide an overview of problems that have been considered in the literature as well as methods for tackling them. Our taxonomy is mainly based on the assumptions made by these methods about the data-generating process and, related to this, the properties of the preference-based feedback.


How should one start learning about AI and machine learning?

#artificialintelligence

AI is definitely the future. Machine learning, being the current application of artificial intelligence, is based on the idea to give the computer access to data and make them learn themselves. There are obviously various ways to start. There are two broad perspectives of getting into AI and machine learning; first, the API and second, the algorithms. These two prospects are hardly covered when you start an online course or you read a book.



Perspective The future of education is virtual

#artificialintelligence

Massive open online courses (MOOCs) were supposed to bring a revolution in education. But they haven't lived up to expectations. We have been putting educators in front of cameras and shooting video -- just as the first TV shows did with radio stars, microphone in hand. This is not to say the millions of hours of online content are not valuable; the limits lie in the ability of the underlying technology to customize the material to the individual and to coach. That is about to change, though, through the use of virtual reality, artificial intelligence and sensors.


4 ways artificial intelligence will shape the future of learning technology

#artificialintelligence

With the rapid pace of innovation continually disrupting business models, and in many cases entire industries, how will online learning keep up to provide the relevant courseware for today's and tomorrow's workforce? This will be essential for economic growth and to support a thriving, college-educated workforce that's equipped with the very latest knowledge, ideas and technology. In the future, I believe that institutions at the forefront of online education will be recognized via several capabilities which will have digitally transformed today's EdTech market. They will include a powerful combination of omni-channel learning pathways, cognitive courseware, virtual counselors and AI-enabled course development and grading. These innovations, underpinned by artificial intelligence (AI), will help to provide students the ultimate choice in their courseware – including up-to-the-minute courses on high-interest/high-growth subject matter – as well as highly-innovative digital services that support them every step of the way to help maximize their success and personal objectives.


The Sooner You Get Your First AI Job, the Better for Your Career

#artificialintelligence

Artificial intelligence is already reshaping society as we know it in both business and consumer realms. Early use cases with Alexa, autonomous vehicles and AI-driven supply chains provide just a glimpse of the disruption that AI is poised to deliver in the near future and for years to come. Yet despite all the AI hype and initial successes, it remains in its infancy. That makes now the ideal time for young people to build the knowledge, skill sets and connections they need to capitalize on the fast-growing market for AI jobs and build a strong AI career. One reason is simply practical. Gartner predicts that AI may eliminate 1.8 million jobs by 2020, yet is on track to create 2.3 million new positions.


Delayed Bandit Online Learning with Unknown Delays

arXiv.org Machine Learning

This paper studies bandit learning problems with delayed feedback, which included multi-armed bandit (MAB) and bandit convex optimization (BCO). Given only function value information (a.k.a. bandit feedback), algorithms for both MAB and BCO typically rely on (possibly randomized) gradient estimators based on function values, and then feed them into well-studied gradient-based algorithms. Different from existing works however, the setting considered here is more challenging, where the bandit feedback is not only delayed but also the presence of its delay is not revealed to the learner. Existing algorithms for delayed MAB and BCO become intractable in this setting. To tackle such challenging settings, DEXP3 and DBGD have been developed for MAB and BCO, respectively. Leveraging a unified analysis framework, it is established that both DEXP3 and DBGD guarantee an ${\cal O}\big( \sqrt{T+D} \big)$ regret over $T$ time slots with $D$ being the overall delay accumulated over slots. The new regret bounds match those in full information settings.


Artificial Intelligence is the bicycle for our Technology -- My Udacity AMA

#artificialintelligence

Firstly, Karen Baker and Martin McGovern from Udacity help organize and facilitate this AMA for the life long learners at Udacity. I am deeply thankful to Karen, Martin and Udacity for this opportunity to share the knowledge. QQ: What is the best piece of advice you've ever received in your career? VK: I have got some good advice from books as well as mentors. QQ: What suggestions do you have around building your portfolio?


More Than Powering Robots, AI Is About Connecting People AGE OF ROBOTS Magazine

#artificialintelligence

I can have a much more meaningful interaction with someone sitting across the table from me than I can with a massive group of people, spread out all over the world, using one message. As social media and technology "connect us" in new ways, we're being driven apart by those messages. Just consider the increasing political divisiveness around the world, at least partially the result of people misunderstanding or talking-past each other. Artificial intelligence promises a lot: self-driving cars, more complex automation, leaps in medical research. Many of these are, however, still far from realization.