I would recommend that you start with Introduction to Statistical Learning with R (usually shortened as ISLR). A lot of people have adapted the examples to Python if you google a bit and it's an excellent book that hides just enough complexity to not be overwhelming. Plus, once you have a good understanding of all of it, you can either graduate to the more extensive version (Elements of Statistical Learning, usually shortened as ESL) for a more rigorous treatment of the same thing, or choose to go for something different like Bishop's Pattern Recognition and Machine Learning. ISLR is free as a pdf and has a corresponding MOOC. ESL doesn't, but is also free on the author's website.
Thanks! We'll email you when relevant content is added and updated. We'll email you when relevant content is added and updated. We'll email you when relevant content is added and updated. We'll email you when relevant content is added and updated. If you answer the question with another -- Does it matter?
I can't offer much in terms of other entry level recommendations, but I can recommend you learn to utilize the resource pages on the coursera course. The way the andrew NG course is set up is that you more or less try to have an idea of how these algorithms work at a conceptual level through the videos, then when you go to programming assignments, you can skip a lot of the prep work and focus on implementing the machine learning algorithms. Now those algorithms might be a little hard to follow at first, which is okay and expected, and that's where the lecture notes and/or wiki come in. From the wiki you can more or less translate the math formulas into code syntax and the assignments are more or less complete. The weeks build off each other so as you learn how to do one part, they do a little less prep work for you so you have to learn how to do another part, and so forth.
We propose an online prediction version of submodular set cover with connections to ranking and repeated active learning. In each round, the learning algorithm chooses a sequence of items. The algorithm then receives a monotone submodular function and suffers loss equal to the cover time of the function: the number of items needed, when items are selected in order of the chosen sequence, to achieve a coverage constraint. We develop an online learning algorithm whose loss converges to approximately that of the best sequence in hindsight. Our proposed algorithm is readily extended to a setting where multiple functions are revealed at each round and to bandit and contextual bandit settings.