CS 229 - Supervised Learning Cheatsheet

Mar-14-2019, 23:08:57 GMT–#artificialintelligence

Given a set of data points $\{x {(1)}, ..., x {(m)}\}$ associated to a set of outcomes $\{y {(1)}, ..., y {(m)}\}$, we want to build a classifier that learns how to predict $y$ from $x$. Hypothesis ― The hypothesis is noted $h_\theta$ and is the model that we choose. For a given input data $x {(i)}$ the model prediction output is $h_\theta(x {(i)})$. Loss function ― A loss function is a function $L:(z,y)\in\mathbb{R}\times Y\longmapsto L(z,y)\in\mathbb{R}$ that takes as inputs the predicted value $z$ corresponding to the real data value $y$ and outputs how different they are. Remark: Stochastic gradient descent (SGD) is updating the parameter based on each training example, and batch gradient descent is on a batch of training examples.

artificial intelligence, machine learning, mathcal, (17 more...)

#artificialintelligence

Mar-14-2019, 23:08:57 GMT

News Web Page

Add feedback

Country:
- North America > United States > California > Santa Clara County > Palo Alto (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Inductive Learning (1.00)
  - Statistical Learning > Gradient Descent (0.77)