- select a class of models and loss function
- fit a model to training data
- use a model to predict data that is not in training set
Class of functions
Y=f(X,θ)
- X input data
- θ parameters
A loss function
l(Y,Y^)
- Y known output values form data
- Y^ estimated output value using f
empirical risk: an average of the loss on such dataset (training dataset)
ERM: minimizing empirical risk is good if we want to minimize loss on data not used in training