Machine Learning Engineer Interview Review - Machine Learning

1.1 Supervised Learning

1.1.1 Linear Regression

1.1.1.1 Linear regression with multiple variables

$x^{i}= vector[x_1, x_2, …, x_n]$

$h_\theta(x)=\theta_0+\theta_1x_1+\theta_2x_2+…+\theta_nx_n $

To simplify the above equation, we set $x_0=1$, so that we can use $h_\theta(x)=\theta^TX$ to represent the above euqation, where $X$ is a $n+1$ dimension vector.

Loss function: $J(\theta_0, \theta_1, …, \theta_n)=\frac{1}{2m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})^2$.

We want to find $\theta$ that can minimize the loss function.

Repeat $\theta_j := \theta_j-\alpha\frac{\part}{\part \theta_j}J(\theta_0, \theta_1, …, \theta_n)$ for $j=0,1,…,n$.

1.1.1.2 Feature Scaling

Suppose we have two features, the size of houses (0-2000) and the number of houses (0-5),