Machine Learning Engineer Interview Review - Machine Learning
1.1 Supervised Learning
1.1.1 Linear Regression
1.1.1.1 Linear regression with multiple variables
$x^{i}= vector[x_1, x_2, …, x_n]$
$h_\theta(x)=\theta_0+\theta_1x_1+\theta_2x_2+…+\theta_nx_n $
To simplify the above equation, we set $x_0=1$, so that we can use $h_\theta(x)=\theta^TX$ to represent the above euqation, where $X$ is a $n+1$ dimension vector.
Loss function: $J(\theta_0, \theta_1, …, \theta_n)=\frac{1}{2m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})^2$.
We want to find $\theta$ that can minimize the loss function.
Repeat $\theta_j := \theta_j-\alpha\frac{\part}{\part \theta_j}J(\theta_0, \theta_1, …, \theta_n)$ for $j=0,1,…,n$.
1.1.1.2 Feature Scaling
Suppose we have two features, the size of houses (0-2000) and the number of houses (0-5),