본문 바로가기
Data Science/Machine Learning

[1주차] Coursera Machine Learning : Regression

by 엔딴 2021. 5. 27.
반응형

x : input

y : output

y can be predicted from x

model : how we assume the world works

f(x) : expected relationship between x and y

 

Regression model :

yi = f(xi) + ei

E(ei) = 0 

ㄴexpected value 

ㄴequally likely that error is positive or negative

ㄴyi is equally likely to be above or below f(xi)

 


"Essentially all models are wrong, but some are useful."

-George Box, 1987

 

Task1 - which model f(x)?

Task2 - For a given model f(x), estimate function fhat(x) from data

 

Flowchart

 

초록색 ML model=>

The Simple Linear Regression

f(x) = w0+ w1x

yi = f(xi) + ei = w0+w1xi + ei

parameters : regression coefficients (w0, w1)

 

Fitting a line to data

 

주황색 Quality metric=> 

"Cost" of using a given line

Residual sum of squares (RSS)

x와 y 사이에 예상했던 값과 실제 값의 차이의 제곱의 합

 

Find "best" line => Minimize cost over all possible w0, w1

 

 

The fitted line : use + interpretation

Model vs. fitted line

 

Interpreting the coefficients

w0 : predicted $ of house with sqft=0 (just land), not very meaningful

w1 : 1 sqft당 변하는 predicted change in the output per unit change in the input

magnitude depends on units of both features and observations

단위 중요!

 

회색 => ML algorithm

Optimization

RSS가 최소인 w0, w1 찾는 것 => w0hat, w1hat

 

Concave / Convex functions

* concave : cave(동굴) 같이 생김 

concave : a,b 사이에 선을 그으면 모든 점에서 g(w) 아래에 위치한다

convex : line is above g(w) everywhere

neither : below and above

 

Finding the Max fo Min Analytically

concave에서의 최대값은 derivative=0

convex에서의 최소값은 derivative =0

-> only one place where der=0

neither의 경우는, der=0인 점이 여러개 있다

 

How do we know whether to move w to right or left? (increase or decrease the value of w?)

concave : Hill climbing 

der가 음수면 오른쪽으로, 양수면 왼쪽으로 이동

while not converged,

 

convex : Hill descent

when der is positive, we want to decrease w and when der is negative, we want to increase w. 

 

 

Choosing the stepsize

에타(Η, η, eta)

 

이론적으로 der가 0일 때 optimum인 것은 알지만, 실제로는 어떤 threshold보다 작으면 멈춘다. 

der < ε (threshold to be set) 

 

Multiple dimensions

partial derivative is like a derivative with respect to w1, treating all variables as constants

 

Convex => solution is unique + gradient descent algorithm will converge to minimum

 

 

반응형

'Data Science > Machine Learning' 카테고리의 다른 글

머신러닝 (Machine Learning)  (0) 2023.05.21
[3주차]  (0) 2021.05.27
[2주차] Multiple Regression  (0) 2021.05.27
Scikit-Learn을 이용한 머신러닝  (0) 2020.10.21
머신러닝 기본  (0) 2020.08.04