본문 바로가기

카테고리 없음

[패스트캠퍼스 수강 후기] 데이터분석 인강 100% 환급 챌린지 35회차 미션

36. ch05. sklearn - 회귀 - 08. 라쏘 (Lasso) - 38. ch05. sklearn - 회귀 - 10. Scaler 적용 (StandardScaler)

36. ch05. sklearn - 회귀 - 08. 라쏘 (Lasso)

from sklearn.linear_model import Lasso

# 값이 커질 수록 큰 규제입니다.
alphas = [100, 10, 1, 0.1, 0.01, 0.001, 0.0001]

for alpha in alphas:
lasso = Lasso(alpha=alpha)
lasso.fit(x_train, y_train)
pred = lasso.predict(x_test)
mse_eval('Lasso(alpha={})'.format(alpha), pred, y_test)



lasso_100 = Lasso(alpha=100)
lasso_100.fit(x_train, y_train)
lasso_pred_100 = lasso_100.predict(x_test)

lasso_001 = Lasso(alpha=0.001)
lasso_001.fit(x_train, y_train)
lasso_pred_001 = lasso_001.predict(x_test)


plot_coef(x_train.columns, lasso_100.coef_)

lasso_100.coef_

plot_coef(x_train.columns, lasso_001.coef_)

lasso_001.coef_



37. ch05. sklearn - 회귀 - 09. 엘라스틱넷 (ElasticNet)

ElasticNet

l1_ratio (default=0.5)

l1_ratio = 0 (L2 규제만 사용).
l1_ratio = 1 (L1 규제만 사용).
0 < l1_ratio < 1 (L1 and L2 규제의 혼합사용)

from sklearn.linear_model import ElasticNet

ratios = [0.2, 0.5, 0.8]


for ratio in ratios:
elasticnet = ElasticNet(alpha=0.5, l1_ratio=ratio)
elasticnet.fit(x_train, y_train)
pred = elasticnet.predict(x_test)
mse_eval('ElasticNet(l1_ratio={})'.format(ratio), pred, y_test)




elsticnet_20 = ElasticNet(alpha=5, l1_ratio=0.2)
elsticnet_20.fit(x_train, y_train)
elasticnet_pred_20 = elsticnet_20.predict(x_test)

elsticnet_80 = ElasticNet(alpha=5, l1_ratio=0.8)
elsticnet_80.fit(x_train, y_train)
elasticnet_pred_80 = elsticnet_80.predict(x_test)


plot_coef(x_train.columns, elsticnet_20.coef_)

plot_coef(x_train.columns, elsticnet_80.coef_)

elsticnet_80.coef_


38. ch05. sklearn - 회귀 - 10. Scaler 적용 (StandardScaler)

Scaler

from sklearn.preprocessing import StandardScaler, MinMaxScaler, RobustScaler

x_train.describe()

StandardScaler

평균(mean)을 0, 표준편차(std)를 1로 만들어 주는 스케일러



std_scaler = StandardScaler()

std_scaled = std_scaler.fit_transform(x_train)


round(pd.DataFrame(std_scaled).describe(), 2)



MinMaxScaler


min값과 max값을 0~1사이로 정규화

minmax_scaler = MinMaxScaler()
minmax_scaled = minmax_scaler.fit_transform(x_train)

round(pd.DataFrame(minmax_scaled).describe(), 2)



RobustScaler


중앙값(median)이 0, IQR(interquartile range)이 1이 되도록 변환.

outlier 값 처리에 유용


robust_scaler = RobustScaler()
robust_scaled = robust_scaler.fit_transform(x_train)


round(pd.DataFrame(robust_scaled).median(), 2)



패스트캠퍼스 데이터분석 강의 링크
bit.ly/3imy2uN