36. ch05. sklearn - 회귀 - 08. 라쏘 (Lasso) - 38. ch05. sklearn - 회귀 - 10. Scaler 적용 (StandardScaler)
36. ch05. sklearn - 회귀 - 08. 라쏘 (Lasso)
from sklearn.linear_model import Lasso
# 값이 커질 수록 큰 규제입니다.
alphas = [100, 10, 1, 0.1, 0.01, 0.001, 0.0001]
for alpha in alphas:
lasso = Lasso(alpha=alpha)
lasso.fit(x_train, y_train)
pred = lasso.predict(x_test)
mse_eval('Lasso(alpha={})'.format(alpha), pred, y_test)
lasso_100 = Lasso(alpha=100)
lasso_100.fit(x_train, y_train)
lasso_pred_100 = lasso_100.predict(x_test)
lasso_001 = Lasso(alpha=0.001)
lasso_001.fit(x_train, y_train)
lasso_pred_001 = lasso_001.predict(x_test)
plot_coef(x_train.columns, lasso_100.coef_)
lasso_100.coef_
plot_coef(x_train.columns, lasso_001.coef_)
lasso_001.coef_
37. ch05. sklearn - 회귀 - 09. 엘라스틱넷 (ElasticNet)
ElasticNet
l1_ratio (default=0.5)
l1_ratio = 0 (L2 규제만 사용).
l1_ratio = 1 (L1 규제만 사용).
0 < l1_ratio < 1 (L1 and L2 규제의 혼합사용)
from sklearn.linear_model import ElasticNet
ratios = [0.2, 0.5, 0.8]
for ratio in ratios:
elasticnet = ElasticNet(alpha=0.5, l1_ratio=ratio)
elasticnet.fit(x_train, y_train)
pred = elasticnet.predict(x_test)
mse_eval('ElasticNet(l1_ratio={})'.format(ratio), pred, y_test)
elsticnet_20 = ElasticNet(alpha=5, l1_ratio=0.2)
elsticnet_20.fit(x_train, y_train)
elasticnet_pred_20 = elsticnet_20.predict(x_test)
elsticnet_80 = ElasticNet(alpha=5, l1_ratio=0.8)
elsticnet_80.fit(x_train, y_train)
elasticnet_pred_80 = elsticnet_80.predict(x_test)
plot_coef(x_train.columns, elsticnet_20.coef_)
plot_coef(x_train.columns, elsticnet_80.coef_)
elsticnet_80.coef_
38. ch05. sklearn - 회귀 - 10. Scaler 적용 (StandardScaler)
Scaler
from sklearn.preprocessing import StandardScaler, MinMaxScaler, RobustScaler
x_train.describe()
StandardScaler
평균(mean)을 0, 표준편차(std)를 1로 만들어 주는 스케일러
std_scaler = StandardScaler()
std_scaled = std_scaler.fit_transform(x_train)
round(pd.DataFrame(std_scaled).describe(), 2)
MinMaxScaler
min값과 max값을 0~1사이로 정규화
minmax_scaler = MinMaxScaler()
minmax_scaled = minmax_scaler.fit_transform(x_train)
round(pd.DataFrame(minmax_scaled).describe(), 2)
RobustScaler
중앙값(median)이 0, IQR(interquartile range)이 1이 되도록 변환.
outlier 값 처리에 유용
robust_scaler = RobustScaler()
robust_scaled = robust_scaler.fit_transform(x_train)
round(pd.DataFrame(robust_scaled).median(), 2)
패스트캠퍼스 데이터분석 강의 링크
bit.ly/3imy2uN