본문 바로가기
Python

[python] 결측치, outlier 제거하기

by Chandler.j 2021. 2. 21.
반응형

fig1. title

결측치 제거

df <- df[complete.cases(df), ]

 

 y_train_pd의 'pred-true' 변수의 outlier를 제거

#remove outlier

Q1 = y_train_pd['pred-true'].quantile(0.25)
Q3 = y_train_pd['pred-true'].quantile(0.75)
IQR = Q3 - Q1    #IQR is interquartile range. 

filter = (y_train_pd['pred-true'] >= Q1 - 1.5 * IQR) & (y_train_pd['pred-true'] <= Q3 + 1.5 *IQR)
ro_list = y_train_pd.loc[filter]
print(y_train_pd['pred-true'].describe())
print(ro_list['pred-true'].describe())

fig2. output of #1

 


TOP

Designed by 티스토리