[코드내용]
(1) 데이터셋 불러오기 (pd.read_csv , df.shape , dataframe column 지정)
(2) EDA (Uni - Non Graphic , Uni - Graphic , Multi Uni - Non Graphic , Multi Uni - Graphic)
(3) EDA이후 (데이터 전처리)
(4) Feature engineering
(5) String & Type case (df.apply)
(6) Data Manipulation (concat , merge , isin , group by)
(7) Tidy data (melt , pivot_table , wide -> tidy , tidy -> wide)
https://github.com/khalidpark/data_engineer_whitepaper/blob/main/data_engineer_whitepaper.ipynb
khalidpark/data_engineer_whitepaper
데이터 전처리 , EDA 등 머신러닝과 딥러닝을 적용하기 전 단계. Contribute to khalidpark/data_engineer_whitepaper development by creating an account on GitHub.
github.com
출처 : 코드스테이츠
728x90
댓글