본문 바로가기
AI월드/⚙️AI BOOTCAMP_Section 2

Section 2 종합 회고

by khalidpark 2021. 3. 2.

SECTION 2 리뷰
01 - Linear Models
02 - Tree Based Model
03 - Applied Predictive Modeling
04 - Project (NBA 게임 결과 예측 머신러닝 모델)

01 - KEYWORD

  • 선형회귀 (Linear Regression)
  • 지도학습 (Supervised Learning)
  • 기준모델 (Baseline Model) (분류문제-타겟의 최빈 클래스) (회귀문제-평균값) (시계열문제-이전 타임스탬프의값)
  • 회귀선
  • 잔차
  • RSS (Residual sum of squares) (Sum of Square Error)
  • 비용함수
  • 최소제곱회귀 (OLS , Ordinary least squares)
  • 종속변수 (반응변수 , 레이블 , 타겟)
  • 독립변수 (예측변수 , 설명 , 특성)
  • 단순 선형 회귀 (Simple Linear Regression)
  • 계수 (Coefficients)
  • 절편 (Intercept)
  • 다중 선형 회귀 (Multiple Linear Regression)
  • MSE (Mean Squared Error)
  • MAE (평균절대오차 , Mean Absolute Error)
  • RMSE (Root Mean Squared Error)
  • R-squared (Coefficient of determination)
  • 일반화
  • 과적합
  • 과소적합
  • 편향
  • 분산
  • 트레이드오프
  • Ridge Regression
  • Ridge CV - alpha , lambda
  • One hot encoding
  • 특성선택 (Feature Selection) _ 특성공학 , 과제에 적합한 특성을 만들어 내는 과정
  • 정규화 (regularization)
  • 카테고리형 데이터 - 명목형(nominal) , 순서형(ordinal)
  • 분류문제
  • Logistic Regression
  • 훈련 / 검증 / 테스트 세트
  • 하이퍼파라미터
  • 교차검증 (K-fold corss validation)
  • 오즈(Odds . 실패확률에 대한 성공확률의 비)
  • 로짓 변환 (Logit transformation)
   
Linear Regression Analysis,선형모델 첫시간_Day21 khalidpark2029.tistory.com/67
Tabular Data, 분류와 회귀_Day21(2) khalidpark2029.tistory.com/68
단순선형회귀,Simple Linear Regression_Day21(3) khalidpark2029.tistory.com/69
R Squared 계산방법 , R 스퀘어, 결정계수_Day22 khalidpark2029.tistory.com/70
Mean Square Error,평균 제곱근 편차,잔차와오차_Day22(2) khalidpark2029.tistory.com/71
Training & Test , Bias & Variance,편향,분산_Day22(3) khalidpark2029.tistory.com/72
과적합(Overfitting)과 과소적합(Underfitting)_Day22(4) khalidpark2029.tistory.com/73
Bias/Variance/편향과분산, 한번더_Day22(5) khalidpark2029.tistory.com/75
회귀의 오류지표,MAE,MSE,RMSE,MAPE,MPE_Day22(6) khalidpark2029.tistory.com/76
Regularization,Ridge,릿지,regression_Day23 khalidpark2029.tistory.com/77
다중 회귀 분석 vs 다항 회귀 분석_Day23(2) khalidpark2029.tistory.com/78
fit_transform과 transform의 차이,싸이킷런 khalidpark2029.tistory.com/82
회귀의 종류 총정리 khalidpark2029.tistory.com/83
Regularization,Lasso ,라쏘,regression_Day23(3) khalidpark2029.tistory.com/85
Regularization,Elastic Net,엘라스틱넷,regression_Day23(4) khalidpark2029.tistory.com/86
Cross Validation(CV)와 lambda , Regression_Day23(5) khalidpark2029.tistory.com/87
Overfitting, train, validation and test_Day24 khalidpark2029.tistory.com/88
Logistic Regression,로지스틱_Day24(2) khalidpark2029.tistory.com/89
Logistic Regression ,Coefficients_Day24(3) khalidpark2029.tistory.com/90
Logistic Regression,Maximum Likelihood_Day24(4) khalidpark2029.tistory.com/91

 

02 - KEYWORD

  • 결정트리 (Decision Tree)
  • 노드(node)
  • 뿌리노드 (root) , 중간노드 (internal) , 말단노드 (external, leaf, terminal)
  • 엣지(edge)
  • 지니 불순도 (gini impurity)
  • 파이프라인(Pipelines)
  • 결정트리 과적합 문제 해결 방법 (min_samples_split , min_samples_leaf , max_depth)
  • 특성 중요도(feature importances)
  • 랜덤포레스트(random forest)
  • 배깅(bagging)
  • 부트스트랩(bootstrap)
  • Aggregation - 부트스트랩세트로 만들어진 기본모델들을 합치는 과정
  • Ordinal encoding
  • 혼동행렬 (Confusion matrix)
  • 정확도
  • 정밀도
  • 재현율
  • 임계값
  • ROC curve
  • AUC (ROC 면적)
  • 모델선택(Model Selection)
  • 교차검증(Cross Validation)
  • 하이퍼파라미터 튜닝 _최적화 , 일반화
  • 검증곡선 (Validation curve)
  • RandomizedSearchCV
  • GridSearchCV

   
결정트리,Decision Trees_Day26 khalidpark2029.tistory.com/92
Random Forest,랜덤포레스트_Day27 khalidpark2029.tistory.com/94
Confusion Matrix, 혼동행렬_Day28 khalidpark2029.tistory.com/101
결정트리,랜덤포레스트,혼동행렬,교차검증 키워드 개념정리 khalidpark2029.tistory.com/104
혼동행렬, 정확도, 정밀도, 재현율 한번더정리 khalidpark2029.tistory.com/105

 

03 - KEYWORD

  • 정보누수 (leakage)
  • Data wrangling
  • 특성중요도 (feature importances)
  • 1) MDI (Mean decrease impurity)
  • 2) Drop Column Importance
  • 3) 순열중요도 (MDA , Mean decrease accuracy)
  • eli5 라이브러리
  • Adaboost
  • XGboost
  • LightGBM
  • PDP (부분의존도그림 , Partial dependence plot)
  • SHAP
   
분류 정확도와 불균형, accuracy만 집중했을 때의 문제점_Day 31 khalidpark2029.tistory.com/107
판다스 groupby 활용법_Day 32 khalidpark2029.tistory.com/108
bootstrap,bagging 복습_Day33 khalidpark2029.tistory.com/109
AdaBoost,아다부스트,decision tree,random tree_Day33(2) khalidpark2029.tistory.com/110
Gradient Boost,for Regression_Day33(3) khalidpark2029.tistory.com/113
부분의존도,Partial Dependence Plot (PDP)_Day34 khalidpark2029.tistory.com/114

 

04 - Project

  • NBA 머신러닝 예측 모델 프로젝트

 

 

   
프로젝트)NBA 머신러닝 예측 모델_Day36~39 khalidpark2029.tistory.com/115
완성)NBA 머신러닝 예측 모델 프로젝트_Day40 khalidpark2029.tistory.com/119
피드백)NBA 머신러닝 예측 모델 프로젝트_Day40 khalidpark2029.tistory.com/120

 

728x90

댓글