1sample vs 2sample chisquare 코드

[1 sample chisquare]

목적 : 주어진 데이터가 균등 한 분포를 나타내고 있는지 확인

Expected => sum / 데이터 수

import numpy as np
data = np.array([10, 11, 10, 12, 10, 11])

exp = np.sum(data) / 6  # [10.6, 10.6, 10.6, 10.6, 10.6, 10.6]

chi = np.sum(np.power(data - exp, 2) / exp) # chisquare statistics = 0.3125
print(chi)

print( 1 - stats.chi2.cdf(chi, df = 6 - 1)) # pvalue : 0.9974013615235537

from scipy.stats import chisquare  

data = np.array([10,11,10,12,10,11])

print(chisquare(data)) # statistic=0.3125, pvalue=0.9974013615235537

[2 sample chisquare]

목적 : 2개의 데이터가 연관이 있는지를 확인 (Frequency 기반)

Expected = rowsum*colsum/totalsum

data = np.array([  
[10, 12] ,
[14, 16] 
])

exp = np.array([
[(10+12)*(10+14), (10+12)*(12+16)],
[(14+16)*(10+14), (14+16)*(12+16)]
]) / np.sum(data)

chi = np.sum( np.power(data-exp,2)/exp ) # 0.007503607503607451

print( 1 - stats.chi2.cdf(chi, df = (2-1)*(2-1) )) # 0.9309708924815491

from scipy.stats import chi2_contingency
data = np.array([  
[10, 12] ,
[14, 16] 
])
print(chi2_contingency(data, correction = False))  # chi = 0.007503607503607451, pvalue =  0.9309708924815491

728x90

저작자표시

'AI월드 > ⚙️AI BOOTCAMP_Section 1' 카테고리의 다른 글

콜모고로프-스미르노브 검정_Day7(5) (0)	2021.01.17
Empirical Analysis. 경험적 분석(실증적 분석)_Day7(4) (0)	2021.01.17
신뢰구간의 개념 한번더_Day8(3) (0)	2021.01.17
선형대수와 매트릭스의 시작.기본개념_Day13(4) (0)	2021.01.16
클러스터, scree plot, k-means, ML_Day14(2) (0)	2021.01.15

칼리드월드

1sample vs 2sample chisquare 코드_Day7(3)

[1 sample chisquare]

목적 : 주어진 데이터가 균등 한 분포를 나타내고 있는지 확인

Expected => sum / 데이터 수

[2 sample chisquare]

목적 : 2개의 데이터가 연관이 있는지를 확인 (Frequency 기반)

Expected = rowsum*colsum/totalsum

'AI월드 > ⚙️AI BOOTCAMP_Section 1' 카테고리의 다른 글

댓글

티스토리툴바

1sample vs 2sample chisquare 코드_Day7(3)

[1 sample chisquare]

목적 : 주어진 데이터가 균등 한 분포를 나타내고 있는지 확인

Expected => sum / 데이터 수

[2 sample chisquare]

목적 : 2개의 데이터가 연관이 있는지를 확인 (Frequency 기반)

Expected = rowsum*colsum/totalsum

'AI월드 > ⚙️AI BOOTCAMP_Section 1' 카테고리의 다른 글

관련글

댓글

티스토리툴바