Untitled

비모수 검정(Non-parametic)

  • 정규성을 만족하지 않거나 데이터가 작을 때 사용한다.

1. Mann-Whitney test (or Wilcoxon rank-sum test)

  • 독립표본 t-test (1범주-2집단에 대한 숫자형의 평균차이)에 대한 비모수적 검정
  • 2가지 방법이 있다.
# 데이터 준비
library(moonBook)
data(acs)

acs2 <- acs[1:100, ]
head(acs2)
##   age    sex cardiogenicShock   entry              Dx   EF height weight
## 1  62   Male               No Femoral           STEMI 18.0    168     72
## 2  78 Female               No Femoral           STEMI 18.4    148     48
## 3  76 Female              Yes Femoral           STEMI 20.0     NA     NA
## 4  89 Female               No Femoral           STEMI 21.8    165     50
## 5  56   Male               No  Radial          NSTEMI 21.8    162     64
## 6  73 Female               No  Radial Unstable Angina 22.0    153     59
##        BMI obesity  TC LDLC HDLC  TG  DM HBP smoking
## 1 25.51020     Yes 215  154   35 155 Yes  No  Smoker
## 2 21.91381      No  NA   NA   NA 166  No Yes   Never
## 3       NA      No  NA   NA   NA  NA  No Yes   Never
## 4 18.36547      No 121   73   20  89  No  No   Never
## 5 24.38653      No 195  151   36  63 Yes Yes  Smoker
## 6 25.20398     Yes 184  112   38 137 Yes Yes   Never
# 방법1 : mann-whitney test 1
#  - 물결로 칼럼명만 입력 & exact옵션주기 (a logical indicating whether an exact p-value should be computed.)
# **** exact 옵션 - 순서대로 나열 시, 같은 값이 존재하면 순서를 정하는데 문제가 생겨 디폴트인 exact test로 p-value를 구하지 못하고, 대신 정규 분포에 근사 시켜 p-value를 구한다. Exact=FALSE를 옵션으로 추가하여 정규분포에 근사 시키는 방법을 선택하면 디폴트인 exact test를 시도하지 않으므로 Warning을 없앨 수 있다.
wilcox.test( BMI ~ sex, data = acs2, exact = FALSE ) 
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  BMI by sex
## W = 946, p-value = 0.6327
## alternative hypothesis: true location shift is not equal to 0
# 방법2 : mann-whitney test 2
# - 직접 2개 집단(class, 수준)을 인덱싱하여 따로따로 데이터를 만든 뒤, 넣어주기
# - 이 경우는 exact = False옵션을 안주어도 된다.
male_bmi <- acs2[ acs2$sex == "Male", c("BMI")]
female_bmi <- acs2[ acs2$sex == "Female", c("BMI")]
wilcox.test(male_bmi, female_bmi)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  male_bmi and female_bmi
## W = 1069, p-value = 0.6327
## alternative hypothesis: true location shift is not equal to 0

2. wilcoxon signed rank test

  • 대응표본 t-test(실험 전/후 데이터의 평균차이)에 대한 비모수적 검정
# 데이터 준비( 실험 전/후 -> 데이터 개수 같아야함 )
x1 <- c(51.4, 52.0, 45.5, 54.5, 52.3, 50.9, 52.7, 50.3, 53.8, 53.1)
x2 <- c(50.1, 51.5, 45.9, 53.1, 51.8, 50.3, 52.0, 49.9, 52.5, 53.0)

# wilcoxon signed rank test : paired t-test의 비모수 검정
wilcox.test(x1, x2, 
            alternative = c("greater"), paired = TRUE, conf.level = 0.95, # 95% 신뢰구간 -> 5%유의수준
            exact = F) # 오류나서 내가 추가해준 것
## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  x1 and x2
## V = 52.5, p-value = 0.006172
## alternative hypothesis: true location shift is greater than 0

3. kruskal wallis test 과 사후분ㅅ

  • 1way ANOVA의 비모수 검정
# 데이터 준비
head(acs2)
##   age    sex cardiogenicShock   entry              Dx   EF height weight
## 1  62   Male               No Femoral           STEMI 18.0    168     72
## 2  78 Female               No Femoral           STEMI 18.4    148     48
## 3  76 Female              Yes Femoral           STEMI 20.0     NA     NA
## 4  89 Female               No Femoral           STEMI 21.8    165     50
## 5  56   Male               No  Radial          NSTEMI 21.8    162     64
## 6  73 Female               No  Radial Unstable Angina 22.0    153     59
##        BMI obesity  TC LDLC HDLC  TG  DM HBP smoking
## 1 25.51020     Yes 215  154   35 155 Yes  No  Smoker
## 2 21.91381      No  NA   NA   NA 166  No Yes   Never
## 3       NA      No  NA   NA   NA  NA  No Yes   Never
## 4 18.36547      No 121   73   20  89  No  No   Never
## 5 24.38653      No 195  151   36  63 Yes Yes  Smoker
## 6 25.20398     Yes 184  112   38 137 Yes Yes   Never
# 1. kruskal wallis test
kruskal.test( weight ~ factor( smoking ), data=acs2)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  weight by factor(smoking)
## Kruskal-Wallis chi-squared = 11.699, df = 2, p-value = 0.002882
# 2. 사후분석
# -  kruskal wallis test에서 3집단 중 적어도 하나의 집단이 차이가 있다고 할 경우
# 2_1. PMCMR 패키지 : kruskal willis test -> Tukey
# install.packages("PMCMR")
library(PMCMR)
## PMCMR is superseded by PMCMRplus and will be no longer maintained. You may wish to install PMCMRplus instead.
posthoc.kruskal.nemenyi.test(x=acs2$weight, g=as.factor(acs2$smoking), method="Tukey")
## Warning in posthoc.kruskal.nemenyi.test.default(x = acs2$weight, g =
## as.factor(acs2$smoking), : Ties are present, p-values are not corrected.
## 
##  Pairwise comparisons using Tukey and Kramer (Nemenyi) test  
##                    with Tukey-Dist approximation for independent samples 
## 
## data:  acs2$weight and as.factor(acs2$smoking) 
## 
##        Ex-smoker Never 
## Never  0.2784    -     
## Smoker 0.3899    0.0019
## 
## P value adjustment method: none
# install.packages("PMCMRplus") # 사용할려니 이것으로 깔아라고함( PMCMR은 더이상 지원 안할 예정)
# library(PMCMRplus)



# 2_2. n(on)par(ametic) comp(arison) 패키지 - mctp()함수
# 마찬가지로 Tukey로 하는 것 같다.
# - Estimation Method: Global Pseudo ranks 
# - Type of Contrast : Tukey 

# install.packages("nparcomp")
require(nparcomp)
## Loading required package: nparcomp
## Loading required package: multcomp
## Loading required package: mvtnorm
## Loading required package: survival
## Loading required package: TH.data
## Loading required package: MASS
## 
## Attaching package: 'TH.data'
## The following object is masked from 'package:MASS':
## 
##     geyser
result=mctp(weight~smoking, data=acs2)
## 
##  #----------------Nonparametric Multiple Comparisons for relative effects---------------# 
##  
##  - Alternative Hypothesis:  True differences of relative effects are less or equal than 0 
##  - Estimation Method:  Global Pseudo Ranks 
##  - Type of Contrast : Tukey 
##  - Confidence Level: 95 % 
##  - Method = Fisher with 33 DF 
##  
##  #--------------------------------------------------------------------------------------# 
## 
summary(result) # Analysis 탭으로 확인
## 
##  #----------------Nonparametric Multiple Comparisons for relative effects---------------# 
##  
##  - Alternative Hypothesis:  True differences of relative effects are less or equal than 0 
##  - Estimation Method: Global Pseudo ranks 
##  - Type of Contrast : Tukey 
##  - Confidence Level: 95 % 
##  - Method = Fisher with 33 DF 
##  
##  #--------------------------------------------------------------------------------------# 
##  
##  #----Data Info-------------------------------------------------------------------------# 
##              Sample Size    Effect     Lower     Upper
## Ex-smoker Ex-smoker   20 0.5054632 0.4243261 0.5863135
## Never         Never   39 0.3836336 0.3237192 0.4473026
## Smoker       Smoker   37 0.6109032 0.5455423 0.6725069
## 
##  #----Contrast--------------------------------------------------------------------------# 
##                    Ex-smoker Never Smoker
## Never - Ex-smoker         -1     1      0
## Smoker - Ex-smoker        -1     0      1
## Smoker - Never             0    -1      1
## 
##  #----Analysis--------------------------------------------------------------------------# 
##                    Estimator  Lower Upper Statistic     p.Value
## Never - Ex-smoker     -0.122 -0.309 0.074    -1.522 0.289976447
## Smoker - Ex-smoker     0.105 -0.093 0.296     1.296 0.403133946
## Smoker - Never         0.227  0.081 0.364     3.767 0.001777051
## 
##  #----Overall---------------------------------------------------------------------------# 
##   Quantile     p.Value
## 1 2.443167 0.001777051
## 
##  #--------------------------------------------------------------------------------------#

mytable()로 한다면? (사후분석은x)

  • method = 1 : 정규분포 가정 모수적 -> 2개 평균 +- 표준편차 제시하는 t-test
  • method = 2 : 비모수적인 통계방법 가정 -> 중앙값 [ 1, 3] 제시하는 비모수 2개 -> wilcox(대응표본), mann(독립표본)
  • method = 3 : 실제로 잔차의 정규성 검정(샤피로) 등등 해놓고 나서 알아서 판단하게 한다
head(acs2)
##   age    sex cardiogenicShock   entry              Dx   EF height weight
## 1  62   Male               No Femoral           STEMI 18.0    168     72
## 2  78 Female               No Femoral           STEMI 18.4    148     48
## 3  76 Female              Yes Femoral           STEMI 20.0     NA     NA
## 4  89 Female               No Femoral           STEMI 21.8    165     50
## 5  56   Male               No  Radial          NSTEMI 21.8    162     64
## 6  73 Female               No  Radial Unstable Angina 22.0    153     59
##        BMI obesity  TC LDLC HDLC  TG  DM HBP smoking
## 1 25.51020     Yes 215  154   35 155 Yes  No  Smoker
## 2 21.91381      No  NA   NA   NA 166  No Yes   Never
## 3       NA      No  NA   NA   NA  NA  No Yes   Never
## 4 18.36547      No 121   73   20  89  No  No   Never
## 5 24.38653      No 195  151   36  63 Yes Yes  Smoker
## 6 25.20398     Yes 184  112   38 137 Yes Yes   Never
library(moonBook)
# method = 2 : 강제로 비모수적 검정
# - 2집단  mann-whitney(독립표본), wilcox(대응표본), 3집단 kruskal - willis test
mytable(smoking ~ weight, data = acs2, method = 2) # 0.003 - 직접수행한것에 반올림만 하면 똑같다!
## 
##               Descriptive Statistics by 'smoking'              
## ________________________________________________________________ 
##            Ex-smoker          Never            Smoker        p  
##              (N=20)           (N=42)           (N=38)     
## ---------------------------------------------------------------- 
##  weight 60.0 [54.0;69.5] 59.0 [50.0;64.0] 65.0 [60.0;71.0] 0.003
## ----------------------------------------------------------------
# - 직접 kruksal
kruskal.test( weight ~ factor( smoking ), data=acs2) #  0.002882
## 
##  Kruskal-Wallis rank sum test
## 
## data:  weight by factor(smoking)
## Kruskal-Wallis chi-squared = 11.699, df = 2, p-value = 0.002882
# method = 3 : 잔차의 정규성 검정을 통해 판단후 모수적or비모수적 검정 ( 2집단 : t-test or mann/wilcoxon ) (3집단 : anova or kruskal )
mytable(smoking ~ weight, data = acs2, method = 3) # 직접 모수/비모수인지 잔차로 정규성을 검정한 뒤 -> 그에 따라 검정 -> 평균+-표준편차면 정규성O 모수적 /  중앙값[1Q, 3Q]면 정규성X 비모수적
## 
##        Descriptive Statistics by 'smoking'      
## _________________________________________________ 
##          Ex-smoker     Never      Smoker      p  
##           (N=20)      (N=42)      (N=38)   
## ------------------------------------------------- 
##  weight 62.6 ± 10.2 57.7 ±  9.0 66.0 ±  9.7 0.001
## -------------------------------------------------

+ Recent posts