Bayesian Test of Independence

14 Jul 2012 - Michael Tsikerdekis

Introduction

The following test behaves alot like the chi-square test of independence. It can work with ordinal, categorical and even dichotomous variables (any case that can give you a contingency table).

References

This method is based on the book "Bayesian Computation With R" by Jim Albert. If you want to learn more about the model and the code you can read the book or the article.

For the procedure you need R and the LearnBayes package that can be installed in R using the commandinstall.packages('LearnBayes').

Procedure Highlights

Input

You need to type the data for your contingency table or feed it to your tabledata variable. Additionally, you need to specify the rows and columns for your table.

 #---------------INPUT DATA------------------
 tabledata = c(6,9,40,34) # Enter data first row by row and then column by column
 tablerows = 3 # rows in the contigency table
 tablecolumns = 4 # columns in the contigency table
 #-------------------------------------------

The hypotheses are:

H0: There is no dependency between the two variables
H1: There is a dependency between the two variables
H~0: There is almost no dependency. This hypothesis tests for a model close to independence.

Output

 Your table: 
      [,1] [,2]
 [1,]    6   40
 [2,]    9   34
 
 The uniform table to be compared with your table: 
      [,1] [,2]
 [1,]    1    1
 [2,]    1    1
 
 -----------Results--------------
 Bayes Factor(BF10) for H1 Dependence over H0 Independence:  0.4660114 
 Bayes Factor(BF01) for H0 Independence over H1 Dependence:  2.14587 
 
 -----------Additional Model Results--------------
 Bayes factor in support of the model close to independence versus the model of independence:
   log.K log.BF   BF
 1     2  -1.76 0.17
 2     3  -0.50 0.61
 3     4  -0.25 0.78
 4     5  -0.07 0.93
 5     6  -0.02 0.98
 6     7   0.00 1.00
 
 -------Results--------------
 Bayes Factor(BF10) for H~0 Close to Independence over H0 Independence:  0.9954232 
 Bayes Factor(BF01) for H0 Independence over H~0 Close to Independence:  1.004598

You need to always verify that your table looks the way that it should. The code is still a bit buggy and sometimes rows get changed for columns. In case your table looks the opposite way, just change the numbers between rows and columns.

The code performs two analyses. The first, tests the independence hypothesis against the dependence hypothesis. The second analysis tests the hypothesis of Independence against the hypothesis close to independence.

Read the Bayes Factor page for how you should interpret these results.

In this example, H0 is 2.14 times more likely than H1. The evidence is not really strong however. Additionally, the second test failed to provide support for a model close to independence.

Code

 # clears workspace:  
 rm(list=ls(all=TRUE))
 
 #---------------INPUT DATA------------------
 tabledata = c(6,9,40,34) # Enter data first row by row and then column by column
 tablerows = 2 # rows in the contigency table
 tablecolumns = 2 # columns in the contigency table
 #-------------------------------------------
 
 library(LearnBayes)
 
 tablesize = c(tablecolumns,tablerows)
 data=matrix(tabledata,tablesize)
 cat("\r\nYour table: \r\n")
 print(data)
 
 #chisq.test(data)
 #fisher.test(data)
 totalrowscolumns = tablerows * tablecolumns
 
 a=matrix(rep(1,totalrowscolumns),tablesize)
 cat("\r\nThe uniform table to be compared with your table: \r\n")
 print(a)
 
 BF10 = ctable(data,a) #BF in support of the dependence hypothesis
 BF01 =  1 /BF10
 cat("\r\n-----------Results--------------\r\n")
 cat("Bayes Factor(BF10) for H1 Dependence over H0 Independence: ",BF10,"\r\n")
 cat("Bayes Factor(BF01) for H0 Independence over H1 Dependence: ",BF01,"\r\n")
 
 log.K=seq(2,7)
 compute.log.BF=function(log.K)
   log(bfindep(data,exp(log.K),100000)$bf)
 
 log.BF=sapply(log.K,compute.log.BF)
 BF=exp(log.BF)
 
 #BF in support of the alternative model close to independence
 #Bayes factor against independence assuming alternatives close to independence
 cat("\r\n-----------Additional Model Results--------------\r\n")
 cat("Bayes factor in support of the model close to independence versus the model of independence:\r\n")
 print(round(data.frame(log.K,log.BF,BF),2))
 
 #Plotting
 plot(log.K,log.BF)
 lines(log.K,log.BF)
 
 cat("\r\n-----------Results--------------\r\n")
 cat("Bayes Factor(BF~00) for H~0 Close to Independence over H0 Independence: ",max(BF),"\r\n")
 cat("Bayes Factor(BF0~0) for H0 Independence over H~0 Close to Independence: ",1/max(BF),"\r\n")