Bayesian Test of Independence
14 Jul 2012 - Michael Tsikerdekis
Introduction
The following test behaves alot like the chi-square test of independence. It can work with ordinal, categorical and even dichotomous variables (any case that can give you a contingency table).
References
This method is based on the book "Bayesian Computation With R" by Jim Albert. If you want to learn more about the model and the code you can read the book or the article.
For the procedure you need R and the LearnBayes package that can be installed in R using the commandinstall.packages('LearnBayes').
Procedure Highlights
Input
You need to type the data for your contingency table or feed it to your tabledata variable. Additionally, you need to specify the rows and columns for your table.
tabledata = c(6,9,40,34) # Enter data first row by row and then column by column
tablerows = 3 # rows in the contigency table
tablecolumns = 4 # columns in the contigency table
#-------------------------------------------
The hypotheses are:
- H0: There is no dependency between the two variables
- H1: There is a dependency between the two variables
- H~0: There is almost no dependency. This hypothesis tests for a model close to independence.
Output
[,1] [,2]
[1,] 6 40
[2,] 9 34
The uniform table to be compared with your table:
[,1] [,2]
[1,] 1 1
[2,] 1 1
-----------Results--------------
Bayes Factor(BF10) for H1 Dependence over H0 Independence: 0.4660114
Bayes Factor(BF01) for H0 Independence over H1 Dependence: 2.14587
-----------Additional Model Results--------------
Bayes factor in support of the model close to independence versus the model of independence:
log.K log.BF BF
1 2 -1.76 0.17
2 3 -0.50 0.61
3 4 -0.25 0.78
4 5 -0.07 0.93
5 6 -0.02 0.98
6 7 0.00 1.00
-------Results--------------
Bayes Factor(BF10) for H~0 Close to Independence over H0 Independence: 0.9954232
Bayes Factor(BF01) for H0 Independence over H~0 Close to Independence: 1.004598
You need to always verify that your table looks the way that it should. The code is still a bit buggy and sometimes rows get changed for columns. In case your table looks the opposite way, just change the numbers between rows and columns.
The code performs two analyses. The first, tests the independence hypothesis against the dependence hypothesis. The second analysis tests the hypothesis of Independence against the hypothesis close to independence.
Read the Bayes Factor page for how you should interpret these results.
In this example, H0 is 2.14 times more likely than H1. The evidence is not really strong however. Additionally, the second test failed to provide support for a model close to independence.
Code
rm(list=ls(all=TRUE))
#---------------INPUT DATA------------------
tabledata = c(6,9,40,34) # Enter data first row by row and then column by column
tablerows = 2 # rows in the contigency table
tablecolumns = 2 # columns in the contigency table
#-------------------------------------------
library(LearnBayes)
tablesize = c(tablecolumns,tablerows)
data=matrix(tabledata,tablesize)
cat("\r\nYour table: \r\n")
print(data)
#chisq.test(data)
#fisher.test(data)
totalrowscolumns = tablerows * tablecolumns
a=matrix(rep(1,totalrowscolumns),tablesize)
cat("\r\nThe uniform table to be compared with your table: \r\n")
print(a)
BF10 = ctable(data,a) #BF in support of the dependence hypothesis
BF01 = 1 /BF10
cat("\r\n-----------Results--------------\r\n")
cat("Bayes Factor(BF10) for H1 Dependence over H0 Independence: ",BF10,"\r\n")
cat("Bayes Factor(BF01) for H0 Independence over H1 Dependence: ",BF01,"\r\n")
log.K=seq(2,7)
compute.log.BF=function(log.K)
log(bfindep(data,exp(log.K),100000)$bf)
log.BF=sapply(log.K,compute.log.BF)
BF=exp(log.BF)
#BF in support of the alternative model close to independence
#Bayes factor against independence assuming alternatives close to independence
cat("\r\n-----------Additional Model Results--------------\r\n")
cat("Bayes factor in support of the model close to independence versus the model of independence:\r\n")
print(round(data.frame(log.K,log.BF,BF),2))
#Plotting
plot(log.K,log.BF)
lines(log.K,log.BF)
cat("\r\n-----------Results--------------\r\n")
cat("Bayes Factor(BF~00) for H~0 Close to Independence over H0 Independence: ",max(BF),"\r\n")
cat("Bayes Factor(BF0~0) for H0 Independence over H~0 Close to Independence: ",1/max(BF),"\r\n")