Online Learning Platform

Data Analysis > Correlation and Regression > Chi-Square Test of Independence

Chi-Square Test of Independence (for categorical variables)

To create a contingency table

   table(lung$sex,lung$ph.ecog)

   tab <- table(lung$sex, lung$ph.ecog)

   row_perc <- prop.table(tab, margin = 1) * 100

To find rounded to 1 decimal place of all percentages:

   round(row_perc, 1) 

   chisq.test(tab)

 

p-value much more greater than 0.05. so null hypothesis is rejected i.e. sex and performance score are not independent.

For a keen observation we can plot the grouped bar chart:

rownames(tab) <- c("Male", "Female")

colnames(tab) <- c( "0: Fully active",

                             "1: Restricted activity",

                             "2: Ambulatory but unable to work",

                             "3: Limited self-care")

barplot(tab,

        main = "ECOG Performance Status by Sex",

        xlab = "ECOG Status", ylab = "Frequency",

        col = c("skyblue", "salmon"),

        legend = rownames(tab), beside = TRUE, las = 2,        

        cex.names = 0.8) # scale x-axis label text

 

barplot(tab,

        main = "ECOG Performance Status by Sex",

        xlab = "ECOG Status", ylab = "Frequency",

        col = c("skyblue", "salmon"), legend = rownames(tab),

        beside = FALSE, las = 2,        

        cex.names = 0.8) # scale x-axis label text

 

library(ggplot2)

tab <- table(lung$sex, lung$ph.ecog)

rownames(tab) <- c("Male", "Female")

colnames(tab) <- c( "Fully active",

                             "Restricted activity",

                             "Ambulatory but unable to work",

                             "Limited self-care")

 

df <- as.data.frame(tab)

ggplot(df, aes(x = Var2, y = Freq, fill = Var1)) +

  geom_bar(stat = "identity", position = "dodge") +

  labs(title = "ECOG Performance Status by Sex",

       x = "ECOG Status", y = "Frequency", fill = "Sex") +

  theme_minimal() +

  theme(axis.text.x = element_text(angle = 45, hjust = 1))

 

 

 

Prev
Correlation for ordinal variables

No More

Feedback
ABOUT

Statlearner


Statlearner STUDY

Statlearner