Creating Two-Way Tables

Is there an association between gender and self-image? The 2003-2004 National Health & Nuitrition Exam Survey (NHANES) asked participants to describe their own self-image regarding weight. The options were Underweight, About Right, and Overweight.

One way to get the data is to load it from a comma separated values (csv) file into an R data frame and then convert it to a two-way table using the table() function.

self.image.df <- read.csv("https://bclins.github.io/spring26/math222/Examples/SelfImage.csv")
summary(self.image.df)
##     gender           self.image       
##  Length:5876        Length:5876       
##  Class :character   Class :character  
##  Mode  :character   Mode  :character
self.image <- table(self.image.df)
self.image
##         self.image
## gender   About Right Overweight Underweight
##   Female        1175       1730         116
##   Male          1469       1112         274

The other way is to manually enter the data as a matrix:

self.image.matrix <- matrix(c(116,274,1175,1469,1730,1112),ncol=2,byrow=T)
colnames(self.image.matrix)=c('Female','Male')
rownames(self.image.matrix)=c('Underweight','About Right','Overweight')
self.image2 <- as.table(self.image.matrix)
self.image2
##             Female Male
## Underweight    116  274
## About Right   1175 1469
## Overweight    1730 1112

Notice that the two tables aren’t the same because they have swapped rows and columns. That’s easy to fix using the t() function which transposes the table (swaps the rows and columns).

self.image <- t(self.image)
self.image
##              gender
## self.image    Female Male
##   About Right   1175 1469
##   Overweight    1730 1112
##   Underweight    116  274

Visualizing Two Categorical Variables

You can make either a bar graph or a mosaic plot for this data, and each can be oriented two ways:

Stacked Bar Graph

barplot(self.image, col = topo.colors(3), legend=T)

barplot(t(self.image), col = terrain.colors(3), legend=T)

Stacked Bar Graph Showing Proportions

barplot(prop.table(self.image, margin = 2), col = rainbow(3), legend=T)

barplot(prop.table(t(self.image), margin = 2), col = cm.colors(3), legend=T)

Mosaic Plots

mosaicplot(self.image, col = T, legend=T)
## Warning: In mosaicplot.default(self.image, col = T, legend = T) :
##  extra argument 'legend' will be disregarded

mosaicplot(t(self.image), col = cm.colors(3), legend=T)
## Warning: In mosaicplot.default(t(self.image), col = cm.colors(3), legend = T) :
##  extra argument 'legend' will be disregarded

Chi-Squared Test

In order to test the hypotheses:

We use the R command chisq.test().

prop.test(self.image) # prop.test only works for a table with two columns.  
## 
##  3-sample test for equality of proportions without continuity correction
## 
## data:  self.image
## X-squared = 226.58, df = 2, p-value < 2.2e-16
## alternative hypothesis: two.sided
## sample estimates:
##    prop 1    prop 2    prop 3 
## 0.4444024 0.6087262 0.2974359
chisq.test(self.image)
## 
##  Pearson's Chi-squared test
## 
## data:  self.image
## X-squared = 226.58, df = 2, p-value < 2.2e-16

As you can see, the results are very significant, so we can safely conclude that there is a real association between gender and self-image in the population. It appears that women are significantly more likely to view themselves as overweight.