Create a ggalluvial plot in R: A ggalluvial plot, also known as an alluvial diagram, is a type of visualization used to show how categorical data is distributed among different groups. It is particularly useful for visualizing how categorical variables are related to each other across different levels of a grouping variable.

To create a ggalluvial plot in R, you can follow these steps:
Step 1: Install and load the required packages
install.packages("ggplot2")
install.packages("ggalluvial")
library(ggplot2)
library(ggalluvial)
Step 2: Prepare the data
The ggalluvial package requires data to be in a specific format. The data must be in a data frame where each row represents a single observation, and each column represents a category. Each category column should have a unique name, and each row should have a unique identifier.
Here is an example data frame:
# create example data frame
data <- data.frame(
id = c(1, 2, 3, 4, 5, 6),
gender = c("Male", "Male", "Female", "Male", "Female", "Female"),
age = c("18-24", "25-34", "35-44", "18-24", "25-34", "35-44"),
country = c("USA", "Canada", "USA", "Canada", "Canada", "USA")
)
Step 3: Create the ggalluvial plot
ggplot(data = data,
aes(x = gender, stratum = age, alluvium = id, fill = country)) +
geom_alluvium() +
geom_stratum() +
ggtitle("Gender, Age, and Country") +
theme(legend.position = "bottom")
The geom_alluvium()
function creates the flowing paths that connect the different categories, and the geom_stratum()
function adds the vertical bars that represent the categories. The ggtitle()
function adds a title to the plot, and the theme()
function adjusts the legend position to the bottom.
For next example, let’s use the diamonds
dataset from the ggplot2
package:
data("diamonds")
Now let’s create a ggalluvial plot to visualize the relationship between cut, color, and price of diamonds:
ggplot(diamonds, aes(y = price, axis1 = cut, axis2 = color)) +
geom_alluvium(aes(fill = cut), width = 0.1) +
geom_stratum(width = 1/8, fill = "black", color = "grey") +
geom_text(stat = "stratum", aes(label = after_stat(stratum)),
size = 3, fontface = "bold", color = "white") +
scale_fill_brewer(type = "qual", palette = "Set1") +
theme_minimal() +
labs(title = "Diamonds by Cut, Color, and Price",
subtitle = "Data from ggplot2::diamonds")
This code will create a ggalluvial plot with cut and color on the axes, and price represented by the y-axis. The alluvia are colored by cut, and the strata are filled in black with white text labels.
You can customize the plot further by adjusting the parameters in the geom_alluvium
, geom_stratum
, and scale_fill_brewer
functions.
Comments are closed.