geom_hex()

In this document, I will introduce the \(\color{red}{\text{geom_hex()}}\) function and show what it’s for.

#load tidyverse up
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0      ✔ purrr   1.0.1 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.5.0 
## ✔ readr   2.1.3      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
#example dataset
library(palmerpenguins)
data(penguins)

#load required package 'hexbin' for 'stat_binhex()'
library(hexbin)

What is it for?

geom_hex() is a function from the ggplot2 package that creates heatmaps using hexagonal shapes in two dimensions. It is used to display a continuous bivariate distribution.

Example 1

By adding the ‘geom_hex()’ function to the ggplot() and aes() function for two desired continuous variables, you can see that there is variation in concentration throughout the graph.

ggplot(penguins) +     
                  
  aes(x = body_mass_g,
      y = flipper_length_mm) + 
    
  labs(title = "Body Mass by Flipper Length",
       x = "Body Mass (g)",
       y = "Flipper Length (mm)") +
  
  geom_hex()

Example 2

Other options can be added to change the way the information is displayed on your graph.

The number of bins determines how many hexagon/bins are displayed in both vertical and horizontal directions. The default number of bins is 30, but you can change the number of bins with the option ‘bins =’:

ggplot(penguins) +
                  
  aes(x = body_mass_g,
      y = flipper_length_mm) + 
  
    labs(title = "Body Mass by Flipper Length",
       x = "Body Mass (g)",
       y = "Flipper Length (mm)") +
  
  geom_hex(bins = 15)

Example 3

As with other functions in the ggplot2 package, multiple options like ‘binwidth()’, ‘scale_fill_gradient()’, and ‘fill =’ within aes() can be used.

ggplot(penguins) +
  
      labs(title = "Body Mass by Flipper Length",
       x = "Body Mass (g)",
       y = "Flipper Length (mm)") +
                  
  aes(x = body_mass_g,
      y = flipper_length_mm,
      fill = ..density..) + 
              # count is the default option, but you can change it to density with 'fill ='
  geom_hex(binwidth = c(200, 2)) +
              # 'binwidth' is used to adjust the hexagon size horizontally and vertically
  scale_fill_gradient (low = "light blue", high = "blue")   # with 'scale_fill_gradient' you can change the color

Is it helpful?

geom_hex() allows you to better visualize areas where your data is concentrated and thus makes it easier to read.