Let’s create pipopaplot because your plots deserve to sing 🎶
Overview
pipopaplot is an experimental R package for parameter mapping sonification.
Sonification refers to the use of non-speech audio to convey information or perceptualize data. While data visualization translates data into shapes, positions, and colors, parameter mapping sonification maps data into sound parameters such as pitch, velocity, duration, or timing.
pipopaplot is designed to work seamlessly with the ggplot2 ecosystem: you can use the same data-mapping logic that you use for visual plots, but instead of pixels, the output is musical notes.
The package provides a small set of functions to bridge between ggplot-like data frames and MIDI events:
as_notes() — extract note-like data from ggplot layers or data frames
rollup() — aggregate or transform notes before playback
sonify() — convert note data into MIDI-ready event sequences
(optionally) write_midi() — save the resulting notes as a .mid file using built-in craigsapp/midifile library
Together, these functions allow you to hear your data, explore its temporal patterns, or even compose algorithmic music directly from statistical graphics.
Usage
A typical workflow consists of three steps.
1. Create note data
You can start from a ggplot object or any data frame. Use as_notes() to extract or format the columns that correspond to sonification aesthetics:
Custom ‘rollup’ functions can also be defined by users — as long as they return a data frame with the required note columns.
3. Convert to sound
Finally, use sonify() to map data values into MIDI pitches, velocities, and durations. The resulting object can be passed to write_midi() to save as a .mid file:
The default pitch range corresponds to a 76-key piano (27–102 in MIDI notes), and velocity is mapped roughly to mezzo-forte levels (60–100). Because values are rescaled internally, input ranges are arbitrary but must be finite.
Examples
The example above sounds like this:
Below is another example using the ggplot2::diamonds dataset. We start with a ggplot density plot of carat by color and cut.
as_notes() can be used to extract the data from a ggplot object, however, it sometimes requires to be specified explicitly which columns are extracted, because the variables in a layer data are dependent on the type of geometry used.
dat <-get_layer_data(gp, 1)str(dat)#> 'data.frame': 7168 obs. of 19 variables:#> $ x : num 0.23 0.231 0.232 0.234 0.235 ...#> $ density : num 2.13 2.13 2.14 2.14 2.15 ...#> $ scaled : num 0.97 0.972 0.974 0.976 0.978 ...#> $ ndensity : num 0.97 0.972 0.974 0.976 0.978 ...#> $ count : num 6.38 6.39 6.41 6.42 6.44 ...#> $ wdensity : num 6.38 6.39 6.41 6.42 6.44 ...#> $ n : int 3 3 3 3 3 3 3 3 3 3 ...#> $ flipped_aes: logi FALSE FALSE FALSE FALSE FALSE FALSE ...#> $ group : int 1 1 1 1 1 1 1 1 1 1 ...#> $ PANEL : Factor w/ 7 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...#> $ y : num 1.46 1.46 1.46 1.46 1.46 ...#> $ ymin : num 0 0 0 0 0 0 0 0 0 0 ...#> $ ymax : num 1.46 1.46 1.46 1.46 1.46 ...#> $ colour : chr "black" "black" "black" "black" ...#> $ fill : logi NA NA NA NA NA NA ...#> $ weight : num 1 1 1 1 1 1 1 1 1 1 ...#> $ alpha : logi NA NA NA NA NA NA ...#> $ linewidth : num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...#> $ linetype : int 1 1 1 1 1 1 1 1 1 1 ...
For example, layer data from geom_density does not have size column, so we need to specify which column goes to velocity here.
rollup() here aggregates note data by channel, group, and x, applying max function. sonify() then transforms the aggregated data into a data frame of ‘note-on’ and ‘note-off’ events while grouping by channel (derived from PANEL, i.e., the faceting variable color). group (derived from cut) is used to layout notes along the timeline. As a result, we have 2 phrases of 16 beats each in this .mid file.