`AlgebraOfGraphics.jl`

can perform **statistical transformations** as layers with five functions:

: calculates the mean (`expectation`

*expectation*) of the underlying Y-axis column: computes the frequency (`frequency`

*raw count*) of the underlying X-axis column: computes the density (`density`

*distribution*) of the underlying X-axis column: computes a linear trend relationship between the underlying X- and Y-axis columns`linear`

: computes a smooth relationship between the underlying X- and Y-axis columns`smooth`

Let’s first cover `expectation`

:

```
plt = data(df) *
mapping(:name, :grade) *
expectation()
draw(plt)
```

Here, `expectation`

adds a statistical transformation layer that tells `AlgebraOfGraphics.jl`

to compute the mean of the Y-axis values for every unique X-axis values. In our case, it computed the mean of grades for every student. Note that we could safely remove the visual transformation layer (`visual(BarPlot)`

) since it is the default visual transformation for `expectation`

.

Next, we’ll show an example with `frequency`

:

```
plt = data(df) *
mapping(:name) *
frequency()
draw(plt)
```

Here we are passing just a single positional argument to `mapping`

since this is the underlying column that `frequency`

will use to calculate the raw count. Note that, as previously, we could also safely remove the visual transformation layer (`visual(BarPlot)`

) since it is the default visual transformation for `frequency`

.

Now, an example with `density`

:

```
plt = data(df) *
mapping(:grade) *
density()
draw(plt)
```

Analogous to the previous examples, `density`

does not need a visual transformation layer. Additionally, we only need to pass a single continuous variable as the only positional argument inside `mapping`

. `density`

will compute the distribution density of this variable which we can fuse all the layers together and visualize the plot with `draw`

.

For the last two statistical transformations, `linear`

and `smooth`

, they cannot be used with the `*`

operator. This is because `*`

fuses two or more layers into a *single* layer. `AlgebraOfGraphics.jl`

cannot represent these transformations with a *single* layer. Hence, we need to **superimpose layers with the + operator**. First, let’s generate some data:

```
x = rand(1:5, 100)
y = x + rand(100) .* 2
synthetic_df = DataFrame(; x, y)
first(synthetic_df, 5)
```

x | y |
---|---|

1.0 | 1.96607080855035 |

5.0 | 6.993454790161161 |

5.0 | 5.709911916334441 |

3.0 | 4.544011304772464 |

4.0 | 5.382914898287261 |

Let’s begin with `linear`

:

```
plt = data(synthetic_df) *
mapping(:x, :y) *
(visual(Scatter) + linear())
draw(plt)
```

We are using the **distribute property** (Section 7) for more efficient code inside our `mapping`

, `a * (b + c) = (a * b) + (a + b)`

, where:

`a`

: the`data`

and`mapping`

layers fused into a single layer`b`

: the`visual`

transformation layer`c`

: the statistical`linear`

transformation layer

`linear`

adds a linear trend between the X- and Y-axis mappings with a 95% confidence interval shaded region.

Finally, the same example as before but now replacing `linear`

with `smooth`

:

```
plt = data(synthetic_df) *
mapping(:x, :y) *
(visual(Scatter) + smooth())
draw(plt)
```

`smooth`

adds a smooth trend between the X- and Y-axis mappings.