R users are familiar with the pipe operator %>% which allows chaining operations together. That means that the output of an operation will be used as input in the next operation and so on.
This can be accomplished with the @chain macro. To use it, we start with @chain followed by the DataFrame and a begin statement. Every operation inside the begin ... end statement will be used as input for the next operation, therefore chaining operations together.
Here is a simple example with a groupby followed by a @combine:
@chain leftjoined begin
groupby(:name)
@combine :mean_grade_2020 = mean(:grade_2020)
end
| name | mean_grade_2020 |
|---|---|
| Sally | 1.0 |
| Hank | 4.0 |
| Bob | 5.0 |
| Alice | 8.5 |
NOTE:
@chainwill replace the first positional argument while chaining operations. This is not a problem inDataFrames.jlandDataFramesMeta.jl, since theDataFrameis always the first positional argument.
We can also nest as many as begin ... end statements we desired inside the operations:
@chain leftjoined begin
groupby(:name)
@combine begin
:mean_grade_2020 = mean(:grade_2020)
:mean_grade_2021 = mean(:grade_2021)
end
end
| name | mean_grade_2020 | mean_grade_2021 |
|---|---|---|
| Sally | 1.0 | 9.5 |
| Hank | 4.0 | 6.0 |
| Bob | 5.0 | 5.0 |
| Alice | 8.5 | 5.0 |
To conclude, let’s show a @chain example with all of the DataFramesMeta.jl macros we covered so far:
@chain leftjoined begin
@rtransform begin
:grade_2020 = :grade_2020 * 10
:grade_2021 = :grade_2021 * 10
end
groupby(:name)
@combine begin
:mean_grade_2020 = mean(:grade_2020)
:mean_grade_2021 = mean(:grade_2021)
end
@rtransform :mean_grades = (:mean_grade_2020 + :mean_grade_2021) / 2
@rsubset :mean_grades > 50
@orderby -:mean_grades
end
| name | mean_grade_2020 | mean_grade_2021 | mean_grades |
|---|---|---|---|
| Alice | 85.0 | 50.0 | 67.5 |
| Sally | 10.0 | 95.0 | 52.5 |