5.7 Piping Operations

R users are familiar with the pipe operator %>% which allows chaining operations together. That means that the output of an operation will be used as input in the next operation and so on.

This can be accomplished with the @chain macro. To use it, we start with @chain followed by the DataFrame and a begin statement. Every operation inside the begin ... end statement will be used as input for the next operation, therefore chaining operations together.

Here is a simple example with a groupby followed by a @combine:

@chain leftjoined begin
    groupby(:name)
    @combine :mean_grade_2020 = mean(:grade_2020)
end
name mean_grade_2020
Sally 1.0
Hank 4.0
Bob 5.0
Alice 8.5

NOTE: @chain will replace the first positional argument while chaining operations. This is not a problem in DataFrames.jl and DataFramesMeta.jl, since the DataFrame is always the first positional argument.

We can also nest as many as begin ... end statements we desired inside the operations:

@chain leftjoined begin
    groupby(:name)
    @combine begin
        :mean_grade_2020 = mean(:grade_2020)
        :mean_grade_2021 = mean(:grade_2021)
    end
end
name mean_grade_2020 mean_grade_2021
Sally 1.0 9.5
Hank 4.0 6.0
Bob 5.0 5.0
Alice 8.5 5.0

To conclude, let’s show a @chain example with all of the DataFramesMeta.jl macros we covered so far:

@chain leftjoined begin
    @rtransform begin
        :grade_2020 = :grade_2020 * 10
        :grade_2021 = :grade_2021 * 10
    end
    groupby(:name)
    @combine begin
        :mean_grade_2020 = mean(:grade_2020)
        :mean_grade_2021 = mean(:grade_2021)
    end
    @rtransform :mean_grades = (:mean_grade_2020 + :mean_grade_2021) / 2
    @rsubset :mean_grades > 50
    @orderby -:mean_grades
end
name mean_grade_2020 mean_grade_2021 mean_grades
Alice 85.0 50.0 67.5
Sally 10.0 95.0 52.5


Support this project
CC BY-NC-SA 4.0 Jose Storopoli, Rik Huijzer, Lazaro Alonso