We already covered two macros that operate on columns, @select and @transform.
Now let’s cover the only macro we need to operate on rows: @subset It follows the same principles we’ve seen so far with DataFramesMeta.jl, except that the operation must return a boolean variable for row selection.
Let’s filter grades above 7:
@rsubset df :grade > 7
| name | grade |
|---|---|
| Alice | 8.5 |
| Bob | 9.5 |
| Sally | 9.5 |
As you can see, @subset has also a vectorized variant @rsubset. Sometimes we want to mix and match vectorized and non-vectorized function calls. For instance, suppose that we want to filter out the grades above the mean grade:
@subset df :grade .> mean(:grade)
| name | grade |
|---|---|
| Alice | 8.5 |
| Bob | 9.5 |
| Sally | 9.5 |
For this, we need a @subset macro with the > operator vectorized, since we want a element-wise comparison, but the mean function needs to operate on the whole column of values.
@subset also supports multiple operations inside a begin ... end statement:
@rsubset df begin
:grade > 7
startswith(:name, "A")
end
| name | grade |
|---|---|
| Alice | 8.5 |