You want to identify the nth largest or smallest item in a group using R. For example, to filter out the two rows in the table below:

Any time there is some by-group processing, I almost always stick with the `dplyr`

library because of it’s so-called window operations. Below are a few techniques:

Let’s say our data frame is named *stuff*.

group_by(stuff, type) %>% filter(weight == max(weight))

Result:

type name weight 1 Fruits Mangoes 19 2 Vegetables Brussel Sprouts 20

This gets right to the point. We set the data frame up for a grouped operation using `group_by()`

. Then we filter the row(s) where weight is equal to the max weight. Because of the group_by, we are looking at max(weight) within each different type.

group_by(stuff, type) %>% mutate(rank = rank(desc(weight))) %>% arrange(rank)

Result:

type name weight rank 1 Fruits Mangoes 19 1.0 2 Fruits Bananas 18 2.5 3 Fruits Watermelons 18 2.5 4 Fruits Pineapples 10 4.0 5 Fruits Apples 9 5.0 6 Fruits Canteloupes 5 6.0 7 Fruits Oranges 4 7.0 8 Vegetables Brussel Sprouts 20 1.0 9 Vegetables Spinach 15 2.0 10 Vegetables Asparagus 11 3.0 11 Vegetables Mushrooms 8 4.0 12 Vegetables Cabbage 4 5.0

Here we created a new column using the `rank()`

function. Now we can filter what we’d like from here. E.g., `filter(rank <= 3)`

will get you the top 3 within each group. Note the `rank()`

function has a few arguments, like `ties.method`

to handle ties (notice Bananas and Watermelons are tied).