Ask Your Question
1

How can I retrieve data.table groups that have a specific number of rows only?

asked 2023-05-01 02:58:18 +0000

scrum gravatar image

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
2

answered 2023-05-01 03:25:02 +0000

huitzilopochtli gravatar image

You can use the .N special symbol in data.table to count the number of rows within each group. Then you can filter the groups that have a specific number of rows using the subset operator []. Here is an example:

library(data.table)

# Create a sample data.table
DT <- data.table(x = rep(c("A", "B"), each = 5), y = rnorm(10))

# Count the number of rows per group
n_rows <- DT[, .N, by = x]

# Retrieve groups that have exactly 3 rows
n_rows[n_rows$N == 3, ]

# Alternatively, you can filter the original data.table
DT[x %in% n_rows[x == "A" & N == 3]$x]

In this example, we first create a data.table DT with two columns x and y. We then use the .N symbol to count the number of rows in each group defined by the column x. We store the result in a new data.table n_rows. Finally, we filter n_rows to retrieve the groups that have exactly 3 rows. Alternatively, you can filter the original DT data.table directly using the subset operator [].

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer


Question Tools

Stats

Asked: 2023-05-01 02:58:18 +0000

Seen: 17 times

Last updated: May 01 '23