Revision history [back]

One way to determine the occurrence of a value in a data frame, while taking into account the sequence of the elements, is to use the apply() function in combination with the lag() function.

Here is an example:

# create a sample data frame
df <- data.frame(col1 = c("A", "B", "C", "D", "E"), 
                 col2 = c("B", "C", "A", "D", "E"), 
                 col3 = c("C", "A", "B", "D", "E"))

# define a function to count occurrences of a value in a sequence
count_occurrence <- function(x, value) {
  sum(x == value) - sum(lag(x, default = "") == value & x != value)
}

# apply the function to each row of the data frame
occurrences <- apply(df, 1, count_occurrence, value = "A")

In this example, we create a sample data frame with three columns and five rows. We also define a function called count_occurrence that takes two arguments: x (the sequence to be searched) and value (the value to be counted). The function uses the sum() function to count the number of times value appears in x, but also subtracts the number of times it appears immediately after appearing in the previous position (lag(x, default = "") == value & x != value). This is important to avoid double-counting when the same value appears in consecutive positions.

Finally, we use the apply() function to apply the count_occurrence function to each row of the data frame, with the value argument set to "A". The resulting occurrences vector will contain the number of times "A" appears in each row, while taking into account the sequence of the elements.