Day 2 of the Advent of Code provides us with a tab delimited input consisting of numbers 2-4 digits long and asks us to calculate its “checksum”. checksum is defined as the sum of the difference between each row’s largest and smallest values. Awesome! This is a problem that is well-suited for base R.
I started by reading the file in using read.delim, specifying header = F in order to ensure that numbers within the first row of the data are not treated as variable names.
When working with short problems like this where I know I won’t be rerunning my code or reloading my data often, I will use file.choose() in my read.whatever functions for speed. file.choose() opens Windows Explorer, allowing you to navigate to your file path.
1 2 3 4 | input <- read.delim(file.choose(), header = F) # Check the dimensions of input to ensure the data read in correctly. dim(input) |
After checking the dimensions of our input, everything looks good. As suspected, this is a perfect opportunity to use some vectorization via the apply function.
1 2 3 | row_diff <- apply(input, 1, function(x) max(x) - min(x)) checksum <- sum(row_diff) checksum |
Et voilĂ , the answer is 45,972!
As was the case with Day 1, we are then prompted with a part two. In order to help out a worrisome computer, we now have to find the two evenly divisible numbers within each row, divide them, and add each row’s result.
This is a tad bit trickier but it’s clear we need to work with the modulo operator. We need to identify the two numbers a and b within each row such that a %% b == 0. If a < b, a %% b will just return a so my first thought is that we should sort the rows in ascending order.
1 2 | # Sort rows of matrix input <- t(apply(input, 1, sort)) |
You can avoid transposing the matrix if you use some helpful packages but per my previous post, I'm trying to stick to base R *sobs quietly*. I used loops to solve this because we need to iterate through each row, comparing each element to every other element. I did try using vectorization here via sapply,
1 2 | # Compare all elements in first row of input matrix. sapply(input[1,], function(x) x %% input[1,] == 0) |
but this produces a 16 x 16 matrix for each row with a diagonal that needs to be ignored, on top of which we need to find the one TRUE element and map it back to the two matrix indices. I think I would pursue this method further if there was more than one pair of numbers we were searching for but since we areeeeen't....
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | # Initialize vector to store each row value row_val <- c(rep(NA, nrow(input))) # For each row.. for(row in 1:nrow(input)){ # Compare each element to its succeeding elements.. for(col in 1:(ncol(input) - 1)){ for(i in 1:(ncol(input) - col)){ # If the modulo is equal to 0, # set the vector element equal to the division result. if(input[row, col + i] %% input[row, col] == 0){ row_val[row] <- input[row, col + i] / input[row, col] } } } } sum(row_val) |
Our sum is 326, the correct answer. I'd love to some alternative solutions to part 2, I feel like there is definitely a lot of optimization that could occur here!