Day 2 of the Advent of Code provides us with a tab delimited input consisting of numbers 2-4 digits long and asks us to calculate its “checksum”. checksum is defined as the sum of the difference between each row’s largest and smallest values. Awesome! This is a problem that is well-suited for base R.
I started by reading the file in using read.delim, specifying header = F in order to ensure that numbers within the first row of the data are not treated as variable names.
When working with short problems like this where I know I won’t be rerunning my code or reloading my data often, I will use file.choose() in my read.whatever functions for speed. file.choose() opens Windows Explorer, allowing you to navigate to your file path.
input <- read.delim(file.choose(), header = F)
# Check the dimensions of input to ensure the data read in correctly.
dim(input)
After checking the dimensions of our input, everything looks good. As suspected, this is a perfect opportunity to use some vectorization via the apply function.
row_diff <- apply(input, 1, function(x) max(x) - min(x))
checksum <- sum(row_diff)
checksum
Et voilĂ , the answer is 45,972!
As was the case with Day 1, we are then prompted with a part two. In order to help out a worrisome computer, we now have to find the two evenly divisible numbers within each row, divide them, and add each row’s result.
This is a tad bit trickier but it’s clear we need to work with the modulo operator. We need to identify the two numbers a and b within each row such that a %% b == 0. If a < b, a %% b will just return a so my first thought is that we should sort the rows in ascending order.
# Sort rows of matrix
input <- t(apply(input, 1, sort))
You can avoid transposing the matrix if you use some helpful packages but per my previous post, I’m trying to stick to base R *sobs quietly*. I used loops to solve this because we need to iterate through each row, comparing each element to every other element. I did try using vectorization here via sapply,
# Compare all elements in first row of input matrix.
sapply(input[1,], function(x) x %% input[1,] == 0)
but this produces a 16 x 16 matrix for each row with a diagonal that needs to be ignored, on top of which we need to find the one TRUE element and map it back to the two matrix indices. I think I would pursue this method further if there was more than one pair of numbers we were searching for but since we areeeeen’t….
# Initialize vector to store each row value
row_val <- c(rep(NA, nrow(input)))
# For each row..
for(row in 1:nrow(input)){
# Compare each element to its succeeding elements..
for(col in 1:(ncol(input) - 1)){
for(i in 1:(ncol(input) - col)){
# If the modulo is equal to 0,
# set the vector element equal to the division result.
if(input[row, col + i] %% input[row, col] == 0){
row_val[row] <- input[row, col + i] / input[row, col]
}
}
}
}
sum(row_val)
Our sum is 326, the correct answer. I’d love to some alternative solutions to part 2, I feel like there is definitely a lot of optimization that could occur here!