Advent of Code 2017 in R: Day 2

Day 2 of the Advent of Code provides us with a tab delimited input consisting of numbers 2-4 digits long and asks us to calculate its “checksum”. checksum is defined as the sum of the difference between each row’s largest and smallest values. Awesome! This is a problem that is well-suited for base R.

I started by reading the file in using read.delim, specifying header = F in order to ensure that numbers within the first row of the data are not treated as variable names.

When working with short problems like this where I know I won’t be rerunning my code or reloading my data often, I will use file.choose() in my read.whatever functions for speed. file.choose() opens Windows Explorer, allowing you to navigate to your file path.

1
2
3
4
input <- read.delim(file.choose(), header = F)
 
# Check the dimensions of input to ensure the data read in correctly.
dim(input)

After checking the dimensions of our input, everything looks good. As suspected, this is a perfect opportunity to use some vectorization via the apply function.

1
2
3
row_diff <- apply(input, 1, function(x) max(x) - min(x))
checksum <- sum(row_diff)
checksum

Et voilĂ , the answer is 45,972!

As was the case with Day 1, we are then prompted with a part two. In order to help out a worrisome computer, we now have to find the two evenly divisible numbers within each row, divide them, and add each row’s result.

This is a tad bit trickier but it’s clear we need to work with the modulo operator. We need to identify the two numbers a and b within each row such that a %% b == 0. If a < b, a %% b will just return a so my first thought is that we should sort the rows in ascending order.

1
2
# Sort rows of matrix
input <- t(apply(input, 1, sort))

You can avoid transposing the matrix if you use some helpful packages but per my previous post, I'm trying to stick to base R *sobs quietly*. I used loops to solve this because we need to iterate through each row, comparing each element to every other element. I did try using vectorization here via sapply,

1
2
# Compare all elements in first row of input matrix.
sapply(input[1,], function(x) x %% input[1,] == 0)

but this produces a 16 x 16 matrix for each row with a diagonal that needs to be ignored, on top of which we need to find the one TRUE element and map it back to the two matrix indices. I think I would pursue this method further if there was more than one pair of numbers we were searching for but since we areeeeen't....

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Initialize vector to store each row value
row_val <- c(rep(NA, nrow(input)))
 
# For each row..
for(row in 1:nrow(input)){
 
  # Compare each element to its succeeding elements..
  for(col in 1:(ncol(input) - 1)){
    for(i in 1:(ncol(input) - col)){
 
      # If the modulo is equal to 0, 
      # set the vector element equal to the division result.
      if(input[row, col + i] %% input[row, col] == 0){ 
 
        row_val[row] <- input[row, col + i] / input[row, col]
 
      }
    }
  }
}
 
sum(row_val)

Our sum is 326, the correct answer. I'd love to some alternative solutions to part 2, I feel like there is definitely a lot of optimization that could occur here!

Advent of Code 2017 in R: Day 1

My boyfriend recently introduced me to Advent of Code while I was in one of my “learn ALL of the things!” phases. Every year starting December 1st, new programming challenges are posted daily leading up to Christmas. They’re meant to be quick 5-10 minute challenges, so, wanting to test my programming skills, I figured why not try to do all of them in base R!

I went with base R because I know I can dplyr and stringr my way to victory with some of these challenges. I really want to force myself to really go back to basics and confirm that I have the knowledge to do these things on my own without Hadley Wickham‘s (very much appreciated in any other situation) assistance.

Since I’ve started, I’ve also seen a couple of other bloggers attempt to do these challenges in R so I’m really curious how my solutions will compare to theirs.

The first day of the challenge provides you with a string of numbers and asks you to sum all of the digits that match the next digit in a circular list, i.e. the digit after the last digit is the first digit.

My string was…

1
8231753674683997878179259195565332579493378483264978184143341284379682788518559178822225126625428318115396632681141871952894291898364781898929292614792884883249356728741993224889167928232261325123447569829932951268292953928766755779761837993812528527484487298117739869189415599461746944992651752768158611996715467871381527675219481185217357632445748912726487669881876129192932995282777848496561259839781188719233951619188388532698519298142112853776942545211859134185231768952888462471642851588368445761489225786919778983848113833773768236969923939838755997989537648222217996381757542964844337285428654375499359997792679256881378967852376848812795761118139288152799921176874256377615952758268844139579622754965461884862647423491918913628848748756595463191585555385849335742224855473769411212376446591654846168189278959857681336724221434846946124915271196433144335482787432683848594487648477532498952572515118864475621828118274911298396748213136426357769991314661642612786847135485969889237193822718111269561741563479116832364485724716242176288642371849569664594194674763319687735723517614962575592111286177553435651952853878775431234327919595595658641534765455489561934548474291254387229751472883423413196845162752716925199866591883313638846474321161569892518574346226751366315311145777448781862222126923449311838564685882695889397531413937666673233451216968414288135984394249684886554812761191289485457945866524228415191549168557957633386991931186773843869999284468773866221976873998168818944399661463963658784821796272987155278195355579386768156718813624559264574836134419725187881514665834441359644955768658663278765363789664721736533517774292478192143934318399418188298753351815388561359528533778996296279366394386455544446922653976725113889842749182361253582433319351193862788433113852782596161148992233558144692913791714859516653421917841295749163469751479835492713392861519993791967927773114713888458982796514977717987598165486967786989991998142488631168697963816156374216224386193941566358543266646516247854435356941566492841213424915682394928959116411457967897614457497279472661229548612777155998358618945222326558176486944695689777438164612198225816646583996426313832539918

My first thought was that I would need to separate this string such that each character was the element of an object, either a vector or a list. I kept things simple and started by just copy-pasting the string into R. I could import it as a .txt file or otherwise but I figured that was unnecessary for such a quick problem. I stored the string as a variable named input.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
 
# Split string after each character.
 
input_split <- strsplit(input, "")
 
# As a result, input_split is a list with 1 element:
# a vector containing each character of input as an 
# element. Annoying. Let's unlist() it to extract
# *just* the vector.
 
char_vector <- unlist(input_split)
 
# The problem now is that if we are going to sum
# the elements of our string, we need them to be
# numeric and not characters. Easy enough...
 
num_vector <- as.numeric(char_vector)
 
# Now lets just initialize our sum...
 
num_sum = 0
 
# And use a loop...
 
for(i in 1:length(num_vector)){
 
  # If we have the last element of the input string, 
  # set the next number equal to the first element
  # of the string, else select element i + 1.
  next_num <- ifelse(i = length(num_vector), 
  				num_vector[1],
  				num_vector[i + 1])
 
 
  # If our current element is equal to the next element,
  # update the sum.
  if(num_vector[i] == next_num){
 
  	num_sum = num_sum + num_vector[i]
 
  }
}
 
num_sum

Our sum is 1390 which is correct, huzzah. Once you complete part 1, it surprises us with part 2 which asks us to now consider the digit “halfway around the circular list”. So, if our list is 10 digits long, we would add element 1 and element 6, element 8 and element 3, etc. We can make a really simple modification to our established loop to solve this…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
 
num_sum = 0
skip = length(num_vector)/2
 
for(i in 1:length(num_vector)){
 
  # For the first half of the list we need to
  # add half the length of the vector to the
  # index to get the next number & substract
  # for the second half.
 
  next_num <- ifelse(i <= skip, 
                     num_vector[i + skip],
                     num_vector[i - skip])
 
 
  if(num_vector[i] == next_num){
 
    num_sum = num_sum + num_vector[i]
 
  }
}

Our answer is 1,232 which is correct. Woo.

I know that loops in R are often discouraged due to their speed, or lack thereof. I think I might come back and play with this one again to see how I can implement vectorization here. That being said, I also think it’s important to know which tool is right for the job. With a problem of this size, I think a loop is just fine.