How to solve Advent of Code 2022 - Day 7 with Python
                            
                    If you missed any previous days, click here for all my content about that: Advent of Code, if you want to know why you should participate and try to code the solution for yourself, click here: Advent Of Code 2022 – 7 Reasons why you should participate. If you're here to see the solution of Day 7, continue reading ;)
GitHub Repository
https://github.com/GalaxyInfernoCodes/Advent_Of_Code_2022
I will upload all of my solutions there - in the form of Python (or Scala alternatives) notebooks. You have to use your own input for the "input.txt" since everyone gets their own and everyone needs to provide a different solution based on their input.
Day 7 Puzzle
On day 7 of Advent of Code, we had to build a tree of a filesystem that we get knowledge of through recorded terminal commands, such as this:
$ cd /
$ ls
dir a
14848514 b.txt
8504156 c.dat
dir d
$ cd a
$ ls
dir e
29116 f
2557 g
62596 h.lst
$ cd e
$ ls
584 i
$ cd ..
$ cd ..
$ cd d
$ ls
4060174 j
8033020 d.log
5626152 d.ext
7214296 k
Which would generate the following tree:
- / (dir)
  - a (dir)
    - e (dir)
      - i (file, size=584)
    - f (file, size=29116)
    - g (file, size=2557)
    - h.lst (file, size=62596)
  - b.txt (file, size=14848514)
  - c.dat (file, size=8504156)
  - d (dir)
    - j (file, size=4060174)
    - d.log (file, size=8033020)
    - d.ext (file, size=5626152)
    - k (file, size=7214296)
Part 1
Here we then had to calculate the size of the folders to then add up the ones with a total size with less than 100,000. Now you could do this in a shortcut-way without reconstructing the complete tree. However, I chose to train my data structures and build that tree from top to bottom - or rather, from root to leaves.
Tree Nodes
First off, I created a class for the entries in the tree. This would either be a file or a directory, indicated by a boolean value and if it's a file, it has a size.
At the start the node does not have children or a parent. You can however add a child through the method **add_child** which assigns the child a parent before adding it to the children list.
Then I calculated the size recursively, always adding the size of the children to the current size until I arrive at a file (which by its nature does not have children), in which case I return just the file size.
I also included a print method for the tree. Frankly because debugging without being able to see the tree was nearly impossible.
class TreeNode:
    def __init__(self, is_dir: bool, name, size=None) -> None:
        self.is_dir = is_dir
        self.name = name
        self.size = size
        self.children = []
        self.parent = None
    def add_child(self, child):
        child.parent = self
        self.children.append(child)
    def get_size(self):
        if self.is_dir:
            total_size = 0
            for child in self.children:
                total_size += child.get_size()
            return total_size
        else:
            return self.size
    def print_children(self, level):
        if self.is_dir:
            print('--' * level + self.name + ' (total=' + str(self.get_size()) +')')
        else: 
            print('--' * level + self.name + ' (file=' + str(self.get_size()) +')')
        if len(self.children) > 0:
            for child in self.children:
                child.print_children(level+1)
Tree
I also wrote a class for the tree. You don't have to do that, most of the methods are one liners and its main purpose is keeping track of the current directory/node. You could easily do that in your main loop.
class Tree:
    def __init__(self) -> None:
        self.root = TreeNode(is_dir=True, name="root")
        self.current = self.root 
    def reset_to_root(self):
        self.current = self.root
    def go_up_one_level(self):
        self.current = self.current.parent
    def go_to_child(self, name):
        self.current = list(filter(lambda child: child.name == name, self.current.children))[0]
    def add_new_child(self, child):
        self.current.add_child(child)
Parsing the tree
There are a few different cases of lines:
- '$ cd /' which returns us to the root directory
 - '$ ls' which starts the sequence of listing files and directories within the current ones
- here I use a while loop to catch all of those and add them as children
 
 - '$ cd ..' simply goes up to the parent directory/node of the current one
 - '$ cd <name>' which enters the given sub-directory
- side note: thankfully the (or my) input never tries to enter directories that you didn't previously see
 
 
with open('example.txt', 'r') as f:
    lines = f.readlines()
    lines = [entry.strip() for entry in lines]
tree = Tree()
while len(lines) > 0:
    line = lines.pop(0)
    if line == '$ cd /': 
        tree.reset_to_root()
    elif line == '$ ls':
        while len(lines)>0 and '$' not in lines[0]:
            line = lines.pop(0)
            size, name = line.split(' ')
            if size.isdigit():
                new_node = TreeNode(is_dir=False, name=name, size=int(size))
            else:
                new_node = TreeNode(is_dir=True, name=name)
            tree.add_new_child(new_node)
    elif line == '$ cd ..':
        tree.go_up_one_level()
    elif '$ cd' in line:
        _, _, name = line.split(' ')
        tree.go_to_child(name)
Putting it together
This is the method (within the TreeNode class), that solves Part 1:
def find_subdirectories_part1(self):
        dir_sizes = 0
        if self.is_dir:
            for child in self.children:
                if child.is_dir and child.get_size() <= 100000:
                    dir_sizes += child.get_size() + child.find_subdirectories_part1()
                else:
                    dir_sizes += child.find_subdirectories_part1()
        return dir_sizes
Here we recursively go through the tree building up a sum of file sizes as we go. When iterating through a nodes children, we only add the current directory size to our sum, if it's below the 100,000 threshold. Otherwise we just look deeper in the tree.
# call on root node to solve
tree.root.find_subdirectories_part1()
Part 2
After building up a whole tree in Part 1, Part 2 is fairly quick after reading the assignment. Basically, we are looking for a directory that is at least as big as 30000000 - (70000000 - tree.root.get_size()).
To do that I created a similar method as in Part 1. This time, I'm not building a sum, but instead collecting all the possible directory sizes in an array.
# method in TreeNode class
def find_subdirectories_part2(self, min_size):
        dir_sizes = []
        if self.is_dir:
            for child in self.children:
                if child.is_dir and child.get_size() >= min_size:
                    dir_sizes += [child.get_size()] + child.find_subdirectories_part2(min_size)
                else:
                    dir_sizes += child.find_subdirectories_part2(min_size)
        return dir_sizes
The instruction was the smallest sufficiently large directory, so finally I select the minimum of all directories found above.
total_space = 70000000
space_needed = 30000000
current_empty_space = 70000000 - tree.root.get_size()
possible_dirs = tree.root.find_subdirectories_part2(space_needed - current_empty_space)
min(possible_dirs)
Conclusion
I'm tired. I did this way too late.
No spam, no sharing to third party. Only you and me.
                                            
                
                
                
                
                
                
Member discussion