- Sat 17 July 2021
This is my solution with commentary for day 19 of the 2020 set of Advent of Code problems. This was the most challenging day for me of the 2020 set.
In short, these rules describe a set of strings made up of the characters "a" and "b".
0: 8 11
8: 42
11: 42 31
42: 73 29 | 93 114 (matches either 73 29 or 93 114)
31: 29 92 | 114 38
... 100+ more rules ...
29: a
114: b
0: 8 11
means that rule number 0 matches a concatenation of the results of the rules 8 and 11.
And the problem is to count how many of the provided set of strings match the root rule 0.
aaababbaaabababaaabaaabb
bbabbababbbabaababbbbbbabababaaa
baaaaaabaababbbbbaaaaaababbbbbaababbaaaaaaaaaaaa
bababbbbaabaabbaaabababbaabbabab
ababbaaabaabbabbbbaaabba
aaaabaaaababbbababbaababaabbabbababaaabbaabbbbabaabbabababbbabbbaababbaaabaaabbb
... 400+ more strings ...
Solution: Part One
Maybe the problem can be solved with brute force, generating all possible messages and counting how many match the input set. I tried that first to see what would happen.
Here's a function that will return a generator of all possible strings for a given rule. In this case I'll use it for the root rule 0.
def gen_strings(rule_id):
rule = rules[rule_id]
if rule in "ab":
# yield constant character
yield rule
elif "|" in rule:
# handle case with multiple forms (example: 121 29 | 128 114)
elem1, elem2 = rule.split(" | ")
generators = (gen_strings(_rule_id) for _rule_id in elem1.split())
for s in product(*generators):
yield "".join(s)
generators = (gen_strings(_rule_id) for _rule_id in elem2.split())
for s in product(*generators):
yield "".join(s)
else:
# handle case with a single forms (example: 65 114)
generators = (gen_strings(_rule_id) for _rule_id in rule.split())
for s in product(*generators):
yield "".join(s)
In [19]: valid_strings = list(build_strings("0"))
In [20]: len(valid_strings)
Out[20]: 2097152
2 million possible strings were generated, taking less than a second. Fortunately, my computer has millions of bytes of memory.
In [23]: len(set(valid_strings).intersection(set(messages)))
Out[23]: 239
Correct!
Solution: Part Two
Now part two is unlocked. The prompt is to answer the same question after modifying two rules such that they now contain cycles.
Old rules 8 and 11:
0: 8 11 (root)
8: 42
11: 42 31
New rules 8 and 11:
0: 8 11 (root)
8: 42 | 42 8
11: 42 31 | 42 11 31
Though an infinite number of messages can now theoretically be matched, the longest message in the set that needs to be checked is 88 characters. There's possibly a solution here where all messages up to 88 characters can still be generated with reasonable time and memory. But the spirit of these problems is typically that part two can't be brute forced the way part one can. So I'll play along.
The messages matching the root rule 0 will be concatenations of different combinations of rules 42 and 31. I used the same function above to find all strings for each of these subrules, which themselves do not have cycles.
In [24]: valid_strings_31 = set(build_strings("31"))
In [25]: valid_strings_42 = set(build_strings("42"))
In [26]: len(valid_strings_31), len(valid_strings_42)
Out[26]: (128, 128)
In [27]: from collections import Counter
In [28]: Counter(len(s) for s in valid_strings_31)
Out[28]: Counter({8: 128})
In [29]: Counter(len(s) for s in valid_strings_42)
Out[29]: Counter({8: 128})
Each subrule generates 128 possible strings of 8 characters each. So those lists should be easy to work with.
The messages that need to be checked all have a length that's divisible by 8. The shortest is 24 and the longest is 88, so all messages will be a concatenation of a combination of 3 to 11 occurrences of subrules 31 and 42.
The pattern of possible combinations is pretty simple:
0: 8 11
8: 42 | 42 8
11: 42 31 | 42 11 31
One or more instances of 42, followed by [42 31], [42 42 31 31], [42 42 42 31 31 31], etc. There are only 25 possible sequences of these that have 3 to 11 elements:
rules = [[42, 42, 31],
[42, 42, 42, 31, 31],
[42, 42, 42, 42, 31, 31, 31],
[42, 42, 42, 42, 42, 31, 31, 31, 31],
[42, 42, 42, 42, 42, 42, 31, 31, 31, 31, 31],
[42, 42, 42, 31],
[42, 42, 42, 42, 31, 31],
[42, 42, 42, 42, 42, 31, 31, 31],
[42, 42, 42, 42, 42, 42, 31, 31, 31, 31],
[42, 42, 42, 42, 31],
[42, 42, 42, 42, 42, 31, 31],
[42, 42, 42, 42, 42, 42, 31, 31, 31],
[42, 42, 42, 42, 42, 42, 42, 31, 31, 31, 31],
[42, 42, 42, 42, 42, 31],
[42, 42, 42, 42, 42, 42, 31, 31],
[42, 42, 42, 42, 42, 42, 42, 31, 31, 31],
[42, 42, 42, 42, 42, 42, 31],
[42, 42, 42, 42, 42, 42, 42, 31, 31],
[42, 42, 42, 42, 42, 42, 42, 42, 31, 31, 31],
[42, 42, 42, 42, 42, 42, 42, 31],
[42, 42, 42, 42, 42, 42, 42, 42, 31, 31],
[42, 42, 42, 42, 42, 42, 42, 42, 31],
[42, 42, 42, 42, 42, 42, 42, 42, 42, 31, 31],
[42, 42, 42, 42, 42, 42, 42, 42, 42, 31],
[42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 31]]
Here's the algorithm I used to check how many messages satisfy one of these rules. Break each message into blocks of 8 characters, and check that each block is one of the 128 strings matching that element of a rule.
In [34]: part_options_map = {
...: 31: valid_strings_31,
...: 42: valid_strings_42
...: }
In [35]: import textwrap
In [36]: def check_message(message, rule):
...: """Check one message against one rule."""
...:
...: parts = textwrap.wrap(message, 8)
...:
...: for n, part in enumerate(parts):
...:
...: part_options = part_options_map[rule[n]]
...:
...: if not part in part_options:
...: return False
...:
...: return True
...:
In [37]: valid_message_count = 0
In [38]: for message in messages:
...:
...: # every 42/31 message part is 8 characters, so only
...: # rules with size: len(message)/8 are possible
...: possible_rules = [r for r in rules if len(r) * 8 == len(message)]
...:
...: if any(check_message(message, rule) for rule in possible_rules):
...: valid_message_count += 1
...:
In [39]: valid_message_count
Out[39]: 405
Success!