Codebreaking – AI vs. The Zodiac Killer

The Zodiac Killer’s cipher spent years unbroken. Misspellings and mistakes within his own system made it look like gibberish. But he’d sent less complicated letters before, so they knew he wasn’t trolling them. How did he make it?

Language and Hiding It

Ultimately, what makes a language is recognizable patterns. What makes a good cipher is deliberately obscuring those patterns, but in a way only your team can un-obscure. In WWII, Native Americans from the Dinè (also known as Navajo) tribe were enlisted to create a code using their language, which was so isolated from the rest of the world that the Axis powers never successfully broke it. In fact, the people in the unit did such a good job of scrambling the code that other bilingual Dinè, who weren’t trained in it, couldn’t decipher it either. Simply speaking in another language wasn’t enough to protect the messages, and if the code talkers hadn’t gone to such lengths to double- or triple- layer the code, it might have been cracked.

In contrast, using well-known languages to make an unbreakable code was extraordinarily difficult even before computers could decipher for us– see ULTRA, the name the Brits gave to cracked Nazi messages. There wasn’t much computer aide available back then. It makes the Zodiac Killer’s cipher all the more frustrating to try and break! How did one dude make something so difficult to break when we have modern technology? There are patterns, but they’re difficult to pin down. There’s organization, which is visible in the line breaks, but it doesn’t help. One single person made a code so arbitrarily screwy that only decades later was it broken.

Breaking It

Knowing the base was in English and knowing he did mean to put a message in there somewhere was a great help by itself. The FBI had more important things to work on, and resources were gradually diverted away from the Zodiac Killer case, but that didn’t mean that everyone had to give up on it.

A team of private citizens took on the case. One of them used a program known as AZDecrypt, which combined many features of other AI decripters into something better suited for the Zodiac’s Cipher. That guy then sent the other guy 650,000 variants of the cipher, in an attempt to make something the computer could actually decrypt. Brute forcing sometimes works! The first pass got some near-misses, but one near-miss produced a couple of readable, sensical words in the first paragraph! That alone was huge – the content made sense, so the team then locked those words in and tried to solve for the rest. They discovered that the code was on a diagonal, where the next letter in the word was down and to the right, until it hit the edge, and then it wrapped around. They also discovered his formatting split the message into two separate 9-line paragraphs with a 2 line parting sentence at the end, out of a solid block of glyphs. Their initial key that got David and his team that far looked like a Caesar cypher with extra letters per letter shift.

That got them a very solid, readable message from the first paragraph. However, it wasn’t quite perfect. As you can see above, each letter could be multiple characters, so the computer is accidentally scrambling words when it can’t recognize them. The second paragraph helped fill in more clues, and their first pass produced sentences that looked like this:

You may notice right before the word “Death”, there’s his common misspelling of “paradice” but backwards. In fact, when spaces are put in between words we can see and words we can’t, the entire sentence is composed of backwards and forwards words, and the stray letters are where the computer got confused. The rules are different for the second set of 9 lines than they are for the first one, so the computer is at a loss – the humans in this are having to help it think creatively. Still, the computer’s help is valuable. Imagine trying to figure this out on paper, and remember: all of this is diagonal relative to normal, left to right English. It’s a solid wall of text that the computer has broken down into paragraphs. The glyphs are different than they were last time, and it took this team 650,000 scramble-translations to get something passably usable. Without that computer, this is years of attempting to manually decipher.

Typos: The Unexpected

The Zodiac killer misspelled the word “Paradise” as “Paradice”. That much is consistent with other letters he sent before he stopped. This alone confused the computers for ages. “Payalice” was a common accidental interpretation, where Paradice wasn’t in the computer’s dictionary.

Even worse: one line of text, either deliberately or not, was one character off. Below, you’ll see a screenshot of David’s video again. The line with the letters NOSR was scooted one character to the left, which threw off all of the words that passed through that section, confusing AZDecript. Moving it back fixed most of the errors within the second paragraph. It’s possible this was intentionally done by the Zodiac Killer to make deciphering even harder – that H on the bottom row is highlighted because David and his team discovered putting it there fixed the typos. It’s also possible he lost his place in his own handwriting and the H is a happy coincidence, given how many of the words were misspelled even after the correction.

This brings us to the final issue, the typos in the final message:

Look at this mess. The typos are unaccounted for, and David ended up just correcting it without trying to figure out what the Zodiac did to produce these typos. The message is obvious, and he talks a lot about “paradice”, so it matched up with what they already knew. It also contained a direct response to something someone had said on the news two weeks prior to the FBI receiving this coded message. It all makes sense, he just left typos in the message, deliberately or not.

That alone complicated codebreaking efforts for years, professionals and amateurs following the case alike couldn’t get a cohesive message out of it no matter what they did.

Some got close – but there were so many missing or transposed letters that sorting a message out of them may as well have been divination. Nobody wanted to give up. But the guy had gone quiet, and there was nothing else they could do.  David’s team slaved over AZDecript and the Zodiac’s cipher for years to decode it, and while it may not bring him any closer to justice, it’s a little less mystery in the case. The final message, also taken from the video, made from human and computer power:

Computers can do a lot. But they haven’t always been this awesome. People were still beating machines at chess on the regular, they barely stood a chance against a cipher like this. Strange, arbirtrary rulesets and inconsistencies in the message itself meant that humans were needed to figure out these human flaws to coding. Without a dedicated team working round the clock to decipher it, it sat for decades, unsolved, until tech caught up enough to simulate that team.

Sources:

https://cryptii.com/pipes/caesar-cipher

https://www.nationalww2museum.org/war/articles/american-indian-code-talkers

https://www.sfchronicle.com/crime/article/Zodiac-340-cypher-cracked-by-code-expert-51-years-15794943.php

https://www.bbc.com/news/world-us-canada-55285805