18 February 2008

a short journey into the realm of math

Ok, So my wife plays on a website called "neopets". At first blush this site appears to be squarely marketed for kids. It's constantly full of tie-ins to the Disney movie of the month, the hot toy of the season, etc. Many of the games are mindless... pure and simple MINDLESS ways to waste time. And there are a LOT of them. As the name implies, the whole point of the site is to keep and entertain "new pets", and of course, their "pet pets". (yes, the new pets have pets of their own... but they only recurse one level deep so far as I know.) Sometimes the site will have a "plot" where the kids have to work out a series of puzzles that tell some story in order to get some virtual trinket for their virtual pets' virtual pets. The couple of these that I've seen... fetch quests at their worst. Mind you, these are still less mindless than the normal games, so I guess there's something to be said for that. (I hesitate to even think about the bandwidth usage, as some of these games remind me of hypercard applications... only implemented in raw HTML, with no client side computing. There have been times when I work from home that I've had to ask her to go find something else to do, because she's chewing up so much bandwidth it's impacting my work.)

So here's the scary part... she's not alone. There are legions of housewives and soccer moms playing these games. And I think the folks behind neopets know it. A case in point came up on the current plot quest. As the hero is wandering the halls of some space station they broke into they come to a computer terminal which they decide to crack into. (note, not "hack into", this is not hacking... this is cracking... they are clearly doing it not for the educational merits of learning a new system, but for the clear purpose of profiting off the information obtained from the system.) Now it's funny, the hero says "they might be using military grade encryption" (we'll look at that in a bit) but they're clearly also using "Hollywood OS" as we call it. That's the operating system that everyone in Hollywood productions that isn't cloning dinosaurs uses. (yes, that is one of my two favorite scenes in that movie... just in front of the lawyer getting eaten by T-Rex while on the john.) The Hollywood OS is recognizable in any movie by any number of it's patented features:

  • The ability to establish a psychic mind meld with users, so that it knows what they meant to type, even if they just hit random keys... or even if they didn't hit enough keys.
  • Each and every option has a wildly stylized icon (there are no menus) with a pop up gui that is used ONLY for that option, and has all the right options defaulted to do exactly what they want (again, via the psychic mind meld... I wonder if that's a plug and play usb option? or do they have to pair a bluetooth device for that to work?)
  • Big flashy "I'm working, don't touch me" screens every time it has to do something that might not return immediately.
  • Even the least likely options, in fact ones you wouldn't want in a normal user interface, are present and optimized. Such as cracking another users password.
In this case several of these options come to play, but the most important is the central tenant of this game... it presents a list of options for crypto keys. Wouldn't that be nice... in the strongest case it reduces 2.25e+25 total combinations (a 75bit keyspace) to 24360 combinations. But because the order of these keylets don't matter, it's farther simplified to just 4060 combinations (just under a 12 bit keyspace). But still... a 12 bit keyspace to search by hand? That's a bit cruel to kids. Ah, but clearly they know it's not just kids playing these quests. So in comes the adult minds, the hopefully college educated minds. Well, at least hopefully GED holding minds. They gang together and start coming up with ideas on how to solve the puzzle. Some of them are pretty close, others, well, I think they "got lucky" once and extrapolated from that to assume how the system worked.

So why did I start looking at this system? Cory tried. For a long time... I'm not sure how many days she was at it before asking for help... I want to say it was probably close to a week.

Before diving into the analysis, let's look at how this puzzle operates. You 're presented with a basic green screen (5250 anyone?) implementation of the good old unix "mail" program... or something eerily similar... only as rendered by the Hollywood OS. There are a couple dozen emails that need to be read, one of them has the plot point to move the story along farther. The mails are all encrypted. The crypto system shows you a "key" and a desired "result" along with spots for, and a variety of "modifiers" to select from. The player simply drags potential modifiers from the list to the spots, and when they get the right set in place, the email is decrypted. The mails can be sorted into four groups by diffiuculty: 1, 2, 3 and 3' (that's "three prime" for those that have forgotten their calculus.) These groups are base on the number of "modifiers" and the number of glyphs used to render the keys, result, and modifiers. Stage 1 has 1 modifier, 2 has 2, and 3 and 3' both have 3. Stage 1 has 3 glyphs, 2 has 5, 3 has 6 and 3' has 7. The number of glyphs turns out to be an important number in the following math, so I'll refer to it as 'N'. For the purpose of analysis, and rendering in conversations, these glyphs tend to be named for letters they vaugely resemble. In stage 1 you have 'C', 'X' and 'V' or 'K' (it's open to interpretation what exactly the glyph is). Stage 2 adds 'B' or '8', and 'P' or 'Q'. Stage 3 adds 'Z' or 'S' or '$'. And stage 3' adds 'O' (though personally, I think it looks more like a Theta). The key/result/modifier lengths also vary, 4,6,8,10.

(a table would go well here)

Ok, so to the meat of the analysis...

So to do this mathematically, we first have to assign some number values to these symbols. Some suggest that any random mapping may be employed. That might work in stage 1. It's going to fall elsewhere. Some playing with the system reveals two important factors about this "crypto algorythim" that the space station mail system uses...

  1. "keys" and "modifiers" are added to produce results.
  2. this addition is caryless, that is to say each position in the key is standalone, it is not influenced by the one "before" it as normal addition would be.
  3. each addition is performed either as modulo N, or in the base of N. (technically, these are one in the same... some folks are more comfortable with one concept than the other. I'll use mod N)

Opening up our first stage 1 email gives an oppertunity to play with the system and establish 4 equations to play with for each modifier to choose from. If you're lucky, you can get enough from a single modifer attempt to build a system of multi-variable equations to work out the values. Here's the one I got on my third try that got me started:


From that we can derive:

1. C+C=C  -> 2C=C  -> C=0
2. X+X=V -> 2X=V
3. X+V=C
4. V+V=X -> 2V=X

Now remember that this is base 3 math. The obvious identity function for C tells us what it's value is, and the 3rd equation confirms this (X and V must be 1 and 2, the only other possible values, and either way 1+2 = 2+1 = 0 (remember, 3 mod 3 = 0)). The other two equations don't really narrow it down any... 1+1=2, 2+2=1 (again, 4 mod 3 = 1). So we try a different modifier, and even a different email to get a different key, and we end up with a few more equations... none of them change anything. So pick either way, V=1 and X=2 OR V=2 and X=1. Either way works just fine. (Thanks to this being mod 3... that changes quickly.)

So with that simple bit of math, we now can solve stage 1 combinations quickly and trivially. It's a simple matter of math... you're filling in A + __ = C for a 4 digit problem. Given:

Mod: ____

The obvious correct key is: VVXV (that's either 1121 or 2212... depending on your choice above). This is sometimes made even easier by the fact that there are only two choices that start with a V, and only one of them has a V at the end, or in the second position.

Now stage 2 becomes a little more difficult, but again, the first thing you have to do is work out the values for the glyphs. A bit of playing in some of the messages and you can build up some identity equations. Most useful are of the form 3X=Y, but you don't even need a full collection of those if you do some "what if" exercises on paper. Here's the collection I used:


No obvious identity function for zero there, but after a few minutes of fiddling and playing what if exercises with that set I came to the following assignments that worked for every equation I'd seen so far (and I went and got about a dozen more to validate):


Now at this point, the stage 2 problems are not much more complex than the stage 1 problems. Given an initial point A, and a target point D, what pair of values B and C can you add in. Since the digits don't wrap, it becomes fairly simple to work out the correct values from the list of options given. An easy way was to look for keys near 0 and targets near N. For example, given:

Mod1: ______
Mod2: ______

the second and third colums were very nice to narrow down options. Their equations work out to P + __ + __ = C and B + __ + __ = X. These are nice because there are some simple options that don't require modulus, in almost every time I saw this type of pattern, the correct key was findable this way... I only recall once that I found a setup like that that I didn't have the appropriate keys available, and in that case I just switched to a different digit. For the second digit, the easy path is either B + B, or P + C. A quick look through the offered keys might show there are only 2 with a B in the second position... drop them in and you might be in luck. Or there might be three to choose from... but that's only 3 combinations. If you're not lucky yet, then it's fall back to the P + C option. There maybe 3 P's in the second column, and 2 C's. This narrows the search space (temporarily) down to just 5 keys. Now we look at the values in position 3 of those 5 keys... will any legit combination of those values work for the equation there? Almost every time there was only one set that solved both columns, and it solved all 6 of them, solving the puzzle.

Phase 3 and 3' are just more of the same. Larger N value, more modifiers, more digits, different mappings of glyphs to values. Which leads to more possible ways to do the math... so if you're doing this, and you're reading this.... take lots of notes as to what combinations you've tried. Now here's an important note... from what I see, everyone's game is different, not only the exact keys/modifiers/results, but also the values of the glyphs. Looking at some of the screen shots and other (non-mathematical) analysis that has been done, I see that other people's numbers aren't compatible with mine. Which means everyone is going to have to do the initial solving for their numbers. That said, there are reports that various folks are getting the same combinations to work... like C+C+B+B=X for several folks. I can only assume therefore that they might only have a limited collection of value maps.

Here is the equation set I used for stage 3:


Nothing obviously stands out as zero this time, three likely candidates though gives some easy places to start the 'what if's. Again some playing with those numbers got me to a set that satisfied all equations and a couple dozen more I tried for validation:


Stage 3' was more of the same... the obvious identity functions were the first I went and generated (had to use 2 different mail messages to get them, as it was hard to find the glyphs in the right spots to get more than about 2/3s of them in one message.)

4O=O -> O=0

Given that nice identity for zero, I was able to quickly make some more simple equations (I had about 2 dozen more 4 digit additions to play with that I had captured in the process of making the above before I found zero):


This let me boil down the mapping for 3' quicker than I did for 3.


Labels: ,


Post a Comment

<< Home