YesWeHack

YesWeHack was a very small python reversing challenge. It looked deceptively complicated but really didn't take much to complete. The challenge consisted of only a single python file seen below:

 1import re
 2from functools import reduce
 3
 4FLAG_REG = re.compile(r"^\d{0,16}9$")
 5KEY = 0xC3F80B633F90C6DE
 6OPS = [
 7    lambda x: ([print(x), x][1]),
 8    lambda x: x - ((KEY >> (x - 8 & 56) >> (x & 7) & 1 ^ 1) & x >> 3) * 8 & 56 | x & 7,
 9    lambda x: x - ((KEY >> (x - 8 & 56) >> (x & 7) & 1 ^ 1) & ~x >> 3) * 8 & 56 | x & 7,
10    lambda x: x + ((KEY >> (x + 8 & 56) >> (x & 7) & 1 ^ 1) & x >> 3) * 8 & 56 | x & 7,
11    lambda x: x + ((KEY >> (x + 8 & 56) >> (x & 7) & 1 ^ 1) & ~x >> 3) * 8 & 56 | x & 7,
12    lambda x: x & 56 | x + ((KEY >> (x & 56) >> (x + 1 & 7) & 1 ^ 1) & ~x) & 7,
13    lambda x: x & 56 | x + ((KEY >> (x & 56) >> (x + 1 & 7) & 1 ^ 1) & x) & 7,
14    lambda x: x & 56 | x - ((KEY >> (x & 56) >> (x - 1 & 7) & 1 ^ 1) & ~x) & 7,
15    lambda x: x & 56 | x - ((KEY >> (x & 56) >> (x - 1 & 7) & 1 ^ 1) & x) & 7,
16    lambda x: ["Good flag!", "Bad flag!"][x > 1],
17]
18
19if __name__ == "__main__":
20    flag = input("FLAG-")
21    if not FLAG_REG.match(flag):
22        exit(1)
23    print(reduce(lambda x, c: OPS[c](x), (int(c) for c in flag), 48))

So we have a little program that asks for the input to be appended to "FLAG-" then it runs some checks on the value and tells you if the flag is correct or not.

The first check is at line 21 using a regular expression created on line 4. You don't actually need to decode this regex as any value that fails it (and some that don't,) will fail the check on line 23 anyway. So anyway the regex matches a series of digits that is anywhere from 0 to 16 characters long and then ends in a 9.

The important thing to do with this information is to resist the urge to push on with brute force. There isn't enough entropy in this value for it to be cryptographically secure but there is more than enough to prevent someone with reasonable means from brute forcing it during a CTF. CTF challenges that require brute force manoeuvres are always tricky you really have to know what you're doing before you start and think about just how much time you're signing yourself up for.

Now we get to the second test on line 23. Line 23 uses the reduce function which is a construct associated with functional programming. You have a large amount of data and you want to reduce it to a single value. The example in the python documentation is using it to sum up all the values in a list. You could do that with sum (and you should,) but reduce lets you specify an arbitrary function. This function is called using the provided container's first two elements then the result from that operation and the 3rd argument. So all the elements in the container other than the first (or 0th if you will,) get passed as the second argument to the provided function and the first argument is this first element and then the result returned from the function from then on. The 48 value that's passed in however sets the initial value to start calculations from so that the first element of the container also gets passed as the 2nd argument to the function.

Let's simplify this a bit and talk specifically about line 23. The function given is a lambda let's just call this function a for now. Then the container is a tuple containing integers for each of the digits given to form part of the flag and finally an initial value of 48 is given. Let's say that the input given was "1234". First reduce calculates a(48,1) and we'll say that result is r0. Then reduce calculates r1 = a(r0,2) then r2 = a(r1,3) then r3 = a(r2,4) and r3 is returned from reduce. Now that example doesn't end in 9 so it doesn't actually meet the regex requirements. If it did end in 9 then it would call a(r3,9) which causes either "Good flag!" or "Bad flag!" to be printed.

Now lets take a closer look at this "a" lambda (I'm continuing to refer to the lambda from line 23 as a here if you skipped ahead). What it does is it uses the digit from the attempted flag as an index into the OPS list to select another lambda function which is given the first argument to a as an argument. If you look at the 10th lambda on line 16, which will be OPS[9] due to 0 indexing, it tests to see if its argument is greater than 1. If this is true it returns "Bad Flag!" and if not it returns "Good flag!". So now the regex makes a bit more sense, it always ends in 9 so that the print function on line 23 gets a reasonable string to print instead of just a number. We can also figure out however that there won't be any other 9s in the flag as the rest of the lambdas in OPS clearly expect to get numeric values and will actually throw unhandled exceptions if they get a value like "Bad flag!".

So we can think of our proposed flag as a little program that has to start with the value 48 and change it until it's less than 1 in 16 characters or less. This program is written using op codes from 0-8 and these characters cause the operations from the lambdas in OPS to be performed. At this point you could start to dig into those lambas and do a bunch of math, but they have a lot of shifts and bitwise functions and depend on the x value they're given so while certainly possible this is a bad idea. There is no reason for us to pull apart these lambdas.

Instead we should consider the impact these lambdas have on the starting value we have. So what I did was to just use ipython. I copied and pasted the KEY and OPS values and then used code similar to this:

1for i,op in enumerate(OPS):
2    print(f"{i}:{op(48)}")

When this is run you can see that every op other than 5 returns 48 and 9 returns "Bad Flag!". So now we know that the first value in the flag has to be 5. Given the first value only lead to one possible solution I continued for a while just running this code manually for the returned values until I saw that it did eventually branch. Once I saw this branching I decided it was time to just let the computer do the work. If you think about what's happening you're creating a tree structure. Every time you find a new number that could be in the flag you need to keep track of all the past values that you've seen to make sure you're making progress and not entering a loop. Every time you consider a character in the flag you're eliminating a huge number of potential values and narrowing your search. If you look at OPS with any detail at all you'll also see that 0 just prints the current reduce value without changing it. This makes 0 redundant. If a 16 character flag works with 3 0 values that means the same flag value without the 0s also works since 0 is a no-op. When you start testing values you also see that most of the time most of the potential operations will return the same value they started with. So you could ignore 0s in potential flags but I chose not to bother as it would have made my solve code a bit more complicated if I had. I did assume that if the flag could be 16 characters long it would be the full 16 characters and decided to only check for success once an attempt reached 16 characters. It would be dead simple to change my code to check shorter flags but this expectation turned out to be correct so I never bothered to change anything.

So let's look closely at the algorithm that worked for me. As I said conceptually this looks like a tree that branches out. To use a tree data structure wouldn't be all that difficult, but would require a lot of walking backwards to the root of the tree to check to see if calculated values are unique to the particular path in question. This is definitely doable but a flat structure is much simpler. For example the first value of the flag is 5 after that 6 then a branch between 2 and 4. You can think of that as a tree that goes 5 -> 6 and then branches to 2 and 4. So when you're testing in this tree when you hit 2 and see the value has changed you go back to 6 and see that the result is different then it was, then again to 5 to confirm the same thing. What's much easier though is to just say I'm guessing "5". When I guess "5" I see the next value is "6" so now my guess is "56" and I just store the previous resulting values in a list associated with that string. When you hit the branch between 2 and 4 you don't need a tree you just now have two guesses "562" and "564" which have different lists of past results associated with them. if one of your guesses doesn't result in a new potential value it means you hit a dead end and you just abandon that guess. When a guess hits 16 characters in length you test to see if it's the flag and throw it away if it doesn't work. If you don't want to assume the full 16 characters are used you can simply test each time through the loop. To test you just have to take the most recent result and see if it's less than or equal to one. Due to a combination of laziness and a desire to ensure I didn't make a mistake somewhere I chose to just use the code as given in the challenge to test my guesses. Once you have a flag that works you exit the program and print the flag. Code that solves the challenge is below (you have to include the KEY and OPS portion of the challenge code of course and note that I filled in the first known value 5 and the history of previous result values 48 and 49):

 1flags = [("5",[48,49])]
 2while True:
 3    for i,(flag,val_history) in enumerate(flags):
 4        new_list = []
 5        if len(flag) == 16:
 6            flag = flag + "9"
 7            if reduce(lambda x, c: OPS[c](x), (int(c) for c in flag), 48)[1] == "Good flag!":
 8                print(flag)
 9                exit()
10        else:
11            for j,op in enumerate(OPS[:-1]):
12                next_try = op(val_history[-1])
13                if next_try not in val_history:
14                    new_list.append((flag+str(j),val_history+[next_try]))
15        flags = new_list