Reverse engineering an unknown CRC
It's been a long time since I've been to any security CTFs, even longer time since I've written my last writeup. But, as life would have it, I came across a task during my AR adventures that was exactly like a CTF task:
I've found a non-standard CRC checksum on a firmware file, and had to reverse-engineer its parameters.
What is a writeup?
Not all of you may be familiar with CTFs and writeups. CTFs in the infosec context are competitions where you have to hack computers, applications, or both. They usually last some days, and there are 5-20 tasks, and done in teams.
After the CTF, it's common practice for the CTF teams to write so-called "writeups", which are half technical description of the task and how it was solved, and half a story of how the author came to the solution.
So prepare for 50% story, 50% information :)
The task
I had an embedded (Cortex-M4-based) device on my hand, some ~5 firmware files that I obtained fully legally.
If I booted the device while holding down the "volume UP" button, this splash screen appeared on USB-to-serial:
========================= Home Menu ==========================
==============================================================
--------------------------------------------------------------
= Version: Vxx.xx.xx =
= Revise Date: xx-xx-xx =
= Software Name: xxxxxxx_Bootloader =
= (c) COPYRIGHT xxxxxxxxxxxxxxxxxxxxxxx All rights reserved =
==============================================================
$1 ------- Download New xxxxxxxxxxx_App
$2 ------- Download xxxxProgram
$3 ------- Download xxxxxxProgram
$5 ------- Jump To xxxxx App
(It has been redacted because it's not important and I don't want to appear in specific search results)
Naturally, I pressed $
1
, and got a new prompt:
Waiting New xxxxxxx_App to be Write Flash_addr 0xxxxxxxx
Start Download xxxxx APP
Not having any better ideas, I switched to a pyserial
to automate the process
and with it, directly sent in one of the firmware files.
Download xxxxxApp Length is 80681 Byte!
Check New xxxxxxx_App CRC Success!
Recvd:
File Name : CONF7
Size: 80681 Bytes
And it started successfully after the $5
command. OK, we are done
with the positive control.
Looking at the firmware files, it was quite apparent that it didn't have any cryptographic signatures (there were no 0x80 or bigger blobs with very high entropy), and in fact the header was incredibly simple:
ee a1 19 6f 29 3b 01 00 43 4f 4e 46 37 00 56 30 |...o);..CONF7.V0|
2e 31 2e 32 00 00 a8 53 01 20 35 10 05 08 97 0d |.1.2...S. 5.....|
05 08 f9 0d 05 08 fb 0d 05 08 fd 0d 05 08 ff 0d |................|
05 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00 00 01 0e 05 08 03 0e 05 08 00 00 00 00 05 0e |................|
05 08 f1 41 04 08 6b 10 05 08 6f 10 05 08 73 10 |...A..k...o...s.|
It may not look that simple to you, so let me break it down:
ee a1 19 6f
: 4 byte checksum29 3b 01 00
: 4 byte length (little endian, ==0x13b29 ==80681)CONF7
: Some kind of filename (was the same in all firmware blobs)V0.1.2.
: Some kind of version (was the same in all firmware blobs, was not used)- Then immediately a Cortex M4 interrupt vector table:
a8 53 01 20
: == 0x200153a8, Initial stack pointer,(in the IRAM)35 10 05 08
: == 0x08051032, Pointer to the reset handler function- and so on...
My goal here is to patch the firmware with some extra functionality that the original manufacturer didn't bother to implement.
But, as you could've guessed, modifying any bytes in the firmware binary resulted in the following error message, and the device failing to boot:
Check New xxxxxxx_App CRC Failed!
I tried some shenanigans like sending only half the firmware, or unplugging it just before any check could happen, etc., but nothing worked.
Time to figure out...
...what kind of checksum is that
Let's first list our assumptions:
- The checksum is a CRC variant. We can't know for sure, but since we don't have a better idea, let's trust the log strings for now.
- The checksum checks from the filename to the end of file. I know this, because I tested modifying only the filename, and that caused a CRC check error. Also checked the last byte, appended bytes, etc, all resulting in a check failure
- The 4 byte size field may or may not be included in the checksum. We will always have to check both versions.
(Later I've been told that some software even include the CRC field in the checksum process with dummy bytes, but fortunately it turned out this was not the case)
First, let's try the easy way: copy-pasting the bytes into an online CRC calculator and crossing fingers:
Nope, none of them are even close to EE A1 19 6F or its inverse (11 5E E6 90).
Next try: bruteforcing parameters with pycrc:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
It didn't find anything.
Background
I saw that I won't really be able to brute-force this many arbitrary bits even with GPU stuff, so I resorted to the M-word: using mathemathics. Group theory, specifically.
GF(2)
Tl;dr: addition of polyomials over GF(2) is the xor
operator,
multiplication and division are weird bitshift and xor combinations.
To understand the stuff in this section, you need to understand GF(2)
.
GF(2)
is a finite field field, which means it has:
- A finite number of elements:
0
and1
in this case
- An addition operator:
- For
GF(2)
it isxor
:0 + 0 = 0
,0 + 1 = 1
,1 + 1 = 0
- For
- A multiplication operator:
and
:0 * 0 = 0
,0 * 1 = 0
,1 * 1 = 1
)
- And there are rules how the elements and the operators should behave (e.g. multiplication has to be distributive over addition), which are OK for the above
GF(2)
is not really useful by itself, as it operates on only a single bit.
Polynomials over GF(2)
are much more useful. Let's look at these two example
polynomials:
a = x2 + 1
b = x4
a + b = x4 + x2 + 1
Easy, right? Let's make it a bit more complicated:
c = x3 + x2 + x
a + c = x2 + 1 + x3 + x2 + x = x3 + x + 1
Wait what? What happened to x2? Remember that while x is a symbol in
the above expression, all the constants must be either 0 or 1, since we are operating
above GF(2)
. And according to the addition rule, x2 + x2 = (1 + 1)⋅x2 = 0⋅x2 = 0.
We will need one more operation, multiplication:
a⋅c =
(x2 + 1)⋅(x3 + x2 + x) =
x2⋅x3 +
x2⋅x2 +
x2⋅x +
1⋅x3 +
1⋅x1 +
1⋅x =
x5 +
x4 +
x3 +
x3 +
x1 +
x =
x5 + x4 + x1 + x
It's still not really clear what this has to do with computers or anything. Let's start descending towards Earth. We will start by expanding all of our polynomial notations:
a = 0⋅x4 + 0⋅x3 + 1⋅x2 + 0⋅x1 + 1⋅x0
b = 1⋅x4 + 0⋅x3 + 0⋅x2 + 0⋅x1 + 0⋅x0
c = 0⋅x4 + 1⋅x3 + 1⋅x2 + 1⋅x1 + 0⋅x0
At this point we can leave the x
-es behind:
a = 00101
b = 10000
c = 01110
The above operation results become:
a + b = 10101
a + c = 01011 (basically a xor c)
a ⋅ c = 110110
Only the last one, the multiplication is not what you would expect. This is because if you do the long multiplication, the addition part is done differently.
With pseudocode, it looks like something like this:
1 2 3 4 5 6 7 |
|
CRC
Spoiler: CRC is literally just division over GF(2)
polynomials. The
CRC polynomial is the divisor, CRC checksum itself is the remainder.
It's the same as long division
taught in elementary school, but spicier. Just remember that over GF(2)
,
addition and subtraction are the same.
There's a great example of the steps on wikipedia, which I'll just copy paste here:
11010011101100 000 <--- input right padded by 3 bits
1011 <--- divisor
01100011101100 000 <--- result (note the first four bits are the XOR with the divisor beneath, the rest of the bits are unchanged)
1011 <--- divisor ...
00111011101100 000
1011
00010111101100 000
1011
00000001101100 000 <--- note that the divisor moves over to align with the next 1 in the dividend (since quotient for that step was zero)
1011 (in other words, it doesn't necessarily move one bit per iteration)
00000000110100 000
1011
00000000011000 000
1011
00000000001110 000
1011
00000000000101 000
101 1
00000000000000 100
10 11
00000000000000 100
1 011
-----------------
00000000000000 100 <--- remainder (3 bits).
And an example code with the CCIT-16 polynomial (a.k.a the XMODEM checksum):
1 2 3 4 5 6 7 8 9 10 11 |
|
CRC collisions
CRC is not designed to be secure, so finding collisions, preimages and things like that are actually easy once you know the math behind it.
We will be looking at the following equality specifically:
crc(a ^ b) = crc(a) ^ crc(b)
This should be obvious why this works, but if not, here's the same thing with math notations:
(a + b) mod crc_poly = a mod crc_poly + b mod crc_poly
which is always true (because GF(2)
is a field).
This means that if you have any two messages with checksums, you can always create a third message with a correct checksum, even if you don't know anything else about them. This is very important in case of encryption algos which simply xor the message with a secret byte stream (such as RC4 or AES-CTR), because in some cases a correct checksummed message can be assembled from several secret messages.
This was used in some WEP cracks back in the day.
Further reading
There is a very interesting paper out there which shows several other interesting operations, like forcing the CRC to be always zero, or even specific value.
Unfortunately none of them helped recover the polynomial.
The first success
So I learned all the above yesterday, it was an interesting refresher for my rusty college math skills. Fortunately this little side-quest only took a few hours, which is lucky considering my untreated ADHD.
But... we were doing something important, weren't we?
Oh yeah, I still have a firmware checksum to crack.
But first, I wanted to try the xor thing. I found two firmware files that had the same length. In fact, they differed in two bits (a version character was upped from '1' to '2'). I xored them together, and got this beauty:
$ hexdump -C xxxor.bin
00000000 15 a8 97 7c 00 00 00 00 00 00 00 00 00 00 00 00 |...|............|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001870 00 00 00 00 00 00 00 00 00 00 00 03 00 00 00 00 |................|
00001880 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0000fca0 00 |.|
0000fca1
Literally 3 bits and a supposedly correct checksum. I fixed up the size field (because size == 0 obviously did not work), and (force)fed it to the bootloader:
Download xxxxxApp Length is 64665 Byte!
Check New xxxxxxx_App CRC Success!
Recvd:
File Name :
Size: 64665 Bytes
Lol. Lmao, even.
Notice how the file name is empty (it was not redacted).
Now this is somewhat useless, but it did give some important information:
- I'm now 99.99% sure that the checksum is actually CRC.
- CRC final xor is 0
- If
chk_a = crc(a) ^ finalxor
andchk_b = crc(a) ^ finalxor
,chk_a ^ chk_b = crc(a) ^ crc(b)
without thefinalxor
. Thus it has to be 0.
- If
- CRC initial xor is 0
- The initial xor value is applied to the first 4 bytes of the data before any CRC happens, and xoring the two files together cancels this initial xor out, just like the final one.
- The length is not part of the checksum
- Otherwise changing the 00000000 length to the correct one would've broken the checksum.
Now that we know all this, brute forcing actually became possible.
Brute forcing the CRC polynomial
I had several choices for doing the brute force:
- Writing it in
python
- Writing it in
C
,C++
,Rust
,CUDA
, whatever - Using someone else's solution
I opted for the last one, because obviously. The tool that worked for me is called CRC RevEng. The usage was straightforward:
- I Modified my firmware files so they look like
[data] + [checksumbytes]
, without the length field. - Ran
RevEng
with the following parameters:
./reveng -w 32 -i 0 -x 0 -f -s fw_mod1 fw_mod2 fw_mod3 fw_mod4
It spat out a result after like 30 seconds (on a single thread):
width=32
poly=0xf4acfb13
init=0x00000000
refin=false
refout=false
xorout=0x00000000
check=0x6c9f84a8
residue=0x00000000
name=(none)
Apparently this is actually a known polynomial, known as AUTOSAR CRC. I just forgot to include this poly in the python code above.
Oh and the checksum was not little endian. It was big endian. Those
sneaky bastards. I got lucky that RevEng
tries for that too.
So now all I have to do is
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
And I get my nice little SUCCESS message:
Download xxxxxApp Length is 80681 Byte!
Check New xxxxxxx_App CRC Success!
Recvd:
File Name : HACKD
Size: 80681 Bytes
The device boots, and I'm free to do all kinds of modifications.
Epilogue
Hey, wait, so all that math was for nothing?
Yeah it was brute force in the end. The polynomial was even standard. Yeah I could've skipped all that math stuff if I was only a tiny bit more thorough. In fact, I could've spared you too.
But maybe the real checksum was that funky xor stuff we learned along the way.
Synthwave slides
AR glasses USB protocols: the Good, the Bad and the Ugly
If you need Augmented Reality problem solving, or want help implementing an AR or VR idea, drop us a mail at info@voidcomputing.hu