As mentioned in the (now deleted) post I wrote describing my plans for 2019, one of my goals this year is to get through at least 50 of the exercises on Dennis Yurichev's challenges.re. I've decided to document my progress in the form of writeups for the challenges I complete, batched in sets of ten exercises. For each challenge, I'll try to explain the intuitions that brought me closer to answering the recurring question from Yurichev, "[w]hat does this code do?"
Challenge #2
In nearly all of the challenges available on the site, we're given equivalent
disassembly listings of a function, f
, as generated by different compilers on
different processor architectures, and we're asked to describe what the code
does. For now, I've decided to take it easy and only pay attention to the
disassemblies for GCC on x86, as that's what I've done the most work with. We
aren't given a target operating system, but I think it's reasonable to assume
that the x86 code uses the cdecl calling convention.
Although I stayed within my comfort zone in terms of instruction set architecture, I refrained from my usual habit of converting the disassembly listing to AT&T syntax for once.
Below is a rough translation of the disassembly listing to C. My process is relatively unchanged from the workflow I described in an older post.
unsigned f(unsigned a) { // mov eax,DWORD PTR [esp+0x4] // bswap eax a = ((a & 0xff) << 24) | ((a & 0xff00) << 8) | ((a & 0xff0000) >> 8) | ((a & 0xff000000) >> 24); // mov edx,eax // and eax,0xf0f0f0f // and edx,0xf0f0f0f0 // shr edx,0x4 // shl eax,0x4 // or eax,edx a = ((a & 0xf0f0f0f) << 4) | ((a & 0xf0f0f0f0) >> 4); // mov edx,eax // and eax,0x33333333 // and edx,0xcccccccc // shr edx,0x2 // shl eax,0x2 // or eax,edx a = ((a & 0x33333333) << 2) | ((a & 0xcccccccc) >> 2); // and eax,0x55555555 // and edx,0xaaaaaaaa // add eax,eax // shr edx,1 // or eax,edx a = ((a & 0x55555555) << 1) | ((a & 0xaaaaaaaa) >> 1); // ret return a; }
I think it should make sense that add eax,eax
is mathematically equivalent to
imul eax, 2
, but it takes another step to see that it's equivalent to shl eax,1
,
which is represented in the C code as << 1
. This isn't terribly complicated, but
it's an optimization detail that some might not be familiar with. bswap
is an
instruction I was unfamiliar with, so I consulted my favorite x86 reference. It
converts the endianness of the word in the register. If you don't know what that
means, I'd suggest you read the page in the ISA reference.
The code seems nonsensical at first, but we can compile it and inspect the output given some test values.
#include <stdio.h> void main(void) { unsigned i; for (i = 0; i <= 256; i++) { printf("%010u %08x\n", i, i); printf("%010u %08x\n", f(i), f(i)); printf("\n"); } }
Which produces:
0000000000 00000000 0000000000 00000000 0000000001 00000001 2147483648 80000000 0000000002 00000002 1073741824 40000000 0000000003 00000003 3221225472 c0000000 0000000004 00000004 0536870912 20000000 ...
What's happening might not be immediately obvious, but there's a pattern in the column of input/output represented in hexadecimal. Let's have a look at the binary representations of a few pairs:
bin(0x00000001) # --> '0b00000000000000000000000000000001' bin(0x80000000) # --> '0b10000000000000000000000000000000' bin(0x00000003) # --> '0b00000000000000000000000000000011' bin(0xc0000000) # --> '0b11000000000000000000000000000000' # ... bin(0x0000004d) # --> '0b00000000000000000000000001001101' bin(0xb2000000) # --> '0b10110010000000000000000000000000'
My answer to the question is that f
reverses the bits of the word it is given.
Challenge #3
This time, we're given an array of 64 32-bit integers and a hint that "[t]he algorithm is well-known, but I've changed [the] constant so it wouldn't be googleable."
int f(unsigned n) { unsigned a, b; // mov edx, edi // shr edx // or edx, edi // mov eax, edx a = b = (n >> 1) | n; // shr eax, 2 // or eax, edx // mov edx, eax a = b = (a >> 2) | b; // shr edx, 4 // or edx, eax // mov eax, edx a = b = (b >> 4) | a; // shr eax, 8 // or eax, edx // mov edx, eax a = b = (a >> 8) | b; // shr edx, 16 // or edx, eax b = (b >> 16) | a; // imul eax, edx, 79355661 ; 0x4badf0d // shr eax, 26 a = (b * 0x4badf0d) >> 26; // mov eax, DWORD PTR v[0+rax*4] // ret return v[a]; }
The first thing that stood out to me was the presence of -1
in the array of
integers. Testing from 0
to UINT_MAX
, the only n
that returns -1
is 0
.
Interesting. It's also worth noting that the array contains every integer from
0, 31, so this function is using some rule to map the input space onto [0, 31].
If we inspect the values of f
for test values from 0
to UINT_MAX
:
f(1) = 31 f(2) = 30 f(3) = 30 f(4) = 29 f(5) = 29 f(6) = 29 f(7) = 29 f(8) = 28 f(9) = 28 f(10) = 28 f(11) = 28 f(12) = 28 f(13) = 28 f(14) = 28 f(15) = 28 f(16) = 27 f(17) = 27 f(18) = 27 f(19) = 27 f(20) = 27 f(21) = 27 f(22) = 27 f(23) = 27 f(24) = 27 f(25) = 27 f(26) = 27 f(27) = 27 f(28) = 27 f(29) = 27 f(30) = 27 f(31) = 27
There's a pattern of exponential growth here – every result occurs twice as
frequently as the previous result. Mathematically, this is \(31 - [log_2(n)]\)
where the brackets represent the Greatest Integer Function (\(f(x)\) returning
the largest integer less than or equal to \(x\)). This can be verified by
comparing the result of f
to the following function for some test values:
int my_f(unsigned n) { return 31 - ((int) (log(n) / log(2))); }
Challenge #4
This time around we're given an additional question to answer: "Some versions
have the 0x1010101
constant, some do not. Why?" I decided that I'd reverse the
x86 disassembly first, and then compare it to the other architectures.
unsigned f(unsigned a) { // mov edx,edi // shr edx,1 // and edx,0x55555555 // sub edi,edx a -= ((a >> 1) & 0x55555555); // mov eax,edi // shr edi,0x2 // and eax,0x33333333 // and edi,0x33333333 // add edi,eax a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // mov eax,edi // shr eax,0x4 // add eax,edi // and eax,0xf0f0f0f // imul eax,eax,0x1010101 // shr eax,0x18 // ret return (((a + (a >> 4)) & 0xf0f0f0f) * 0x1010101) >> 0x18 }
The past few challenges have shown us that a good way of reversing these bit-twiddling functions is to test a few input values and look at the binary representations of the input and output values.
In: 00000000 Out: 0 In: 00000001 Out: 1 In: 00000010 Out: 1 In: 00000011 Out: 2 ... In: 00001100 Out: 2 In: 00001101 Out: 3 In: 00001110 Out: 3 In: 00001111 Out: 4
It doesn't take much effort to see that the function is counting the number of bits set in the input. This was particularly interesting to me as I was asked to derive this algorithm for a past job interview (though I wasn't able to in the time given).
This falls apart for numbers larger than 0xff
, however. It returns the number of
bits plus some constant that changes depending on which bits in the higher bytes
are set. I'll assume that f
is only meant to be called with 8-bit integers.
With that, we can move onto the second question. The disassemblies for x86,
ARM64, and Thumb have the 0x1010101
constant, while the disassemblies for ARM
and MIPS do not.
Returning to the strategy of inspecting binary representations:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000001 00000001 00000001 00000001 00000000 00000000 00000000 00000010 00000010 00000010 00000010 00000010 ... 00000000 00000000 00000000 00010000 00010000 00010000 00010000 00010000 00000000 00000000 00000000 00010001 00010001 00010001 00010001 00010001 00000000 00000000 00000000 00010010 00010010 00010010 00010010 00010010 ... 00000000 00000000 00000000 11111101 11111101 11111101 11111101 11111101 00000000 00000000 00000000 11111110 11111110 11111110 11111110 11111110 00000000 00000000 00000000 11111111 11111111 11111111 11111111 11111111
It would appear that multiplying a 32-bit integer by 0x1010101
propagates the
low byte to the three bytes above it. This makes sense when you notice that the
multiplication is paired with a shr
of 0x18
– moving the highest byte into the
lowest byte.
Though, this doesn't really answer the question. What difference is there
between the architectures that use the 0x1010101
and the architectures that
don't? ARM and MIPS still do a shift by 0x18
, so what's going on?
Looking at the ARM example, the instructions around the shift are:
ADD r0,r0,r0,LSL #16 ADD r0,r0,r0,LSL #8 LSR r0,r0,#24
For MIPS, it looks like:
sll $3,$2,8 addu $2,$3,$2 sll $3,$2,16 addu $2,$2,$3 j $31 srl $2,$2,24
Both of these (humor me, I know the orders are different) are equivalent to:
a = (a << 8) + a; a = (a << 16) + a; a >> 24;
And, with some test values, we can see that this is equivalent to multiplication
by 0x1010101
and shifting by 24.
unsigned a(unsigned n) { n = (n << 8) + n; n = (n << 16) + n; return n >> 24; } unsigned b(unsigned n) { return (n * 0x1010101) >> 24; } void main(void) { for (unsigned i = 0; i < UINT_MAX; i++) { if (a(i) != b(i)) { printf("%u\n", i); } } }
I suspect the reason it doesn't show up in the ARM or MIPS disassemblies is due
to the fixed-width instruction encoding. The compiler likely decided it would be
less efficient to work with the 0x1010101
constant than to break it up into a
pair of shifts and additions.
Challenge #5
This is the first challenge we're given that has loops and conditionals, as
indicated by the telltale labels starting with ".L". Another initial observation
is that the first instruction in f
operates on %rsi
, and the third operates on
%rcx
, so it's very likely that this function has four parameters.
Translation to C is more involved than it was with the previous challenges, but
it is valuable as it makes the purpose of f
very clear. In lieu of an analysis
of inputs and outputs, I'll provide a few notes on the process of translation.
First, cmp
gave me a bit of trouble as I've been out of practice for some time
and the difference between AT&T and Intel syntax threw me for a loop.
Fortunately, the wikibooks for x86 assembly covers this in detail. In AT&T
syntax, the order is cmp subtrahend, minuend
, while in Intel syntax, the order
is cmp minuend, subtrahend
. The subtrahend is subtracted from the minuend, so,
in Intel syntax, cmp rcx, rsi; ja .L10
will jump if %rcx
is greater than %rsi
.
Looking further into the function, there is some dereferencing with BYTE PTR
,
which tipped me off that this was probably a function operating on a string.
There's a curious push rbx
, followed by a pop rbx
before the ret
. I ignored this
initially, taking it to be register preservation. It was. An intuition of what's
worth ignoring is valuable in reverse engineering.
Upon reaching .L16
, there are a lot of registers in use. It helped to look at
each register in isolation and see how they were used. For example, %r10
is used
in the following instructions: xor r10d, r10d
, add r10, 1
, lea rax, [rdi+r10]
,
and cmp r10, r11
. This is very typical of a for-loop counter. %r9
on the other
hand only shows up in two instructions: mov r9d, 1
, and cmovne r8d, r9d
. %r9
is
just used as a source of 1 for cmovne
, since there are no encodings for cmovne
that have an immediate source.
cmovne
was unfamiliar to me, so I did look it up in my favorite x86 reference.
It's a conditional move. movz
was similarly unfamiliar. It simply loads %bl
with
the source byte and zeroes out the higher portions of the register.
char *f(char *a, unsigned b, char *c, unsigned d) { // cmp rcx, rsi // ja .L10 if (d >= b) { // .L10: // xor eax, eax // ret return NULL; } // sub rsi, rcx // add rsi, 1 // mov r11, rsi b = b - d + 1; // je .L10 if (b == 0) { // .L10: // xor eax, eax // ret return NULL; } // test rcx, rcx // jne .L16 // mov rax, rdi // ret if (d == 0) { return a; } // .L16: // push rbx // xor r10d, r10d // mov r9d, 1 // ... // cmp r10, r11 // jne .L4 for (int i = 0; i != b; i++) { // xor r8d, r8d unsigned ret = 0; // .L4: // lea rax, [rdi+r10] // xor esi, esi // ... // add rsi, 1 // cmp rsi, rcx // jne .L8 for (int j = 0; j != d; j++) { // movzx ebx, BYTE PTR [rdx+rsi] // cmp BYTE PTR [rax+rsi], bl // cmovne r8d, r9d if (a[i] != c[j]) { ret = 1; } } // test r8d, r8d // je .L12 if (!ret) { // .L12: // pop rbx // ret return a + i; } } // xor eax, eax // pop rbx // ret return NULL; }
The variable names I chose are pretty opaque, but if you stare at this long
enough, it should be pretty clear that f
returns the offset of c
in a
. b
and d
are just the lengths of a
and c
respectively.
Challenge #6
An additional hint given for this exercise is that, "[t]his is one of the simplest exercises I made, but still this code can be served as useful library function and is certainly used in many modern real-world applications." I'll leave the relative addresses in my annotations of the disassembly, as it appears to be PIC.
For the sake of showing the mapping between assembly instructions and C code,
I'll first give a translation that uses goto
, followed by a cleaned up version.
// 0: push rbp // 1: mov rbp,rsp // 4: mov QWORD PTR [rbp-0x8],rdi // 8: mov QWORD PTR [rbp-0x10],rsi int f(char *a, char *b) { _start: // c: mov rax,QWORD PTR [rbp-0x8] // 10: movzx eax,BYTE PTR [rax] // 13: movsx dx,al // 17: mov rax,QWORD PTR [rbp-0x10] // 1b: mov WORD PTR [rax],dx *b = *a; // 1e: mov rax,QWORD PTR [rbp-0x10] // 22: movzx eax,WORD PTR [rax] // 25: test ax,ax // 28: jne 2c // 2a: jmp 38 if (*a & 0xffff != 0) { // 2c: add QWORD PTR [rbp-0x8],0x1 // 31: add QWORD PTR [rbp-0x10],0x2 // 36: jmp c a++; b++; goto _start; } // 38: pop rbp // 39: ret }
int f(char *a, char *b) { while (*a != '\0') { *b++ = *a++; } }
Cool. Yurichev wasn't lying, this is a damn simple exercise, but it is something
that's used in nearly every C program. It's strcpy
!
Challenge #7
This exercise gives the same hint as last time, and similarly uses address offsets instead of symbols.
Control flow isn't as initially obvious as some of the past exercises, but the
first instruction is a pretty good tell that this function takes a char *
as a
parameter, and the test dl,dl
was a good tell that the control flow depends on
the individual characters in that parameter. The 0x41
in that =lea
esi,[rdx-0x41]
instruction stood out to me, as 0x41
is 'A' in ASCII, and the
0x20
in the add edx,0x20
was also a big clue, as 'a' - 'A'
is 0x20
.
void f(char *a) { char *cur; // 0: movzx edx,BYTE PTR [rdi] // 3: mov rax,rdi // 6: mov rcx,rdi // 9: test dl,dl // b: je 29 // 29: repz ret if (*a == '\0') return; // 6: mov rcx,rdi cur = a; // 25: test dl,dl // 27: jne 10 while (*cur != '\0') { // 10: lea esi,[rdx-0x41] // 13: cmp sil,0x19 // 17: ja 1e // 19: add edx,0x20 // 1c: mov BYTE PTR [rcx],dl if (*cur - 0x41 <= 0x19) *cur += 0x20; // 1e: add rcx,0x1 // 22: movzx edx,BYTE PTR [rcx] cur++; } // 29: repz ret }
Just from the tells outlined in the previous paragraph, I don't even need to run
f
to know that it converts a
to lowercase, albeit only capable of transforming
capital ASCII characters (producing garbage for, say, a space character).
Challenge #8
The hint we're given this time is, "[t]his is one of the busiest algorithms under the hood, though, usually hidden from programmers. It implements one of the most popular algorithms in computer science. It features recursion and a callback function."
In preparation for an exercise that's would likely be more difficult than the
past few, I did a couple quick perusals to get a basic idea of the control flow,
the parameters, and the return values. The mov rbp,rdx
early on indicates that
there are at least three parameters.
There's a push rbp
instruction, but explicit creation of a stack frame. There
are also push r12
and push rbx
instructions. These all occur at the beginning of
the function, so we see some register preservation and an indication that these
are the registers that are going to be used in the code.
I find that a lot of reverse engineering involves getting good footing, so this is the information you want when starting out.
What I normally try to find out next is whether the parameters and return type
are integers or pointers: mov rsi,QWORD PTR [rbx]
, after %rsi
was moved into
%rbx
is a good tell that the second parameter is a pointer, likely to an array
of pointer as it's dereferenced as QWORD PTR
, and the call r12
tells me that the
first parameter is the callback that was mentioned in the hint. The js 40
after
testing the callback's return value tells me that its return value is signed –
probably an int, not a pointer – and the pair of mov rsi,QWORD PTR [rbx]
and
mov rdi,rbp
before the call indicate that it takes two parameters.
void *f(int (*a)(void *, int), void **b, int c) { int ret; // 0: push r12 // 2: test rsi,rsi // ... // 10: je 32 if (b == 0) { // 32: pop rbx // 33: pop rbp // 34: xor eax,eax // 36: pop r12 // 38: ret return NULL; } // r12 <- a // rbx <- b // rbp <- c while (1) { // (This code path is also duplicated at 49-54. The branch that // contains the duplicated code has been omitted, as the same // effect arises from this loop continuing to iterate. // // 18: mov rsi,QWORD PTR [rbx] // 1b: mov rdi,rbp // 1e: call r12 ret = a(*b, c); // 21: test eax,eax // 23: je 56 if (ret == 0) { // 56: mov rax,rbx // 59: pop rbx // 5a: pop rbp // 5b: pop r12 // 5d: ret return b; } // 25: js 40 else if (ret < 0) { // 40: mov rbx,QWORD PTR [rbx+0x10] b = b[4]; // 44: test rbx,rbx // 47: je 32 if (b == NULL) { // 32: pop rbx // 33: pop rbp // 34: xor eax,eax // 36: pop r12 // 38: ret return NULL; } } else { // 27: mov rbx,QWORD PTR [rbx+0x18] b = b[6]; // 2b: test rbx,rbx // 30: jne 18 if (b == NULL) { // 32: pop rbx // 33: pop rbp // 34: xor eax,eax // 36: pop r12 // 38: ret return NULL; } } } }
In deriving meaning from this, I have a bit of an advantage; I've just recently
implemented this exact algorithm for my university's computer systems principle
course. This is the search function for a binary search tree, which takes an
arbitrary comparison function, a
,, and returns the first node for which a
returns 0. The function returns NULL
if the item is not in the tree. c
is some
sort of "data" parameter for the callback function, hence why it isn't used in
the algorithm.
b
is probably a pointer to a struct looking something like the following:
struct tree_node { char data[0x10]; struct tree_node *left; struct tree_node *right; }
as QWORD PTR [rbx+0x10]
is followed when a
returns something less than 0
(represented in the struct as left
), and QWORD PTR [rbx+0x18]
is followed when a
returns something greater than 0 - (represented in the struct as right
).
This exercise is a little unusual. The hint mentions recursion, but this algorithm is entirely iterative. Perhaps it was implemented recursively in C, and the compiler performed some sort of tail-call optimization? I honestly have no idea.
Challenge #9
The hint we're given this time is, "[n]ow that's easy." I certainly hope it is.
This is the first challenge we're given that uses libc. It's also the first
challenge in which we see the compiler using binary search to optimize a
conditional with more than one branch. I tend to write these out as switch
statements whenever I see them, but it's perfectly reasonable for a compiler to
optimize an if
in the same way.
#include <stdio.h> #include <stdlib.h> int f(char a) { // sub rsp, 8 // movzx eax, BYTE PTR [rdi] switch (a) { // cmp al, 89 // je .L3 case 'Y': // cmp al, 121 // jne .L2 case 'y': // .L3: // mov eax, 1 // add rsp, 8 // ret return 1; // jle .L21 // ... // .L21: // cmp al, 78 // je .L6 case 'N': // ... // cmp al, 110 // je .L6 case 'n': // .L6: // xor eax, eax // add rsp, 8 // ret return 0; default: // .L2: // mov edi, OFFSET FLAT:.LC0 // call puts // xor edi, edi // call exit puts("error!"); exit(0); } }
Yurichev wasn't lying, this was an easy challenge. In fact, if I were reverse engineering a binary and came across something like this, I probably wouldn't bother translating the assembly to equivalent C. It's a function that converts a character to a boolean (in the sense of a prompt that asks the user for 'Y' or 'N' – "Yes" or "No") and exits prematurely if the character wouldn't make sense in that context.
Challenge #10
The hint time is "[t]his code snippet is short, but tricky. What does it do? It's used heavily in low-level programming and is well-known to many low-level programmers. There are several ways to calculate it, and this is the one of them."
The snippet really is short, clocking in at only four instructions, but I still
felt the need to break out Compiler Explorer for this one. The part about being
"used heavily in low-level programming" threw me off a bit, since I saw neg
and
thought that perhaps that'd correspond to the ~
operator in C, which I've only
seen used in very low-level bit shifting code. This initial assumption would've
led me astray, however, and I'm glad I took the extra minute to verify.
int f(int a) { return -a; }
f(int): push rbp mov rbp, rsp mov DWORD PTR [rbp-4], edi mov eax, DWORD PTR [rbp-4] neg eax pop rbp ret
int f(int a) { return ~a; }
f(int): push rbp mov rbp, rsp mov DWORD PTR [rbp-4], edi mov eax, DWORD PTR [rbp-4] not eax pop rbp ret
not
corresponds to ~
, and neg
corresponds to -
We're dealing with neg
here.
The equivalent C code for the snippet is given. Because I had Compiler Explorer
open already, I decided to throw this in there for kicks and giggles. x86-64 gcc
8.3 with -O2
spits out the exact same series of instructions as the challenge. I
love the predictability of C compilers.
int f(int a, int b) { return (a + b - 1) & -b; }
This doesn't answer our question, though. What does this do? We can test a few
values of a
and b
with the following snippet, replacing 2<<0
with various
constants.
int main(void) { int i, j; j = 2 << 0; for (i = 0; i < 256; i++) { printf("%-8x %-8x %-8x\n", i, j, f(i, j)); } }
0 2 0 1 2 2 2 2 2 3 2 4 4 2 4 5 2 6 6 2 6 7 2 8 8 2 8 9 2 a a 2 a b 2 c c 2 c d 2 e e 2 e f 2 10 ... 0 8 0 1 8 8 2 8 8 3 8 8 4 8 8 5 8 8 6 8 8 7 8 8 8 8 8 9 8 10 a 8 10 b 8 10 c 8 10 d 8 10 e 8 10 f 8 10 10 8 10 11 8 18 12 8 18
It would seem that this is some sort of "least multiple of \(b\) such that \(b < a\) given that \(b\) is a power of two, but I feel as though I'm grasping at straws here.
As a Gentoo user, I have the Linux source tree checked out at /usr/src/linux
,
and because the hint mentions low-level programming, I decided to create a
regular expression for the C I came up with and let ag
have a go at it.
ag "\\(.*-[^>].*\\).*&.*\\-" /usr/src/linux
yielded quite a few results. Before
I ran the command, I wasn't expecting much, thinking that my regex was too
permissive, but the first result I saw looked remarkably like the C expression I
had come up with – right at the beginning of sysv_readdir
in fs/sysv/dir.c
:
static int sysv_readdir(struct file *file, struct dir_context *ctx) { unsigned long pos = ctx->pos; struct inode *inode = file_inode(file); struct super_block *sb = inode->i_sb; unsigned long npages = dir_pages(inode); unsigned offset; unsigned long n; ctx->pos = pos = (pos + SYSV_DIRSIZE-1) & ~(SYSV_DIRSIZE-1); if (pos >= inode->i_size) return 0;
Hm. Remember how I mentioned that I expected neg
to correspond to a ~
? Well,
jumping back to Compiler Explorer:
int f(int a) { return ~a; }
f(int): mov eax, edi not eax ret
int f(int a) { return ~(a - 1); }
f(int): mov eax, edi neg eax ret
Modifying our search slightly to ag "\\(.*-[^>].*\\).*&.*\\~.*\\-.*1"
yields a
massive number of results, some of which are named macros. Here's one of them,
in include/uapi/linux/if_packet.h
:
#define TPACKET_ALIGN(x) (((x)+TPACKET_ALIGNMENT-1)&~(TPACKET_ALIGNMENT-1))
Cool. That makes me feel much more confident in my answer.
Challenge #11
The hint for this exercise is: "[t]his is a somewhat large function (in contrast to the other exercises in this blog), but heavily used nowadays in various software. As it can be clearly seen, it uses standard C/C++ functions including strlen() and sscanf(). Some other helper function is also used. I intentionally gave it this name to conceal its real function. So what does the whole code snippet do?"
I'd like to apologize in advance for the sloppiness of the code that follows.
Also, I've renamed helper
to is_hex_digit
, as it makes the code for f
clearer.
#include <string.h> #include <stdio.h> int is_hex_digit(char a) { // lea edx, [rdi-48] // mov eax, 1 // cmp edx, 9 // jbe .L2 if (a <= '9') { // .L2: // ret return 1; } // and edi, -33 // xor eax, eax // sub edi, 65 // cmp edi, 5 // setbe al // .L2: // ret return (a & -33) <= 'F' ? 1 : 0; } int f(char *a, char *b) { int len; int local_12; char *cur; char *end; char *dst; char *next; // push r15 // xor eax, eax // or rcx, -1 // push r14 // push r13 // push r12 // mov r12, rsi // push rbp // mov rbp, rsi // push rbx // mov rbx, rdi // sub rsp, 24 // repnz scasb // not rcx dst = b; cur = a; len = strlen(a); // lea r14, [rbx-1+rcx] // .L6: // cmp rbx, r14 // ja .L24 while (cur <= end) { // movsx eax, BYTE PTR [rbx] // ... // mov DWORD PTR [rsp+12], eax local_12 = (int) *cur; // lea r13, [rbx+1] next = cur + 1; // mov r15, r13 // cmp eax, 43 // jne .L7 if (*cur == '+') { // mov DWORD PTR [rsp+12], 32 local_12 = ' '; // jmp .L8 } else { // .L7: // cmp eax, 37 // jne .L8 // movsx edi, BYTE PTR [rbx+1] // call helper // test eax, eax // jne .L9 if (*cur == '%' && is_hex_digit(*(cur + 1))) { // .L9: // movsx edi, BYTE PTR [rbx+2] // lea r13, [rbx+3] next = cur + 3; // call helper // test eax, eax // je .L11 if (!is_hex_digit(*(cur + 2))) { // .L11: // or eax, -1 // jmp .L10 // .L10: // add rsp, 24 // pop rbx // pop rbp // pop r12 // pop r13 // pop r14 // pop r15 // ret return -1; } // lea rdx, [rsp+12] // xor eax, eax // mov esi, OFFSET FLAT:.LC0 // mov rdi, r15 // call __isoc99_sscanf // test eax, eax // je .L11 if (!sscanf(cur + 1, "%2X", &local_12)) { // .L11: // or eax, -1 // jmp .L10 // .L10: // add rsp, 24 // pop rbx // pop rbp // pop r12 // pop r13 // pop r14 // pop r15 // ret return -1; } } } // .L8: // test r12, r12 // je .L12 if (b != NULL) { // mov eax, DWORD PTR [rsp+12] // mov BYTE PTR [rbp+0], al *dst = local_12; } // .L12: // inc rbp // mov rbx, r13 // jmp .L6 dst++; cur = next; } // .L24: // mov eax, ebp // sub eax, r12d // .L10: // add rsp, 24 // pop rbx // pop rbp // pop r12 // pop r13 // pop r14 // pop r15 // ret return dst - b; }
This could very well be cleaned up. In fact, I'm not even sure that my
translation is completely correct, but I got to the point where I felt it was
"good enough" and I could explain that f
is a function for decoding a
percent-encoded string, where a
is the encoded string and b
is a destination to
decode to. If not for the telltale '+'
corresponding to a ' '
and use of a '%'
character, I probably would have spent more time cleaning up my translation and
making sense of it. But I've seen code like this many times in my life, it
really is "heavily used nowadays in various software."
I began this challenge by reversing helper
, which I think was a good move as it
gave me some footing. I didn't even notice '%' or '+' in f
at first, but the
realization that helper
worked with hexadecimal digits got me started on ideas
for what f
might do.
On the topic of helper
, the reason I was able to pick out that it's checking for
hexadecimal digits was realizing that \(a - 48 \leq 9\) is equivalent to \(a
\leq 49 + 9\). The comparison is otherwise pretty unclear. And I suspect that
the -33
is related to how ASCII is encoded.
The control flow for f
is pretty intimidating with its 8 labels. When it came
time to look at f
, I drew out a rudimentary control flow graph on paper –
scribbling down the label names and drawing arrows between the different labels.
I found this to be very useful in identifying which jumps are loops (cycles in
the graph), which are conditionals (branches), and which labels are related
(linear relationships).
Hello Jakob, The function in challenge 6 appears to be a conversion from an ANSI string to a Unicode ones It does a string copy but adds a 0x00 in between each char, that's the reason of the add 0x2.