jitted eBPF code

December 1, 2021, 5:55 am

≫ Next: overhead of eBPF JIT

≪ Previous: eBPF on cgroups

I add yesterday disasm for jitted eBPF code. To put it mildly this code is very poor

Every function has 7 bytes of nops in prolog. Comment says that this is for BPF trampoline - well, ok

Lots of code like

mov eax, 0x1

cmp r14, 0x2

jnz 0xc0561497

xor eax, eax

0xc0561497:

...

Somebody - tell them about cmovXX instructions

Lots of code like

mov rdi, 0xffff8fd687f3e000

add rdi, 0x110

Lots of repeated instructions:
and rdi, 0xfff

and rdi, 0xfff

it's obvious bug

And finally

you can patch it. Sure it was protected with RO - see call to bpf_jit_binary_lock_ro in function bpf_int_jit_compile but

you can use old trick with cr0
you can call set_memory_rw

and yes - this patches is very hard to detect. Really HARD

↧

overhead of eBPF JIT

December 4, 2021, 5:39 am

≫ Next: plugin for Binary Ninja

≪ Previous: jitted eBPF code

Lets try to estimate overhead of JIT compiler

I wrote simple perl script - it just counts redundant bytes for several cases:

pair mov reg, rbp/add reg, imm (total length 7 bytes) can be replaced with lea reg, [rbp-imm] which is only 4 bytes
pair mov reg, imm/add reg,imm can be replaced with just loading of right address so second instruction can be removed
add reg, 1/sub reg, 1 (length 4 bytes) can be replaced to inc/dec reg (which has length 3 bytes)

etc etc

Results

total: 105374 odd 2439 2.3%

other samples shows similar overhead - between 2.3 and 3.4%

of course lots of code like

mov eax, 0x1

cmp r14, 0x2

jnz 0xc05674ab

xor eax, eax

c05674ab:

...

leave

ret

can be replaced with something like:
xor eax, eax

cmp r14, 0x2

setnz al

but it matters only in big IP filters

Unfortunately that's not all - we can see lots of repeated code like

mov [r13+0x58], bl

mov [r13+0x57], bl

...

mov [r13+0x39], bl

but this is big questions to LLVM eBPF backend

↧

plugin for Binary Ninja

January 25, 2022, 5:26 am

≫ Next: ida pro plugin for unpacking lzma compressed linux kernel

≪ Previous: overhead of eBPF JIT

due to the sad fact that IDA Pro moving to cloud (just think about confidentiality) I decided to look at some alternatives - Binary Ninja. First impression was terrible

totally unknown API, guys - why not make some compatibility layer with IDAPython?
counterintuitive types in LLIL - for example constant ptr has type RegisterValue. whut?
I found bug in LLIL types conversion to python types (and suspect it is not alone)

anyway after couple of weeks I was able to write some simple plugin for checking functions who left some linux kernel resource locked. Perhaps it can be remastered for windows kernel too

↧

ida pro plugin for unpacking lzma compressed linux kernel

May 16, 2022, 6:25 am

≫ Next: ida pro plugin to handle loongson elf relocs

≪ Previous: plugin for Binary Ninja

UOS linux for mips64 contains strange linux kernel which cannot be unpacked with famous extract-vmlinux
Lets see what happens:

zimage_start = (unsigned long)(&__image_begin);
zimage_size = (unsigned long)(&__image_end) -
    (unsigned long)(&__image_begin);
...
/* Decompress the kernel with according algorithm */
__decompress((char *)zimage_start, zimage_size, 0, 0,
	   (void *)VMLINUX_LOAD_ADDRESS_ULL, 0, 0, error);

The problem is that System.map does not contain symbols __image_begin & __image_end. Investigation showed that compressed body of kernel located in .data section so the only unknown parameters for unpacking are start address and size of unpacked data. Fortunately used algo lzma puts size of unpacked data as last DWORD in data. And address you can extract from System.map for symbol _text

So logic of plugin is

get filename of input file
make right name for System.map from it
read this System.map
try to find xrefs in .data section - the only two will be __image_begin & __image_end
unpack
add new segment (and this was most terrible part of development - ida pro failed several times with memory dumps)
put unpacked data to newly added segment
profit

Link to github

↧

ida pro plugin to handle loongson elf relocs

May 22, 2022, 6:34 am

≫ Next: reversing of sunway sw64 ISA

≪ Previous: ida pro plugin for unpacking lzma compressed linux kernel

It seems that you can't just go ahead and implement your own proc_def_t for processor module - bcs ida pro sdk don`t include needed symbols, you will just get something like

1>reg.obj : error LNK2019: unresolved external symbol "public: __cdecl proc_def_t::proc_def_t(struct elf_loader_t &,class reader_t &)" (??0proc_def_t@@QEAA@AEAUelf_loader_t@@AEAVreader_t@@@Z) referenced in function "public: virtual __int64 __cdecl xxx_t::on_event(__int64,char *)" (?on_event@xxxson_t@@UEAA_J_JPEAD@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_supports_relocs(void)const " (?proc_supports_relocs@proc_def_t@@UEBA_NXZ)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual char const * __cdecl proc_def_t::proc_handle_reloc(struct rel_data_t const &,struct sym_rel const *,struct elf_rela_t const *,struct reloc_tools_t *)" (?proc_handle_reloc@proc_def_t@@UEAAPEBDAEBUrel_data_t@@PEBUsym_rel@@PEBUelf_rela_t@@PEAUreloc_tools_t@@@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_create_got_offsets(struct Elf64_Shdr const *,struct reloc_tools_t *)" (?proc_create_got_offsets@proc_def_t@@UEAA_NPEBUElf64_Shdr@@PEAUreloc_tools_t@@@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_perform_patching(struct Elf64_Shdr const *,struct Elf64_Shdr const *)" (?proc_perform_patching@proc_def_t@@UEAA_NPEBUElf64_Shdr@@0@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_can_convert_pic_got(void)const " (?proc_can_convert_pic_got@proc_def_t@@UEBA_NXZ)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual unsigned __int64 __cdecl proc_def_t::proc_convert_pic_got(class segment_t const *,struct reloc_tools_t *)" (?proc_convert_pic_got@proc_def_t@@UEAA_KPEBVsegment_t@@PEAUreloc_tools_t@@@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual char const * __cdecl proc_def_t::proc_describe_flag_bit(unsigned int *)" (?proc_describe_flag_bit@proc_def_t@@UEAAPEBDPEAI@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_load_unknown_sec(struct Elf64_Shdr *,bool)" (?proc_load_unknown_sec@proc_def_t@@UEAA_NPEAUElf64_Shdr@@_N@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual char const * __cdecl proc_def_t::proc_handle_dynamic_tag(struct Elf64_Dyn const *)" (?proc_handle_dynamic_tag@proc_def_t@@UEAAPEBDPEBUElf64_Dyn@@@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_is_acceptable_image_type(unsigned short)" (?proc_is_acceptable_image_type@proc_def_t@@UEAA_NG@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual void __cdecl proc_def_t::proc_on_start_data_loading(struct elf_ehdr_t &)" (?proc_on_start_data_loading@proc_def_t@@UEAAXAEAUelf_ehdr_t@@@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_on_end_data_loading(void)" (?proc_on_end_data_loading@proc_def_t@@UEAA_NXZ)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual void __cdecl proc_def_t::proc_on_loading_symbols(void)" (?proc_on_loading_symbols@proc_def_t@@UEAAXXZ)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_handle_symbol(struct sym_rel &,char const *)" (?proc_handle_symbol@proc_def_t@@UEAA_NAEAUsym_rel@@PEBD@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual void __cdecl proc_def_t::proc_handle_dynsym(struct sym_rel const &,unsigned int,char const *)" (?proc_handle_dynsym@proc_def_t@@UEAAXAEBUsym_rel@@IPEBD@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual int __cdecl proc_def_t::proc_handle_special_symbol(struct sym_rel *,char const *,unsigned short)" (?proc_handle_special_symbol@proc_def_t@@UEAAHPEAUsym_rel@@PEBDG@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_should_load_section(struct Elf64_Shdr const &,unsigned int,class _qstring<char> const &)" (?proc_should_load_section@proc_def_t@@UEAA_NAEBUElf64_Shdr@@IAEBV?$_qstring@D@@@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_on_create_section(struct Elf64_Shdr const &,class _qstring<char> const &,unsigned __int64 *)" (?proc_on_create_section@proc_def_t@@UEAA_NAEBUElf64_Shdr@@AEBV?$_qstring@D@@PEA_K@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual char const * __cdecl proc_def_t::calc_procname(unsigned int *,char const *)" (?calc_procname@proc_def_t@@UEAAPEBDPEAIPEBD@Z)
1>reg.obj : error LNK2001: unresolved external symbol "public: virtual unsigned __int64 __cdecl proc_def_t::proc_adjust_entry(unsigned __int64)" (?proc_adjust_entry@proc_def_t@@UEAA_K_K@Z)
1>D:\ida75\procs\xxx64.dll : fatal error LNK1120: 21 unresolved externals

So I wrote plugin to handle ELF relocs for this new fashionable chinese processor.
Source

some description of relocs can be found here

↧

reversing of sunway sw64 ISA

June 4, 2022, 1:22 am

≫ Next: position independent sw64 code

≪ Previous: ida pro plugin to handle loongson elf relocs

It seems that Chinese are hiding information about their another homemade processor sw64 - try to find some technical details with google, baidu or gitee. At the same time they ported linux on this processor - and you even can find some details in openEuler project. I think this conspiracy is very funny and at least violating licenses for binutils/clang/gcc etc

Anyway lets see if we can reverse ISA for sw64 having only linux image and some source code from linux kernel (spoiler: also write processor module for ida pro)

registers

try to compare registers of sw64 with Alpha AXP - can you find any difference? at least we now know that processor has 32 general purpose registers and 32 for floating point, so fields for register encoding must be 5 bits

ELF relocs

relocs can be extracted from arch/sw_64/include/asm/elf.h. So the next thing which I wrote was small ida pro plugin to apply this relocs - nothing special, actually it was almost exactly copy of the same plugin for LoongArch

mnemonics

So where we can get mnemonics? They are usually stored somewhere inside binutils, so I just put libopcodes-2.31.1-system.so in ida pro and dumped operands table with simple idc script

Also I compared how many mnemonic names matched with Alpha processor. we have total 383 names:

ida pro knows 74 in its processor module for Alpha
binutils knows 84 in opcodes/alpha-opc.c

not very big intersection, so we must employ some reversing technics. Table with opcodes (slightly edited to remove two fields with only zeros) looks like

lldw 20000000 FC00F000 800 A2701

Meaning of fields:

name of instruction
matching value after AND with mask
mask
don`t know what is it - this field contains only 2 value - 800 & 2000, so perhaps this is family or assembler option
most interesting field - schema for operands encoding

so we can write some perl script and quickly realize that we have lots of duplicates. Obviously this is some pseudo instructions whose name changes depending on value of operands. Some example:

nop 40000740 90807

clr 40000740 30807

mov 40000740 30207

or 40000740 F0201

operands decoding

Lets look at arch/sw_64/netbpf_jit.h. We can understand that field for opcode is 6 highest bits, then for opcode SW64_BPF_OPCODE_ALU_REG we have 3 register (ra, rb and rc) and for SW64_BPF_OPCODE_ALU_IMM we have 2 register (ra & rc) and imm value, Bit offsets for register fields:

ra 21
rb 16
rc 0

now we only need to understand where this operands live in 5th field from our table - actually there is 3! = 6 combinations but seems that order is ra, rb, rc:

A B C

nop 40000740 90807

clr 40000740 30807

mov 40000740 30207

or 40000740 F0201

we can conclude that 7 means zero rc, 8 zero rb and 9 zero ra (also 6 for floating point register ra, 0xc for floating point register rb), so nop is "or zr, zr, zr", clr is "or ra, zr, zr" and mov is "or ra, rb, zr"

now having all this info we can quickly (he-he, it took 3 days actually) write processor module for ida pro and then look at actual code in it

memory xrefs

and be disappointed bcs from function emit_sw64_ldu64 I expected to see something like long sequence for loading of 64bit address and then call - something like

ldi reg, imm1

sll reg, 60

ldi reg, imm2

sll reg, 45

ldi reg, imm3

sll reg, 30

etc. but in reality we see something like:

ldih GP, PV, 2

ldi GP, GP, 0x2418

ldih PV, GP, 0

ldl PV, PV, -0x7BD8

call RA, PV, 0

I naive assumed that this is position independent code and pair of instructions ldih/ldi just load address of GOT and next second pair of ldih/ldl load address from it. I don`t know if this is right for all cases but code contains also lots of weird sequences like this:
call RA, PV, 0 ; call with address in PV, store return address in RA

ldih GP, RA, 2 ; load some offset relative RA. weird

ldi GP, GP, 0x23D4

ldih PV, GP, 0

ldl PV, PV, -0x7EF0

in this case value of GP assigned based on return address - for call this will be address of ldih instruction. I am too lazy to check if GP assigned different value each time, seems that just using of GOT address works fine for small modules where all data lie within a 15bit offset. but you can patch my function emu_insn to track values of PV, then GP and then find call to get address in RA

↧

position independent sw64 code

June 7, 2022, 1:54 am

≫ Next: epbf maps

≪ Previous: reversing of sunway sw64 ISA

lets see how PIC looks like for sw64 on the example of a function from libLLVM-7.so.1 (huge shared library - size 45Mb):

1000ED0 ldih GP, PV, 0x1D3

PV almost always contains address of called function so value of GP now 2D30ED0
1000ED4 ldi GP, GP, -0x1290

value of GP now 2D30ED0 - 1290 = 2D2FC40. I expected that this base address always located inside .got but this is not true - it can lie anywhere, sometimes even not inside elf module! All remaining refs use this base address in GP register:

1000ED8 ldih PV, GP, 0

1000EDC ldl PV, PV, -0x4EC0

...

1000F14 call RA, PV, 0

1000F18 ldih GP, RA, 0x1D3 ; 2D30F18

1000F20 ldi GP, GP, -0x12D8 ; 2D2FC40

wait, WHAT? they use return address in RA to fill GP with the same value 2D2FC40. and even worse - they restore value of GP even in epilogue where it is not used

Lets estimate size overhead. libLLVM-7.so.1 has 41337 functions, 8432116 instructions and 781997 to set value of GP. rate 781997 / 8432116 = 0.092740

Lets assume that each function anyway need to setup GP, so required number of instructions is 41337 * 2 = 82674. remaining is 781997 - 82674 = 699323

remove unneeded GP setups from epilogues: 699323 - 82674 = 616649

this amount easy can be reduced in half - just store calculated value of GP in stack with stl gp, sp, offset (+41337 instructions) and then pop it when needed with ldl gp, sp, offset

So actual amount of instructions could be 616649 / 2 + 41337 + 82674 = 432336

new rate: 432336 / 8432116 = 0.05127

overhead is 0.092740 - 0.05127 = 4.1%

cool, almost 2Mb of code is just unnecessary

↧

epbf maps

June 15, 2022, 5:52 am

≫ Next: ebpf opcodes patching

≪ Previous: position independent sw64 code

As you can see from function bpf_map_alloc_id all bpf maps stored in balanced tree map_idr and synced on spinlock map_idr_lock. No surprise that you can`t view them in user-mode - there is bpf command BPF_MAP_GET_NEXT_ID but it can only enumerate ID of maps. So I add today some code to view bpf maps: lkmem -c -d -B gives output like

bpf_maps at 0xffffffff929c1880: 15

[0] id 3 UDPrecvAge at 0xffff99e344f48000

type: 1 BPF_MAP_TYPE_HASH

key_size 8 value_size 8

[1] id 4 UDPsendAge at 0xffff99e344cb4c00

type: 1 BPF_MAP_TYPE_HASH

key_size 38 value_size 8

also disasm of jitted ebpf code began to look better:
mov rdi, 0xffff99e344f48000 ; UDPrecvAge

call 0xffffffff90c191f0 ; __htab_map_lookup_elem

This letter explains that JIT replacing sequence of opcodes

bpf_mov r1, const_internal_map_id
bpf_call bpf_map_lookup

with direct loading of 64bit address of map (BPF_LD_IMM64 pseudo op). But this code is not optimal - every instruction occupy 10 bytes. Lets consider case where we employ constants pool and put all map addresses somewhere after function - sure this will require at least 8 bytes for each address + perhaps some space for alignment. But now we can produce code like:
mov rdi, qword [map1_addr wrt rip] ; 7 bytes

call __htab_map_lookup_elem

...

; somewhere after function

map1_addr: resq 1 ; jit should put real address of map here

if function has 3 or more reference to the same map we can have some decreasing of jitted code size

↧

ebpf opcodes patching

June 20, 2022, 5:30 am

≫ Next: pmu events

≪ Previous: epbf maps

I made today disasm for eBPF opcodes. Lets see how they looks like:
85 00 00 00 C0 10 02 00 call 0x210C0

in jitted code this is call 0xffffffffb4c14110. ffffffffb4c14110 - 210C0 = FFFFFFFFB4BF3050, address of __bpf_call_base. Suppose that we have some paranoidal code in kernel mode and don`t want to be traced with all this ebpf black magic, what we can do on machine without JIT?

First, we could just patch first opcode to

95 00 00 00 00 00 00 00 ret

Second - we could find some empty native function in kernel (or even reuse __bpf_call_base) and patch address let`s say htab_map_update_elem to it. Can some linux ~~ebpf-based~~ EDR detect this?

↧

pmu events

June 25, 2022, 6:42 am

≫ Next: verification of jitted ebpf code

≪ Previous: ebpf opcodes patching

Some details

pmu stored in tree pmu_idr and synced with mutex pmus_lock. and as usually can be used to blind EBPF. How? Lets see:

General speaking there are usually four steps involved to attach an eBPF program to a perf event:
Open the perf event
Load the eBPF program
Set the eBPF program on the perf event
Enable the perf event

We interested in point 4 - enabling of the perf event involves calling of pmu->event_init & pmu->add methods. And worse - all pmu structures located in .data section and thus writable. So I add today some code to dump them:

lkmem -c -t -d

pmus at 0xffffffffb4a081b0: 6

[0] type 2 capabilities 0 at 0xffffffffb43c2a20 - kernel!perf_tracepoint

pmu_enable: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

pmu_disable: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

event_init: 0xffffffffb2639e50 - kernel!perf_tp_event_init

add: 0xffffffffb25d3600 - kernel!perf_trace_add

del: 0xffffffffb25d3680 - kernel!perf_trace_del

start: 0xffffffffb2639340 - kernel!perf_swevent_start

stop: 0xffffffffb2639350 - kernel!perf_swevent_stop

read: 0xffffffffb2639300 - kernel!perf_swevent_read

start_txn: 0xffffffffb2639360 - kernel!perf_pmu_nop_txn

commit_txn: 0xffffffffb2639370 - kernel!perf_pmu_nop_int

cancel_txn: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

event_idx: 0xffffffffb263d200 - kernel!perf_event_idx_default

check_period: 0xffffffffb2639380 - kernel!perf_event_nop_int

[1] type 5 capabilities 0 at 0xffffffffb43c2da0 - kernel!perf_breakpoint

pmu_enable: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

pmu_disable: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

event_init: 0xffffffffb264e010 - kernel!hw_breakpoint_event_init

add: 0xffffffffb264d810 - kernel!hw_breakpoint_add

del: 0xffffffffb264d800 - kernel!hw_breakpoint_del

start: 0xffffffffb264d7c0 - kernel!hw_breakpoint_start

stop: 0xffffffffb264d7e0 - kernel!hw_breakpoint_stop

read: 0xffffffffb2440e10 - kernel!hw_breakpoint_pmu_read

start_txn: 0xffffffffb2639360 - kernel!perf_pmu_nop_txn

commit_txn: 0xffffffffb2639370 - kernel!perf_pmu_nop_int

cancel_txn: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

event_idx: 0xffffffffb263d200 - kernel!perf_event_idx_default

check_period: 0xffffffffb2639380 - kernel!perf_event_nop_int

[2] type 6 capabilities 0 at 0xffffffffb43c2880 - kernel!perf_kprobe

pmu_enable: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

pmu_disable: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

event_init: 0xffffffffb263db00 - kernel!perf_kprobe_event_init

add: 0xffffffffb25d3600 - kernel!perf_trace_add

del: 0xffffffffb25d3680 - kernel!perf_trace_del

start: 0xffffffffb2639340 - kernel!perf_swevent_start

stop: 0xffffffffb2639350 - kernel!perf_swevent_stop

read: 0xffffffffb2639300 - kernel!perf_swevent_read

start_txn: 0xffffffffb2639360 - kernel!perf_pmu_nop_txn

commit_txn: 0xffffffffb2639370 - kernel!perf_pmu_nop_int

cancel_txn: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

event_idx: 0xffffffffb263d200 - kernel!perf_event_idx_default

check_period: 0xffffffffb2639380 - kernel!perf_event_nop_int

[3] type 7 capabilities 0 at 0xffffffffb43c26c0 - kernel!perf_uprobe

pmu_enable: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

pmu_disable: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

event_init: 0xffffffffb263db80 - kernel!perf_uprobe_event_init

add: 0xffffffffb25d3600 - kernel!perf_trace_add

del: 0xffffffffb25d3680 - kernel!perf_trace_del

start: 0xffffffffb2639340 - kernel!perf_swevent_start

stop: 0xffffffffb2639350 - kernel!perf_swevent_stop

read: 0xffffffffb2639300 - kernel!perf_swevent_read

start_txn: 0xffffffffb2639360 - kernel!perf_pmu_nop_txn

commit_txn: 0xffffffffb2639370 - kernel!perf_pmu_nop_int

cancel_txn: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

event_idx: 0xffffffffb263d200 - kernel!perf_event_idx_default

check_period: 0xffffffffb2639380 - kernel!perf_event_nop_int

[4] type 8 capabilities 81 at 0xffffffffb421f740 - kernel!pmu_msr

pmu_enable: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

pmu_disable: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

event_init: 0xffffffffb240c3e0 - kernel!msr_event_init

add: 0xffffffffb240c550 - kernel!msr_event_add

del: 0xffffffffb240c670 - kernel!msr_event_del

start: 0xffffffffb240c680 - kernel!msr_event_start

stop: 0xffffffffb240c660 - kernel!msr_event_stop

read: 0xffffffffb240c5a0 - kernel!msr_event_update

start_txn: 0xffffffffb2639360 - kernel!perf_pmu_nop_txn

commit_txn: 0xffffffffb2639370 - kernel!perf_pmu_nop_int

cancel_txn: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

event_idx: 0xffffffffb263d200 - kernel!perf_event_idx_default

check_period: 0xffffffffb2639380 - kernel!perf_event_nop_int

[5] type 9 capabilities 80 at 0xffff8ccc42814400 UNKNOWN

pmu_enable: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

pmu_disable: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

event_init: 0xffffffffc0587000 - rapl

add: 0xffffffffc0587390 - rapl

del: 0xffffffffc0587290 - rapl

start: 0xffffffffc0587340 - rapl

stop: 0xffffffffc05871c0 - rapl

read: 0xffffffffc05871b0 - rapl

start_txn: 0xffffffffb2639360 - kernel!perf_pmu_nop_txn

commit_txn: 0xffffffffb2639370 - kernel!perf_pmu_nop_int

cancel_txn: 0xffffffffb263d210 - kernel!perf_pmu_nop_void

event_idx: 0xffffffffb263d200 - kernel!perf_event_idx_default

check_period: 0xffffffffb2639380 - kernel!perf_event_nop_int

↧

verification of jitted ebpf code

June 29, 2022, 1:46 am

≫ Next: size of ebpf jit code on different processors

≪ Previous: pmu events

There are some projects for ebpf in usermode, but for verification purposes you need the same code which was used in kernel. So I ripped out some jit code to run it in usermode

x64
powerpc
risc-v
s390
sparc
sunway sw64

And now we can make verification of jitted code - we have actual generated code for some ebpf, next we run JIT for ebpf opcodes in usermode, and finally can compare them

↧

size of ebpf jit code on different processors

June 30, 2022, 9:42 am

≫ Next: PoC to blind pamspy

≪ Previous: verification of jitted ebpf code

it doesn't make much sense but bcs I have now several jit compilers - why not compare how much size have jitted code for different processors?

I chose 3 ebpf programs

simple BPF_PROG_TYPE_CGROUP_SKB with only comparison, 8 opcodes
BPF_PROG_TYPE_RAW_TRACEPOINT with 3 maps, 68 opcodes
enough complex BPF_PROG_TYPE_RAW_TRACEPOINT with 6 maps, 1824 opcodes

results

processor	1st	2nd	3rd
x64	54	312	8195
arm64	99	567	12959
powerpc	78	546	11462
risc-v	102	470	9494
s390	78	534	12622
sparc	79	482	10446

↧

PoC to blind pamspy

July 9, 2022, 6:30 am

≫ Next: dirty secrets of ld.so

≪ Previous: size of ebpf jit code on different processors

Lets disasm jit code from this spyware:

[8] prog 0xffffb02dc0133000 id 160 len 46 jited_len 215 aux 0xffff8ccb58fab400 used_maps 1 used_btf 0 func_cnt 0

tag: 0F 86 19 76 BC 37 68 B3

stack_depth: 16

num_exentries: 0

type: 2 BPF_PROG_TYPE_KPROBE

expected_attach_type: 0 BPF_CGROUP_INET_INGRESS

used maps:

[0] 0xffff8ccbc1b1c600 - rb

...

ffffffffc07bc801 e80a38e6f1 call 0xffffffffb2620010 ; bpf_ringbuf_submit

ffffffffc07bc806 31c0 xor eax, eax

ffffffffc07bc808 415e pop r14

ffffffffc07bc80a 415d pop r13

ffffffffc07bc80c 5b pop rbx

ffffffffc07bc80d c9 leave

ffffffffc07bc80e c3 ret

and in ebpf opcodes:

43 85 00 00 00 C0 CF 02 00 call 0x2CFC0 ; bpf_ringbuf_submit

44 B7 00 00 00 00 00 00 00 mov r0, 0

45 95 00 00 00 00 00 00 00 ret

Here 0x2CFC0 is offset to bpf_ringbuf_submit from __bpf_call_base

The last call submit some data to bpf map rb with type BPF_MAP_TYPE_RINGBUF. If we could patch this function no data will be passed to usermode. How are these native function addresses filled in at all?

Long story short - they are filled by bpf verifiers. It`s really madness how complex this logic is: they have array of bpf_verifier_ops for each type of bpf programs where function pointer get_func_proto returns structure bpf_func_proto which contains address of some function for binding.

I write simple PoC to patch address of function inside bpf_ringbuf_submit_proto to some retn instruction. Lets look again at jit code (sure you first need to load my driver and then run pamspy):

ffffffffc087ff99 e84704daf1 call 0xffffffffb26203e5 ; ptr to c3 byte

ffffffffc087ff9e 31c0 xor eax, eax

ffffffffc087ffa0 415e pop r14

ffffffffc087ffa2 415d pop r13

ffffffffc087ffa4 5b pop rbx

ffffffffc087ffa5 c9 leave

ffffffffc087ffa6 c3 ret

ebpf opcodes:

43 85 00 00 00 95 D3 02 00 call 0x2D395 ; FFFFFFFF812203E5

44 B7 00 00 00 00 00 00 00 mov r0, 0

45 95 00 00 00 00 00 00 00 ret

Especially wonderful that all bpfs loaded before this patch continue to work as usually and moreover - if you unload my driver patch will be reverted but all bpfs loaded with patched bpf_func_proto will still contain patched addresses!

Lets see which artifacts from my driver we can find:

./lkmem -r -d -c -B -t -k kernel5.13 System.map

mem at 0xffffffffb36387a0 (bpf_ringbuf_submit_proto) patched to 0xffffffffb26203e5

Perhaps to minimize lifetime for this patch I could employ kprobe for example on bpf_check and restore all patches in kretprobe handler. Can your ~~ebpf based~~ EDR detect this?

↧

dirty secrets of ld.so

July 29, 2022, 6:04 am

≫ Next: BTI incompatible exported functions in kernel 5.15.0-53

≪ Previous: PoC to blind pamspy

As you can know you can set library path under linux with several ways:

envvar LD_LIBRARY_PATH, but it can be removed somewhere inside program so /proc/pid/environ is useless (as usually they expose via official API only useless trash but carefully hiding any really important things)
via option --library-path to ld.so - like /lib64/ld-linux-x86-64.so.2 --library-path path someprogram Again command line can be patched
via /etc/ld.so.conf - this file also can be patched after your program was launched

So good question is "is there some trusted source to see what library path was installed for some running program?"

Yes, this is ld,so itself - because it uses this data while dynamically loading some modules, So long story short: value from --library-path & LD_LIBRARY_PATH stored in variable library_path and whole directory set in rtld_search_dirs

Bad news - they are not exported and even worse - they are hard to find even using disassembler

for example rtld_search_dirs has xrefs from

open_path
_dl_init_paths
_dl_map_object
_dl_rtld_di_serinfo

and only last one is exported symbol

Anyway I wrote PoC to get offsets to this internal vars, like

LD_LIBRARY_PATH=~redp:/fake/tmp ./ldso /usr/lib/x86_64-linux-gnu/ld-2.31.so

library_path: 0x2d540
rtld_search_dirs: 0x2d840
base 7FC0E7D5B000
0x7ffd179a5339 /home/redp:/fake/tmp
0x7fc0e7d8aca0
system search path  /lib/x86_64-linux-gnu/
system search path  /usr/lib/x86_64-linux-gnu/
system search path  /lib/
system search path  /usr/lib/

↧

BTI incompatible exported functions in kernel 5.15.0-53

October 31, 2022, 10:45 am

≫ Next: linux drivers cross-compilation

≪ Previous: dirty secrets of ld.so

if BTI is enabled, the first instruction encountered after an indirect jump must be a special BTI instruction

from here

I downloaded Ubuntu for arm64 (jammy-desktop-arm64.iso) and decided to check if there are some functions with don`t contain BTI c at start

17804 such functions. System.map-5.15.0-53-generic contains 62819 functions in total. Next I just intersected them with exported - 1269

This is obvious bug - maybe in gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0

at least some of this functions are really important - like register_ftrace_function

↧

linux drivers cross-compilation

November 29, 2022, 4:00 am

≫ Next: timers in linux kernel

≪ Previous: BTI incompatible exported functions in kernel 5.15.0-53

Just reminder for myself how to build driver for arm64 having x64 based machine with ubuntu

Install right gcc

for arm64 we need gcc-aarch64-linux-gnu:

sudo apt-get install gcc-aarch64-linux-gnu

Build Kernel

You cannot use installed kernel and must build one for appropriate architecture - in my case for arm64 (note - gcc has prefix aarch64, C - consistency). Clone or unpack kernel source tree to some directory KROOT and then

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- menuconfig
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- modules

Patch Makefile

Usual trick is to use something like

MACHINE ?= $(shell uname -m) ifeq ($(MACHINE),x86_64)

but this gives you arch of host machine, so you must rewrite all such cases to use ARCH variable (and to setup make -C $(KROOT))

Building

and finally you can cross-compile your driver with something like

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -f Makefile.arm64

↧

timers in linux kernel

December 7, 2022, 3:58 am

≫ Next: dwarfdump

≪ Previous: linux drivers cross-compilation

timers are very important artifact for forensics, for example Volatility even has plugin to dump timers from windows kernel. Unfortunately Volatility cannot dump timers from linux kernel so I made such dump in my lkcd (with -T option)

kernel timers are just structures timer_list and the most important field is

void (*function)(unsigned long);

bcs if your machine rootkited - probably one of timers will contains address from some unknown module. timers are chained in linked list via entry field and lots of this lists stored in array vectors into per-cpu variable timer_base. As you can see there can be 2 instances of this structure - this depends from undocumented config option CONFIG_NO_HZ_COMMON

Some timers are part of so called workqueue - structure delayed_work. In such case timer_list.function contains address of exported function delayed_work_timer_fn

↧

dwarfdump

March 30, 2023, 6:13 am

≫ Next: DWARF size overhead

≪ Previous: timers in linux kernel

I made pale analog of world famous pdbdump to dump types and functions from DWARF. Before introducing my tool I have several words about DWARF - it is excess, compiler-specific, inconsistent and dangerous

Redudancy

gcc and llvm put every used types set in each compilation unit. This is really terrible if you use lots of templates like STL - you will have duplicated declarations of std::map, std::string. Yep, this is main reason why stripped binaries becomes much smaller:

ls -l llvm-dwarfdump llvm-dwarfdump.stripped

-rwxrwxr-x 1 redp redp 471241104 mar 29 00:52 llvm-dwarfdump -rwxrwxr-x 1 redp redp 22170696 mar 29 17:49llvm-dwarfdump.stripped

Another example - lets check how many times function console_printk declared in debug info from linux kernel:

grep console_printk vm.g | wc -l
2883

It is the same function declared in file include/linux/printk.h line 65 column 0xc - why linker can`t merge it`s type producing debug output?

Golang tries to fix this problem using types declarations once and then referring to them from another units (and at the same time compressing debug sections with zlib) - this is very ironically bcs anyway binaries on go typically have size in several Mb (btw llvm-dwarfdump cannot process compressed sections)

compiler-specific

It is pretty obvious

But just look at this:

 <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <c>   DW_AT_name        : internal/cpu
    <19>   DW_AT_language    : 22       (Go)
    <1a>   DW_AT_stmt_list   : 0x0
    <1e>   DW_AT_low_pc      : 0x401000
    <26>   DW_AT_ranges      : 0x0
    <2a>   DW_AT_comp_dir    : .
    <2c>   DW_AT_producer    : Go cmd/compile go1.13.8
    <44>   Unknown AT value: 2905: cpu

I was unable to find in golang sources meaning of this custom attributes

Inconsistency

DWARF specification don`t define lots of important things. Just to name few:

order of tags, so you can have mix of formal parameters with types at the same nesting level
which attributes are mandatory for tags - I saw lots of missed DW_AT_sibling for example
encoding of addresses. You have DW_AT_low_pc for functions address. But also there is DW_AT_abstract_origin (and DW_AT_specification). The same function can have different addresses even in plain C via this attributes:
<1><191cde>: Abbrev Number: 194 (DW_TAG_subprogram) <191ce0> DW_AT_external : 1 <191ce0> DW_AT_name : (indirect string, offset: 0x24d2f): perf_events_lapic_init <191ce4> DW_AT_decl_file : 1 <191ce5> DW_AT_decl_line : 1719 <191ce7> DW_AT_decl_column : 6 <191ce8> DW_AT_prototyped : 1 <191ce8> DW_AT_inline : 1 (inlined) <1><19a945>: Abbrev Number: 96 (DW_TAG_subprogram) <19a946> DW_AT_abstract_origin: <0x191cde> <19a94a> DW_AT_low_pc : 0xffffffff81004dc0 <1><19b3c7>: Abbrev Number: 96 (DW_TAG_subprogram) <19b3c8> DW_AT_abstract_origin: <0x191cde> <19b3cc> DW_AT_low_pc : 0xffffffff81007930

All of this lead us to conclusion that DWARF is just

Dangerous

True ant-debugging trick - what if attribute DW_AT_type for DW_TAG_pointer_type points to the same tag? How about negative offset in DW_AT_sibling? I believe that this is very reach area for fuzzing

Features of dwarfdump

dwarfdump can parse little-endian 32 or 64 bit ELF files and supports compressed sections (from golang and SHF_COMPRESSED with zlib)

It can dump types (like structures, unions, classes, enums etc), functions, methods (including vtbl index) and vars

Where possible it can show addresses of functions, methods (they can have several addresses - for example per each specialization of template) and variables

Also it can show location of formal parameters like their offset in stack or in which register they are passed

dwasrfdump has two output format:

JSON
plain C and some subset of C++. This output my looks strange for other languages like Go. I am too lazy to develop renderers for other languages

What is not supported

Inlined functions (tag DW_TAG_inlined_subroutine)

Local variables in functions - they located inside lexical blocks together with local types. This local types can be included in output with -L option (but not local vars)

C++ templates specialization - it seems that C++ compiler anyway include this information in mangled names so this is not big loss

Sure tags and attributes for languages other than C/C++ - like DW_TAG_with_stmt, DW_TAG_variant_part (used in Ada&Rust) or DW_TAG_dwarf_procedure (I don`t even have ideas for what it was added to DWARF spec) etc

There are probably many more unsupported things I don't even know about

command-line options

-d - dump lots of useless debug output

-f - include functions. if omitted only types will be dumped

-g - due to golang reuse types located in any compilation unit output will be produced only after parsing of whole debug info

-j - produce JSON. if omitted plain c/c++ rendered will be used

-k - keep dumped type - bcs in one module type A can be just declared (but with constructor/destructor for example) and it`s members can be defined in some other module

-l - add nesting level to JSON output for each type

-L - process lexical blocks. This is significantly slows down processing time

-o <output filename>. if omitted stdout will be used

-v - verbose mode

-V - include global variables

performance

Timings of processing biggest module which I was able to find on my Ubuntu - libLTO.so from fresh llvm (3.5Gb!):

ls -l libLTO.so.17git 
-rwxrwxr-x 1 redp redp 3519267480 mar 28 23:47 libLTO.so.17git

time objdump -g -Wi ./libLTO.so.17git | tail

real    10m50,079s
user    10m29,327s
sys    0m52,881s

time llvm-dwarfdump --debug-info libLTO.so.17git | tail

real    8m13,707s
user    8m10,424s
sys    0m36,764s

time ../dumper -v -V -f -k -L libLTO.so.17git | tail

real    1m11,879s
user    0m48,765s
sys    0m5,249s

dwarfdump outpeforms objdump & llvm-dwarfdump bcs it parses only necessary sections of debug info and just skip unsupported tags heavily using DW_AT_sibling attribute. Given that almost every function has one or more lexical blocks this reduces time of processing at least twice

example of output

Good sample of structure with lots of anonymous nested types - restart_block

struct restart_block { // Offset 0x0 long unsigned int arch_data; // Offset 0x8 long int (*fn)(struct restart_block*); // Offset 0x10 union { // Offset 0x0 struct { // Offset 0x0 u32* uaddr; // Offset 0x8 u32 val; // Offset 0xC u32 flags; // Offset 0x10 u32 bitset; // Offset 0x18 u64 time; // Offset 0x20 u32* uaddr2; } futex; // Offset 0x0 struct { // Offset 0x0 clockid_t clockid; // Offset 0x4 enum timespec_type type; // Offset 0x8 union { // Offset 0x0 struct __kernel_timespec* rmtp; // Offset 0x0 struct old_timespec32* compat_rmtp; }; // Offset 0x10 u64 expires; } nanosleep; // Offset 0x0 struct { // Offset 0x0 struct pollfd* ufds; // Offset 0x8 int nfds; // Offset 0xC int has_timeout; // Offset 0x10 long unsigned int tv_sec; // Offset 0x18 long unsigned int tv_nsec; } poll; }; };

Function with strange calling convention from linux kernel:
// Addr 0xFFFFFFFF810082F0 // TypeId 1A408F // kobj id 1A40B1: OP_reg rdi // attr id 1A40BF: OP_reg rsi // i id 1A40CD: OP_reg rdx umode_t not_visible(struct kobject* kobj,struct attribute* attr,int i);

Class tableRegNames:

// Size 0x18
struct tableRegNames {
// Offset 0x8
const const char** tab_;
// Offset 0x10
size_t tab_size;
// --- methods
 tableRegNames(struct tableRegNames* this,struct tableRegNames&&);
 tableRegNames(struct tableRegNames* this,const struct tableRegNames&);
// specification
//  addr AC42 type_id 39AE5 _ZN13tableRegNamesC2EPKPKcm
 tableRegNames(struct tableRegNames* this,const const char**,size_t);
// Vtbl index 2
// specification
//  addr AC90 type_id 39A78
virtual const char* reg_name(struct tableRegNames* this,unsigned int);
// specifications: 2
//  addr AD2A type_id 3998E _ZN13tableRegNamesD0Ev
//  addr ACFC type_id 399BA _ZN13tableRegNamesD2Ev
virtual  ~tableRegNames(struct tableRegNames* this,int);
};

↧

DWARF size overhead

March 31, 2023, 6:10 am

≫ Next: custom dwarf attributes in golang

≪ Previous: dwarfdump

I made today simple script to estimate size overhead due types duplication. This is hard task for C++ - bcs some types can have specialized (or partially specialized) template parameters and sure this types should be considered as different. But for plain C we can safely get all high-level types and assume that types with the same name and declared at the same line and column are equal

Next I ran this script on objdump -g dump for linux kernel. Script gave me digit 252741370

Lets find size of .debug_info section

objdump -h vmlinux | grep debug_info 35 .debug_info 118205ec 0000000000000000 0000000000000000 03037230 2**0 Size is 0x118205ec = 293733868

And finally lets calculate share of unnecessary info: 252741370 / 293733868 = 0,8604

I am shocked - 86%!!! Looks like hd manufacturers conspiracy

↧

custom dwarf attributes in golang

April 10, 2023, 8:36 am

≫ Next: custom attributes in gcc and dwarf

≪ Previous: DWARF size overhead

Finally I found them

0x2900

DW_AT_go_kind, form DW_FORM_data1. Internal golang types kind. For example DW_TAG_structure_type can have kind Struct, Slice or String. I made script to extract statistic which kind can be attached to dwarf tags

0x2901

DW_AT_go_key, form DW_FORM_ref_addr - tag ID. Can be attached to kindMap

0x2902

DW_AT_go_elem, form DW_FORM_ref_addr - tag ID. Can be attached to

kindChan
kindMap
kindSlice

0x2903

DW_AT_go_embedded_field, form DW_FORM_flag. If non-zero member is embedded structure

0x2904

DW_AT_go_runtime_type, form DW_FORM_addr. I don`t know what is it - sure this is not real VA bcs it can point to random sections and even be out of elf module

0x2905

DW_AT_go_package_name, form DW_FORM_string. Just package name for compilation unit

0x2906

DW_AT_go_dict_index, form DW_FORM_udata

index of the dictionary entry describing the real type of this type shape

↧