Quantcast
Channel: windows deep internals
Viewing all articles
Browse latest Browse all 264

epbf maps

$
0
0

As you can see from function bpf_map_alloc_id all bpf maps stored in balanced tree map_idr and synced on spinlock map_idr_lock. No surprise that you can`t view them in user-mode - there is bpf command BPF_MAP_GET_NEXT_ID but it can only enumerate ID of maps. So I add today some code to view bpf maps: lkmem -c -d -B gives output like

bpf_maps at 0xffffffff929c1880: 15
 [0] id 3 UDPrecvAge at 0xffff99e344f48000
  type: 1 BPF_MAP_TYPE_HASH
  key_size 8 value_size 8
 [1] id 4 UDPsendAge at 0xffff99e344cb4c00
  type: 1 BPF_MAP_TYPE_HASH
  key_size 38 value_size 8

also disasm of jitted ebpf code began to look better:
mov rdi, 0xffff99e344f48000 ; UDPrecvAge
call 0xffffffff90c191f0 ; __htab_map_lookup_elem

This letter explains that JIT replacing sequence of opcodes
bpf_mov r1, const_internal_map_id
bpf_call bpf_map_lookup

with direct loading of 64bit address of map (BPF_LD_IMM64 pseudo op). But this code is not optimal - every instruction occupy 10 bytes. Lets consider case where we employ constants pool and put all map addresses somewhere after function - sure this will require at least 8 bytes for each address + perhaps some space for alignment. But now we can produce code like:
mov rdi, qword [map1_addr wrt rip] ; 7 bytes
call __htab_map_lookup_elem
...
; somewhere after function
map1_addr: resq 1 ; jit should put real address of map here

if function has 3 or more reference to the same map we can have some decreasing of jitted code size

Viewing all articles
Browse latest Browse all 264

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>