2108274 botan.db
grep Err: botan.db | wc -l
540
Value is structure or union component.
Operand 0 is the structure or union (an expression).
Operand 1 is the field (a node of type FIELD_DECL).
Operand 2, if present, is the value of DECL_FIELD_OFFSET
Sounds easy? "In theory there is no difference between theory and practice". In practice you can encounter many other types in any combinations, like in this relative simple RTL:(call_insn:TI 1482 1481 2856 35 (call (mem:QI (mem/f:DI (plus:DI (reg/f:DI 0 ax [orig:340 MEM[(struct Server_Hello_13 *)_325].D.264452.D.264115._vptr.Handshake_Message ] [340])
(const_int 24 [0x18])) [744 MEM[(int (*) () *)_199 + 24B]+0 S8 A64]) [0 *OBJ_TYPE_REF(_200;&MEM[(struct _Uninitialized *)&D.349029].D.305525._M_storage->3B) S1 A8])
(const_int 0 [0])) "/usr/local/include/c++/12.2.1/bits/stl_construct.h":88:18 898 {*call}
(expr_list:REG_CALL_ARG_LOCATION (expr_list:REG_DEP_TRUE (concat:DI (reg:DI 5 di)
(reg/f:DI 41 r13 [386]))
(nil))
(expr_list:REG_DEAD (reg:DI 5 di)
(expr_list:REG_DEAD (reg/f:DI 0 ax [orig:340 MEM[(struct Server_Hello_13 *)_325].D.264452.D.264115._vptr.Handshake_Message ] [340])
(expr_list:REG_EH_REGION (const_int 0 [0])
(expr_list:REG_CALL_DECL (nil)
(nil))))))
(expr_list:DI (use (reg:DI 5 di))
(nil)))
COMPONENT_REF
- COMPONENT_REF Op1 will contain FIELD_DECL to field f3 and Op2 reference to
- COMPONENT_REF Op1 will contain FIELD_DECL to field f2 and Op2 reference to
- COMPONENT_REF Op1 will contain FIELD_DECL to field f1 and Op2 finally references to RECORD_TYPE/UNION_TYPEfor SomeStruct
Pretty easy? Actually not - there are at least two problems:
- Both Op1 & Op2 can contain any other types - for example SSA_NAME
-
Record in each chain can be nameless. For C++ you can find enclosed class with function get_containing_scope, but in C all nested nameless structures actually has scope TRANSLATION_UNIT_DECL - in such case there is chance that chain will be unlinked
Dirty hack - you even don`t need RECORD_TYPE for each field bcs you can extract it with DECL_CONTEXT
SSA_NAME
MEM_REF
The type of the MEM_REF is the type the bytes at the memory location are interpreted as.
MEM_REF <p, c> is equivalent to ((typeof(c))p)->x... where x... is a
chain of component references offsetting p by c
Type can be extracted with TMR_BASE and offset with TMR_OFFSET.
Well, it would be good to find field at this offset, right? First field can be extracted with TYPE_FIELDS and next with TREE_CHAIN. See function dump_mem_ref for details
ADDR_EXPR
& in C. Value is the address at which the operand's value resides
OBJ_TYPE_REF
Used to represent lookup in a virtual method table which is dependent on
the runtime type of an object. Operands are:
OBJ_TYPE_REF_EXPR: An expression that evaluates the value to use.
OBJ_TYPE_REF_OBJECT: Is the object on whose behalf the lookup is
being performed. Through this the optimizers may be able to statically
determine the dynamic type of the object.
OBJ_TYPE_REF_TOKEN: An integer index to the virtual method table.
The integer index should have as type the original type of
OBJ_TYPE_REF_OBJECT
So now we can collect all access types to class/structures field and methods. The only uncovered type is pointer to method - can it be tracked? Unfortunately no - nor where offset to method assigned nor where it called. I wrote simple test and methods get_ref look in disasm like:
mov eax, 33 ; just some const
mov edx, 0
pop rbp
(insn 6 3 7 2 (set (reg:DI 0 ax [orig:82 D.3252 ] [82])
(const_int 33 [0x21])) "vtest.cc":40:24 80 {*movdi_internal}
(nil))
For some unknown reason there are no OFFSET_REF& PTRMEM_CST in RTL