-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trace exec upper_layer: fix BTF type size relocation #2400
Conversation
I also tested that this PR fixes the built-in gadget too:
|
struct _ovl_inode { | ||
struct inode vfs_inode; | ||
struct dentry *__upperdentry; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it can be fixed by adding __attribute__((preserve_access_index));
to this structure? (I think it'll force clang to emit a relocation when the container_of macro needs to calculate offsetof(type,member)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why is this structure so different from the upstream one? https://github.com/torvalds/linux/blob/v5.0/fs/overlayfs/ovl_entry.h#L91-L105
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I now understand, we have the inode, and we suppose the dentry is the next member on the structure, so that's why we don't care about any other members.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it can be fixed by adding attribute((preserve_access_index)); to this structure? (I think it'll force clang to emit a relocation when the container_of macro needs to calculate offsetof(type,member))
cilium/ebpf emits a bad relocation when I add this attribute because it does not find struct _ovl_inode
in the kernel BTF.
const badRelo = 0xbad2310
I think I now understand, we have the inode, and we suppose the dentry is the next member on the structure, so that's why we don't care about any other members.
Correct. In that way, no need to define types for other fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cilium/ebpf emits a bad relocation when I add this attribute because it does not find struct _ovl_inode in the kernel BTF.
That's unfortunate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use "flavour names" like struct ovl_inode___gadget
. Note the three _
. Anything after the underscore is ignored by the library when trying to find a target. https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html#dealing-with-kernel-version-and-configuration-differences
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lmb Thanks for the link, I didn't know about struct flavors.
The following seems to work. Tested without cilium/ebpf#1300, so struct ovl_inode
is unknown to BTF and only the "else" branch is tested.
Is it the approach you recommend?
#include <vmlinux.h>
struct ovl_inode___gadget {
struct inode vfs_inode;
struct dentry *__upperdentry;
} __attribute__((preserve_access_index));
...
if (bpf_core_type_exists(struct ovl_inode___gadget)) {
struct ovl_inode___gadget *oi = container_of(
inode, struct ovl_inode___gadget, vfs_inode);
bpf_probe_read_kernel(&upperdentry, sizeof upperdentry,
&oi->__upperdentry);
} else {
bpf_probe_read_kernel(&upperdentry, sizeof upperdentry,
((void *)inode) +
bpf_core_type_size(struct inode));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think something like checking for the existence of the field would be more reliable: https://github.com/libbpf/libbpf/blob/f81eef23b33c0dbf923e863a72ce51ea4d32e291/src/bpf_core_read.h#L151-L164
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave it a try: #2406
About bpf_core_field_exists
: I'll give it a try. If the struct does not exist, I guess bpf_core_field_exists
would just return false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi.
It seems to work but I am concerned about how it would behave if upstream changes.
Moreover, can you push it to citest/
? This way, we ensure everything is correct with regard to the CI. If everything is green on the CI side, I will approve it.
Best regards.
gadgets/trace_exec/program.bpf.c
Outdated
struct inode vfs_inode; | ||
struct dentry *__upperdentry; | ||
}; | ||
|
||
static __always_inline bool upper_layer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename it as has_upper_layer()
since it returns a bool
.
// struct ovl_inode defined in fs/overlayfs/ovl_entry.h | ||
// Unfortunately, not exported to vmlinux.h | ||
// and not available in /sys/kernel/btf/vmlinux | ||
// See https://github.com/cilium/ebpf/pull/1300 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we bring momentum to this PR and then use it once merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be good. But even if the cilium/ebpf PR is merged, we still have work to do to make use of it in Inspektor Gadget:
- How an image-based gadget can specify which kernel module to load, so that the
run
command uses the correctKernelTypes
:
ebpf.CollectionOptions{
MapReplacements: mapReplacements,
Programs: ebpf.ProgramOptions{
KernelTypes: /* BTF from the kernel and the modules */,
},
}
- How does it interact with BTFGen? image-based gadgets: Support BTFGen #2045
bpf_core_type_size(struct inode)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one day the order of the fields in struct ovl_inode
changes upstream, this will not work anymore, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
Sysdig has the same issue, it makes the same assumption (see extract__exe_upper_layer).
To fix that, we will need:
- Add CO-RE support for kernel modules cilium/ebpf#1300
- Use it in the built-in gadget
- By loading /sys/kernel/btf/overlay when we have both
CONFIG_DEBUG_INFO_BTF=y
andCONFIG_DEBUG_INFO_BTF_MODULES=y
- With BTFGen when we have neither
- Some other solutions when we have
CONFIG_DEBUG_INFO_BTF=y
but notCONFIG_DEBUG_INFO_BTF_MODULES=y
?
- By loading /sys/kernel/btf/overlay when we have both
- Use it in image-based gadgets
- Add a way for the gadget to specify which modules to load
- Add BTFGen support for image-based gadgets image-based gadgets: Support BTFGen #2045
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
Sysdig has the same issue, it makes the same assumption (see extract__exe_upper_layer).
We should definitely take inspiration from strong behavior and not weak assumption like this one.
Can you open a follow-up PR to document this limitation?
Also, is this field shown by default? It should be hidden by default.
To fix that, we will need:
* [Add CO-RE support for kernel modules cilium/ebpf#1300](https://github.com/cilium/ebpf/pull/1300)
I will review and test this PR, can you please do so?
Even though this is not enough for this PR, this is something which would definitely be helpful for Inspektor Gadget, so we should definitely participate.
* Use it in the built-in gadget * By loading /sys/kernel/btf/overlay when we have both `CONFIG_DEBUG_INFO_BTF=y` and `CONFIG_DEBUG_INFO_BTF_MODULES=y` * With BTFGen when we have neither * [repository: BTFHUB should contain BTF files for all kernel modules aquasecurity/btfhub#70](https://github.com/aquasecurity/btfhub/issues/70) * Some other solutions when we have `CONFIG_DEBUG_INFO_BTF=y` but not `CONFIG_DEBUG_INFO_BTF_MODULES=y`? * Use it in image-based gadgets * Add a way for the gadget to specify which modules to load
This is not the goal of Inspektor Gadget to load modules.
But we can indeed, use the YAML to read which modules BTF under /sys/kerne/btf
the gadget needs.
* Add BTFGen support for image-based gadgets [image-based gadgets: Support BTFGen #2045](https://github.com/inspektor-gadget/inspektor-gadget/issues/2045)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should definitely take inspiration from strong behavior and not weak assumption like this one.
Can you open a follow-up PR to document this limitation?
Ok, I will.
Also, is this field shown by default? It should be hidden by default.
It's hidden by default.
I will review and test this PR, can you please do so?
I'll try as time allows.
This is not the goal of Inspektor Gadget to load modules.
But we can indeed, use the YAML to read which modules BTF under /sys/kerne/btf the gadget needs.
Yes, I meant loading the BTF file of a kernel module.
bpf_core_type_size() has to be used explicitly to ensure the correct BTF relocation will be used. Using sizeof or other implicit methods don't work. Fixes: 8c433c4 ("image-based trace exec: add upper_layer field") Fixes: a1d98b1 ("built-in trace exec: add upper_layer field") Signed-off-by: Alban Crequy <[email protected]>
a355c26
to
7c8dd8f
Compare
|
The CI is green, on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. It's still "hacky" but the final solution requires a lot of work as mentioned by Alban.
struct _ovl_inode { | ||
struct inode vfs_inode; | ||
struct dentry *__upperdentry; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cilium/ebpf emits a bad relocation when I add this attribute because it does not find struct _ovl_inode in the kernel BTF.
That's unfortunate.
bpf_core_type_size() has to be used explicitly to ensure the correct BTF relocation will be used.
Using sizeof or other implicit methods don't work.
Fixes: 8c433c4 ("image-based trace exec: add upper_layer field")
Fixes: a1d98b1 ("built-in trace exec: add upper_layer field")
Fixes: #2353
How to use
Check the upper layer field in the trace exec gadget on different kernels
Testing done
Deploy this DaemonSet on AKS with 2 node pools: