2020-01-18

While developing new eBPF program type, we need do some small test. We do not want to touch a lot of the libbpf or the higher bcc. What we need is just a eBPF program and loding it to kernel. This post is about how to do this. In this article, I will adds a eBPF program to kprobe tracepoint. This includes three parts, prepare the eBPF program, a loader to load this eBPF program and open the kernel function to kprobe.

Prepare a eBPF program

In Debian 9.1 we install a custome kernel(4.9.208). Go to the samples/bpf, and make(first need to isntall clang and llvm).

Add a test_bpf.c in samples/bpf directory.

    #include <uapi/linux/bpf.h>
    #include "bpf_helpers.h"

    int bpf_prog(void *ctx) {
        char buf[] = "Hello World!\n";
        bpf_trace_printk(buf, sizeof(buf));
        return 0;
    }

Add one line in samples/bpf/Makefile right place.

    always += test_bpf.o

Then type ‘make’ to compile this bpf program.

Now we get a ‘test_bpf.o’ file. But it contains a lot of ELF file metadata. We need to extract the eBPF program itself out.

First let’s see what the eBPF code is. The first try shows that the Debian’s built-in llvm tools is too old, it doesn’t support ‘-S’ option.

    root@192:/home/test/linux-4.9.208/linux-4.9.208/samples/bpf# llvm-objdump -arch-name=bpf -S test_bpf.o
    llvm-objdump: Unknown command line argument '-S'.  Try: 'llvm-objdump -help'
    llvm-objdump: Did you mean '-D'?

Got to ‘http://apt.llvm.org/’ and install the new clang and llvm.

    bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"

Use the new tool to see the eBPF program.

    root@192:/home/test/linux-4.9.208/linux-4.9.208/samples/bpf# llvm-objdump-9 -arch-name=bpf -S test_bpf.o

    test_bpf.o:	file format ELF64-unknown


    Disassembly of section .text:

    0000000000000000 bpf_prog:
        0:	b7 01 00 00 0a 00 00 00	r1 = 10
        1:	6b 1a fc ff 00 00 00 00	*(u16 *)(r10 - 4) = r1
        2:	b7 01 00 00 72 6c 64 21	r1 = 560229490
        3:	63 1a f8 ff 00 00 00 00	*(u32 *)(r10 - 8) = r1
        4:	18 01 00 00 48 65 6c 6c 00 00 00 00 6f 20 57 6f	r1 = 8022916924116329800 ll
        6:	7b 1a f0 ff 00 00 00 00	*(u64 *)(r10 - 16) = r1
        7:	bf a1 00 00 00 00 00 00	r1 = r10
        8:	07 01 00 00 f0 ff ff ff	r1 += -16
        9:	b7 02 00 00 0e 00 00 00	r2 = 14
        10:	85 00 00 00 06 00 00 00	call 6
        11:	b7 00 00 00 00 00 00 00	r0 = 0
        12:	95 00 00 00 00 00 00 00	exit
    root@192:/home/test/linux-4.9.208/linux-4.9.208/samples/bpf# 

As we can see, the eBPF code is contained in the .text section of ‘test_bpf.o’, it’s size is 13*8=104.

Use the dd to dump the eBPF code.

    root@192:/home/test/linux-4.9.208/linux-4.9.208/samples/bpf# dd if=./test_bpf.o  of=test_bpf bs=1 count=104 skip=64
    104+0 records in
    104+0 records out
    104 bytes copied, 0.000221178 s, 470 kB/s
    root@192:/home/test/linux-4.9.208/linux-4.9.208/samples/bpf# hexdump test_bpf
    0000000 01b7 0000 000a 0000 1a6b fffc 0000 0000
    0000010 01b7 0000 6c72 2164 1a63 fff8 0000 0000
    0000020 0118 0000 6548 6c6c 0000 0000 206f 6f57
    0000030 1a7b fff0 0000 0000 a1bf 0000 0000 0000
    0000040 0107 0000 fff0 ffff 02b7 0000 000e 0000
    0000050 0085 0000 0006 0000 00b7 0000 0000 0000
    0000060 0095 0000 0000 0000                    
    0000068

OK, now we have our eBPF program ‘test_bpf’, it contains a ‘helloworld’ eBPF program.

Open the perf event kprobe

And get the event id:

    root@192:/home/test/linux-4.9.208/linux-4.9.208# echo 'p:sys_clone sys_clone' >> /sys/kernel/debug/tracing/kprobe_events 
    root@192:/home/test/linux-4.9.208/linux-4.9.208# cat /sys/kernel/debug/tracing/events/kprobes/sys_clone/id 
    1254

Write a loader

The source code of loader is as following:

    #define _GNU_SOURCE
    #include <unistd.h>
    #include <string.h>
    #include <sys/syscall.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <linux/bpf.h>
    #include <linux/version.h>
    #include <linux/perf_event.h>
    #include <linux/hw_breakpoint.h>
    #include <errno.h>


    int main()
    {
        int bfd;
        unsigned char buf[1024] = {};
        struct bpf_insn *insn;
        union bpf_attr attr = {};
        unsigned char log_buf[4096] = {};
        int ret;
        int efd;
        int pfd;
        int n;
        int i;
        struct perf_event_attr pattr = {};

        bfd = open("./test_bpf", O_RDONLY);
        if (bfd < 0)
        {
        printf("open eBPF program error: %s\n", strerror(errno));
        exit(-1);
        }
        n = read(bfd, buf, 1024);
        for (i = 0; i < n; ++i)
        {
        printf("%02x ", buf[i]);
        if (i % 8 == 0)
            printf("\n");
        }
        close(bfd);
        insn = (struct bpf_insn*)buf;
        attr.prog_type = BPF_PROG_TYPE_KPROBE;
        attr.insns = (unsigned long)insn;
        attr.insn_cnt = n / sizeof(struct bpf_insn);
        attr.license = (unsigned long)"GPL";
        attr.log_size = sizeof(log_buf);
        attr.log_buf = (unsigned long)log_buf;
        attr.log_level = 1;
        attr.kern_version = 264656;
        pfd = syscall(SYS_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));
        if (pfd < 0)
        {
        printf("bpf syscall error: %s\n", strerror(errno));
        printf("log_buf = %s\n", log_buf);
        exit(-1);
        }

        pattr.type = PERF_TYPE_TRACEPOINT;
        pattr.sample_type = PERF_SAMPLE_RAW;
        pattr.sample_period = 1;
        pattr.wakeup_events = 1;
        pattr.config = 1254;
        pattr.size = sizeof(pattr);
        efd = syscall(SYS_perf_event_open, &pattr, -1, 0, -1, 0);
        if (efd < 0)
        {
        printf("perf_event_open error: %s\n", strerror(errno));
        exit(-1);
        }
        ret = ioctl(efd, PERF_EVENT_IOC_ENABLE, 0);
        if (ret < 0)
        {
        printf("PERF_EVENT_IOC_ENABLE error: %s\n", strerror(errno));
        exit(-1);
        }
        ret = ioctl(efd, PERF_EVENT_IOC_SET_BPF, pfd);
        if (ret < 0)
        {
        printf("PERF_EVENT_IOC_SET_BPF error: %s\n", strerror(errno));
        exit(-1);
        }
        while(1);
    }

Something notice:

  1. I first uses the eBPF as a ‘BPF_PROG_TYPE_TRACEPOINT’ to attach to the syscalls tracepoints. But it works, I was quite confusion about this until I read this: https://github.com/iovisor/bcc/issues/748 . So I switch to use kprobe.

  2. The ‘attr.kern_version’ is read from linux-4.9.208/usr/include/linux/version.h file ‘LINUX_VERSION_CODE’

  3. The last ‘while’ is to pin the eBPF in program, also there is method to pin I use ‘while’ to simplify thing.

Before we execute the ‘test_loader’, we first read the ‘/sys/kernel/debug/tracing/trace_pipe’ in Terminal 1, this is where the ‘bpf_trace_printk’ output goes.

Then run the ‘test_loader’ in Terminal 2and we can see the output from Terminal 1 as following:

    root@192:/home/test/linux-4.9.208/linux-4.9.208# cat /sys/kernel/debug/tracing/trace_pipe 
                bash-13708 [003] d... 51890.256702: : Hello World!
                bash-13708 [001] d... 51905.890740: : Hello World!
                bash-13708 [000] d... 52578.776651: : Hello World!
        gnome-shell-1429  [000] d... 52581.579554: : Hello World!
        gnome-shell-1429  [001] d... 52582.922830: : Hello World!
        gnome-shell-13773 [000] d... 52582.937085: : Hello World!


blog comments powered by Disqus