A thorough introduction to eBPF

hacknat · on Dec 16, 2017

If anybody is interested I wrote a go-only library to interact and create ebpf programs. It even parses the compiled elf binary for you and maps it to your variable names:

https://github.com/nathanjsweet/ebpf

UncleEntity · on Dec 16, 2017

> However, eBPF opcode programs themselves must be governed by the GPLv2 anyways, so if you are distributing any software relying on this project you will probably be open-sourcing the most important part (the eBPF opcode) anyways.

Really?

That seems a bit harsh...

hacknat · on Dec 16, 2017

Not my decision sorry. If you are using them for some kind of server side tech then you’re probably in the clear, but if you are distributing any eBPF code you write to a consumer (think IoT) then you’ll have to think about it.

UncleEntity · on Dec 16, 2017

I'm just surprised they would impose that restriction in the kernel considering they have very little problem with closed-source binary blobs for drivers and whatnot.

icebraining · on Dec 16, 2017

Apparently you can now load non-GPL licensed eBPF programs, but they can't then access certain functions marked as "GPL only". This matches the behavior for kernel modules.

drzaeus77 · on Dec 16, 2017

The first BPF programs (original tcpdump use case) were always proprietary-compatible. If you think about it, there is nothing linux-specific about networking, and any code the user writes to tweak the data that is traversing their machine should be theirs to write, and doesn't taint the kernel. The same can be said for iptables/nft rules (you don't have to GPL your firewall configuration). Therefore, the BPF hooks that deal with packets (XDP, tc, socket) are not GPL-restricted and will remain so in the future.

Where GPL comes in is when you use BPF to introspect the kernel. It is quite possible to use BPF programs with the kprobe hook to extract data about how the kernel itself works, and the opinion of the kernel developers on this is that such use cases are within the realm of GPL protection.

hacknat · on Dec 16, 2017

Interesting. @icebraining, do you know where this list exists?

drzaeus77 · on Dec 16, 2017

The helpers are in http://elixir.free-electrons.com/linux/v4.15-rc3/source/kern.... See the field gpl_only. For the list of which helpers are available in which hooks, the code needs to be read, to find things like: http://elixir.free-electrons.com/linux/v4.15-rc3/source/net/....

Unfortunately, I don't believe the high level list of which hooks have gpl helpers is published, so reading the code is the best method currently.

qeole · on Dec 18, 2017

This list is under progress. Didn't think about adding licensing information for the helpers, but that's an excellent idea!

hacknat · on Dec 16, 2017

Firmware can be closed source, but I’m fairly certain you have to open source the drivers too.

monocasa · on Dec 16, 2017

It's grey area. For instance, Nvidia's probably safe with their closed source driver. Since their driver was originally for Windows and simply ported to Linux mainly via an abstraction layer, it'd be pretty difficult to prove to a judge that they're 'derived from' the Linux codebase.

bonzini · on Dec 16, 2017

It's not "being derived from", it's "being a derivative of", which in turn has a legal meaning that need not exactly coincide with the usual English meaning.

That said, lawyers have certainly given their approval to the way Nvidia packages their driver, so it should be fine or at least very very hard to challenge.

horst_feistel · on Dec 16, 2017

In https://progmp.net/froemmgen-middleware2017.pdf, the authors compile a domain specific language to eBPF from inside the Kernel for a research prototype.

convolvatron · on Dec 16, 2017

I've been working a little on NFSv4 lately, and wondering why they didn't do something like this instead of the relatively limited COMPOUND. atomic append and server side file copy should be doable without very much language.

ENOTTY · on Dec 16, 2017

Could someone knowledgable discuss the loop thing? Is it just checking for loop termination? Or does it forbid loops entirely? If it checks for loop termination, does that imply a counter value that is statically known?

monocasa · on Dec 16, 2017

AFAIK, it forbids true loops entirely. It's verifying that the code flow graph is a DAG. You can always unroll.

One of the things I'm playing around with is higher level loop construct stolen from graphics shaders. You should still be able to guarantee termination and worse case execution time if done right.

drzaeus77 · on Dec 16, 2017

That's right, although work is being done to improve the situation. The first is the ability to have function calls: https://patchwork.ozlabs.org/cover/848824/. This won't allow loops, but will allow for less inline boilerplate. Building on this base infra, some folks are working to add a loop counter based verification rather than forcing the user to unroll manually.

UncleEntity · on Dec 16, 2017

If you can call arbitrary bpf functions I would imagine it would be trivial to allow loops through recursion, perhaps borrowing the 'gas' concept from Etherium to avoid infinite recursion and/or using too much time for function application?

drzaeus77 · on Dec 16, 2017

The function call infra being added doesn't change the current DAG restrictions, meaning that the functions are verified using a whole-program analysis semantic, so it may end up working more as just a code organization tool than something fundamental. There is already an ability to tail call to other bpf functions, which does more of what you're thinking, and the bpf runtime enforces a limit of 32 of those tail calls.

monocasa · on Dec 16, 2017

The entire program is still verified as a single unit, so this shouldn't allow arbitrary code graphs still.

UncleEntity · on Dec 16, 2017

Sweet, I was looking for this but forgot its name. Want to use it as the backend of my way over-engineered Lispkit instead of a traditional SECD machine because, why not?

Just need to find the motivation to sort out my lemon grammar...