Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marshal BTF from btf.Type #616

Closed
ti-mo opened this issue Mar 31, 2022 · 1 comment
Closed

Marshal BTF from btf.Type #616

ti-mo opened this issue Mar 31, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@ti-mo
Copy link
Collaborator

ti-mo commented Mar 31, 2022

Over the past few months, while working on per-instruction metadata, we've identified the need for marshaling valid BTF blobs based on btf.Types.

A few use cases:

  • caller-provided btf.Types for map keys/values, for pretty-printing dumps and creating valid StructOps maps
  • caller-provided FuncInfo (e.g. Instruction.WithFunc(f *btf.Func)) for manually assembling (and modifying) programs with bpf-to-bpf calls
  • caller-provided source information (e.g. Instruction.WithSource(s fmt.Stringer)), for adding comments or other info on instructions
  • .. probably a few things I haven't thought of

Optimizations/cleanups:

  • BTF blobs accompanying maps can consist of just 2 (best case) types, removing the need for the BTF handle cache.
  • Loading the complete BTF blob from larger ELFs is no longer required. For programs, this only needs to contain types+strings of FuncInfos and strings from LineInfos.

This should also make fixing #344 possible.

@ti-mo ti-mo added bug Something isn't working enhancement New feature or request and removed bug Something isn't working labels Mar 31, 2022
@lmb lmb self-assigned this Apr 23, 2022
lmb added a commit to lmb/ebpf that referenced this issue Apr 25, 2022
Allow building BTF wire format from scratch. We can compare a full vmlinux build to
just re-encoding raw type data as we currently do in Spec.marshal:

    name                    time/op
    BuildVmlinux/builder-4  81.7ms ± 1%
    BuildVmlinux/native-4   23.5ms ± 2%

    name                    alloc/op
    BuildVmlinux/builder-4  46.5MB ± 0%
    BuildVmlinux/native-4   20.1MB ± 0%

    name                    allocs/op
    BuildVmlinux/builder-4    700k ± 0%
    BuildVmlinux/native-4     394k ± 0%

A full rebuild is about 4x slower and uses 2x the memory. We can't use the builder
to replace Spec.marshal yet, since we don't have a good way to include LineInfos
in the generated BTF. This will become easier once LineInfos are stored in
Instruction metadata.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Apr 26, 2022
Allow building BTF wire format from scratch. We can compare a full vmlinux build to
just re-encoding raw type data as we currently do in Spec.marshal:

    name                    time/op
    BuildVmlinux/builder-4  81.7ms ± 1%
    BuildVmlinux/native-4   23.5ms ± 2%

    name                    alloc/op
    BuildVmlinux/builder-4  46.5MB ± 0%
    BuildVmlinux/native-4   20.1MB ± 0%

    name                    allocs/op
    BuildVmlinux/builder-4    700k ± 0%
    BuildVmlinux/native-4     394k ± 0%

A full rebuild is about 4x slower and uses 2x the memory. We can't use the builder
to replace Spec.marshal yet, since we don't have a good way to include LineInfos
in the generated BTF. This will become easier once LineInfos are stored in
Instruction metadata.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Apr 27, 2022
Allow building BTF wire format from scratch. We can compare a full vmlinux build to
just re-encoding raw type data as we currently do in Spec.marshal:

    name                    time/op
    BuildVmlinux/builder-4  77.9ms ± 2%
    BuildVmlinux/native-4   23.3ms ± 2%

    name                    alloc/op
    BuildVmlinux/builder-4  33.7MB ± 0%
    BuildVmlinux/native-4   20.1MB ± 0%

    name                    allocs/op
    BuildVmlinux/builder-4    473k ± 0%
    BuildVmlinux/native-4     394k ± 0%

A full rebuild is about 3x slower and uses 1.6x the memory. We can't use the builder
to replace Spec.marshal yet, since we don't have a good way to include LineInfos
in the generated BTF. This will become easier once LineInfos are stored in
Instruction metadata.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Apr 27, 2022
Allow building BTF wire format from scratch. We can compare a full vmlinux build to
just re-encoding raw type data as we currently do in Spec.marshal:

    name                    time/op
    BuildVmlinux/builder-4  77.9ms ± 2%
    BuildVmlinux/native-4   23.3ms ± 2%

    name                    alloc/op
    BuildVmlinux/builder-4  33.7MB ± 0%
    BuildVmlinux/native-4   20.1MB ± 0%

    name                    allocs/op
    BuildVmlinux/builder-4    473k ± 0%
    BuildVmlinux/native-4     394k ± 0%

A full rebuild is about 3x slower and uses 1.6x the memory. We can't use the builder
to replace Spec.marshal yet, since we don't have a good way to include LineInfos
in the generated BTF. This will become easier once LineInfos are stored in
Instruction metadata.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Apr 27, 2022
Allow building BTF wire format from scratch. We can compare a full vmlinux build to
just re-encoding raw type data as we currently do in Spec.marshal:

    name                    time/op
    BuildVmlinux/builder-4  77.9ms ± 2%
    BuildVmlinux/native-4   23.3ms ± 2%

    name                    alloc/op
    BuildVmlinux/builder-4  33.7MB ± 0%
    BuildVmlinux/native-4   20.1MB ± 0%

    name                    allocs/op
    BuildVmlinux/builder-4    473k ± 0%
    BuildVmlinux/native-4     394k ± 0%

A full rebuild is about 3x slower and uses 1.6x the memory. We can't use the builder
to replace Spec.marshal yet, since we don't have a good way to include LineInfos
in the generated BTF. This will become easier once LineInfos are stored in
Instruction metadata.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Jul 11, 2022
Allow building BTF wire format from scratch. We can compare a full vmlinux build to
just re-encoding raw type data as we currently do in Spec.marshal:

    name                    time/op
    BuildVmlinux/builder-4  77.9ms ± 2%
    BuildVmlinux/native-4   23.3ms ± 2%

    name                    alloc/op
    BuildVmlinux/builder-4  33.7MB ± 0%
    BuildVmlinux/native-4   20.1MB ± 0%

    name                    allocs/op
    BuildVmlinux/builder-4    473k ± 0%
    BuildVmlinux/native-4     394k ± 0%

A full rebuild is about 3x slower and uses 1.6x the memory. We can't use the builder
to replace Spec.marshal yet, since we don't have a good way to include LineInfos
in the generated BTF. This will become easier once LineInfos are stored in
Instruction metadata.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Sep 6, 2022
Allow building BTF wire format from scratch. We can compare a full vmlinux build to
just re-encoding raw type data as we currently do in Spec.marshal:

    name                    time/op
    BuildVmlinux/builder-4  77.9ms ± 2%
    BuildVmlinux/native-4   23.3ms ± 2%

    name                    alloc/op
    BuildVmlinux/builder-4  33.7MB ± 0%
    BuildVmlinux/native-4   20.1MB ± 0%

    name                    allocs/op
    BuildVmlinux/builder-4    473k ± 0%
    BuildVmlinux/native-4     394k ± 0%

A full rebuild is about 3x slower and uses 1.6x the memory. We can't use the builder
to replace Spec.marshal yet, since we don't have a good way to include LineInfos
in the generated BTF. This will become easier once LineInfos are stored in
Instruction metadata.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Sep 8, 2022
Allow building BTF wire format from scratch. We can compare a full vmlinux build to
just re-encoding raw type data as we currently do in Spec.marshal:

    name                    time/op
    BuildVmlinux/builder-4  71.4ms ± 9%
    BuildVmlinux/native-4   22.2ms ±10%

    name                    alloc/op
    BuildVmlinux/builder-4  29.4MB ± 0%
    BuildVmlinux/native-4   20.8MB ± 0%

    name                    allocs/op
    BuildVmlinux/builder-4    450k ± 0%
    BuildVmlinux/native-4     340k ± 0%

A full rebuild is about 3x slower and uses 1.4x the memory.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Sep 9, 2022
Allow building BTF wire format from scratch. The most interesting problem is which order
to output types in. To understand the issue it's best to start with a non-cyclical type:

    const long unsigned int

libbpf encodes the type like this:

    [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
    [2] CONST '(anon)' type_id=1

Since CONST needs to refer to the long unsigned int somehow, INT is written out
first and therefore gets ID 1. Afterwards the CONST is written with type_id=1.
This is exactly the result we get from doing a postorder traversal on the CONST.

Things get more interesting with cyclical types:

    struct list_head {
        struct list_head *next, *prev;
    }

libbpf encodes the type like this:

    [86] STRUCT 'list_head' size=16 vlen=2
            'next' type_id=88 bits_offset=0
            'prev' type_id=88 bits_offset=64
    ...
    [88] PTR '(anon)' type_id=86

Here STRUCT actually refers to type_id 88, even though that type has not been written
out yet. Even if we turn things around and encode PTR before STRUCT we'd still have
the same problem.

As a consequence, we need a way to allocate IDs for types that haven't been marshalled
yet and we need to ensure that we encode types according to their allocated ID.
The solution is the pending queue: when allocating a new ID for a type we also push
that type to the queue. We then always drain the queue before continuing our traversal.

Some benchmark data that compares marshaling Types to just re-encoding
raw type data as we currently do in Spec.marshal:

    name                  time/op
    BuildVmlinux-4        75.7ms ± 8%
    BuildVmlinuxLegacy-4  26.5ms ± 8%

    name                  alloc/op
    BuildVmlinux-4        28.6MB ± 0%
    BuildVmlinuxLegacy-4  20.8MB ± 0%

    name                  allocs/op
    BuildVmlinux-4          463k ± 0%
    BuildVmlinuxLegacy-4    340k ± 0%

A full rebuild is about 3x slower and uses 1.4x the memory.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Sep 20, 2022
Allow building BTF wire format from scratch. The most interesting problem is which order
to output types in. To understand the issue it's best to start with a non-cyclical type:

    const long unsigned int

libbpf encodes the type like this:

    [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
    [2] CONST '(anon)' type_id=1

Since CONST needs to refer to the long unsigned int somehow, INT is written out
first and therefore gets ID 1. Afterwards the CONST is written with type_id=1.
This is exactly the result we get from doing a postorder traversal on the CONST.

Things get more interesting with cyclical types:

    struct list_head {
        struct list_head *next, *prev;
    }

libbpf encodes the type like this:

    [86] STRUCT 'list_head' size=16 vlen=2
            'next' type_id=88 bits_offset=0
            'prev' type_id=88 bits_offset=64
    ...
    [88] PTR '(anon)' type_id=86

Here STRUCT actually refers to type_id 88, even though that type has not been written
out yet. Even if we turn things around and encode PTR before STRUCT we'd still have
the same problem.

As a consequence, we need a way to allocate IDs for types that haven't been marshalled
yet and we need to ensure that we encode types according to their allocated ID.
The solution is the pending queue: when allocating a new ID for a type we also push
that type to the queue. We then always drain the queue before continuing our traversal.

Some benchmark data that compares marshaling Types to just re-encoding
raw type data as we currently do in Spec.marshal:

    name                  time/op
    BuildVmlinux-4        75.7ms ± 8%
    BuildVmlinuxLegacy-4  26.5ms ± 8%

    name                  alloc/op
    BuildVmlinux-4        28.6MB ± 0%
    BuildVmlinuxLegacy-4  20.8MB ± 0%

    name                  allocs/op
    BuildVmlinux-4          463k ± 0%
    BuildVmlinuxLegacy-4    340k ± 0%

A full rebuild is about 3x slower and uses 1.4x the memory.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Sep 27, 2022
Allow building BTF wire format from scratch. The most interesting problem is which order
to output types in. To understand the issue it's best to start with a non-cyclical type:

    const long unsigned int

libbpf encodes the type like this:

    [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
    [2] CONST '(anon)' type_id=1

Since CONST needs to refer to the long unsigned int somehow, INT is written out
first and therefore gets ID 1. Afterwards the CONST is written with type_id=1.
This is exactly the result we get from doing a postorder traversal on the CONST.

Things get more interesting with cyclical types:

    struct list_head {
        struct list_head *next, *prev;
    }

libbpf encodes the type like this:

    [86] STRUCT 'list_head' size=16 vlen=2
            'next' type_id=88 bits_offset=0
            'prev' type_id=88 bits_offset=64
    ...
    [88] PTR '(anon)' type_id=86

Here STRUCT actually refers to type_id 88, even though that type has not been written
out yet. Even if we turn things around and encode PTR before STRUCT we'd still have
the same problem.

As a consequence, we need a way to allocate IDs for types that haven't been marshalled
yet and we need to ensure that we encode types according to their allocated ID.
The solution is the pending queue: when allocating a new ID for a type we also push
that type to the queue. We then always drain the queue before continuing our traversal.

Some benchmark data that compares marshaling Types to just re-encoding
raw type data as we currently do in Spec.marshal:

    name                  time/op
    BuildVmlinux-4        75.7ms ± 8%
    BuildVmlinuxLegacy-4  26.5ms ± 8%

    name                  alloc/op
    BuildVmlinux-4        28.6MB ± 0%
    BuildVmlinuxLegacy-4  20.8MB ± 0%

    name                  allocs/op
    BuildVmlinux-4          463k ± 0%
    BuildVmlinuxLegacy-4    340k ± 0%

A full rebuild is about 3x slower and uses 1.4x the memory.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Sep 27, 2022
Allow building BTF wire format from scratch. The most interesting problem is which order
to output types in. To understand the issue it's best to start with a non-cyclical type:

    const long unsigned int

libbpf encodes the type like this:

    [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
    [2] CONST '(anon)' type_id=1

Since CONST needs to refer to the long unsigned int somehow, INT is written out
first and therefore gets ID 1. Afterwards the CONST is written with type_id=1.
This is exactly the result we get from doing a postorder traversal on the CONST.

Things get more interesting with cyclical types:

    struct list_head {
        struct list_head *next, *prev;
    }

libbpf encodes the type like this:

    [86] STRUCT 'list_head' size=16 vlen=2
            'next' type_id=88 bits_offset=0
            'prev' type_id=88 bits_offset=64
    ...
    [88] PTR '(anon)' type_id=86

Here STRUCT actually refers to type_id 88, even though that type has not been written
out yet. Even if we turn things around and encode PTR before STRUCT we'd still have
the same problem.

As a consequence, we need a way to allocate IDs for types that haven't been marshalled
yet and we need to ensure that we encode types according to their allocated ID.
The solution is the pending queue: when allocating a new ID for a type we also push
that type to the queue. We then always drain the queue before continuing our traversal.

Some benchmark data that compares marshaling Types to just re-encoding
raw type data as we currently do in Spec.marshal:

    name                  time/op
    BuildVmlinux-4        75.7ms ± 8%
    BuildVmlinuxLegacy-4  26.5ms ± 8%

    name                  alloc/op
    BuildVmlinux-4        28.6MB ± 0%
    BuildVmlinuxLegacy-4  20.8MB ± 0%

    name                  allocs/op
    BuildVmlinux-4          463k ± 0%
    BuildVmlinuxLegacy-4    340k ± 0%

A full rebuild is about 3x slower and uses 1.4x the memory.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Oct 17, 2022
Allow building BTF wire format from scratch. The most interesting problem is which order
to output types in. To understand the issue it's best to start with a non-cyclical type:

    const long unsigned int

libbpf encodes the type like this:

    [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
    [2] CONST '(anon)' type_id=1

Since CONST needs to refer to the long unsigned int somehow, INT is written out
first and therefore gets ID 1. Afterwards the CONST is written with type_id=1.
This is exactly the result we get from doing a postorder traversal on the CONST.

Things get more interesting with cyclical types:

    struct list_head {
        struct list_head *next, *prev;
    }

libbpf encodes the type like this:

    [86] STRUCT 'list_head' size=16 vlen=2
            'next' type_id=88 bits_offset=0
            'prev' type_id=88 bits_offset=64
    ...
    [88] PTR '(anon)' type_id=86

Here STRUCT actually refers to type_id 88, even though that type has not been written
out yet. Even if we turn things around and encode PTR before STRUCT we'd still have
the same problem.

As a consequence, we need a way to allocate IDs for types that haven't been marshalled
yet and we need to ensure that we encode types according to their allocated ID.
The solution is the pending queue: when allocating a new ID for a type we also push
that type to the queue. We then always drain the queue before continuing our traversal.

Some benchmark data that compares marshaling Types to just re-encoding
raw type data as we currently do in Spec.marshal:

    name                  time/op
    BuildVmlinux-4        75.7ms ± 8%
    BuildVmlinuxLegacy-4  26.5ms ± 8%

    name                  alloc/op
    BuildVmlinux-4        28.6MB ± 0%
    BuildVmlinuxLegacy-4  20.8MB ± 0%

    name                  allocs/op
    BuildVmlinux-4          463k ± 0%
    BuildVmlinuxLegacy-4    340k ± 0%

A full rebuild is about 3x slower and uses 1.4x the memory.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Oct 17, 2022
Allow building BTF wire format from scratch. The most interesting problem is which order
to output types in. To understand the issue it's best to start with a non-cyclical type:

    const long unsigned int

libbpf encodes the type like this:

    [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
    [2] CONST '(anon)' type_id=1

Since CONST needs to refer to the long unsigned int somehow, INT is written out
first and therefore gets ID 1. Afterwards the CONST is written with type_id=1.
This is exactly the result we get from doing a postorder traversal on the CONST.

Things get more interesting with cyclical types:

    struct list_head {
        struct list_head *next, *prev;
    }

libbpf encodes the type like this:

    [86] STRUCT 'list_head' size=16 vlen=2
            'next' type_id=88 bits_offset=0
            'prev' type_id=88 bits_offset=64
    ...
    [88] PTR '(anon)' type_id=86

Here STRUCT actually refers to type_id 88, even though that type has not been written
out yet. Even if we turn things around and encode PTR before STRUCT we'd still have
the same problem.

As a consequence, we need a way to allocate IDs for types that haven't been marshalled
yet and we need to ensure that we encode types according to their allocated ID.
The solution is the pending queue: when allocating a new ID for a type we also push
that type to the queue. We then always drain the queue before continuing our traversal.

Some benchmark data that compares marshaling Types to just re-encoding
raw type data as we currently do in Spec.marshal:

    name                  time/op
    BuildVmlinux-4        75.7ms ± 8%
    BuildVmlinuxLegacy-4  26.5ms ± 8%

    name                  alloc/op
    BuildVmlinux-4        28.6MB ± 0%
    BuildVmlinuxLegacy-4  20.8MB ± 0%

    name                  allocs/op
    BuildVmlinux-4          463k ± 0%
    BuildVmlinuxLegacy-4    340k ± 0%

A full rebuild is about 3x slower and uses 1.4x the memory.

Updates cilium#616
lmb added a commit to lmb/ebpf that referenced this issue Oct 24, 2022
Allow building BTF wire format from scratch. The most interesting problem is which order
to output types in. To understand the issue it's best to start with a non-cyclical type:

    const long unsigned int

libbpf encodes the type like this:

    [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
    [2] CONST '(anon)' type_id=1

Since CONST needs to refer to the long unsigned int somehow, INT is written out
first and therefore gets ID 1. Afterwards the CONST is written with type_id=1.
This is exactly the result we get from doing a postorder traversal on the CONST.

Things get more interesting with cyclical types:

    struct list_head {
        struct list_head *next, *prev;
    }

libbpf encodes the type like this:

    [86] STRUCT 'list_head' size=16 vlen=2
            'next' type_id=88 bits_offset=0
            'prev' type_id=88 bits_offset=64
    ...
    [88] PTR '(anon)' type_id=86

Here STRUCT actually refers to type_id 88, even though that type has not been written
out yet. Even if we turn things around and encode PTR before STRUCT we'd still have
the same problem.

As a consequence, we need a way to allocate IDs for types that haven't been marshalled
yet and we need to ensure that we encode types according to their allocated ID.
The solution is the pending queue: when allocating a new ID for a type we also push
that type to the queue. We then always drain the queue before continuing our traversal.

Some benchmark data that compares marshaling Types to just re-encoding
raw type data as we currently do in Spec.marshal:

    name                  time/op
    BuildVmlinux-4        75.7ms ± 8%
    BuildVmlinuxLegacy-4  26.5ms ± 8%

    name                  alloc/op
    BuildVmlinux-4        28.6MB ± 0%
    BuildVmlinuxLegacy-4  20.8MB ± 0%

    name                  allocs/op
    BuildVmlinux-4          463k ± 0%
    BuildVmlinuxLegacy-4    340k ± 0%

A full rebuild is about 3x slower and uses 1.4x the memory.

Updates cilium#616
ti-mo pushed a commit that referenced this issue Oct 26, 2022
Allow building BTF wire format from scratch. The most interesting problem is which order
to output types in. To understand the issue it's best to start with a non-cyclical type:

    const long unsigned int

libbpf encodes the type like this:

    [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
    [2] CONST '(anon)' type_id=1

Since CONST needs to refer to the long unsigned int somehow, INT is written out
first and therefore gets ID 1. Afterwards the CONST is written with type_id=1.
This is exactly the result we get from doing a postorder traversal on the CONST.

Things get more interesting with cyclical types:

    struct list_head {
        struct list_head *next, *prev;
    }

libbpf encodes the type like this:

    [86] STRUCT 'list_head' size=16 vlen=2
            'next' type_id=88 bits_offset=0
            'prev' type_id=88 bits_offset=64
    ...
    [88] PTR '(anon)' type_id=86

Here STRUCT actually refers to type_id 88, even though that type has not been written
out yet. Even if we turn things around and encode PTR before STRUCT we'd still have
the same problem.

As a consequence, we need a way to allocate IDs for types that haven't been marshalled
yet and we need to ensure that we encode types according to their allocated ID.
The solution is the pending queue: when allocating a new ID for a type we also push
that type to the queue. We then always drain the queue before continuing our traversal.

Some benchmark data that compares marshaling Types to just re-encoding
raw type data as we currently do in Spec.marshal:

    name                  time/op
    BuildVmlinux-4        75.7ms ± 8%
    BuildVmlinuxLegacy-4  26.5ms ± 8%

    name                  alloc/op
    BuildVmlinux-4        28.6MB ± 0%
    BuildVmlinuxLegacy-4  20.8MB ± 0%

    name                  allocs/op
    BuildVmlinux-4          463k ± 0%
    BuildVmlinuxLegacy-4    340k ± 0%

A full rebuild is about 3x slower and uses 1.4x the memory.

Updates #616
@ti-mo
Copy link
Collaborator Author

ti-mo commented Nov 4, 2022

This was shipped in #796

@ti-mo ti-mo closed this as completed Nov 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants