Better living through software

Ben Hutchings's diary of life and technology

Email: ben@decadent.org.uk • Twitter: @benhutchingsuk • Debian: benh • Gitweb: git.decadent.org.uk • Github: github.com/bwhacks

Tue, 03 Dec 2019

Debian LTS work, November 2019

I was assigned 24.5 hours of work by Freexian's Debian LTS initiative and carried over 0.5 hours from October. I worked 21.25 hours this month, so will carry over 3.75 hours to December.

I released Linux 3.16.76, rebased the Debian package onto that, and sent out a request for testing.

I backported the mitigation for TSX Asynchronous Abort (CVE-2019-11135) and reporting of iTLB multihit (CVE-2018-12207) to 3.16 (this work started in October). I applied these and a GPU security fix, uploaded the Debian package and issued DLA-1989-1.

I backported the latest security update for Linux 4.9 from stretch to jessie and issued DLA-1990-1 for that.

I prepared and, after, review, released Linux 3.16.77 and 3.16.78. I rebased the Debian package onto 3.16.78 and sent out a request for testing.

posted at: 18:16 | path: / | permanent link to this entry

Sun, 03 Nov 2019

Debian LTS work, October 2019

I was assigned 22.75 hours of work by Freexian's Debian LTS initiative. I worked almost all those hours this month, but will carry over 0.5 hours to November.

I prepared and, after review, released Linux 3.16.75, including various important fixes. I then rebased the Debian package onto that, and sent out a request for testing. I prepared and sent out Linux 3.16.76-rc1 for review.

I handled a misdirected request to update the tzdata package, adding that and the related Perl library to the dla-needed.txt file. I responded to a support request regarding Intel microcode updates for security issues. I also spent some time working on security issues that are still under embargo.

posted at: 22:54 | path: / | permanent link to this entry

Fri, 04 Oct 2019

Kernel Recipes 2019, part 2

This conference only has a single track, so I attended almost all the talks. This time I didn't take notes but I've summarised all the talks I attended. This is the second and last part of that; see part 1 if you missed it.

XDP closer integration with network stack

Speaker: Jesper Dangaard Brouer

Details and slides: https://kernel-recipes.org/en/2019/xdp-closer-integration-with-network-stack/

Video: Youtube

The speaker introduced XDP and how it can improve network performance.

The Linux network stack is extremely flexible and configurable, but this comes at some performance cost. The kernel has to generate a lot of metadata about every packet and check many different control hooks while handling it.

The eXpress Data Path (XDP) was introduced a few years ago to provide a standard API for doing some receive packet handling earlier, in a driver or in hardware (where possible). XDP rules can drop unwanted packets, forward them, pass them directly to user-space, or allow them to continue through the network stack as normal.

He went on to talk about how recent and proposed future extensions to XDP allow re-using parts of the standard network stack selectively.

This talk was supposed to be meant for kernel developers in general, but I don't think it would be understandable without some prior knowledge of the Linux network stack.

Faster IO through io_uring

Speaker: Jens Axboe

Details and slides: https://kernel-recipes.org/en/2019/talks/faster-io-through-io_uring/

Video: Youtube. (This is part way through the talk, but the earlier part is missing audio.)

The normal APIs for file I/O, such as read() and write(), are blocking, i.e. they make the calling thread sleep until I/O is complete. There is a separate kernel API and library for asynchronous I/O (AIO), but it is very restricted; in particular it only supports direct (uncached) I/O. It also requires two system calls per operation, whereas blocking I/O only requires one.

Recently the io_uring API was introduced as an entirely new API for asynchronous I/O. It uses ring buffers, similar to hardware DMA rings, to communicate operations and completion status between user-space and the kernel, which is far more efficient. It also removes most of the restrictions of the current AIO API.

The speaker went into the details of this API and showed performance comparisons.

The Next Steps toward Software Freedom for Linux

Speaker: Bradley Kuhn

Details: https://kernel-recipes.org/en/2019/talks/the-next-steps-toward-software-freedom-for-linux/

Slides: http://ebb.org/bkuhn/talks/Kernel-Recipes-2019/kernel-recipes.html

Video: Youtube

The speaker talked about the importance of the GNU GPL to the development of Linux, in particular the ability of individual developers to get complete source code and to modify it to their local needs.

He described how, for a large proportion of devices running Linux, the complete source for the kernel is not made available, even though this is required by the GPL. So there is a need for GPL enforcement—demanding full sources from distributors of Linux and other works covered by GPL, and if necessary suing to obtain them. This is one of the activities of his employer, Software Freedom Conservancy, and has been carried out by others, particularly Harald Welte.

In one notable case, the Linksys WRT54G, the release of source after a lawsuit led to the creation of the OpenWRT project. This is still going many years later and supports a wide range of networking devices. He proposed that the Conservancy's enforcement activity should, in the short term, concentrate on a particular class of device where there would likely be interest in creating a similar project.

Suricata and XDP

Speaker: Eric Leblond

Details and slides: https://kernel-recipes.org/en/2019/talks/suricata-and-xdp/

Video: Youtube

The speaker described briefly how an Intrusion Detection System (IDS) interfaces to a network, and why it's important to be able to receive and inspect all relevant packets.

He then described how the Suricata IDS uses eXpress Data Path (XDP, explained in an earlier talk) to filter and direct packets, improving its ability to handle very high packet rates.

CVEs are dead, long live the CVE!

Speaker: Greg Kroah-Hartman

Details and slides: https://kernel-recipes.org/en/2019/talks/cves-are-dead-long-live-the-cve/

Video: Youtube

Common Vulnerabilities and Exposures Identifiers (CVE IDs) are a standard, compact way to refer to specific software and hardware security flaws.

The speaker explained problems with the way CVE IDs are currently assigned and described, including assignments for bugs that don't impact security, lack of assignment for many bugs that do, incorrect severity scores, and missing information about the changes required to fix the issue. (My work on CIP's kernel CVE tracker addresses some of these problems.)

The average time between assignment of a CVE ID and a fix being published is apparently negative for the kernel, because most such IDs are being assigned retrospectively.

He proposed to replace CVE IDs with "change IDs" (i.e. abbreviated git commit hashes) identifying bug fixes.

Driving the industry toward upstream first

Speaker: Enric Balletbo i Serra

Details snd slides: https://kernel-recipes.org/en/2019/talks/driving-the-industry-toward-upstream-first/

Video: Youtube

The speaker talked about how the Chrome OS developers have tried to reduce the difference between the kernels running on Chromebooks, and the upstream kernel versions they are based on. This has succeeded to the point that it is possible to run a current mainline kernel on at least some Chromebooks (which he demonstrated).

Formal modeling made easy

Speaker: Daniel Bristot de Oliveira

Details and slides: https://kernel-recipes.org/en/2019/talks/formal-modeling-made-easy/

Video: Youtube

The speaker explained how formal modelling of (parts of) the kernel could be valuable. A formal model will describe how some part of the kernel works, in a way that can be analysed and proven to have certain properties. It is also necessary to verify that the model actually matches the kernel's implementation.

He explained the methodology he used for modelling the real-time scheduler provided by the PREEMPT_RT patch set. The model used a number of finite state machines (automata), with conditions on state transitions that could refer to other state machines. He added (I think) tracepoints for all state transitions in the actual code and a kernel module that verified that at each such transition the model's conditions were met.

In the process of this he found a number of bugs in the scheduler.

Kernel documentation: past, present, and future

Speaker: Jonathan Corbet

Details and slides: https://kernel-recipes.org/en/2019/kernel-documentation-past-present-and-future/

Video: Youtube

The speaker is the maintainer of the Linux kernel's in-tree documentation. He spoke about how the documentation has been reorganised and reformatted in the past few years, and what work is still to be done.

GNU poke, an extensible editor for structured binary data

Speaker: Jose E Marchesi

Details and slides: https://kernel-recipes.org/en/2019/talks/gnu-poke-an-extensible-editor-for-structured-binary-data/

Video: Youtube

The speaker introduced and demonstrated his project, the "poke" binary editor, which he thinks is approaching a first release. It has a fairly powerful and expressive language which is used for both interactive commands and scripts. Type definitions are somewhat C-like, but poke adds constraints, offset/size types with units, and types of arbitrary bit width.

The expected usage seems to be that you write a script ("pickle") that defines the structure of a binary file format, use poke interactively or through another script to map the structures onto a specific file, and then read or edit specific fields in the file.

posted at: 14:30 | path: / | permanent link to this entry

Tue, 01 Oct 2019

Debian LTS work, September 2019

I was assigned 20 hours of work by Freexian's Debian LTS initiative and worked all those hours this month.

I prepared and, after review, released Linux 3.16.74, including various security and other fixes. I then rebased the Debian package onto that. I uploaded that with a small number of other fixes and issued DLA-1930-1.

I backported the latest security update for Linux 4.9 from stretch to jessie and issued DLA-1940-1 for that.

posted at: 15:00 | path: / | permanent link to this entry

Mon, 30 Sep 2019

Kernel Recipes 2019, part 1

This conference only has a single track, so I attended almost all the talks. This time I didn't take notes but I've summarised all the talks I attended.

Updated: Noted slides are available for all talks. Added links to the video streams.

ftrace: Where modifying a running kernel all started

Speaker: Steven Rostedt

Details and slides: https://kernel-recipes.org/en/2019/talks/ftrace-where-modifying-a-running-kernel-all-started/

Video: Youtube

This talk explains how the kernel's function tracing mechanism (ftrace) works, and describes some of its development history.

It was quite interesting, but you probably don't need to know this stuff unless you're touching the ftrace implementation.

Analyzing changes to the binary interface exposed by the Kernel to its modules

Speakers: Dodji Seketeli, Jessica Yu, Matthias Männich

Details and slides: https://kernel-recipes.org/en/2019/talks/analyzing-changes-to-the-binary-interface-exposed-by-the-kernel-to-its-modules/

Video: Youtube

The upstream kernel does not have a stable ABI (or API) for use by modules, but OS distributors often want to support the use of out-of-tree modules by ensuring that at least some subset of the kernel ABI remains stable within a given OS release.

Currently the kernel build process generates a "version" or "CRC" for each exported symbol by parsing the relevant type definitions. There is a load-time ABI check based on comparing these, and distributors can compare them at build time to detect ABI breaks. However this doesn't work that well and it's hard to work out what caused a change.

The speaker develops the "libabigail" library and tools. These can extract ABI definitions from standard debug information (DWARF), and then analyse and compare ABIs for different versions of a shared libraries, or of the Linux kernel and modules. They are likely to replace the kernel's current symbol versioning approach at some point. He talked about the capabilities of libabigail, plans for improving it, and some limitations of C ABI checkers.

BPF at Facebook

Speaker: Alexei Starovoitov

Details and slides: https://kernel-recipes.org/en/2019/talks/bpf-at-facebook/

Video: Youtube

The Berkeley Packet Filter (BPF) is a simple virtual machine implemented by several kernels. It allows user-space to add code that runs in kernel context, without compromising the integrity of the kernel.

In recent years Linux has extended this virtual machine architecture to create eBPF, which is expressive enough to be targeted by general-purpose compilers such as Clang and (in the near future) gcc. eBPF can be used for filtering network packets (the original purpose of BPF), tracing events, and many other purposes.

The speaker talked about practical experiences using eBPF with tracing at Facebook. These mainly involved investigating performance problems. He also talked about the difficulties of doing this on production servers without developer tools installed, and how this is being addressed.

Kernel hacking behind closed doors

Speaker: Thomas Gleixner

Details and slides: https://kernel-recipes.org/en/2019/talks/kernel-hacking-behind-closed-doors/

Video: Youtube

The speaker talked about how kernel developers and hardware vendors have been handling speculative execution vulnerabilities, and the friction between how the vendors' preferred process and the usual kernel development processes.

He described the mailing list manager he wrote to support discussion of security issues with a long embargo period, which sends and receives encrypted messages in both S/MIME and PGP/MIME formats (depending on the subscriber).

Finally he talked about the process that has been settled on for handling future issues of this time with minimal legal paperwork.

This was somewhat marred by a lawyer joke and a generally combative attitude to hardware vendors.

What To Do When Your Device Depends on Another One

Speaker: Rafael Wysocki

Details and slides: https://kernel-recipes.org/en/2019/talks/what-to-do-when-your-device-depends-on-another-one/

Video: Youtube

The Linux device model represents all devices as a simple hierarchy. Driver binding and unbinding (probe/remove), and power management operations, are sequenced based on the assumption that a device only depends on its parent in the device model.

On PCs, additional dependencies are often hidden behind abstractions such as ACPI, so that Linux does not need to be aware of them. On most embedded systems, however, such abstractions are usually missing and Linux does need to be aware of additional dependencies.

(A few years ago, the device driver core gained support for an error code from probe (-EPROBE_DEFER) that indicates that some dependency is not yet bound, and causes the device to be re-probed later. But this is an incomplete, stop-gap solution.)

The speaker described the new "device links" API which provides a way to record additional dependencies in the device model. The device driver core will use this information to sequence operations on multiple devices correctly.

Metrics are money

Speaker: Aurélien Rougemont

Details and slides: https://kernel-recipes.org/en/2019/metrics-are-money/

Video: Youtube

The speaker talked about several instances from his experience where system metrics were used to justify buying or rejecting new hardware. In some cases, these metrics were not accurate or consistent, which could lead to bad decisions. He made a plea for better documentation of metrics reported by the Linux kernel.

No NMI? No Problem! – Implementing Arm64 Pseudo-NMI

Speaker: Julien Thierry

Details and slides: https://kernel-recipes.org/en/2019/talks/no-nmi-no-problem-implementing-arm64-pseudo-nmi/

Video: Youtube

Linux typically uses Non-Maskable Interrupts (NMIs) for Performance Monitoring Unit (PMU) interrupts. NMIs are (almost) never disabled, so this allows interrupt handlers and other code that runs with interrupts disabled to be profiled accurately. On architectures that do not have NMIs, typically Linux can use the highest interrupt priority for this instead, and only mask the lower priorities.

On the Arm architecture, there is no NMI but there are two architectural interrupt priority levels (IRQ and FIQ). However on 64-bit Arm systems FIQ is typically reserved to system firmware so Linux only uses IRQ. This results in inaccurate profiling.

The speaker described the implementation of a pseudo-NMI for 64-bit Arm. This is done by leaving IRQs enabled on the CPU and masking them selectively on the Arm generic interrupt controller (GIC), which supports many more priority levels. However this effectively requires GIC v3 or v4 because these operations are prohibitively slow on earlier versions.

Marvels of Memory Auto-configuration (SPD)

Speaker: Jean Delvare

Details and slides: https://kernel-recipes.org/en/2019/marvels-of-memory-auto-configuration-also-known-as-spd/

Video: Youtube

The speaker talked about the history of standardised DRAM modules (SIMMs and DIMMs) and how system firmware can detect them and find out their size and timing requirements.

DIMMs expose this information through Serial Presence Detect (SPD) which until recently used standard 256-byte I²C EEPROMs.

For the latest generation of DIMMs (DDR4), the configuration information can be larger than 256 bytes and a new interface was required. Jean described and criticised this interfaces.

He also talked about the Linux drivers and utilities that can be used to read the SPD EEPROMs.

posted at: 18:28 | path: / | permanent link to this entry