What goes into a Debian package?

Overview

This talk covers the file formats used in Debian and its derivatives for source and binary packages. A binary package contains code and/or data that's ready to install and use. A source package contains the source code and metadata used to build one or more binary packages.

Unpacked source package

We'll look at the 'hello' example package. On a Debian system and many derivatives, this is available by running:

apt source hello

This downloads the source package and unpacks it into a new directory. The directory name will depend on the version, but for me it's hello-2.10.

At the top level of this directory we see all the upstream source. 'hello' is actually a GNU example project, so it has all the clutter that GNU mandates. Ignore that for now; we're interested in the debian subdirectory, which contains the packaging control files.

Mandatory files

These three files are absolutely required by the dpkg development tools.

debian/changelog

This is a human-readable list of changes in each new version, along with machine-readable metadata about it. It's in reverse order. It's normally installed into every binary package.

There's an emacs mode for this changelog format in the dpkg-dev-el package, and vim also recognises it.

We can extract information from the changelog using the dpkg-parsechangelog command; for example dpkg-parsechangelog -SVersion will output the version in the top entry.

debian/control

This contains most of the other metadata for the source package, and templates for the metadata for the binary packages.

The metadata for binary packages can contain variable references which will be substituted during the build process. For example, ${shlibs:Depends} is replaced by a list of dependencies on shared library packages.

The file format used here is similar to the format of Internet mail headers, and is known as 'deb822' after RFC 822 which specified the Internet mail format. This same general format is used in many different metadata files in Debian packages.

debian/rules

This is an executable makefile that supports at least some standard targets. 'Executable makefile' means it has the x permission bits set, and its first line is:

#!/usr/bin/make -f

In principle, debian/rules can be implemented using some other scripting language, but Debian doesn't allow this.

Looking at the makefile, the first rule is something very strange:

%:
        dh $@

This is a wildcard that defines commands for all targets that aren't explicitly defined later. It runs the dh command with the name of the target.

dh is part of the debhelper package, which provides commands to assist in building packages. The vast majority of Debian packages use debhelper, and a large proportion use the newer dh command.

dh is smart enough to recognise that the upstream build system is autotools, so for example it can build using ./configure && make. So instead of writing more or less the same rules as for thousands of other packages, the maintainer only needed to write explicit rules for a few exceptions where dh didn't automatically do the right thing.

debhelper is pretty well documented in manual pages, so it's generally quite easy to write the rules file this way.

As this is an example package, there's also debian/rules-old which uses the older debhelper commands but not dh.

Almost mandatory files

Almost all source packages should also contain these files.

debian/copyright

This contains the copyright statements and licence texts for the package, to be installed into binary packages. Debian specifies a machine-readable format for this file, but it's not mandatory: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/

This can be omitted if the package adds copyright information to its binary packages in another way.

debian/source/format

Debian supports multiple variants of the source package format, and this file specifies which variant we're using. Here it's '3.0 (quilt)' which is the most common variant. The 'quilt' part means that any changes to the upstream source code are stored as a patch series under debian/patches. (There aren't any in this package.) It could also be '3.0 (native)' which means that there is no separation between upstream and Debian parts, or '1.0' which is the original and largely obsolete source format.

This can be omitted by packages using the 1.0 format.

Optional files

Most source packages will also contain these files.

debian/compat

debhelper is continually being improved, but some of those improvements aren't totally backward compatible so they are opt-in. A source package using debhelper can specifiy which version of the debhelper API it wants through this file. However, the recommended way to do this is now through the debian/control file; you can read the details in the debhelper(7) manual page.

debian/watch

This defines how to poll for new upstream versions. The implementation and the documentation for the file format are in the 'uscan' package.

Others

There are many other files that dpkg, debhelper and other tools look for under the debian subdirectory. These are documented in the respective manual pages.

Packed source package

Unlike source RPMs, Debian source packages are made of multiple files even when packed. Part of the reason for this is to separate upstream code from Debian-specific changes, which some licences require.

For the 'hello' example package, APT downloaded these files:

The first of these is the Debian source control (dsc) file. APT locates and verifies this file through the source package index in the archive. It contains metadata that was generated from the debian/changelog and debian/control files, plus the names, sizes and checksums of all the other files. APT uses this to locate and verify those other files.

We can unpack this by running dpkg-source -x hello_2.10-2.dsc, but APT already did that for us.

dpkg-source -b hello-2.10 will do (roughly) the inverse: it will create the dsc and Debian tarball files. It will never create a new upstream tarball; that's typically downloaded or otherwise created by uscan. In case anything was built in the source directory, this won't remove the build products. So normally a source package is built using the higher-level command dpkg-buildpackage.

From source to binary

The main command used to build binary (and source) packages and to prepare an upload is dpkg-buildpackage. It has many options for which packages it builds, whether to sign them, whether to upload the upstream source, and so on. It invokes various other commands including dpkg-checkbuilddeps which checks that build-dependencies are satisifed, dpkg-source, debian/rules, and dpkg-genchanges which generates a list of files to be uploaded to a package archive server.

Unlike some other build tools, dpkg-buildpackage always builds in the current directory and the current OS installation. To build in a more controlled environment, we would need to use a higher level tool such as sbuild or pbuilder.

Packed binary packages

You're probably familiar with these. Their filenames follow the convention:

name_version_architecture.deb

so for hello we got:

The format of these files is documented in the deb(5) manual page, but in short it's an 'ar' archive (a format originally designed for static libraries of code). It contains a debian-binary file which specifies the format version, and tarballs for 'control' and 'data'.

(The second binary package is an automatically generated debug symbol package, which isn't listed in debian/control. There's some hackery in debhelper and the Debian archive software that allows this.)

Unpacked binary packages

We can unpack (rather than install) a binary package using the ar and tar commands, but it's easier to use dpkg-deb -R. This unpacks the data tarball into the specified directory and the control tarball into the DEBIAN subdirectory.

Looking at DEBIAN/control, we see the metadata for this binary package generated from the debian/control file in the source package. The ${shlibs:Depends} variable has been replaced by a dependency on libc6. This metadata can also be displayed without unpacking the package, using dpkg-deb -I.

The DEBIAN/md5sums file contains checksums for all the files in the data tarball. This can be useful for checking for accidental changes or corruption.

Installed binary packages

We can install a binary package with dpkg -i or a higher level tool like APT. dpkg will always check dependencies and will abort an installation if they're not met, but it doesn't know how to resolve them. APT knows how to resolve dependencies and to fetch and install packages in the right order.

When a package is installed, all the files in the data tarball are extracted into the root of the filesystem. The files in the control tarball are incorporated into dpkg's status files under /var/lib/dpkg. This is private to dpkg, so you shouldn't access it directly.

We can use dpkg to list all the 'data' files for an installed package:

dpkg -L hello

We can check the installed 'data' files against the package's checksums with the debsums command:

debsums hello     # show status of each (non-config) file
debsums -c hello  # only show changed (non-config) files

We can show the metadata for the package - mostly copied from the control file:

dpkg -s hello

We can list and show all the other files from the control tarball:

dpkg-query --control-list hello
dpkg-query --control-show hello md5sums

Documentation