Packaging is hard. Packager-friendly is harder.

Releasing software is no small feat, especially in 2018. You could just upload your source code somewhere (a Git, Subversion, CVS, etc, repo – or tarballs on Sourceforge, or whatever), but it matters what that source looks like and how easy it is to consume. What does the required build environment look like? Are there any dependencies on other software, and if so, which versions? What if the versions don’t match exactly?

Most languages feature solutions to the build environment dependency – Ruby has Gems, Perl has CPAN, Java has Maven. You distribute a manifest with your source, detailing the versions of the dependencies which work, and users who download your source can just use those.

Then, however, we have distributions. If openSUSE or Debian wants to include your software, then it’s not just a case of calling into CPAN during the packaging process – distribution builds need to be repeatable, and work offline. And it’s not feasible for packagers to look after 30 versions of every library – generally a distribution will contain 1-3 versions of a given library, and all software in the distribution will be altered one way or another to build against their version of things. It’s a long, slow, arduous process.

Life is easier for distribution packagers, the more the software released adheres to their perfect model – no non-source files in the distribution, minimal or well-formed dependencies on third parties, swathes of #ifdefs to handle changes in dependency APIs between versions, etc.

Problem is, this can actively work against upstream development.

Developers love npm or NuGet because it’s so easy to consume – asking them to abandon those tools is a significant impediment to developer flow. And it doesn’t scale – maybe a friendly upstream can drop one or two dependencies. But 10? 100? If you’re consuming a LOT of packages via the language package manager, as a developer, being told “stop doing that” isn’t just going to slow you down – it’s going to require a monumental engineering effort. And there’s the other side effect – moving from Yarn or Pip to a series of separate download/build/install steps will slow down CI significantly – and if your project takes hours to build as-is, slowing it down is not going to improve the project.

Therein lies the rub. When a project has limited developer time allocated to it, spending that time on an effort which will literally make development harder and worse, for the benefit of distribution maintainers, is a hard sell.

So, a concrete example: MonoDevelop. MD in Debian is pretty old. Why isn’t it newer? Well, because the build system moved away from a packager ideal so far it’s basically impossible at current community & company staffing levels to claw it back. Build-time dependency downloads went from a half dozen in the 5.x era (somewhat easily patched away in distributions) to over 110 today. The underlying build system changed from XBuild (Mono’s reimplementation of Microsoft MSBuild, a build system for Visual Studio projects) to real MSbuild (now FOSS, but an enormous shipping container of worms of its own when it comes to distribution-shippable releases, for all the same reasons & worse). It’s significant work for the MonoDevelop team to spend time on ensuring all their project files work on XBuild with Mono’s compiler, in addition to MSBuild with Microsoft’s compiler (and any mix thereof). It’s significant work to strip out the use of NuGet and Paket packages – especially when their primary OS, macOS, doesn’t have “distribution packages” to depend on.

And then there’s the integration testing problem. When a distribution starts messing with your dependencies, all your QA goes out the window – users are getting a combination of literally hundreds of pieces of software which might carry your app’s label, but you have no idea what the end result of that combination is. My usual anecdote here is when Ubuntu shipped Banshee built against a new, not-regression-tested version of SQLite, which caused a huge performance regression in random playback. When a distribution ships a broken version of an app with your name on it – broken by their actions, because you invested significant engineering resources in enabling them to do so – users won’t blame the distribution, they’ll blame you.

Releasing software is hard.

10 Responses to “Packaging is hard. Packager-friendly is harder.”

  1. I’ve been feeling that more and more. In my projects, my ideal dependency management is a puppet manifest defining which Debian packages to install, but it’s becoming impossible. I need to add Python packages managed by pip (there are too many for me to consider packaging myself) and there are missing packages that are fairly mainstream (e.g. elasticsearch, apache-solr is ancient).

    It’s also a vicious (virtuous?) cycle. The more developers embrace the philosophy of using language-specific package managers, the less likely it is that a critical mass of persons willing to package a resource will exist.

  2. Does flatpac or snaps make this easier?

    Seems like trying to get everything to the use the same version of dependencies is not only unsustainable in terms of workload and scale for the distro maintainers but also dangerous in that the developers didn’t build or test it against that version of the dependency.

  3. My experience is that users are very hostile towards distribution methods which aren’t 100% deeply integrated.

  4. Do flatpac or snaps solve the problem of having to backport a security fix to multiple vulnerable versions of the libraries in question? That seems to me like the more dangerous problem here. Bundling/vendoring and container-oriented ecosystems are basically promoting a cesspool of vulnerable applications exposed to the Internet. They punt on the hard problems which distributions are there to solve, leaving their users to learn some very hard lessons about prioritizing features over security. The “bleeding edge” is called that for a good reason!

  5. Yes! Snap and Flatpak *do* solve this problem – by making it the responsibility of the vendor *and* making sure that the only data an unresponsive vendor can endanger is the data the user explicitly touches with that application.

    I don’t care if VLC has exploitable codec bugs if the only thing that VLC can access is the file I opened it with. The sandboxes are not *quite* there yet, but they’re getting there.

  6. Of course the spectre of the next meltdown always hangs over such approaches. And data is not even everything. I would rather avoid my VLC being used for mining bitcoin on my electricity bill, or DDoSing someone for that matter.

  7. Suggesting that “the only data an unresponsive vendor can endanger is the data the user explicitly touches with that application” ignores modern vulnerability types such as cross-site scripting (where scripts injected via the vulnerable application are used to coerce the client and take some remote action on a different target system and/or exfiltrate local data not entrusted to the application), reflection attacks (where vulnerable services are leveraged to launch attacks against victims entirely unrelated to that service), and also any vulnerabilities granting local access which can be used in conjunction with escape vulnerabilities in the sandboxing layer.

    In short, unmaintained dependencies in a container are a significant risk. I have doubts that the typical development team for a complex service/application are doing to be willing much less able to keep track of all vulnerabilities exposed by their entire dependency chain baked into those container images, snaps, flatpaks, whatever. I hope I’m wrong but so far from what I’ve seen security risks of these newer application distribution models are simply ignored or handwaved away.

  8. Things go bad esp. for people who want to use free software and only free software. Distributions, esp. Debian, look very careful, that any package in their distro is free, builds from source, and is buildable in a reproducible way. That ensures, that the software is really free, that I really have the source code and can alter it, if I want to.

    As soon as I start to use npm, gem, pip, plugins from mozilla.biz, flatpaks and snaps from any random server, I cannot be sure, that the software is completely free. That it does really build from source code, that is actually available.

    It seems, that some developers do not care about that aspect. That is their decision and their priorities. It does mean, however, that I will not use such software anymore.

  9. Just in this very moment another software engineering horror story:

    https://bugs.debian.org/890598

    Upstream uses minified JavaScript files without having any sources! Of course, the package will now be removed from Debian.

    Now the users this software have the choice: Either use upstream .deb packages, which are clearly low quality, without full sources.

    Or, not to use such badly engineered software at all.

  10. […] recent blog post “Packaging is hard. Packager-friendly is harder.” is also […]

Leave a Reply