The Story Behind the XZ Backdoor and KDE Unsafe Themes

The Story Behind the XZ Backdoor and KDE Unsafe Themes

So, I've been missing for a couple of weeks, and what do we have? Well, apparently, the biggest attack on linux I've ever seen and somebody's entire home folder got deleted after downloading a global theme. Complete and utter chaos. Great.

Let's start right away with the global theme thing, since it's closer to KDE and I know more of it. Firstly, how? Just… how could something like that happen?

Well, most themes on the KDE store are actually themes. Like, the colorscheme is just a file with a lot of colors, and the Plasma Theme is a folder with a lot of SVGs, and so on. However, some themes require to change the behavior of some components too: take the login animation theme; that one needs to be able to set a custom animation, and then there's the SDDM theme, to customize the log-in screen, which also might want to move things around a bit. All of that, which requires changes to the behavior of components, is usually implemented via QML files, which is a markup language that uses a JavaScript interpreter under the hood. Which runs code.

Of course, it's not just about themes. Widgets are another example of something that necessarily has to run code, the main different is that you know widgets can run code, but you don't really know that themes do too.

Finally, we have Global Themes. Most people think of those as a sort-of collection of different specific themes, like a Plasma themes and a colorscheme and a cursor theme, and so on. But really, they can be much more complex than than; KDE Plasma is extremely modular under the hood, and Global Themes are allowed to go use that modularity more than you would expect; you shouldn't be surprised if they were even able to replace your entire desktop with some custom component.

So, necessarily, they run code as well, mostly through QML. So, what happened? Well, it seems like some theme on the store had some javascript code to handle a temporary config folder; this folder would then be deleted, maybe after installation. The directory of this config folder is specified in a file in the theme, and everything was fine.

However, someone else decided to fork that theme and, accidentally, they did not set a new config directory. Thus, the code that handled the config directory said: hey, let's delete the config directory! Hey, according to our file, the config directory is set to nothing, so let's just delete the home folder! Yeah!

It's not the first time that such an mistake is made, even in reviewed packages from your distributions. Another important incident was the Steam client, which - again, through a badly guarded rm command - accidentally deleted all users data under certain circumstances.

The faulty Global Theme has been removed within hours, and it was a recent theme with very few installs. The KDE Store is up and running since something like ten years ago at least, and this is the first time something like this happens. Please note that: you can still absolutely safely download colorscheme, plasma themes, cursors themes and more – since those do not run code – and that, by simply checking for a few genuine reviews of a Global Theme, you can safely install those too. Still, this should not have happened. A discussion immediately begun on how to address this and make sure it never happens again.

Firstly, the warning message. Whenever you enter the store in KDE Plasma, there's a little message at the top that tells you: hey, all of this content comes from users like you, and it wasn't reviewed by your distro, so be careful!

It's pretty clear that this message can work for Plasma Themes, where a bad themes could make your desktop very ugly or even unusable. But Global Themes, running arbitrary code, that's not enough. KDE developers have immediately added a new property distinguishing the safe and unsafe store sections, and implemented a new warning message, much more noticeable, and much more clear about the fact that, yes, unreviewed code is being run.

Secondly, there's active efforts (by Marco, I think) in making sure that Global Themes won't be able to run arbitrary code. This would mean losing a lot of cool functionalities like SDDM themes and widgets, but you can still install those - extra carefully - by yourself, instead of part of the theme. This way, Global Themes would become a "safe" category on the store as well. The lockscreen is already out, and that was probably one of the most dangerous components, and more might be removed in the future.

Thirdly, when you click on "apply" after selecting a Global Themes, there's a dialog that actually asks you: hey, which components exactly would you like to apply? And you can choose. We would like to add a clear warning message telling you which components are safe to apply, and which ones currently run arbitrary code. The latter would probably be un-ticked by default, thus giving you the option to select them if you want, but still preserving a safe install out of the box.

Fourth, the KDE store dialog in KDE Plasma currently gives you all the content sorted by the most recent. This is cool, because you see different themes and widgets each time you open it, but also, probably we should display the themes and widgets with the most downloads or highest rating. This way, the themes you'd be likely to click are also the safest ones. By the way, here there's an entire discussion that could be had about how the store manages the ratings, but we don't have time for that right now.

Finally, there's the long-term big deal: can we actually review, manually, all of the content submitted to the KDE store? At least, the potentially dangerous one? This is a big question, and I would like to apologize, because I initially thought this was completely impossible and even strongly argued against it in Brodie Robertson's video about this. However, other KDE developers have the opinion that this is currently feasible. Let's see what it would take.

Firstly, it would still - I think - be impossible to review all current widgets and themes on the store. However, most of them are Plasma 5 widgets, and what we could do is to hide all widgets or themes and ask to re-submit them, with Plasma 6 as the goal platform, and we could start looking into the already-ported Plasma 6 items. We should expect something like two to four widgets or themes to be submitted every week, which sounds feasible to manually review. However, this would have to be done in collaboration to Pling, which is the service that actually manages the store, and due to the complexity of the idea (probably somebody would have to get paid to do it and … paid by whom?) it's more of a long-term goal right now.

This is as much in-depth as I can go with the time I get with this format; I hope I managed to convince you that some parts of the store are still safe to use, and that we KDE developers are taking this issue very seriously and actively working on steps to make sure it does not happen again.

Finally, let's get to AssBleed. Because, yes, that's the name that… somebody… gave to the backdoor that was introduced in the latest version of the xz compression package. And it's just… wow. I know you already know about it, but let's do a deep dive, because it's worth it.

Given that I'm less of an expert here, the following is mostly inspired by articles from: wired, boehs.org, tukaani.org, rheaeve.substack, and more. All sources will be linked in the video description as soon as the livestream ends.

Let's start at the beginning. "XZ" is a file format designed by developer Lasse Collin et al between two thousant and five to two thousant and eight. This is now a package used in pretty much any modern distribution to compress and uncompress archives; if you use macOS with homebrew, you also probably have that package. And yet, as it customary in the Open Source world, its development is severely underfunded and mostly driven by just Lasse Collin who, as we will see, seem to be occasionally overwhelmed by the work necessary to maintain it.

In this context, a developer (or, a group of developers) under the name of Jia Tan start contributing to the project. They had already done a pull requests to libarchive, and their late 2021 patches seem genuine. These patches are sent to the xz-devel mailing list, and Collin merges them in the project.

In April of 2022, Jia Tan sends another patch to the mailing list; yet again, this patch is valid and innocuous. After a few days, another user under the name of Jigar Kumar complains that the patch has not yet been merged. They do so in a quite aggressive tone:

Patches spend years on this mailing list. 5.2.0 release was 7 years ago. There is no reason to think anything is coming soon

In May, the patch by Jia hasn't still been merged; again, Kumar says:

Over 1 month and no closer to being merged. Not a suprise.

And, yet again, in June:

Is there any progress on this? Jia I see you have recent commits. Why can't you commit this yourself?

Another developer, under the name of Dennis Ens, sends an email to the same mailing list asking whether Java is supported. It is, but not actively so. Collin apologizes:

It's clear that my resources are too limited (thus the many emails waiting for replies) so something has to change in the long term.

Kumar also sends another email to this new thread, yet again aggressively pressuring Collin:

“Progress will not happen until there is new maintainer. XZ for C has sparse commit log too. Dennis you are better off waiting until new maintainer happens or fork yourself. Submitting patches here has no purpose these days. The current maintainer lost interest or doesn't care to maintain anymore. It is sad to see for a repo like this.

We finally get to a key email sent by Collin:

I haven't lost interest but my ability to care has been fairly limited mostly due to longterm mental health issues but also due to some other things. Recently I've worked off-list a bit with Jia Tan on XZ Utils and perhaps he will have a bigger role in the future, we'll see. It's also good to keep in mind that this is an unpaid hobby project.

The plot twist here is that we have reasons to believe that developers Jigar Kumar and Dennis Ens do not actually exist, but rather, they are new aliases for the same group behind Jia Tan. Their emails are never to be seen again in the internet, not even in data breaches, they all have similar style and - of course - none of them has reached out for comment to anyone after all of this happened.

Most likely, what we're seeing here is an example of social engineering attack. Collin is already overwhelmed, and the groups want to gain access to the project more quickly, so they started pressuring Lasse as much as possible, even bringing him to admit he has longterm mental health issues.

Anyhow, this works. After further aggressive emails, we get to this statement from Collie:

As I have hinted in earlier emails, Jia Tan may have a bigger role in the project in the future. He has been helping a lot off-list and is practically a co-maintainer already. :-) [...] In any case some change in maintainership is already in progress at least for XZ Utils

And, thus, in January of 2023 Jia Tan is able to merge commits on his own, indicating that he has gained trust from Collin.

Around March, Jia Tan commits the testing infrastructure that will be the key component in the backdoor later on. Interestingly enough, this part of the code does not seem to come from Jia Tan themselves; rather, it was made by a developer called Hans Jansen.

Except, of course, Hans Jansen's emails also has very little activity except for this change, and they only appear again to push for the compromised version of xz to be included in Debian. Thus, they are also most likely a fake alias for the same developer group.

In January of 2024, Jia Tan also moved the website of the xz project to a subdomain of tukaani.org, xz.tukaani.org; this has a DNS record that points to GitHub pages, to which Jia Tan has access. Thus, he can now change the website too.

Finally, in February, the attack begins.

Due to this project being, you know, a compress/decompress package, the testing directory often contains raw binary code to test with. According to the README (long before Jia Tan started contributing to the project), "Many of the files have been created by hand with a hex editor, thus there is no better "source code" than the files themselves". On the 23rd of February, Jia Tan introduces new binary files to use as tests, with names like "good-large_compressed.lzma" and so on. However, they contain a backdoor, which however does not do anything on its own.

The following day, he tags and builds a new version of the project, v5.6.0, and he publishes a tar package; however, this package actually contains one extra file that is not found in the source code of the project, build-to-hest.m4, which adds the backdoor when building a deb/rpm package. It's not yet completely understood what this backdoor does, but it seems like it would give external users complete control over the affected machine via ssh.

Jia Tan starts emailing Richard W.M. Jones, who works at RedHat, asking for Fedora 40 to include the compromised xz version. At the same time, Debian adds that package to its unstable branch.

Now, this is actually somewhat exciting, because this is the moment that Jia Tan has been working towards for years, literally. And yet, a couple of things happen that put the whole operation at risk.

Firstly, developer teknoraver sends a pull request to "stop linking libzlzma to libsystemd". This probably would've defeated the backdoor, which adds some extra pressure to Jia to get it merged before such a thing happens.

Over the RedHat side of things, they start seeing Valgrind errors in a liblzma's function that's the entry of the backdoor.

Developer Russ Cox puts it perfectly when he says: "the race is on to fix this before the Linux distributions dig too deeply". Even worse, the above mentioned libsystemd pull request is also merged: "another race is on, to get liblzma backdoor'ed before the distros break the approach entirely".

Jia Tan, on the 9th of March, commits a couple of things.

Firstly, they commit a fake fix to the Valgrind issue. The real issue is in the backdoor, but of course they cannot admit that, so they do this "fake" fix as a misdirection. [42] But, at the same time, they also update the backdoor files; they do this by updating the binary test files, saying that - and I quote -

The original files were generated with random local to my machine. To better reproduce these files in the future, a constant seed was used to recreate these files.

None of this is true; regardless, a new version of xz is immediately published, v5.6.1, which indeed fixes the RedHad Valgrind issue.

Hans Jansen, who had written the testing infrastructure, also comes back and files a Debian bug to ask for xz-utils to be updated to the latest version, 5.6.1. Other developers that seem just as fake as Jansen show up to support him, and a couple of days later Debian indeed updates the package.

On the following day, Jia Tan themselves file a bug to Ubuntu asking for the package to be updated to 5.6.1 too.

Finally, on the 28th, developer Andres Freund discovers the bug. He immediately notifies Debian and other distributions. Debian rolls back to xz 5.4.5, Arch Linux starts building xz from Git instead of relying on the tarball, and RedHad assigns in the Critical Vulnerability code of 2024-3094. [46] The following day, Andres Freund also publish his findings publicly.

Of course, there's currently great speculation over the exact identity of Jia Tan. Most qualified security experts seem to think that this could've only been done by a group of people, which of course raises some important question. (such as: who paid them to do this? We're talking about hiring a team of very experienced developers for years).

An important part of the speculation relies on analyizing the commit history of Jia Tan, especially the timezone and commit times.

Almost all commits by Jia Tan use Chinese timezone, +0800. However, some commits switch between UTC+02 and UTC+03, depending on Daylight Saving Time, which would indicate for Jia Tan to be working for Eastern Europe. If this were true, then the commit times would generally start from 9am and end in 6pm, which would correspond to a standard day job. On top of that, it seems like Jia Tan often worked during multiple Chinese festivities, but there are no commits from European festivities such as Christmas or New Years. However, the commits with those different timezones were merged by Collin, who is also on the UTC+02/03 timezone. It could be that Collin accidentally changed the timezone of the commit? Please note that other commits merged by Collin preserve the original +0800 timezone, so it's not so clear what the right option might be. According to a statement gave by Dave Aitel, former NSA hacker and founder of cybersecurity firm Immunity, to Wired:

The majority of the clues lead back to Russia, and specifically Russia's APT29 hacking group - widely believed to work for Russia's foreign intelligence agency.

Of course, it could absolutely be someone else - another group, most likely.

Yet again, I would like to thank Russ Cox, who has provided the most in-depth timeline, and which I've extensively used to write this section.

One key point that I'd like to make is: yes, we've been very lucky to have discovered this attack. This could've gone wrong. Many, rightfully, say: hey, surely, there must be lots of attacks like this that go unnoticed, if anybody can just start maintaining some understaffed project and, in a few years, just throw a backdoor in.

But let's remember that this was a very long term, very complex and very coordinates technical and social attack. Most likely, it required a lot of effort and resources, which not everybody has -- even organized groups. Nonetheless, it could be that other similar attack happened (or are currently underway) and we don't know about them. The world's unsafe!