Gentoo Logo
Gentoo Logo Side
Gentoo Spaceship

Contributors:
. Aaron W. Swenson
. Agostino Sarubbo
. Alexey Shvetsov
. Alexis Ballier
. Alexys Jacob
. Alice Ferrazzi
. Alice Ferrazzi
. Andreas K. Hüttel
. Anthony Basile
. Arun Raghavan
. Bernard Cafarelli
. Brian Harring
. Christian Ruppert
. Chí-Thanh Christopher Nguyễn
. Denis Dupeyron
. Detlev Casanova
. Diego E. Pettenò
. Domen Kožar
. Doug Goldstein
. Eray Aslan
. Erik Mackdanz
. Fabio Erculiani
. Gentoo Haskell Herd
. Gentoo Miniconf 2016
. Gentoo Monthly Newsletter
. Gentoo News
. Gilles Dartiguelongue
. Greg KH
. Göktürk Yüksek
. Hanno Böck
. Hans de Graaff
. Ian Whyman
. Jan Kundrát
. Jason A. Donenfeld
. Jeffrey Gardner
. Joachim Bartosik
. Johannes Huber
. Jonathan Callen
. Jorge Manuel B. S. Vicetto
. Kristian Fiskerstrand
. Liam McLoughlin
. Luca Barbato
. Marek Szuba
. Mart Raudsepp
. Matt Turner
. Matthew Thode
. Michael Palimaka
. Michal Hrusecky
. Michał Górny
. Mike Doty
. Mike Gilbert
. Mike Pagano
. Nathan Zachary
. Pacho Ramos
. Patrick Kursawe
. Patrick Lauer
. Patrick McLean
. Paweł Hajdan, Jr.
. Piotr Jaroszyński
. Rafael G. Martins
. Remi Cardona
. Richard Freeman
. Robin Johnson
. Sean Amoss
. Sebastian Pipping
. Steev Klimaszewski
. Sven Vermeulen
. Sven Wegener
. Tom Wijsman
. Yury German
. Zack Medico

Last updated:
February 22, 2018, 06:04 UTC

Disclaimer:
Views expressed in the content published here do not necessarily represent the views of Gentoo Linux or the Gentoo Foundation.


Bugs? Comments? Suggestions? Contact us!

Powered by:
Planet Venus

Welcome to Planet Gentoo, an aggregation of Gentoo-related weblog articles written by Gentoo developers. For a broader range of topics, you might be interested in Gentoo Universe.

February 19, 2018
Jason A. Donenfeld a.k.a. zx2c4 (homepage, bugs)
WireGuard in Google Summer of Code (February 19, 2018, 14:55 UTC)

WireGuard is participating in Google Summer of Code 2018. If you're a student — bachelors, masters, PhD, or otherwise — who would like to be funded this summer for writing interesting kernel code, studying cryptography, building networks, making mobile apps, contributing to the larger open source ecosystem, doing web development, writing documentation, or working on a wide variety of interesting problems, then this may be appealing. You'll be mentored by world-class experts, and the summer will certainly boost your skills. Details are on this page — simply contact the WireGuard team to get a proposal into the pipeline.

Alice Ferrazzi a.k.a. alicef (homepage, bugs)

Gentoo has been accepted as a Google Summer of Code 2018 mentoring organization

Gentoo has been accepted into the Google Summer of Code 2018!

Contacts:
If you want to help in any way with Gentoo GSoC 2018 please add yourself here [1] as soon as possible, is important!. Someone from the Gentoo GSoC team will contact you back, as soon as possible. We are always searching for people willing to help with Gentoo GSoC 2018!

Students:
If you are a student and want to spend your summer on Gentoo projects and having fun writing code, you can start discussing ideas in the #gentoo-soc IRC Freenode channel [2].
You can find some ideas example here [3], but of course could be also something different.
Just remember that we strongly recommend that you work with a potential mentor to develop your idea before proposing it formally. Don’t waste any time because there’s typically some polishing which needs to occur before the deadline (March 28th)

I can assure you that Gentoo GSoC is really fun and a great experience for improving yourself. It is also useful for making yourself known in the Gentoo community. Contributing to the Gentoo GSoC will be contributing to Gentoo, a bleeding edge distribution, considered by many to be for experts and with near-unlimited adaptability.

If you require any further information, please do not hesitate to contact us [4] or check the Gentoo GSoC 2018 wiki page [5].

[1] https://wiki.gentoo.org/wiki/Google_Summer_of_Code/2018/Mentors
[2] http://webchat.freenode.net/?channels=gentoo-soc
[3] https://wiki.gentoo.org/wiki/Google_Summer_of_Code/2018/Ideas
[4] soc-mentors@gentoo.org
[5] https://wiki.gentoo.org/wiki/Google_Summer_of_Code/2018

February 14, 2018
Sebastian Pipping a.k.a. sping (homepage, bugs)
I love free software… and Gentoo does! #ilovefs (February 14, 2018, 14:25 UTC)

Some people care if software is free of cost or if it has the best features, above everything else. I don’t. I care that I can legally inspect its inner workings, modify and share modified versions. That’s why I happily avoid macOS, Windows, Skype, Photoshop.

I ran into these two pieces involving Gentoo in the Gallery of Free Software lovers and would like to share them with you:

Images are licensed under CC BY-SA 4.0 (with attribution going to Free Software Foundation Europe) as confirmed by Max Mehl.

February 10, 2018
Matthew Thode a.k.a. prometheanfire (homepage, bugs)
Native ZFS encryption for your rootfs (February 10, 2018, 06:00 UTC)

Disclaimer

I'm not responsible if you ruin your system, this guide functions as documentation for future me. Remember to back up your data.

Why do this instead of luks

I wanted to remove a layer from the File to Disk layering, before it was ZFS -> LUKS -> disk, now it's ZFS -> disk.

Prework

I just got a new laptop and wanted to just migrate the data, luckily the old laptop was using ZFS as well, so the data could be sent/received though native ZFS means.

The actual setup

Set up your root pool with the encryption key, it will be inherited by all child datasets, no child datasets will be allowed to be unencrypted.

In my case the pool name was slaanesh-zp00, so I ran the following to create the fresh pool.

zpool create -O encryption=on -O keyformat=passphrase zfstest /dev/zvol/slaanesh-zp00/zfstest

After that just go on and create your datasets as normal, transfer old data as needed (it'll be encrypted as it's written). See https://wiki.gentoo.org/wiki/ZFS for a good general guide on setting up your datasets.

decrypting at boot

If you are using dracut it should just work. No changes to what you pass on the kernel command line are needed. The code is upstream in https://github.com/zfsonlinux/zfs/blob/master/contrib/dracut/90zfs/zfs-load-key.sh.in

notes

Make sure you install from git master, there was a disk format change for encrypted datasets that just went in a week or so ago.

February 06, 2018
Gentoo at FOSDEM 2018 (February 06, 2018, 19:49 UTC)

Gentoo Linux participated with a stand during this year's FOSDEM 2018, as has been the case for the past several years. Three Gentoo developers had talks this year, Haubi was back with a Gentoo-related talk on Unix? Windows? Gentoo! - POSIX? Win32? Native Portability to the max!, dilfridge talked about Perl in the Physics Lab … Continue reading "Gentoo at FOSDEM 2018"

February 05, 2018
Greg KH a.k.a. gregkh (homepage, bugs)
Linux Kernel Release Model (February 05, 2018, 17:13 UTC)

Note

This post is based on a whitepaper I wrote at the beginning of 2016 to be used to help many different companies understand the Linux kernel release model and encourage them to start taking the LTS stable updates more often. I then used it as a basis of a presentation I gave at the Linux Recipes conference in September 2017 which can be seen here.

With the recent craziness of Meltdown and Spectre , I’ve seen lots of things written about how Linux is released and how we handle handles security patches that are totally incorrect, so I figured it is time to dust off the text, update it in a few places, and publish this here for everyone to benefit from.

I would like to thank the reviewers who helped shape the original whitepaper, which has helped many companies understand that they need to stop “cherry picking” random patches into their device kernels. Without their help, this post would be a total mess. All problems and mistakes in here are, of course, all mine. If you notice any, or have any questions about this, please let me know.

Overview

This post describes how the Linux kernel development model works, what a long term supported kernel is, how the kernel developers approach security bugs, and why all systems that use Linux should be using all of the stable releases and not attempting to pick and choose random patches.

Linux Kernel development model

The Linux kernel is the largest collaborative software project ever. In 2017, over 4,300 different developers from over 530 different companies contributed to the project. There were 5 different releases in 2017, with each release containing between 12,000 and 14,500 different changes. On average, 8.5 changes are accepted into the Linux kernel every hour, every hour of the day. A non-scientific study (i.e. Greg’s mailbox) shows that each change needs to be submitted 2-3 times before it is accepted into the kernel source tree due to the rigorous review and testing process that all kernel changes are put through, so the engineering effort happening is much larger than the 8 changes per hour.

At the end of 2017 the size of the Linux kernel was just over 61 thousand files consisting of 25 million lines of code, build scripts, and documentation (kernel release 4.14). The Linux kernel contains the code for all of the different chip architectures and hardware drivers that it supports. Because of this, an individual system only runs a fraction of the whole codebase. An average laptop uses around 2 million lines of kernel code from 5 thousand files to function properly, while the Pixel phone uses 3.2 million lines of kernel code from 6 thousand files due to the increased complexity of a SoC.

Kernel release model

With the release of the 2.6 kernel in December of 2003, the kernel developer community switched from the previous model of having a separate development and stable kernel branch, and moved to a “stable only” branch model. A new release happened every 2 to 3 months, and that release was declared “stable” and recommended for all users to run. This change in development model was due to the very long release cycle prior to the 2.6 kernel (almost 3 years), and the struggle to maintain two different branches of the codebase at the same time.

The numbering of the kernel releases started out being 2.6.x, where x was an incrementing number that changed on every release The value of the number has no meaning, other than it is newer than the previous kernel release. In July 2011, Linus Torvalds changed the version number to 3.x after the 2.6.39 kernel was released. This was done because the higher numbers were starting to cause confusion among users, and because Greg Kroah-Hartman, the stable kernel maintainer, was getting tired of the large numbers and bribed Linus with a fine bottle of Japanese whisky.

The change to the 3.x numbering series did not mean anything other than a change of the major release number, and this happened again in April 2015 with the movement from the 3.19 release to the 4.0 release number. It is not remembered if any whisky exchanged hands when this happened. At the current kernel release rate, the number will change to 5.x sometime in 2018.

Stable kernel releases

The Linux kernel stable release model started in 2005, when the existing development model of the kernel (a new release every 2-3 months) was determined to not be meeting the needs of most users. Users wanted bugfixes that were made during those 2-3 months, and the Linux distributions were getting tired of trying to keep their kernels up to date without any feedback from the kernel community. Trying to keep individual kernels secure and with the latest bugfixes was a large and confusing effort by lots of different individuals.

Because of this, the stable kernel releases were started. These releases are based directly on Linus’s releases, and are released every week or so, depending on various external factors (time of year, available patches, maintainer workload, etc.)

The numbering of the stable releases starts with the number of the kernel release, and an additional number is added to the end of it.

For example, the 4.9 kernel is released by Linus, and then the stable kernel releases based on this kernel are numbered 4.9.1, 4.9.2, 4.9.3, and so on. This sequence is usually shortened with the number “4.9.y” when referring to a stable kernel release tree. Each stable kernel release tree is maintained by a single kernel developer, who is responsible for picking the needed patches for the release, and doing the review/release process. Where these changes are found is described below.

Stable kernels are maintained for as long as the current development cycle is happening. After Linus releases a new kernel, the previous stable kernel release tree is stopped and users must move to the newer released kernel.

Long-Term Stable kernels

After a year of this new stable release process, it was determined that many different users of Linux wanted a kernel to be supported for longer than just a few months. Because of this, the Long Term Supported (LTS) kernel release came about. The first LTS kernel was 2.6.16, released in 2006. Since then, a new LTS kernel has been picked once a year. That kernel will be maintained by the kernel community for at least 2 years. See the next section for how a kernel is chosen to be a LTS release.

Currently the LTS kernels are the 4.4.y, 4.9.y, and 4.14.y releases, and a new kernel is released on average, once a week. Along with these three kernel releases, a few older kernels are still being maintained by some kernel developers at a slower release cycle due to the needs of some users and distributions.

Information about all long-term stable kernels, who is in charge of them, and how long they will be maintained, can be found on the kernel.org release page.

LTS kernel releases average 9-10 patches accepted per day, while the normal stable kernel releases contain 10-15 patches per day. The number of patches fluctuates per release given the current time of the corresponding development kernel release, and other external variables. The older a LTS kernel is, the less patches are applicable to it, because many recent bugfixes are not relevant to older kernels. However, the older a kernel is, the harder it is to backport the changes that are needed to be applied, due to the changes in the codebase. So while there might be a lower number of overall patches being applied, the effort involved in maintaining a LTS kernel is greater than maintaining the normal stable kernel.

Choosing the LTS kernel

The method of picking which kernel the LTS release will be, and who will maintain it, has changed over the years from an semi-random method, to something that is hopefully more reliable.

Originally it was merely based on what kernel the stable maintainer’s employer was using for their product (2.6.16.y and 2.6.27.y) in order to make the effort of maintaining that kernel easier. Other distribution maintainers saw the benefit of this model and got together and colluded to get their companies to all release a product based on the same kernel version without realizing it (2.6.32.y). After that was very successful, and allowed developers to share work across companies, those companies decided to not do that anymore, so future LTS kernels were picked on an individual distribution’s needs and maintained by different developers (3.0.y, 3.2.y, 3.12.y, 3.16.y, and 3.18.y) creating more work and confusion for everyone involved.

This ad-hoc method of catering to only specific Linux distributions was not beneficial to the millions of devices that used Linux in an embedded system and were not based on a traditional Linux distribution. Because of this, Greg Kroah-Hartman decided that the choice of the LTS kernel needed to change to a method in which companies can plan on using the LTS kernel in their products. The rule became “one kernel will be picked each year, and will be maintained for two years.” With that rule, the 3.4.y, 3.10.y, and 3.14.y kernels were picked.

Due to a large number of different LTS kernels being released all in the same year, causing lots of confusion for vendors and users, the rule of no new LTS kernels being based on an individual distribution’s needs was created. This was agreed upon at the annual Linux kernel summit and started with the 4.1.y LTS choice.

During this process, the LTS kernel would only be announced after the release happened, making it hard for companies to plan ahead of time what to use in their new product, causing lots of guessing and misinformation to be spread around. This was done on purpose as previously, when companies and kernel developers knew ahead of time what the next LTS kernel was going to be, they relaxed their normal stringent review process and allowed lots of untested code to be merged (2.6.32.y). The fallout of that mess took many months to unwind and stabilize the kernel to a proper level.

The kernel community discussed this issue at its annual meeting and decided to mark the 4.4.y kernel as a LTS kernel release, much to the surprise of everyone involved, with the goal that the next LTS kernel would be planned ahead of time to be based on the last kernel release of 2016 in order to provide enough time for companies to release products based on it in the next holiday season (2017). This is how the 4.9.y and 4.14.y kernels were picked as the LTS kernel releases.

This process seems to have worked out well, without many problems being reported against the 4.9.y tree, despite it containing over 16,000 changes, making it the largest kernel to ever be released.

Future LTS kernels should be planned based on this release cycle (the last kernel of the year). This should allow SoC vendors to plan ahead on their development cycle to not release new chipsets based on older, and soon to be obsolete, LTS kernel versions.

Stable kernel patch rules

The rules for what can be added to a stable kernel release have remained almost identical for the past 12 years. The full list of the rules for patches to be accepted into a stable kernel release can be found in the Documentation/process/stable_kernel_rules.rst kernel file and are summarized here. A stable kernel change:

  • must be obviously correct and tested.
  • must not be bigger than 100 lines.
  • must fix only one thing.
  • must fix something that has been reported to be an issue.
  • can be a new device id or quirk for hardware, but not add major new functionality
  • must already be merged into Linus’s tree

The last rule, “a change must be in Linus’s tree”, prevents the kernel community from losing fixes. The community never wants a fix to go into a stable kernel release that is not already in Linus’s tree so that anyone who upgrades should never see a regression. This prevents many problems that other projects who maintain a stable and development branch can have.

Kernel Updates

The Linux kernel community has promised its userbase that no upgrade will ever break anything that is currently working in a previous release. That promise was made in 2007 at the annual Kernel developer summit in Cambridge, England, and still holds true today. Regressions do happen, but those are the highest priority bugs and are either quickly fixed, or the change that caused the regression is quickly reverted from the Linux kernel tree.

This promise holds true for both the incremental stable kernel updates, as well as the larger “major” updates that happen every three months.

The kernel community can only make this promise for the code that is merged into the Linux kernel tree. Any code that is merged into a device’s kernel that is not in the kernel.org releases is unknown and interactions with it can never be planned for, or even considered. Devices based on Linux that have large patchsets can have major issues when updating to newer kernels, because of the huge number of changes between each release. SoC patchsets are especially known to have issues with updating to newer kernels due to their large size and heavy modification of architecture specific, and sometimes core, kernel code.

Most SoC vendors do want to get their code merged upstream before their chips are released, but the reality of project-planning cycles and ultimately the business priorities of these companies prevent them from dedicating sufficient resources to the task. This, combined with the historical difficulty of pushing updates to embedded devices, results in almost all of them being stuck on a specific kernel release for the entire lifespan of the device.

Because of the large out-of-tree patchsets, most SoC vendors are starting to standardize on using the LTS releases for their devices. This allows devices to receive bug and security updates directly from the Linux kernel community, without having to rely on the SoC vendor’s backporting efforts, which traditionally are very slow to respond to problems.

It is encouraging to see that the Android project has standardized on the LTS kernels as a “minimum kernel version requirement”. Hopefully that will allow the SoC vendors to continue to update their device kernels in order to provide more secure devices for their users.

Security

When doing kernel releases, the Linux kernel community almost never declares specific changes as “security fixes”. This is due to the basic problem of the difficulty in determining if a bugfix is a security fix or not at the time of creation. Also, many bugfixes are only determined to be security related after much time has passed, so to keep users from getting a false sense of security by not taking patches, the kernel community strongly recommends always taking all bugfixes that are released.

Linus summarized the reasoning behind this behavior in an email to the Linux Kernel mailing list in 2008:

On Wed, 16 Jul 2008, pageexec@freemail.hu wrote:
>
> you should check out the last few -stable releases then and see how
> the announcement doesn't ever mention the word 'security' while fixing
> security bugs

Umm. What part of "they are just normal bugs" did you have issues with?

I expressly told you that security bugs should not be marked as such,
because bugs are bugs.

> in other words, it's all the more reason to have the commit say it's
> fixing a security issue.

No.

> > I'm just saying that why mark things, when the marking have no meaning?
> > People who believe in them are just _wrong_.
>
> what is wrong in particular?

You have two cases:

 - people think the marking is somehow trustworthy.

   People are WRONG, and are misled by the partial markings, thinking that
   unmarked bugfixes are "less important". They aren't.

 - People don't think it matters

   People are right, and the marking is pointless.

In either case it's just stupid to mark them. I don't want to do it,
because I don't want to perpetuate the myth of "security fixes" as a
separate thing from "plain regular bug fixes".

They're all fixes. They're all important. As are new features, for that
matter.

> when you know that you're about to commit a patch that fixes a security
> bug, why is it wrong to say so in the commit?

It's pointless and wrong because it makes people think that other bugs
aren't potential security fixes.

What was unclear about that?

    Linus

This email can be found here, and the whole thread is recommended reading for anyone who is curious about this topic.

When security problems are reported to the kernel community, they are fixed as soon as possible and pushed out publicly to the development tree and the stable releases. As described above, the changes are almost never described as a “security fix”, but rather look like any other bugfix for the kernel. This is done to allow affected parties the ability to update their systems before the reporter of the problem announces it.

Linus describes this method of development in the same email thread:

On Wed, 16 Jul 2008, pageexec@freemail.hu wrote:
>
> we went through this and you yourself said that security bugs are *not*
> treated as normal bugs because you do omit relevant information from such
> commits

Actually, we disagree on one fundamental thing. We disagree on
that single word: "relevant".

I do not think it's helpful _or_ relevant to explicitly point out how to
tigger a bug. It's very helpful and relevant when we're trying to chase
the bug down, but once it is fixed, it becomes irrelevant.

You think that explicitly pointing something out as a security issue is
really important, so you think it's always "relevant". And I take mostly
the opposite view. I think pointing it out is actually likely to be
counter-productive.

For example, the way I prefer to work is to have people send me and the
kernel list a patch for a fix, and then in the very next email send (in
private) an example exploit of the problem to the security mailing list
(and that one goes to the private security list just because we don't want
all the people at universities rushing in to test it). THAT is how things
should work.

Should I document the exploit in the commit message? Hell no. It's
private for a reason, even if it's real information. It was real
information for the developers to explain why a patch is needed, but once
explained, it shouldn't be spread around unnecessarily.

    Linus

Full details of how security bugs can be reported to the kernel community in order to get them resolved and fixed as soon as possible can be found in the kernel file Documentation/admin-guide/security-bugs.rst

Because security bugs are not announced to the public by the kernel team, CVE numbers for Linux kernel-related issues are usually released weeks, months, and sometimes years after the fix was merged into the stable and development branches, if at all.

Keeping a secure system

When deploying a device that uses Linux, it is strongly recommended that all LTS kernel updates be taken by the manufacturer and pushed out to their users after proper testing shows the update works well. As was described above, it is not wise to try to pick and choose various patches from the LTS releases because:

  • The releases have been reviewed by the kernel developers as a whole, not in individual parts
  • It is hard, if not impossible, to determine which patches fix “security” issues and which do not. Almost every LTS release contains at least one known security fix, and many yet “unknown”.
  • If testing shows a problem, the kernel developer community will react quickly to resolve the issue. If you wait months or years to do an update, the kernel developer community will not be able to even remember what the updates were given the long delay.
  • Changes to parts of the kernel that you do not build/run are fine and can cause no problems to your system. To try to filter out only the changes you run will cause a kernel tree that will be impossible to merge correctly with future upstream releases.

Note, this author has audited many SoC kernel trees that attempt to cherry-pick random patches from the upstream LTS releases. In every case, severe security fixes have been ignored and not applied.

As proof of this, I demoed at the Kernel Recipes talk referenced above how trivial it was to crash all of the latest flagship Android phones on the market with a tiny userspace program. The fix for this issue was released 6 months prior in the LTS kernel that the devices were based on, however none of the devices had upgraded or fixed their kernels for this problem. As of this writing (5 months later) only two devices have fixed their kernel and are now not vulnerable to that specific bug.

January 30, 2018
FOSDEM is near (January 30, 2018, 00:00 UTC)

Excitement is building with FOSDEM 2018 only a few days away. There are now 14 current and one former developer in planned attendance, along with many from the Gentoo community.

This year one Gentoo related talk titled Unix? Windows? Gentoo! will be given by Michael Haubenwallner (haubi ) of the Gentoo Prefix project.

Two more non-Gentoo related talks will be given by current Gentoo developers:

If you attend don’t miss out on your opportunity to build the Gentoo web-of-trust by bringing a valid governmental ID, a printed key fingerprint, and various UIDs you want certified. See you at FOSDEM!

January 23, 2018
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
Evaluating ScyllaDB for production 1/2 (January 23, 2018, 17:31 UTC)

I have recently been conducting a quite deep evaluation of ScyllaDB to find out if we could benefit from this database in some of our intensive and latency critical data streams and jobs.

I’ll try to share this great experience within two posts:

  1. The first one (you’re reading) will walk through how to prepare yourself for a successful Proof Of Concept based evaluation with the help of the ScyllaDB team.
  2. The second post will cover the technical aspects and details of the POC I’ve conducted with the various approaches I’ve followed to find the most optimal solution.

But let’s start with how I got into this in the first place…


Selecting ScyllaDB

I got interested in ScyllaDB because of its philosophy and engagement and I quickly got into it by being a modest contributor and its Gentoo Linux packager (not in portage yet).

Of course, I didn’t pick an interest in that technology by chance:

We’ve been using MongoDB in (mass) production at work for a very very long time now. I can easily say we were early MongoDB adopters. But there’s no wisdom in saying that MongoDB is not suited for every use case and the Hadoop stack has come very strong in our data centers since then, with a predominance of Hive for the heavy duty and data hungry workflows.

One thing I was never satisfied with MongoDB was its primary/secondary architecture which makes you lose write throughput and is even more horrible when you want to set up what they call a “cluster” which is in fact some mediocre abstraction they add on top of replica-sets. To say the least, it is inefficient and cumbersome to operate and maintain.

So I obviously had Cassandra on my radar for a long time, but I was pushed back by its Java stack, heap size and silly tuning… Also, coming from the versatile MongoDB world, Cassandra’s CQL limitations looked dreadful at that time…

The day I found myself on ScyllaDB’s webpage and read their promises, I was sure to be challenging our current use cases with this interesting sea monster.


Setting up a POC with the people at ScyllaDB

Through my contributions around my packaging of ScyllaDB for Gentoo Linux, I got to know a bit about the people behind the technology. They got interested in why I was packaging this in the first place and when I explained my not-so-secret goal of challenging our production data workflows using Scylla, they told me that they would love to help!

I was a bit surprised at first because this was the first time I ever saw a real engagement of the people behind a technology into someone else’s POC.

Their pitch is simple, they will help (for free) anyone conducting a serious POC to make sure that the outcome and the comprehension behind it is the best possible. It is a very mature reasoning to me because it is easy to make false assumptions and conclude badly when testing a technology you don’t know, even more when your use cases are complex and your expectations are very high like us.

Still, to my current knowledge, they’re the only ones in the data industry to have this kind of logic in place since the start. So I wanted to take this chance to thank them again for this!

The POC includes:

  • no bullshit, simple tech-to-tech relationship
  • a private slack channel with multiple ScyllaDB’s engineers
  • video calls to introduce ourselves and discuss our progress later on
  • help in schema design and logic
  • fast answers to every question you have
  • detailed explanations on the internals of the technology
  • hardware sizing help and validation
  • funny comments and French jokes (ok, not suitable for everyone)

 

 

 

 

 

 

 

 

 


Lessons for a successful POC

As I said before, you’ve got to be serious in your approach to make sure your POC will be efficient and will lead to an unbiased and fair conclusion.

This is a list of the main things I consider important to have prepared before you start.

Have some background

Make sure to read some literature to have the key concepts and words in mind before you go. It is even more important if like me you do not come from the Cassandra world.

I found that the Cassandra: The Definitive Guide book at O’Reilly is a great read. Also, make sure to go around ScyllaDB’s documentation.

Work with a shared reference document

Make sure you share with the ScyllaDB guys a clear and detailed document explaining exactly what you’re trying to achieve and how you are doing it today (if you plan on migrating like we did).

I made a google document for this because it felt the easiest. This document will be updated as you go and will serve as a reference for everyone participating in the POC.

This shared reference document is very important, so if you don’t know how to construct it or what to put in it, here is how I structured it:

  1. Who’s participating at <your company>
    • photo + name + speciality
  2. Who’s participating at ScyllaDB
  3. POC hardware
    • if you have your own bare metal machines you want to run your POC on, give every detail about their number and specs
    • if not, explain how you plan to setup and run your scylla cluster
  4. Reference infrastructure
    • give every details on the technologies and on the hardware of the servers that are currently responsible for running your workflows
    • explain your clusters and their speciality
  5. Use case #1 : <name>
    • Context
      • give context about your use case by explaining it without tech words, think from the business / user point of view
    • Current implementations
      • that’s where you get technical
      • technology names and where they come into play in your current stack
      • insightful data volumes and cardinality
      • current schema models
    • Workload related to this use case
      • queries per second per data source / type
      • peek hours or no peek hours?
      • criticality
    • Questions we want to answer to
      • remember, the NoSQL world is lead by query-based-modeling schema design logic, cassandra/scylla is no exception
      • write down the real questions you want your data model(s) to be able to answer to
      • group them and rate them by importance
    • Validated models
      • this one comes during the POC when you have settled on the data models
      • write them down, explain them or relate them to the questions they answer to
      • copy/paste some code showcasing how to work with them
    • Code examples
      • depending on the complexity of your use case, you may have multiple constraints or ways to compare your current implementation with your POC
      • try to explain what you test and copy/paste the best code you came up with to validate each point

Have monitoring in place

ScyllaDB provides a monitoring platform based on Docker, Prometheus and Grafana that is efficient and simple to set up. I strongly recommend that you set it up, as it provides valuable insights almost immediately, and on an ongoing basis.

Also you should strive to give access to your monitoring to the ScyllaDB guys, if that’s possible for you. They will provide with a fixed IP which you can authorize to access your grafana dashboards so they can have a look at the performances of your POC cluster as you go. You’ll learn a great deal about ScyllaDB’s internals by sharing with them.

Know when to stop

The main trap in a POC is to work without boundaries. Since you’re looking for the best of what you can get out of a technology, you’ll get tempted to refine indefinitely.

So this is good to have at least an idea on the minimal figures you’d like to reach to get satisfied with your tests. You can always push a bit further but not for too long!

Plan some high availability tests

Even if you first came to ScyllaDB for its speed, make sure to test its high availability capabilities based on your experience.

Most importantly, make sure you test it within your code base and guidelines. How will your code react and handle a failure, partial and total? I was very surprised and saddened to discover so little literature on the subject in the Cassandra community.

POC != production

Remember that even when everything is right on paper, production load will have its share of surprises and unexpected behaviours. So keep a good deal of flexibility in your design and your capacity planning to absorb them.

Make time

Our POC lasted almost 5 months instead of estimated 3, mostly because of my agenda’s unwillingness to cooperate…

As you can imagine this interruption was not always optimal, for either me or the ScyllaDB guys, but they were kind not to complain about it. So depending on how thorough you plan to be, make sure you make time matching your degree of demands. The reference document is also helpful to get back to speed.


Feedback for the ScyllaDB guys

Here are the main points I noted during the POC that the guys from ScyllaDB could improve on.

They are subjective of course but it’s important to give feedback so here it goes. I’m fully aware that everyone is trying to improve, so I’m not pointing any fingers at all.

I shared those comments already with them and they acknowledged them very well.

More video meetings on start

When starting the POC, try to have some pre-scheduled video meetings to set it right in motion. This will provide a good pace as well as making sure that everyone is on the same page.

Make a POC kick starter questionnaire

Having a minimal plan to follow with some key points to set up just like the ones I explained before would help. Maybe also a minimal questionnaire to make sure that the key aspects and figures have been given some thought since the start. This will raise awareness on the real answers the POC aims to answer.

To put it simpler: some minimal formalism helps to check out the key aspects and questions.

Develop a higher client driver expertise

This one was the most painful to me, and is likely to be painful for anyone who, like me, is not coming from the Cassandra world.

Finding good and strong code examples and guidelines on the client side was hard and that’s where I felt the most alone. This was not pleasant because a technology is definitely validated through its usage which means on the client side.

Most of my tests were using python and the python-cassandra driver so I had tons of questions about it with no sticking answers. Same thing went with the spark-cassandra-connector when using scala where some key configuration options (not documented) can change the shape of your results drastically (more details on the next post).

High Availability guidelines and examples

This one still strikes me as the most awkward on the Cassandra community. I literally struggled with finding clear and detailed explanations about how to handle failure more or less gracefully with the python driver (or any other driver).

This is kind of a disappointment to me for a technology that position itself as highly available… I’ll get into more details about it on the next post.

A clearer sizing documentation

Even if there will never be a magic formula, there are some rules of thumb that exist for sizing your hardware for ScyllaDB. They should be written down more clearly in a maybe dedicated documentation (sizing guide is labeled as admin guide at time of writing).

Some examples:

  • RAM per core ? what is a core ? relation to shard ?
  • Disk / RAM maximal ratio ?
  • Multiple SSDs vs one NMVe ?
  • Hardware RAID vs software RAID ? need a RAID controller at all ?

Maybe even provide a bare metal complete example from two different vendors such as DELL and HP.

What’s next?

In the next post, I’ll get into more details on the POC itself and the technical learnings we found along the way. This will lead to the final conclusion and the next move we engaged ourselves with.

Marek Szuba a.k.a. marecki (homepage, bugs)
Randomness in virtual machines (January 23, 2018, 15:47 UTC)

I always felt that entropy available to the operating system must be affected by running said operating system in a virtual environment – after all, unpredictable phenomena used to feed the entropy pool are commonly based on hardware and in a VM most hardware either is simulated or has the hypervisor mediate access to it. While looking for something tangentially related to the subject, I have recently stumbled upon a paper commissioned by the German Federal Office for Information Security which covers this subject, with particular emphasis on entropy sources used by the standard Linux random-number generator (i.e. what feeds /dev/random and /dev/urandom), in extreme detail:

https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Publikationen/Studien/ZufallinVMS/Randomness-in-VMs.pdf?__blob=publicationFile&v=3

Personally I have found this paper very interesting but since not everyone can stomach a 142-page technical document, here are some of the main points made by its authors regarding entropy.

  1. As a reminder – the RNG in recent versions of Linux uses five sources of random noise, three of which do not require dedicated hardware (emulated or otherwise): Human-Interface Devices, rotational block devices, and interrupts.
  2. Running in a virtual machine does not affect entropy sources implemented purely in hardware or purely in software. However, it does affect hybrid sources – which in case of Linux essentially means all of the default ones.
  3. Surprisingly enough, virtualisation seems to have no negative effect on the quality of delivered entropy. This is at least in part due to additional CPU execution-time jitter introduced by the hypervisor compensating for increased predictability of certain emulated devices.
  4. On the other hand, virtualisation can strongly affect the quantity of produced entropy. More on this below.
  5. Low quantity of available entropy means among other things that it takes virtual machines visibly longer to bring /dev/urandom to usable state. This is a problem if /dev/urandom is used by services started at boot time because they can be initialised using low-quality random data.

Why exactly is the quantity of entropy low in virtual machines? The problem is that in a lot of configurations, only the last of the three standard noise sources will be active. On the one hand, even physical servers tend to be fairly inactive on the HID front. On the other, the block-device source does nothing unless directly backed by a rotational device – which has been becoming less and less likely, especially when we talk about large cloud providers who, chances are, hold your persistent storage on distributed networked file systems which are miles away from actual hard drives. This leaves interrupts as the only available noise source. Now take a machine configured this way and have it run a VPN endpoint, a HTTPS server, a DNSSEC-enabled DNS server… You get the idea.

But wait, it gets worse. Chances are many devices in your VM, especially ones like network cards which are expected to be particularly active and therefore the main source of interrupts, are in fact paravirtualised rather than fully emulated. Will such devices still generate enough interrupts? That depends on the underlying hypervisor, or to be precise on the paravirtualisation framework it uses. The BSI paper discusses this for KVM/QEMU, Oracle VirtualBox, Microsoft Hyper-V, and VMWare ESXi:

  • the former two use the the VirtIO framework, which is integrated with the Linux interrupt-handling code;
  • VMWare drivers trigger the interrupt handler similarly to physical devices;
  • Hyper-V drivers use a dedicated communication channel called VMBus which does not invoke the LRNG interrupt handler. This means it is entirely feasible for a Linux Hyper-V guest to have all noise sources disabled, and reenabling the interrupt one comes at a cost of performance loss caused by the use of emulated rather than paravirtualised devices.

 

All in all, the paper in question has surprised me with the news of unchanged quality of entropy in VMs and confirmed my suspicions regarding its quantity. It also briefly mentions (which is how I have ended up finding it) the way I typically work around this problem, i.e. by employing VirtIO-RNG – a paravirtualised hardware RNG in the KVM/VirtualBox guests which which interfaces with a source of entropy (typically /dev/random, although other options are possible too) on the host. Combine that with haveged on the host, sprinkle a bit of rate limiting on top (I typically set it to 1 kB/s per guest, even though in theory haveged is capable of acquiring entropy at a rate several orders of magnitude higher) and chances are you will never have to worry about lack of entropy in your virtual machines again.

January 19, 2018
Greg KH a.k.a. gregkh (homepage, bugs)

I keep getting a lot of private emails about my previous post about the latest status of the Linux kernel patches to resolve both the Meltdown and Spectre issues.

These questions all seem to break down into two different categories, “What is the state of the Spectre kernel patches?”, and “Is my machine vunlerable?”

State of the kernel patches

As always, lwn.net covers the technical details about the latest state of the kernel patches to resolve the Spectre issues, so please go read that to find out that type of information.

And yes, it is behind a paywall for a few more weeks. You should be buying a subscription to get this type of thing!

Is my machine vunlerable?

For this question, it’s now a very simple answer, you can check it yourself.

Just run the following command at a terminal window to determine what the state of your machine is:

1
$ grep . /sys/devices/system/cpu/vulnerabilities/*

On my laptop, right now, this shows:

1
2
3
4
$ grep . /sys/devices/system/cpu/vulnerabilities/*
/sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI
/sys/devices/system/cpu/vulnerabilities/spectre_v1:Vulnerable
/sys/devices/system/cpu/vulnerabilities/spectre_v2:Vulnerable: Minimal generic ASM retpoline

This shows that my kernel is properly mitigating the Meltdown problem by implementing PTI (Page Table Isolation), and that my system is still vulnerable to the Spectre variant 1, but is trying really hard to resolve the variant 2, but is not quite there (because I did not build my kernel with a compiler to properly support the retpoline feature).

If your kernel does not have that sysfs directory or files, then obviously there is a problem and you need to upgrade your kernel!

Some “enterprise” distributions did not backport the changes for this reporting, so if you are running one of those types of kernels, go bug the vendor to fix that, you really want a unified way of knowing the state of your system.

Note that right now, these files are only valid for the x86-64 based kernels, all other processor types will show something like “Not affected”. As everyone knows, that is not true for the Spectre issues, except for very old CPUs, so it’s a huge hint that your kernel really is not up to date yet. Give it a few months for all other processor types to catch up and implement the correct kernel hooks to properly report this.

And yes, I need to go find a working microcode update to fix my laptop’s CPU so that it is not vulnerable against Spectre…

January 18, 2018
Matthew Thode a.k.a. prometheanfire (homepage, bugs)
Undervolting your CPU for fun and profit (January 18, 2018, 06:00 UTC)

Disclaimer

I'm not responsible if you ruin your system, this guide functions as documentation for future me. While this should just cause MCEs or system resets if done wrong it might also be able to ruin things in a more permanent way.

Why do this

It lowers temps generally and if you hit thermal throttling (as happens with my 5th Generation X1 Carbon) you may thermally throttle less often or at a higher frequency.

Prework

Repaste first, it's generally easier to do and can result in better gains (and it's all about them gains). For instance, my Skull Canyon NUC was resetting itself thermally when stress testing. Repasting lowered the max temps by 20°C.

Undervolting - The How

I based my info on https://github.com/mihic/linux-intel-undervolt which seems to work on my Intel Kaby Lake based laptop.

Using the MSR registers to write the values via msr-tools wrmsr binary.

The following python3 snippet is how I got the values to write.

# for a -110mv offset I run the following
format(0xFFE00000&( (round(-110*1.024)&0xFFF) <<21), '08x')

What you are actually writing is actually as follows, with the plane index being for cpu, gpu or cache for 0, 1 or 2 respectively.

constant plane index constant write/read offset
80000 0 1 1 F1E00000

So we end up with 0x80000011F1E00000 to write to MSR register 0x150.

Undervolting - The Integration

I made a script in /opt/bin/, where I place my custom system scripts called undervolt.sh (make sure it's executable).

#!/usr/bin/env sh
/usr/sbin/wrmsr 0x150 0x80000011F1E00000  # cpu core  -110
/usr/sbin/wrmsr 0x150 0x80000211F1E00000  # cpu cache -110
/usr/sbin/wrmsr 0x150 0x80000111F4800000  # gpu core  -90

I then made a custom systemd unit in /etc/systemd/system called undervolt.service with the following content.

[Unit]
Description=Undervolt Service

[Service]
ExecStart=/opt/bin/undervolt.sh

[Install]
WantedBy=multi-user.target

I then systemctl enable undervolt.service so I'll get my settings on boot.

The following script was also placed in /usr/lib/systemd/system-sleep/undervolt.sh to get the settings after recovering from sleep, as they are lost at that point.

#!/bin/sh
if [ "${1}" == "post" ]; then
  /opt/bin/undervolt.sh
fi

That's it, everything after this point is what I went through in testing.

Testing for stability

I stress tested with mprime in stress mode for the CPU and glxmark or gputest for the GPU. I only recorded the results for the CPU, as that's what I care about.

No changes:

  • 15-17W package tdp
  • 15W core tdp
  • 3.06-3.22 Ghz

50mv Undervolt:

  • 15-16W package tdp
  • 14-15W core tdp
  • 3.22-3.43 Ghz

70mv undervolt:

  • 15-16W package tdp
  • 14W core tdp
  • 3.22-3.43 Ghz

90mv undervolt:

  • 15W package tdp
  • 13-14W core tdp
  • 3.32-3.54 Ghz

100mv undervolt:

  • 15W package tdp
  • 13W core tdp
  • 3.42-3.67 Ghz

110mv undervolt:

  • 15W package tdp
  • 13W core tdp
  • 3.42-3.67 Ghz

started getting gfx artifacts, switched the gfx undervolt to 100 here

115mv undervolt:

  • 15W package tdp
  • 13W core tdp
  • 3.48-3.72 Ghz

115mv undervolt with repaste:

  • 15-16W package tdp
  • 14-15W core tdp
  • 3.63-3.81 Ghz

120mv undervolt with repaste:

  • 15-16W package tdp
  • 14-15W core tdp
  • 3.63-3.81 Ghz

I decided on 110mv cpu and 90mv gpu undervolt for stability, with proper ventilation I get about 3.7-3.8 Ghz out of a max of 3.9 Ghz.

Other notes: Undervolting made the cpu max speed less spiky.

January 08, 2018
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)
FOSDEM 2018 talk: Perl in the Physics Lab (January 08, 2018, 17:07 UTC)

FOSDEM 2018, the "Free and Open Source Developers' European Meeting", takes place 3-4 February at Universite Libre de Bruxelles, Campus Solbosch, Brussels - and our measurement control software Lab::Measurement will be presented there in the Perl devrooom! As all of FOSDEM, the talk will also be streamed live and archived; more details on this follow later. Here's the abstract:

Perl in the Physics Lab
Andreas K. Hüttel
    Track: Perl Programming Languages devroom
    Room: K.4.601
    Day: Sunday
    Start: 11:00
    End: 11:40 
Let's visit our university lab. We work on low-temperature nanophysics and transport spectroscopy, typically measuring current through experimental chip structures. That involves cooling and temperature control, dc voltage sources, multimeters, high-frequency sources, superconducting magnets, and a lot more fun equipment. A desktop computer controls the experiment and records and evaluates data.

Some people (like me) want to use Linux, some want to use Windows. Not everyone knows Perl, not everyone has equal programming skills, not everyone knows equally much about the measurement hardware. I'm going to present our solution for this, Lab::Measurement. We implement a layer structure of Perl modules, all the way from the hardware access and the implementation of device-specific command sets to high level measurement control with live plotting and metadata tracking. Current work focuses on a port of the entire stack to Moose, to simplify and improve the code.

FOSDEM

January 06, 2018
Greg KH a.k.a. gregkh (homepage, bugs)
Meltdown and Spectre Linux kernel status (January 06, 2018, 12:36 UTC)

By now, everyone knows that something “big” just got announced regarding computer security. Heck, when the Daily Mail does a report on it , you know something is bad…

Anyway, I’m not going to go into the details about the problems being reported, other than to point you at the wonderfully written Project Zero paper on the issues involved here. They should just give out the 2018 Pwnie award right now, it’s that amazingly good.

If you do want technical details for how we are resolving those issues in the kernel, see the always awesome lwn.net writeup for the details.

Also, here’s a good summary of lots of other postings that includes announcements from various vendors.

As for how this was all handled by the companies involved, well this could be described as a textbook example of how NOT to interact with the Linux kernel community properly. The people and companies involved know what happened, and I’m sure it will all come out eventually, but right now we need to focus on fixing the issues involved, and not pointing blame, no matter how much we want to.

What you can do right now

If your Linux systems are running a normal Linux distribution, go update your kernel. They should all have the updates in them already. And then keep updating them over the next few weeks, we are still working out lots of corner case bugs given that the testing involved here is complex given the huge variety of systems and workloads this affects. If your distro does not have kernel updates, then I strongly suggest changing distros right now.

However there are lots of systems out there that are not running “normal” Linux distributions for various reasons (rumor has it that it is way more than the “traditional” corporate distros). They rely on the LTS kernel updates, or the normal stable kernel updates, or they are in-house franken-kernels. For those people here’s the status of what is going on regarding all of this mess in the upstream kernels you can use.

Meltdown – x86

Right now, Linus’s kernel tree contains all of the fixes we currently know about to handle the Meltdown vulnerability for the x86 architecture. Go enable the CONFIG_PAGE_TABLE_ISOLATION kernel build option, and rebuild and reboot and all should be fine.

However, Linus’s tree is currently at 4.15-rc6 + some outstanding patches. 4.15-rc7 should be out tomorrow, with those outstanding patches to resolve some issues, but most people do not run a -rc kernel in a “normal” environment.

Because of this, the x86 kernel developers have done a wonderful job in their development of the page table isolation code, so much so that the backport to the latest stable kernel, 4.14, has been almost trivial for me to do. This means that the latest 4.14 release (4.14.12 at this moment in time), is what you should be running. 4.14.13 will be out in a few more days, with some additional fixes in it that are needed for some systems that have boot-time problems with 4.14.12 (it’s an obvious problem, if it does not boot, just add the patches now queued up.)

I would personally like to thank Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra, Josh Poimboeuf, Juergen Gross, and Linus Torvalds for all of the work they have done in getting these fixes developed and merged upstream in a form that was so easy for me to consume to allow the stable releases to work properly. Without that effort, I don’t even want to think about what would have happened.

For the older long term stable (LTS) kernels, I have leaned heavily on the wonderful work of Hugh Dickins, Dave Hansen, Jiri Kosina and Borislav Petkov to bring the same functionality to the 4.4 and 4.9 stable kernel trees. I had also had immense help from Guenter Roeck, Kees Cook, Jamie Iles, and many others in tracking down nasty bugs and missing patches. I want to also call out David Woodhouse, Eduardo Valentin, Laura Abbott, and Rik van Riel for their help with the backporting and integration as well, their help was essential in numerous tricky places.

These LTS kernels also have the CONFIG_PAGE_TABLE_ISOLATION build option that should be enabled to get complete protection.

As this backport is very different from the mainline version that is in 4.14 and 4.15, there are different bugs happening, right now we know of some VDSO issues that are getting worked on, and some odd virtual machine setups are reporting strange errors, but those are the minority at the moment, and should not stop you from upgrading at all right now. If you do run into problems with these releases, please let us know on the stable kernel mailing list.

If you rely on any other kernel tree other than 4.4, 4.9, or 4.14 right now, and you do not have a distribution supporting you, you are out of luck. The lack of patches to resolve the Meltdown problem is so minor compared to the hundreds of other known exploits and bugs that your kernel version currently contains. You need to worry about that more than anything else at this moment, and get your systems up to date first.

Also, go yell at the people who forced you to run an obsoleted and insecure kernel version, they are the ones that need to learn that doing so is a totally reckless act.

Meltdown – ARM64

Right now the ARM64 set of patches for the Meltdown issue are not merged into Linus’s tree. They are staged and ready to be merged into 4.16-rc1 once 4.15 is released in a few weeks. Because these patches are not in a released kernel from Linus yet, I can not backport them into the stable kernel releases (hey, we have rules for a reason…)

Due to them not being in a released kernel, if you rely on ARM64 for your systems (i.e. Android), I point you at the Android Common Kernel tree All of the ARM64 fixes have been merged into the 3.18, 4.4, and 4.9 branches as of this point in time.

I would strongly recommend just tracking those branches as more fixes get added over time due to testing and things catch up with what gets merged into the upstream kernel releases over time, especially as I do not know when these patches will land in the stable and LTS kernel releases at this point in time.

For the 4.4 and 4.9 LTS kernels, odds are these patches will never get merged into them, due to the large number of prerequisite patches required. All of those prerequisite patches have been long merged and tested in the android-common kernels, so I think it is a better idea to just rely on those kernel branches instead of the LTS release for ARM systems at this point in time.

Also note, I merge all of the LTS kernel updates into those branches usually within a day or so of being released, so you should be following those branches no matter what, to ensure your ARM systems are up to date and secure.

Spectre

Now things get “interesting”…

Again, if you are running a distro kernel, you might be covered as some of the distros have merged various patches into them that they claim mitigate most of the problems here. I suggest updating and testing for yourself to see if you are worried about this attack vector

For upstream, well, the status is there is no fixes merged into any upstream tree for these types of issues yet. There are numerous patches floating around on the different mailing lists that are proposing solutions for how to resolve them, but they are under heavy development, some of the patch series do not even build or apply to any known trees, the series conflict with each other, and it’s a general mess.

This is due to the fact that the Spectre issues were the last to be addressed by the kernel developers. All of us were working on the Meltdown issue, and we had no real information on exactly what the Spectre problem was at all, and what patches were floating around were in even worse shape than what have been publicly posted.

Because of all of this, it is going to take us in the kernel community a few weeks to resolve these issues and get them merged upstream. The fixes are coming in to various subsystems all over the kernel, and will be collected and released in the stable kernel updates as they are merged, so again, you are best off just staying up to date with either your distribution’s kernel releases, or the LTS and stable kernel releases.

It’s not the best news, I know, but it’s reality. If it’s any consolation, it does not seem that any other operating system has full solutions for these issues either, the whole industry is in the same boat right now, and we just need to wait and let the developers solve the problem as quickly as they can.

The proposed solutions are not trivial, but some of them are amazingly good. The Retpoline post from Paul Turner is an example of some of the new concepts being created to help resolve these issues. This is going to be an area of lots of research over the next years to come up with ways to mitigate the potential problems involved in hardware that wants to try to predict the future before it happens.

Other arches

Right now, I have not seen patches for any other architectures than x86 and arm64. There are rumors of patches floating around in some of the enterprise distributions for some of the other processor types, and hopefully they will surface in the weeks to come to get merged properly upstream. I have no idea when that will happen, if you are dependant on a specific architecture, I suggest asking on the arch-specific mailing list about this to get a straight answer.

Conclusion

Again, update your kernels, don’t delay, and don’t stop. The updates to resolve these problems will be continuing to come for a long period of time. Also, there are still lots of other bugs and security issues being resolved in the stable and LTS kernel releases that are totally independent of these types of issues, so keeping up to date is always a good idea.

Right now, there are a lot of very overworked, grumpy, sleepless, and just generally pissed off kernel developers working as hard as they can to resolve these issues that they themselves did not cause at all. Please be considerate of their situation right now. They need all the love and support and free supply of their favorite beverage that we can provide them to ensure that we all end up with fixed systems as soon as possible.

January 03, 2018
FOSDEM 2018 (January 03, 2018, 00:00 UTC)

FOSDEM 2018 logo

Put on your cow bells and follow the herd of Gentoo developers to Université libre de Bruxelles in Brussels, Belgium. This year FOSDEM 2018 will be held on February 3rd and 4th.

Our developers will be ready to candidly greet all open source enthusiasts at the Gentoo stand in building K. Visit this year’s wiki page to see which developer will be running the stand during the different visitation time slots. So far seven developers have specified their attendance, with most-likely more on the way!

Unlike past years, Gentoo will not be hosting a keysigning party, however participants are encouraged to exchange and verify OpenPGP key data to continue building the Gentoo Web of Trust. See the wiki article for more details.

December 28, 2017
Michael Palimaka a.k.a. kensington (homepage, bugs)
Helper scripts for CMake-based ebuilds (December 28, 2017, 14:13 UTC)

I recently pushed two new scripts to the KDE overlay to help with CMake-based ebuild creation and maintenance.

cmake_dep_check inspects all CMakeLists.txt files under the current directory and maps find_package calls to Gentoo packages.

cmake_ebuild_check builds on cmake_dep_check and produces a list of CMake dependencies that are not present in the ebuild (and optionally vice versa).

If you find it useful or run into any issues feel free to drop me a line.

December 12, 2017
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
py3status v3.7 (December 12, 2017, 14:34 UTC)

This important release has been long awaited as it focused on improving overall performance of py3status as well as dramatically decreasing its memory footprint!

I want once again to salute the impressive work of @lasers, our amazing contributors from the USA who has become top one contributor of 2017 in term of commits and PRs.

Thanks to him, this release brings a whole batch of improvements and QA clean ups on various modules. I encourage you to go through the changelog to see everything.

Highlights

Deep rework of the usage and scheduling of threads to run modules has been done by @tobes.

  •  now py3status does not keep one thread per module running permanently but instead uses a queue to spawn a thread to execute the module only when its cache expires
  • this new scheduling and usage of threads allows py3status to run under asynchronous event loops and gevent will be supported on the upcoming 3.8
  • memory footprint of py3status got largely reduced thanks to the threads modifications and thanks to a nice hunt on ever growing and useless variables
  • modules error reporting is now more detailed

Milestone 3.8

The next release will bring some awesome new features such as gevent support, environment variable support in config file and per module persistent data storage as well as new modules!

Thanks contributors!

This release is their work, thanks a lot guys!

  • JohnAZoidberg
  • lasers
  • maximbaz
  • pcewing
  • tobes

November 17, 2017
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)
Python’s M2Crypto fails to compile (November 17, 2017, 21:29 UTC)

When updating my music server, I ran into a compilation error on Python’s M2Crypto. The error message was a little bit strange and not directly related to the package itself:

fatal error: openssl/ecdsa.h: No such file or directory

Obviously, that error is generated from OpenSSL and not directly within M2Crypto. Remembering that there are some known problems with the “bindist” USE flag, I took a look at OpenSSL and OpenSSH. Indeed, “bindist” was set. Simply removing the USE flag from those two packages took care of the problem:


# grep -i 'openssh\|openssl' /etc/portage/package.use
>=dev-libs/openssl-1.0.2m -bindist
net-misc/openssh -bindist

In this case, the problems makes sense based on the error message. The error indicated that the Elliptic Curve Digital Signature Algorithm (ECDSA) header was not found. In the previously-linked page about the “bindist” USE flag, it clearly states that having bindist set will “Disable/Restrict EC algorithms (as they seem to be patented)”.

Cheers,
Nathan Zachary

November 10, 2017
Alice Ferrazzi a.k.a. alicef (homepage, bugs)
Latex 001 (November 10, 2017, 06:03 UTC)

This is manly a memo for remember each time which packages are needed for Japanese.

Usually I use xetex with cjk for write latex document on Gentoo this is done by adding xetex and cjk support to texlive.

app-text/texlive cjk xetex app-text/texlive-core cjk xetex

for platex is needed to install

dev-texlive/texlive-langjapanese

Google-Summer-of-Code-summary week08 (November 10, 2017, 06:03 UTC)

Google Summer of Code summary week 08

What I did in this week 08 summary:

elivepatch:

  • Working with tempfile for keeping the uncompressed configuration file using the appropiate tempfile module.
  • Refactoring and code cleaning.
  • Check if the ebuild is present in the overlay before trying to merge it.
  • lpatch added locally
  • fix ebuild directory
  • Static patch and config filename on send This is useful for isolating the api requests using only the uuid as session identifier
  • Removed livepatchStatus and lpatch class configurations. Because we need a request to only be identified by is own UUID, for isolating the transaction.
  • return in case the request with same uuid is already present.
  • Fixed livepatch output name for reflecting the static setted patch name
  • module install removed as not needed
  • debug option is now also copying the build.log to the uuid directory, for investigating failed attempt * refactored uuid_dir with uuid in the case where uuid_dir is not actually containing the full path
  • adding debug_info to the configuration file if not already present however this setting is only needed for the livepatch creation (not tested yet)

Kpatc:

  • Investigate about dwarf attribute problems

What I need to do next time:

  • Finish the function for download the livepatch to the client
  • Testing elivepatch
  • Implementing the CVE patch uploader
  • Installing elivepatch to the Gentoo server
  • Fix kpatch-build for automatically work with gentoo-sources
  • Add more features to elivepatch

day 33 What was my plan for today?

  • testing and improving elivepatch

What i did today?

  • Working with tempfile for keeping the uncompressed configuration file using the appropiate tempfile module.
  • Refactoring and code cleaning.
  • Check if the ebuild is present in the overlay before trying to merge it.

what i will do next time?

* testing and improving elivepatch

day 34

What was my plan for today?

  • testing and improving elivepatch

What i did today?

  • lpatch added locally
  • fix ebuild directory
  • Static patch and config filename on send This is useful for isolating the api requests using only the uuid as session identifier
  • Removed livepatchStatus and lpatch class configurations. Because we need a request to only be identified by is own UUID, for isolating the transaction.
  • return in case the request with same uuid is already present.

we still have problem about "can'\''t find special struct alt_instr size."

I will investigate it tomorrow

what i will do next time?

  • testing and improving elivepatch
  • Investigating the missing informations in the livepatch

day 35

What was my plan for today?

  • testing and improving elivepatch

What i did today?

Today I investigate the missing information on the livepatch that we have when we don't have CONFIG_DEBUG_INFO=y in our kernel configuration. In the case we don't have debug_info in the kernel configuration we usually get missing alt_instr errors from kpatch-build and this is stopping elivepatch from creating a livepatch. This DEBUG_INFO is only needed for making the livepatch and dosen't have to be setted also in production (but not tested it yet)

Kpatch need some special section data for find where to inject the livepatch. This special section data existence is checked by kpatch-build in the given vmlinux file. The vmlinux file need CONFIG_DEBUG_INFO=y for making the debug symbols containing the special section data. This special section data is found like this:

# Set state if name matches
a == 0 && /DW_AT_name.* alt_instr[[:space:]]*$/ {a = 1; next}
b == 0 && /DW_AT_name.* bug_entry[[:space:]]*$/ {b = 1; next}
p == 0 && /DW_AT_name.* paravirt_patch_site[[:space:]]*$/ {p = 1; next}
e == 0 && /DW_AT_name.* exception_table_entry[[:space:]]*$/ {e = 1; next}

DW_AT_NAME is the dwarf attributei (AT) for the name of declaration as it appear in the source program.

<1><3a75de>: Abbrev Number: 118 (DW_TAG_variable)
   <3a75df>   DW_AT_name        : (indirect string, offset: 0x2878c):
__alt_instructions
   <3a75e3>   DW_AT_decl_file   : 1
   <3a75e4>   DW_AT_decl_line   : 271
   <3a75e6>   DW_AT_type        : <0x3a75d3>
   <3a75ea>   DW_AT_external    : 1
   <3a75ea>   DW_AT_declaration : 1
  • decl_file is the file containing the source declaration
  • type is the type of declaration
  • external means that the variable is visible outside of its enclosing compilation unit
  • declaration indicates that this entry represents a non-defining declaration of object

more informations can be found here http://dwarfstd.org/doc/Dwarf3.pdf

After the kpatch-build identify the various name attribute

It will re build the original kernel and the patched kernel

With the use of create-diff-object program, kpatch-build will extract new and modified ELF by using the dwarf data special section

Finally it will create the patch module using create-kpatch-module program and by using dynamic linked objects relocation (dynrelas) symbol sections changes

[continue...]

what i will do next time?

  • testing and improving elivepatch
  • Investigating the missing informations in the livepatch

day 36

What was my plan for today?

  • testing and improving elivepatch

What i did today?

  • Fixed livepatch output name for reflecting the static setted patch name
  • module install removed as not needed
  • debug option is now also copying the build.log to the uuid directory, for investigating failed attempt * refactored uuid_dir with uuid in the case where uuid_dir is not actually containing the full path
  • adding debug_info to the configuration file if not already present however this setting is only needed for the livepatch creation (not tested yet)

what i will do next time?

  • testing and improving elivepatch

day 37


Google Summer of Code day41-51 (November 10, 2017, 05:59 UTC)

Google Summer of Code day 41

What was my plan for today?

  • testing and improving elivepatch

What i did today?

elivepatch work: - working on making the code for sending a unknown number of files with Werkzeug

Making and Testing patch manager

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 42

What was my plan for today?

  • testing and improving elivepatch

What i did today?

elivepatch work: * Fixing sending multiple file using requests I'm making the script for incremental add patches but I'm stuck on this requests problem.

looks like I cannot concatenate files like this files = {'patch': ('01.patch', patch_01, 'multipart/form-data', {'Expires': '0'}), ('02.patch', patch_01, 'multipart/form-data', {'Expires': '0'}), ('03.patch', patch_01, 'multipart/form-data', {'Expires': '0'}), 'config': ('config', open(temporary_config.name, 'rb'), 'multipart/form-data', {'Expires': '0'})} or like this: files = {'patch': ('01.patch', patch_01, 'multipart/form-data', {'Expires': '0'}), 'patch': ('02.patch', patch_01, 'multipart/form-data', {'Expires': '0'}), 'patch': ('03.patch', patch_01, 'multipart/form-data', {'Expires': '0'}), 'config': ('config', open(temporary_config.name, 'rb'), 'multipart/form-data', {'Expires': '0'})} getting AttributeError: 'tuple' object has no attribute 'read'

looks like requests cannot manage to send data with same key but the server part of flask-restful is using parser.add_argument('patch', action='append') for concatenate more arguments togheter. and it suggest to send the informations like this: curl http://api.example.com -d "patch=bob" -d "patch=sue" -d "patch=joe"

Unfortunatly as now I'm still trying to understand how I can do it with requests.

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 43

What was my plan for today?

  • making the first draft of incremental patches feature

What i did today?

elivepatch work: * Changed send_file for manage more patches at a time * Refactored functions name for reflecting the incremental patches change * Made function for list patches in the eapply_user portage user patches folder directory and temporary folder patches reflecting the applied livepatches.

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 44

What was my plan for today?

  • making the first draft of incremental patches feature

What i did today?

elivepatch work: * renamed command client function as internal function * put togheter the client incremental patch sender function

what i will do next time?

  • make the server side incremtal patch part and test it

Google Summer of Code day 45

What was my plan for today?

  • testing and improving elivepatch

What i did today?

Meeting with mentor summary

elivepatch work: - working on incremental patch features design and implementatio - putting patch files under /var/run/elivepatch - ordering patch by numbers - cleaning folder when the machine is restarted - sending the patches to the server in order - cleaning client terminal output by catching the exceptions

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 46

What was my plan for today?

  • making the first draft of incremental patches feature

What i did today?

elivepatch work: * testing elivepatch with incremental patches * code refactoring

what i will do next time?

  • make the server side incremtal patch part and test it

Google Summer of Code day 47

What was my plan for today?

  • making the first draft of incremental patches feature

What i did today?

elivepatch work: * testing elivepatch with incremental patches * trying to make some way for make testing more fast, I'm thinking something like unit test and integration testing. * merged build_livepatch with send_files as it can reduce the steps for making the livepatch and help making it more isolated. * going on writing the incremental patch

Building with kpatch-build takes usually too much time, this is making testing parts where building live patch is needed, long and complicated. Would probably be better to add some unit/integration testing for keep each task/function under control without needing to build live patches every time. any thoughts?

what i will do next time?

  • make the server side incremental patch part and test it

Google Summer of Code day 48

What was my plan for today?

  • making the first draft of incremental patches feature

What i did today?

elivepatch work: * added PORTAGE_CONFIGROOT for set the path from there get the incremental patches. * cannot download kernel sources to /usr/portage/distfiles/ * saving created livepatch and patch to the client but sending only the incremental patches and patch to the elivepatch server * Cleaned output from bash command

what i will do next time?

  • make the server side incremental patch part and test it

Google Summer of Code day 49

What was my plan for today?

  • testing and improving elivepatch incremental patches feature

What i did today?

1) we need a way for having old genpatches. mpagano made a script for saving all the genpatches and they are saved here: http://dev.gentoo.org/~mpagano/genpatches/tarballs/ I think we can redirect the ebuild on our overlay for get the tarballs from there.

2) kpatch can work with initrd

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 50

What was my plan for today?

  • testing and improving elivepatch incremental patches feature

What i did today?

  • refactoring code
  • starting writing first draft for the automatical kernel livepatching system

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 51

What was my plan for today?

  • testing and improving elivepatch incremental patches feature

What i did today?

  • checking eapply_user eapply_user is using patch

  • writing elivepatch wiki page https://wiki.gentoo.org/wiki/User:Aliceinwire/elivepatch

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day31-40 (November 10, 2017, 05:57 UTC)

Google Summer of Code day 31

What was my plan for today?

  • testing and improving elivepatch

What I did today?

Dispatcher.py:

  • fixed comments
  • added static method for return the kernel path
  • Added todo
  • fixed how to make directory with uuid

livepatch.py * fixed docstring * removed sudo on kpatch-build * fixed comments * merged ebuild commands in one on build livepatch

restful.py * fixed comment

what I will do next time?

  • testing and improving elivepatch

Google Summer of Code day 32

What was my plan for today?

  • testing and improving elivepatch

What I did today?

client checkers: * used os.path.join for ungz function * implemented temporary folder using python tempfile

what I will do next time?

  • testing and improving elivepatch

Google Summer of Code day 33

What was my plan for today?

  • testing and improving elivepatch

What I did today?

  • Working with tempfile for keeping the uncompressed configuration file using the appropriate tempfile module.
  • Refactoring and code cleaning.
  • check if the ebuild is present in the overlay before trying to merge it.

what I will do next time?

  • testing and improving elivepatch

Google Summer of Code day 34

What was my plan for today?

  • testing and improving elivepatch

What I did today?

  • lpatch added locally
  • fix ebuild directory
  • Static patch and config filename on send This is useful for isolating the API requests using only the uuid as session identifier
  • Removed livepatchStatus and lpatch class configurations. Because we need a request to only be identified by is own UUID, for isolating the transaction.
  • return in case the request with the same uuid is already present.

we still have that problem about "can'\''t find special struct alt_instr size."

I will investigate it tomorrow

what I will do next time?

  • testing and improving elivepatch
  • Investigating the missing information in the livepatch

Google Summer of Code day 35

What was my plan for today?

  • testing and improving elivepatch

What I did today?

Kpatch needs some special section data for finding where to inject the livepatch. This special section data existence is checked by kpatch-build in the given vmlinux file. The vmlinux file need CONFIG_DEBUG_INFO=y for making the debug symbols containing the special section data. This special section data is found like this:

# Set state if name matches
a == 0 && /DW_AT_name.* alt_instr[[:space:]]*$/ {a = 1; next}
b == 0 && /DW_AT_name.* bug_entry[[:space:]]*$/ {b = 1; next}
p == 0 && /DW_AT_name.* paravirt_patch_site[[:space:]]*$/ {p = 1; next}
e == 0 && /DW_AT_name.* exception_table_entry[[:space:]]*$/ {e = 1; next}

what I will do next time?

  • testing and improving elivepatch
  • Investigating the missing information in the livepatch

Google Summer of Code day 36

What was my plan for today?

  • testing and improving elivepatch

What I did today?

  • Fixed livepatch output name for reflecting the static set patch name
  • module install removed as not needed
  • debug option is now also copying the build.log to the uuid directory, for investigating failed attempt
  • refactored uuid_dir with uuid in the case where uuid_dir is not actually containing the full path
  • adding debug_info to the configuration file if not already present however, this setting is only needed for the livepatch creation (not tested yet)

what I will do next time?

  • testing and improving elivepatch
  • Investigating the missing information in the livepatch

Google Summer of Code day 37

What was my plan for today?

  • testing and improving elivepatch

What I did today?

  • Fix return message for cve option
  • Added different configuration example [4.9.30,4.10.17]
  • Catch Gentoo-sources not available error
  • Catch missing livepatch errors on the client

Tested kpatch with patch for kernel 4.9.30 and 4.10.17 and it worked without any problem. I checked with coverage for see which code is not used. I think we can maybe remove cve option for now as not implemented yet and we could use a modular implementation of for it, so the configuration could change. We need some documentation about elivepatch on the Gentoo wiki. We need some unit tests for making development more smooth and making it more simple to check the working status with GitHub Travis.

I talked with kpatch creator and we got some feedback:

“this project could also be used for kpatch testing :)
imagine instead of just loading the .ko, the client was to kick off a series of tests and report back.”

“why bother a production or tiny machine when you might have a patch-building server”

what I will do next time?

  • testing and improving elivepatch
  • Investigating the missing information in the livepatch

Google Summer of Code day 38

What was my plan for today?

  • testing and improving elivepatch

What I did today?

Meeting with mentor summary

What we will do next: - incremental patch tracking on the client side - CVE security vulnerability checker - dividing the repository - ebuild - documentation [optional] - modularity [optional]

Kpatch work: - Started to make the incremental patch feature - Tested kpatch for permission issue

what I will do next time?

  • testing and improving elivepatch
  • Investigating the missing information in the livepatch

Google Summer of Code day 39

What was my plan for today?

  • testing and improving elivepatch

What I did today?

Meeting with mentor summary

elivepatch work: - working on incremental patch features design and implementation - putting patch files under /var/run/elivepatch - ordering patch by numbers - cleaning folder when the machine is restarted - sending the patches to the server in order - cleaning client terminal output by catching the exceptions

what I will do next time?

  • testing and improving elivepatch

Google Summer of Code day 40

What was my plan for today?

  • testing and improving elivepatch

What I did today?

Meeting with mentor

summary of elivepatch work: - working with incremental patch manager - cleaning client terminal output by catching the exceptions

Making and Testing patch manager

what I will do next time?

  • testing and improving elivepatch

November 09, 2017
Alice Ferrazzi a.k.a. alicef (homepage, bugs)
Google Summer of Code day26-30 (November 09, 2017, 11:14 UTC)

Google Summer of Code day 26

What was my plan for today?

  • testing and improving elivepatch

What i did today?

After discussion with my mentor. I need: * Availability and replicability of downloading kernel sources. [needed] * Represent same kernel sources as the client kernel source (use flags for now and in the future user added patches) [needed] * Support multiple request at the same time. [needed] * Modularity for adding VM machine or container support. (in the future)

Create overlay for keeping old gentoo-sources ebuild where we will keep retrocompatibility for old kernels: https://github.com/aliceinwire/gentoo-sources_overlay

Made function for download and get old kernel sources and install sources under the designated temporary folder.

what i will do next time?

  • Going on improving elivepatch

Google Summer of Code day 27

What was my plan for today?

  • testing and improving elivepatch

What i did today?

  • Fixed git clone of the gentoo-sources overlay under a temporary directory
  • Fixed download directories
  • Tested elivepatch

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 28

What was my plan for today?

  • testing and improving elivepatch

What i did today?

  • Testing elivepatch with multi threading
  • Removed check for uname kernel version as is getting the kernel version directly from the kernel configuration file header.
  • Starting kpatch-build under the output folder Because kpatch-build is making the livepatch under the $PWD folder we are starting it under the uuid tmp folder and we are getting the livepatch from the uuid folder. this is usefull for dealing with multi-threading
  • Added some helper function for code clarity [not finished yet]
  • Refactored for code clarity [not finished yet]

For making elivepatch multithread we can simply change app.run() of flask with app.run(threaded=True).
This will make flask spawn thread for each request (using class SocketServer.ThreadingMixIn and baseWSGIserver)1, but also if is working pretty nice, the suggested way of threading is probably using gunicorn or uWSGI.2
maybe like using flask-gunicorn: https://github.com/doobeh/flask-gunicorn

what i will do next time?

  • testing and improving elivepatch

[1] https://docs.python.org/2/library/socketserver.html#SocketServer.ThreadingMixIn [2] http://flask.pocoo.org/docs/0.12/deploying/

Google Summer of Code day 29

What was my plan for today?

  • testing and improving elivepatch

What i did today?

  • Added some helper function for code clarity [not finished yet]
  • Refactored for code clarity [not finished yet]
  • Sended pull request to kpatch for adding dynamic output folder option https://github.com/dynup/kpatch/pull/718
  • Sended pull request to kpatch for small style fix https://github.com/dynup/kpatch/pull/719

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 30

What was my plan for today?

  • testing and improving elivepatch

What i did today?

  • fixing uuid design

moved the uuid generating to the client and read rfc 4122

that states "A UUID is 128 bits long, and requires no central registration process."

  • made regex function for checking the format of uuid is actually correct.

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day21-25 (November 09, 2017, 10:42 UTC)

Google Summer of Code day 21

What was my plan for today?

  • Testing elivepatch
  • Getting the kernel version dynamically
  • Updating kpatch-build for work with Gentoo better

What I did today?

Fixing the problems pointed out by my mentor.

  • Fixed software copyright header
  • Fixed popen call
  • Used regex for checking file extension
  • Added debug option for the server
  • Changed reserved word shadowing variables
  • used os.path.join for link path
  • dynamical pass kernel version

what I will do next time?
* Testing elivepatch * work on making elivepatch support multiple calls

Google Summer of Code day 22

What was my plan for today?

  • Testing elivepatch
  • work on making elivepatch support multiple calls

What I did today?

  • working on sending the kernel version
  • working on dividing users by UUID (Universally unique identifier)

With the first connection, because the UUID is not generated yet, it will be created and returned by the elivepatch server. The client will store the UUID in shelve so that it can authenticate next time is inquiring the server. By using python UUID I'm assigning different UUID for each user. By using UUID we can start supporting multiple calls.

what I will do next time?
* Testing elivepatch * Cleaning code

Google Summer of Code day 23

What was my plan for today?

  • Testing elivepatch
  • Cleaning code

What I did today?

  • commented some code part
  • First draft of UUID working

elivepatch server need to generate a UUID for a client connection, and assign the UUID to each client, this is needed for managing multiple request. The livepatch and configs files will be generated in different folders for each client request and returned using the UUID. Made a working draft.

what I will do next time?
* Cleaning code * testing it * Go on with programming and starting implementing the CVE

Google Summer of Code day 24

What was my plan for today?

  • Cleaning code
  • testing it
  • Go on with programming and starting implementing the CVE

What I did today?

  • Working with getting the Linux kernel version from configuration file
  • Working with parsing CVE repository

I could implement the Linux kernel version from the configuration file, and I'm working on parsing the CVE repository. Would also be a nice idea to work with the kpatch-build script for making it Gentoo compatible. But I also got into one problem, that is we need to find a way to download old Gentoo-sources ebuild for building an old kernel, where the server there isn't the needed version of kernel sources. This depends on how much back compatibility we want to give. And with old Gentoo-sources, we cannot assure that is working. Because old Gentoo-sources was using different versions of Gentoo repository, eclass. So is a problem to discuss in the next days.

what I will do next time?

  • work on the CVE repository
  • testing it

Google Summer of Code day 25

What was my plan for today?

  • Cleaning code
  • testing and improving elivepatch

What I did today?

As discussed with my mentor I worked on unifying the patch and configuration file RESTful API call. And made the function to standardize the subprocess commands.

what I will do next time?

  • testing it and improving elivepatch

November 02, 2017
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
Gentoo Linux on DELL XPS 13 9365 and 9360 (November 02, 2017, 17:33 UTC)

Since I received some positive feedback about my previous DELL XPS 9350 post, I am writing this summary about my recent experience in installing Gentoo Linux on a DELL XPS 13 9365.

This installation notes goals:

  • UEFI boot using Grub
  • Provide you with a complete and working kernel configuration
  • Encrypted disk root partition using LUKS
  • Be able to type your LUKS passphrase to decrypt your partition using your local keyboard layout

Grub & UEFI & Luks installation

This installation is a fully UEFI one using grub and booting an encrypted root partition (and home partition). I was happy to see that since my previous post, everything got smoother. So even if you can have this installation working using MBR, I don’t really see a point in avoiding UEFI now.

BIOS configuration

Just like with its ancestor, you should:

  • Turn off Secure Boot
  • Set SATA Operation to AHCI

Live CD

Once again, go for the latest SystemRescueCD (it’s Gentoo based, you won’t be lost) as it’s quite more up to date and supports booting on UEFI. Make it a Live USB for example using unetbootin and the ISO on a vfat formatted USB stick.

NVME SSD disk partitioning

We’ll obviously use GPT with UEFI. I found that using gdisk was the easiest. The disk itself is found on /dev/nvme0n1. Here it is the partition table I used :

  • 10Mo UEFI BIOS partition (type EF02)
  • 500Mo UEFI boot partition (type EF00)
  • 2Go Swap partition
  • 475Go Linux root partition

The corresponding gdisk commands :

# gdisk /dev/nvme0n1

Command: o ↵
This option deletes all partitions and creates a new protective MBR.
Proceed? (Y/N): y ↵

Command: n ↵
Partition Number: 1 ↵
First sector: ↵
Last sector: +10M ↵
Hex Code: EF02 ↵

Command: n ↵
Partition Number: 2 ↵
First sector: ↵
Last sector: +500M ↵
Hex Code: EF00 ↵

Command: n ↵
Partition Number: 3 ↵
First sector: ↵
Last sector: +2G ↵
Hex Code: 8200 ↵

Command: n ↵
Partition Number: 4 ↵
First sector: ↵
Last sector: ↵ (for rest of disk)
Hex Code: ↵

Command: p ↵
Disk /dev/nvme0n1: 1000215216 sectors, 476.9 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): A73970B7-FF37-4BA7-92BE-2EADE6DDB66E
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1000215182
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048           22527   10.0 MiB    EF02  BIOS boot partition
   2           22528         1046527   500.0 MiB   EF00  EFI System
   3         1046528         5240831   2.0 GiB     8200  Linux swap
   4         5240832      1000215182   474.4 GiB   8300  Linux filesystem

Command: w ↵
Do you want to proceed? (Y/N): Y ↵

No WiFi on Live CD ? no panic

Once again on my (old?) SystemRescueCD stick, the integrated Intel 8265/8275 wifi card is not detected.

So I used my old trick with my Android phone connected to my local WiFi as a USB modem which was detected directly by the live CD.

  • get your Android phone connected on your local WiFi (unless you want to use your cellular data)
  • plug in your phone using USB to your XPS
  • on your phone, go to Settings / More / Tethering & portable hotspot
  • enable USB tethering

Running ip addr will show the network card enp0s20f0u2 (for me at least), then if no IP address is set on the card, just ask for one :

# dhcpcd enp0s20f0u2

You have now access to the internet.

Proceed with installation

The only thing to prepare is to format the UEFI boot partition as FAT32. Do not worry about the UEFI BIOS partition (/dev/nvme0n1p1), grub will take care of it later.

# mkfs.vfat -F 32 /dev/nvme0n1p2

Do not forget to use cryptsetup to encrypt your /dev/nvme0n1p4 partition! In the rest of the article, I’ll be using its device mapper representation.

# cryptsetup luksFormat -s 512 /dev/nvme0n1p4
# cryptsetup luksOpen /dev/nvme0n1p4 root
# mkfs.ext4 /dev/mapper/root

Then follow the Gentoo handbook as usual for the stage3 related next steps. Make sure you mount and bind the following to your /mnt/gentoo LiveCD installation folder (the /sys binding is important for grub UEFI):

# mount -t proc none /mnt/gentoo/proc
# mount -o bind /dev /mnt/gentoo/dev
# mount -o bind /sys /mnt/gentoo/sys

make.conf settings

I strongly recommend using at least the following on your /etc/portage/make.conf :

GRUB_PLATFORM="efi-64"
INPUT_DEVICES="evdev synaptics"
VIDEO_CARDS="intel i965"

USE="bindist cryptsetup"

The GRUB_PLATFORM one is important for later grub setup and the cryptsetup USE flag will help you along the way.

fstab for SSD

Don’t forget to make sure the noatime option is used on your fstab for / and /home.

/dev/nvme0n1p2    /boot    vfat    noauto,noatime    1 2
/dev/nvme0n1p3    none     swap    sw                0 0
/dev/mapper/root  /        ext4    noatime   0 1

Kernel configuration and compilation

I suggest you use a recent >=sys-kernel/gentoo-sources-4.13.0 along with genkernel.

  • You can download my kernel configuration file (iptables, docker, luks & stuff included)
  • Put the kernel configuration file into the /etc/kernels/ directory (with a training s)
  • Rename the configuration file with the exact version of your kernel

Then you’ll need to configure genkernel to add luks support, firmware files support and keymap support if your keyboard layout is not QWERTY.

In your /etc/genkernel.conf, change the following options:

LUKS="yes"
FIRMWARE="yes"
KEYMAP="1"

Then run genkernel all to build your kernel and luks+firmware+keymap aware initramfs.

Grub UEFI bootloader with LUKS and custom keymap support

Now it’s time for the grub magic to happen so you can boot your wonderful Gentoo installation using UEFI and type your password using your favourite keyboard layout.

  • make sure your boot vfat partition is mounted on /boot
  • edit your /etc/default/grub configuration file with the following:
GRUB_CMDLINE_LINUX="crypt_root=/dev/nvme0n1p4 keymap=fr"

This will allow your initramfs to know it has to read the encrypted root partition from the given partition and to prompt for its password in the given keyboard layout (french here).

Now let’s install the grub UEFI boot files and setup the UEFI BIOS partition.

# grub-install --efi-directory=/boot --target=x86_64-efi /dev/nvme0n1
Installing for x86_64-efi platform.
Installation finished. No error reported

It should report no error, then we can generate the grub boot config:

# grub-mkconfig -o /boot/grub/grub.cfg

You’re all set!

You will get a gentoo UEFI boot option, you can disable the Microsoft Windows one from your BIOS to get straight to the point.

Hope this helped!

October 16, 2017
Greg KH a.k.a. gregkh (homepage, bugs)
Linux Kernel Community Enforcement Statement FAQ (October 16, 2017, 09:05 UTC)

Based on the recent Linux Kernel Community Enforcement Statement and the article describing the background and what it means , here are some Questions/Answers to help clear things up. These are based on questions that came up when the statement was discussed among the initial round of over 200 different kernel developers.

Q: Is this changing the license of the kernel?

A: No.

Q: Seriously? It really looks like a change to the license.

A: No, the license of the kernel is still GPLv2, as before. The kernel developers are providing certain additional promises that they encourage users and adopters to rely on. And by having a specific acking process it is clear that those who ack are making commitments personally (and perhaps, if authorized, on behalf of the companies that employ them). There is nothing that says those commitments are somehow binding on anyone else. This is exactly what we have done in the past when some but not all kernel developers signed off on the driver statement.

Q: Ok, but why have this “additional permissions” document?

A: In order to help address problems caused by current and potential future copyright “trolls” aka monetizers.

Q: Ok, but how will this help address the “troll” problem?

A: “Copyright trolls” use the GPL-2.0’s immediate termination and the threat of an immediate injunction to turn an alleged compliance concern into a contract claim that gives the troll an automatic claim for money damages. The article by Heather Meeker describes this quite well, please refer to that for more details. If even a short delay is inserted for coming into compliance, that delay disrupts this expedited legal process.

By simply saying, “We think you should have 30 days to come into compliance”, we undermine that “immediacy” which supports the request to the court for an immediate injunction. The threat of an immediate junction was used to get the companies to sign contracts. Then the troll goes back after the same company for another known violation shortly after and claims they’re owed the financial penalty for breaking the contract. Signing contracts to pay damages to financially enrich one individual is completely at odds with our community’s enforcement goals.

We are showing that the community is not out for financial gain when it comes to license issues – though we do care about the company coming into compliance.  All we want is the modifications to our code to be released back to the public, and for the developers who created that code to become part of our community so that we can continue to create the best software that works well for everyone.

This is all still entirely focused on bringing the users into compliance. The 30 days can be used productively to determine exactly what is wrong, and how to resolve it.

Q: Ok, but why are we referencing GPL-3.0?

A: By using the terms from the GPLv3 for this, we use a very well-vetted and understood procedure for granting the opportunity to come fix the failure and come into compliance. We benefit from many months of work to reach agreement on a termination provision that worked in legal systems all around the world and was entirely consistent with Free Software principles.

Q: But what is the point of the “non-defensive assertion of rights” disclaimer?

A: If a copyright holder is attacked, we don’t want or need to require that copyright holder to give the party suing them an opportunity to cure. The “non-defensive assertion of rights” is just a way to leave everything unchanged for a copyright holder that gets sued.  This is no different a position than what they had before this statement.

Q: So you are ok with using Linux as a defensive copyright method?

A: There is a current copyright troll problem that is undermining confidence in our community – where a “bad actor” is attacking companies in a way to achieve personal gain. We are addressing that issue. No one has asked us to make changes to address other litigation.

Q: Ok, this document sounds like it was written by a bunch of big companies, who is behind the drafting of it and how did it all happen?

A: Grant Likely, the chairman at the time of the Linux Foundation’s Technical Advisory Board (TAB), wrote the first draft of this document when the first copyright troll issue happened a few years ago. He did this as numerous companies and developers approached the TAB asking that the Linux kernel community do something about this new attack on our community. He showed the document to a lot of kernel developers and a few company representatives in order to get feedback on how it should be worded. After the troll seemed to go away, this work got put on the back-burner. When the copyright troll showed back up, along with a few other “copycat” like individuals, the work on the document was started back up by Chris Mason, the current chairman of the TAB. He worked with the TAB members, other kernel developers, lawyers who have been trying to defend these claims in Germany, and the TAB members’ Linux Foundation’s lawyers, in order to rework the document so that it would actually achieve the intended benefits and be useful in stopping these new attacks. The document was then reviewed and revised with input from Linus Torvalds and finally a document that the TAB agreed would be sufficient was finished. That document was then sent to over 200 of the most active kernel developers for the past year by Greg Kroah-Hartman to see if they, or their company, wished to support the document. That produced the initial “signatures” on the document, and the acks of the patch that added it to the Linux kernel source tree.

Q: How do I add my name to the document?

A: If you are a developer of the Linux kernel, simply send Greg a patch adding your name to the proper location in the document (sorting the names by last name), and he will be glad to accept it.

Q: How can my company show its support of this document?

A: If you are a developer working for a company that wishes to show that they also agree with this document, have the developer put the company name in ‘(’ ‘)’ after the developer’s name. This shows that both the developer, and the company behind the developer are in agreement with this statement.

Q: How can a company or individual that is not part of the Linux kernel community show its support of the document?

A: Become part of our community! Send us patches, surely there is something that you want to see changed in the kernel. If not, wonderful, post something on your company web site, or personal blog in support of this statement, we don’t mind that at all.

Q: I’ve been approached by a copyright troll for Netfilter. What should I do?

A: Please see the Netfilter FAQ here for how to handle this

Q: I have another question, how do I ask it?

A: Email Greg or the TAB, and they will be glad to help answer them.

Linux Kernel Community Enforcement Statement (October 16, 2017, 09:00 UTC)

By Greg Kroah-Hartman, Chris Mason, Rik van Riel, Shuah Khan, and Grant Likely

The Linux kernel ecosystem of developers, companies and users has been wildly successful by any measure over the last couple decades. Even today, 26 years after the initial creation of the Linux kernel, the kernel developer community continues to grow, with more than 500 different companies and over 4,000 different developers getting changes merged into the tree during the past year. As Greg always says every year, the kernel continues to change faster this year than the last, this year we were running around 8.5 changes an hour, with 10,000 lines of code added, 2,000 modified, and 2,500 lines removed every hour of every day.

The stunning growth and widespread adoption of Linux, however, also requires ever evolving methods of achieving compliance with the terms of our community’s chosen license, the GPL-2.0. At this point, there is no lack of clarity on the base compliance expectations of our community. Our goals as an ecosystem are to make sure new participants are made aware of those expectations and the materials available to assist them, and to help them grow into our community.  Some of us spend a lot of time traveling to different companies all around the world doing this, and lots of other people and groups have been working tirelessly to create practical guides for everyone to learn how to use Linux in a way that is compliant with the license. Some of these activities include:

Unfortunately the same processes that we use to assure fulfillment of license obligations and availability of source code can also be used unjustly in trolling activities to extract personal monetary rewards. In particular, issues have arisen as a developer from the Netfilter community, Patrick McHardy, has sought to enforce his copyright claims in secret and for large sums of money by threatening or engaging in litigation. Some of his compliance claims are issues that should and could easily be resolved. However, he has also made claims based on ambiguities in the GPL-2.0 that no one in our community has ever considered part of compliance.  

Examples of these claims have been distributing over-the-air firmware, requiring a cell phone maker to deliver a paper copy of source code offer letter; claiming the source code server must be setup with a download speed as fast as the binary server based on the “equivalent access” language of Section 3; requiring the GPL-2.0 to be delivered in a local language; and many others.

How he goes about this activity was recently documented very well by Heather Meeker.

Numerous active contributors to the kernel community have tried to reach out to Patrick to have a discussion about his activities, to no response. Further, the Netfilter community suspended Patrick from contributing for violations of their principles of enforcement. The Netfilter community also published their own FAQ on this matter.

While the kernel community has always supported enforcement efforts to bring companies into compliance, we have never even considered enforcement for the purpose of extracting monetary gain.  It is not possible to know an exact figure due to the secrecy of Patrick’s actions, but we are aware of activity that has resulted in payments of at least a few million Euros.  We are also aware that these actions, which have continued for at least four years, have threatened the confidence in our ecosystem.

Because of this, and to help clarify what the majority of Linux kernel community members feel is the correct way to enforce our license, the Technical Advisory Board of the Linux Foundation has worked together with lawyers in our community, individual developers, and many companies that participate in the development of, and rely on Linux, to draft a Kernel Enforcement Statement to help address both this specific issue we are facing today, and to help prevent any future issues like this from happening again.

A key goal of all enforcement of the GPL-2.0 license has and continues to be bringing companies into compliance with the terms of the license. The Kernel Enforcement Statement is designed to do just that.  It adopts the same termination provisions we are all familiar with from GPL-3.0 as an Additional Permission giving companies confidence that they will have time to come into compliance if a failure is identified. Their ability to rely on this Additional Permission will hopefully re-establish user confidence and help direct enforcement activity back to the original purpose we have all sought over the years – actual compliance.  

Kernel developers in our ecosystem may put their own acknowledgement to the Statement by sending a patch to Greg adding their name to the Statement, like any other kernel patch submission, and it will be gladly merged. Those authorized to ‘ack’ on behalf of their company may add their company name in (parenthesis) after their name as well.

Note, a number of questions did come up when this was discussed with the kernel developer community. Please see Greg’s FAQ post answering the most common ones if you have further questions about this topic.

October 13, 2017
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
Gentoo Linux listed RethinkDB’s website (October 13, 2017, 08:22 UTC)

 

The rethinkdb‘s website has (finally) been updated and Gentoo Linux is now listed on the installation page!

Meanwhile, we have bumped the ebuild to version 2.3.6 with fixes for building on gcc-6 thanks to Peter Levine who kindly proposed a nice PR on github.

September 27, 2017
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Important!

This is a workaround for a FreeType/fontconfig problem, but my be applicable in other cases as well. For Gentoo users, the related bug is 631502.

Recently, after updating to Mozilla Firefox to version 52 or later (55.0.2, in my case), and Mozilla Thunderbird to version 52 or later (52.3.0, in my case), I found that fonts were rendering horribly under Linux. It looked essentially like there was no anti-aliasing or hinting at all.

Come to find out, this was due to a change in the content rendering engine, which is briefly mentioned in the release notes for Firefox 52 (but it also applies to Thunderbird). Basically, in Linux, the default engine changed from cairo to Google’s Skia.

Ugly fonts in Firefox and Thunderbird under Linux - skia and cairo

For each application, the easiest method for getting the fonts to render nicely again is to make two changes directly in the configuration editor. To do so in Firefox, simply go to the address bar and type about:config. Within Thunderbird, it can be launched by going to Menu > Preferences > Advanced > Config Editor. Once there, the two keys that need to change are:

gfx.canvas.azure.backends
gfx.content.azure.backends

They likely have values of “skia” or a comma-separated list with “skia” being the first value. On my Linux hosts, I changed the value from skia back to cairo, restarted the applications, and all was again right in the world (or at least in the Mozilla font world 😛 ).

Hope that helps.

Cheers,
Zach

September 26, 2017
Sven Vermeulen a.k.a. swift (homepage, bugs)
SELinux Userspace 2.7 (September 26, 2017, 12:50 UTC)

A few days ago, Jason "perfinion" Zaman stabilized the 2.7 SELinux userspace on Gentoo. This release has quite a few new features, which I'll cover in later posts, but for distribution packagers the main change is that the userspace now has many more components to package. The project has split up the policycoreutils package in separate packages so that deployments can be made more specific.

Let's take a look at all the various userspace packages again, learn what their purpose is, so that you can decide if they're needed or not on a system. Also, when I cover the contents of a package, be aware that it is based on the deployment on my system, which might or might not be a complete installation (as with Gentoo, different USE flags can trigger different package deployments).

September 11, 2017
Sven Vermeulen a.k.a. swift (homepage, bugs)
Authenticating with U2F (September 11, 2017, 16:25 UTC)

In order to further secure access to my workstation, after the switch to Gentoo sources, I now enabled two-factor authentication through my Yubico U2F USB device. Well, at least for local access - remote access through SSH requires both userid/password as well as the correct SSH key, by chaining authentication methods in OpenSSH.

Enabling U2F on (Gentoo) Linux is fairly easy. The various guides online which talk about the pam_u2f setup are indeed correct that it is fairly simple. For completeness sake, I've documented what I know on the Gentoo Wiki, as the pam_u2f article.

September 06, 2017
Greg KH a.k.a. gregkh (homepage, bugs)
4.14 == This years LTS kernel (September 06, 2017, 14:41 UTC)

As the 4.13 release has now happened, the merge window for the 4.14 kernel release is now open. I mentioned this many weeks ago, but as the word doesn’t seem to have gotten very far based on various emails I’ve had recently, I figured I need to say it here as well.

So, here it is officially, 4.14 should be the next LTS kernel that I’ll be supporting with stable kernel patch backports for at least two years, unless it really is a horrid release and has major problems. If so, I reserve the right to pick a different kernel, but odds are, given just how well our development cycle has been going, that shouldn’t be a problem (although I guess I just doomed it now…)

As always, if people have questions about this, email me and I will be glad to discuss it, or talk to me in person next week at the LinuxCon^WOpenSourceSummit or Plumbers conference in Los Angeles, or at any of the other conferences I’ll be at this year (ELCE, Kernel Recipes, etc.)

September 04, 2017
Arun Raghavan a.k.a. ford_prefect (homepage, bugs)
A Late GUADEC 2017 Post (September 04, 2017, 03:20 UTC)

It’s been a little over a month since I got back from Manchester, and this post should’ve come out earlier but I’ve been swamped.

The conference was absolutely lovely, the organisation was a 110% on point (serious kudos, I know first hand how hard that is). Others on Planet GNOME have written extensively about the talks, the social events, and everything in between that made it a great experience. What I would like to write about is about why this year’s GUADEC was special to me.

GNOME turning 20 years old is obviously a large milestone, and one of the main reasons I wanted to make sure I was at Manchester this year. There were many occasions to take stock of how far we had come, where we are, and most importantly, to reaffirm who we are, and why we do what we do.

And all of this made me think of my own history with GNOME. In 2002/2003, Nat and Miguel came down to Bangalore to talk about some of the work they were doing. I know I wasn’t the only one who found their energy infectious, and at Linux Bangalore 2003, they got on stage, just sat down, and started hacking up a GtkMozEmbed-based browser. The idea itself was fun, but what I took away — and I know I wasn’t the only one — is the sheer inclusive joy they shared in creating something and sharing that with their audience.

For all of us working on GNOME in whatever way we choose to contribute, there is the immediate gratification of shaping this project, as well as the larger ideological underpinning of making everyone’s experience talking to their computers better and free-er.

But I think it is also important to remember that all our efforts to make our community an inviting and inclusive space have a deep impact across the world. So much so that complete strangers from around the world are able to feel a sense of belonging to something much larger than themselves.

I am excited about everything we will achieve in the next 20 years.

(thanks go out to the GNOME Foundation for helping me attend GUADEC this year)

Sponsored by GNOME!

August 26, 2017
Sebastian Pipping a.k.a. sping (homepage, bugs)
GIMP 2.9.6 now in Gentoo (August 26, 2017, 19:53 UTC)

Here’s what upstream has to say about the new release 2.9.6. Enjoy 🙂

August 23, 2017
Sven Vermeulen a.k.a. swift (homepage, bugs)
Using nVidia with SELinux (August 23, 2017, 17:04 UTC)

Yesterday I've switched to the gentoo-sources kernel package on Gentoo Linux. And with that, I also attempted (succesfully) to use the propriatary nvidia drivers so that I can enjoy both a smoother 3D experience while playing minecraft, as well as use the CUDA support so I don't need to use cloud-based services for small exercises.

The move to nvidia was quite simple, as the nvidia-drivers wiki article on the Gentoo wiki was quite easy to follow.

August 22, 2017
Sven Vermeulen a.k.a. swift (homepage, bugs)
Switch to Gentoo sources (August 22, 2017, 17:04 UTC)

You've might already read it on the Gentoo news site, the Hardened Linux kernel sources are removed from the tree due to the grsecurity change where the grsecurity Linux kernel patches are no longer provided for free. The decision was made due to supportability and maintainability reasons.

That doesn't mean that users who want to stick with the grsecurity related hardening features are left alone. Agostino Sarubbo has started providing sys-kernel/grsecurity-sources for the users who want to stick with it, as it is based on minipli's unofficial patchset. I seriously hope that the patchset will continue to be maintained and, who knows, even evolve further.

Personally though, I'm switching to the Gentoo sources, and stick with SELinux as one of the protection measures. And with that, I might even start using my NVidia graphics card a bit more, as that one hasn't been touched in several years (I have an Optimus-capable setup with both an Intel integrated graphics card and an NVidia one, but all attempts to use nouveau for the one game I like to play - minecraft - didn't work out that well).

Alexys Jacob a.k.a. ultrabug (homepage, bugs)
py3status v3.6 (August 22, 2017, 06:00 UTC)

After four months of cool contributions and hard work on normalization and modules’ clean up, I’m glad to announce the release of py3status v3.6!

Milestone 3.6 was mainly focused about existing modules, from their documentation to their usage of the py3 helper to streamline their code base.

Other improvements were made about error reporting while some sneaky bugs got fixed along the way.

Highlights

Not an extensive list, check the changelog.

  • LOTS of modules streamlining (mainly the hard work of @lasers)
  • error reporting improvements
  • py3-cmd performance improvements

New modules

  • i3blocks support (yes, py3status can now wrap i3blocks thanks to @tobes)
  • cmus module: to control your cmus music player, by @lasers
  • coin_market module: to display custom cryptocurrency data, by @lasers
  • moc module: to control your moc music player, by @lasers

Milestone 3.7

This milestone will give a serious kick into py3status performance. We’ll do lots of profiling and drastic work to reduce py3status CPU and memory footprints!

For now we’ve been relying a lot on threads, which is simple to operate but not that CPU/memory friendly. Since i3wm users rightly care for their efficiency we think it’s about time we address this kind of points in py3status.

Stay tuned, we have some nice ideas in stock 🙂

Thanks contributors!

This release is their work, thanks a lot guys!

  • aethelz
  • alexoneill
  • armandg
  • Cypher1
  • docwalter
  • enguerrand
  • fmorgner
  • guiniol
  • lasers
  • markrileybot
  • maximbaz
  • tablet-mode
  • paradoxisme
  • ritze
  • rixx
  • tobes
  • valdur55
  • vvoland
  • yabbes

August 21, 2017
sys-kernel/grsecurity-sources available! (August 21, 2017, 15:11 UTC)

Is known that the grsecurity project since few weeks made available the grsecurity patches only for their customers. In the meantime some people made their fork of the latest publicly available patches.

At Gentoo, for some reasons (which I respect) explained by the news item and on the mailing lists, the maintainer decided to drop the hardened-sources package at the end of September 2017

Then, I decided to make my own ebuild that uses the Genpatches plus the Unofficial forward ports of the last publicly available grsecurity patch.

Before you wondering about the code of the ebuild, let me explain the logic used:

1) The ebuild was done in this way because the version bump should result in a copy-paste on the ebuild side.
2) I don’t use the GENPATCHES variable from the kernel eclass because of the previously explained point 1.
3) I generate the tarball via a bash script which takes the genpatches, take the unofficial-grsecurity-patches and deletes the unwanted patches from the genpatches tarball (i.e. in hardened-sources we had UNIPATCH_EXCLUDE=”1500_XATTR_USER_PREFIX.patch 2900_dev-root-proc-mount-fix.patch”).
4) I don’t use the UNIPATCH_EXCLUDE variable because because of the previously explained point 3.

Don’t expect a version bump on each minor release unless there are critical bugs and/or dangerous security bugs. So please not file version bump requests on bugzilla.

If you have any issue regarding grsecurity itself, please file a bug on the github issue tracker and if you will mention the issue elsewhere, please specify that the issue is with the unofficial grsecurity port. This will avoid to “damage” the grsecurity image/credibility.

The ebuild is available into my overlay
If you have trouble on how to install that ebuild, please follow the layman article on our wiki, basically you need:

root ~ $ layman -S && layman -a ago

USE IT AT YOUR OWN RISK 😉

August 19, 2017
Hardened Linux kernel sources removal (August 19, 2017, 00:00 UTC)

As you may know the core of sys-kernel/hardened-sources has been the grsecurity patches. Recently the grsecurity developers have decided to limit access to these patches. As a result, the Gentoo Hardened team is unable to ensure a regular patching schedule and therefore the security of the users of these kernel sources. Thus, we will be masking hardened-sources on the 27th of August and will proceed to remove them from the main ebuild repository by the end of September. We recommend to use sys-kernel/gentoo-sources instead. Userspace hardening and support for SELinux will of course remain in the Gentoo ebuild repository. Please see the full news item for additional information and links.

August 18, 2017

FroSCon logo

Upcoming weekend, 19-20th August 2017, there will be a Gentoo booth again at the FrOSCon “Free and Open Source Conference” 12, in St. Augustin near Bonn! Visitors can see Gentoo live in action, get Gentoo swag, and prepare, configure, and compile their own Gentoo buttons. See you there!

August 08, 2017
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
ScyllaDB meets Gentoo Linux (August 08, 2017, 14:19 UTC)

I am happy to announce that my work on packaging ScyllaDB for Gentoo Linux is complete!

Happy or curious users are very welcome to share their thoughts and ping me to get it into portage (which will very likely happen).

Why Scylla?

Ever heard of the Cassandra NoSQL database and Java GC/Heap space problems?… if you do, you already get it 😉

I will not go into the details as their website does this way better than me but I got interested into Scylla because it fits the Gentoo Linux philosophy very well. If you remember my writing about packaging Rethinkdb for Gentoo Linux, I think that we have a great match with Scylla as well!

  • it is written in C++ so it plays very well with emerge
  • the code quality is so great that building it does not require heavy patching on the ebuild (feels good to be a packager)
  • the code relies on system libs instead of bundling them in the sources (hurrah!)
  • performance tuning is handled by smart scripting and automation, allowing the relationship between the project and the hardware is strong

I believe that these are good enough points to go further and that such a project can benefit from a source based distribution like Gentoo Linux. Of course compiling on multiple systems is a challenge for such a database but one does not improve by staying in their comfort zone.

Upstream & contributions

Packaging is a great excuse to get to know the source code of a project but more importantly the people behind it.

So here I got to my first contributions to Scylla to get Gentoo Linux as a detected and supported Linux distribution in the different scripts and tools used to automatically setup the machine it will run upon (fear not, I contributed bash & python, not C++)…

Even if I expected to contribute using Github PRs and got to change my habits to a git-patch+mailing list combo, I got warmly welcomed and received positive and genuine interest in the contributions. They got merged quickly and thanks to them you can install and experience Scylla in Gentoo Linux without heavy patching on our side.

Special shout out to Pekka, Avi and Vlad for their welcoming and insightful code reviews!

I’ve some open contributions about pushing further on the python code QA side to get the tools to a higher level of coding standards. Seeing how upstream is serious about this I have faith that it will get merged and a good base for other contributions.

Last note about reaching them is that I am a bit sad that they’re not using IRC freenode to communicate (I instinctively joined #scylla and found myself alone) but they’re on Slack (those “modern folks”) and pretty responsive to the mailing lists 😉

Java & Scylla

Even if scylla is a rewrite of Cassandra in C++, the project still relies on some external tools used by the Cassandra community which are written in Java.

When you install the scylla package on Gentoo, you will see that those two packages are Java based dependencies:

  • app-admin/scylla-tools
  • app-admin/scylla-jmx

It pained me a lot to package those (thanks to help of @monsieurp) but they are building and working as expected so this gets the packaging of the whole Scylla project pretty solid.

emerge dev-db/scylla

The scylla packages are located in the ultrabug overlay for now until I test them even more and ultimately put them in production. Then they’ll surely reach the portage tree with the approval of the Gentoo java team for the app-admin/ packages listed above.

I provide a live ebuild (scylla-9999 with no keywords) and ebuilds for the latest major version (2.0_rc1 at time of writing).

It’s as simple as:

$ sudo layman -a ultrabug
$ sudo emerge -a dev-db/scylla
$ sudo emerge --config dev-db/scylla

Try it out and tell me what you think, I hope you’ll start considering and using this awesome database!

July 27, 2017
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Last evening, I ran some updates on one of my servers. One of the updates was from MariaDB 10.1 to 10.2 (some minor release as well). After compiling, I went to restart, but it failed with:

# /etc/init.d/mysql start
* Checking mysqld configuration for mysql ...
[ERROR] Can't find messagefile '/usr/share/mysql/errmsg.sys'
[ERROR] Aborting

* mysql config check failed [ !! ]
* ERROR: mysql failed to start

I’m not sure why this just hit me now, but it looks like it is a function within the init script that’s causing it to look for files in the nonexistent directory of /usr/share/mysql/ instead of the appropriate /usr/share/mariadb/. The fast fix here (so that I could get everything back up and running as quickly as possible) was to simply symlink the directory:

cd /usr/share
ln -s mariadb/ mysql

Thereafter, MariaDB came up without any problem:

# /etc/init.d/mysql start
* Caching service dependencies ... [ ok ]
* Checking mysqld configuration for mysql ... [ ok ]
* Starting mysql ... [ ok ]
# /etc/init.d/mysql status
* status: started

I hope that information helps if you’re in a pinch and run into the same error message.

Cheers,
Zach

UPDATE: It seems as if the default locations for MySQL/MariaDB configurations have changed (in Gentoo). Please see this comment for more information about a supportable fix for this problem moving forward. Thanks to Brian Evans for the information. 🙂

July 23, 2017
Michał Górny a.k.a. mgorny (homepage, bugs)
Optimizing ccache using per-package caches (July 23, 2017, 18:03 UTC)

ccache can be of great assistance to Gentoo developers and users who frequently end up rebuilding similar versions of packages. By providing a caching compiler frontend, it can speed up builds by removing the need to build files that have not changed again. However, it uses a single common cache directory by default which can be suboptimal even if you are explicitly enabling ccache only for a subset of packages needing that.

The likeliness of cross-package ccache hits is pretty low — majority of the hits occurs within a single package. If you use a single cache directory for all affected packages, it grows pretty quick. Besides a possible performance hit from having a lot of files in every directory, this means that packages built later can shift earlier packages out of the cache, resulting in meaninglessly lost cache hits. A simple way to avoid both of the problems is to use separate ccache directories.

In my solution, a separate subdirectory of /var/cache/ccache is used for every package, named after the category, package name and slot. While the last one is not strictly necessary, it can be useful for slotted packages such as LLVM where I do not want frequently changing live package sources to shift the release versions out of the cache.

To use it, put a code similar to the following in your /etc/portage/bashrc:

if [[ ${FEATURES} == *ccache* && ${EBUILD_PHASE_FUNC} == src_* ]]; then
	if [[ ${CCACHE_DIR} == /var/cache/ccache ]]; then
		export CCACHE_DIR=/var/cache/ccache/${CATEGORY}/${PN}:${SLOT}
		mkdir -p "${CCACHE_DIR}" || die
	fi
fi

The first condition makes sure the code is only run when ccache is enabled, and only for src_* phases where we can rely on userpriv being used consistently. The second one makes sure the code only applies to a specific (my initial) value of CCACHE_DIR and therefore avoids both nesting the cache indefinitely when Portage calls subsequent phase functions, and applying the replacement if user overrides CCACHE_DIR.

You need to either adjust the value used here to the directory used on your system, or change it in your /etc/portage/make.conf:

CCACHE_DIR="/var/cache/ccache"

Once this is done, Portage should start creating separate cache directories for every package where you enable ccache. This should improve the cache hit ratio, especially if you are using ccache for large packages (why else would you need it?). However, note that you will no longer have a single cache size limit — every package will have its own limit. Therefore, you may want to reduce the limits per-package, or manually look after the cache periodically.

July 16, 2017
Michał Górny a.k.a. mgorny (homepage, bugs)
GLEP 73 check results explained (July 16, 2017, 08:40 UTC)

The pkgcheck instance run for the Repo mirror&CI project has finished gaining a full support for GLEP 73 REQUIRED_USE validation and verification today. As a result, it can report 5 new issues defined by that GLEP. In this article, I’d like to shortly summarize them and explain how to interpret and solve the reports.

Technical note: the GLEP number has not been formally assigned yet. However, since there is no other GLEP request open at the moment, I have taken the liberty of using the next free number in the implementation.

GLEP73Syntax: syntax violates GLEP 73

GLEP 73 specifies a few syntax restrictions as compared to the pretty much free-form syntax allowed by the PMS. The restrictions could be shortly summarized as:

  • ||, ^^ and ?? can not not be empty,
  • ||, ^^ and ?? can not not be nested,
  • USE-conditional groups can not be used inside ||, ^^ and ??,
  • All-of groups (expressed using parentheses without a prefix) are banned completely.

The full rationale for the restrictions, along with examples and proposed fixes is provided in the GLEP. For the purpose of this article, it is enough to say that in all the cases found, there was a simpler (more obvious) way of expressing the same constraint.

Violation of this syntax prevents pkgcheck from performing any of the remaining checks. But more importantly, the report indicates that the constraint is unnecessarily complex and could result in REQUIRED_USE mismatch messages that are unnecessarily confusing to the user. Taking a real example, compare:

  The following REQUIRED_USE flag constraints are unsatisfied:
    exactly-one-of ( ( !32bit 64bit ) ( 32bit !64bit ) ( 32bit 64bit ) )

and the effect of a valid replacement:

  The following REQUIRED_USE flag constraints are unsatisfied:
	any-of ( 64bit 32bit )

While we could debate about usefulness of the Portage output, I think it is clear that the second output is simpler to comprehend. And the best proof is that you actually need to think a bit before confirming that they’re equivalent.

GLEP73Immutability: REQUIRED_USE violates immutability rules

This one is rather simple: it means this constraint may tell user to enable (disable) a flag that is use.masked/forced. Taking a trivial example:

a? ( b )

GLEP73Immutability report will trigger if a profile masks the b flag. This means that if the user has a enabled, the PM would normally tell him to enable b as well. However, since b is masked, it can not be enabled using normal methods (we assume that altering use.mask is not normally expected).

The alternative is to disable a then. But what’s the point of letting user enable it if we afterwards tell him to disable it anyway? It is more friendly to disable both flags together, and this is pretty much what the check is about. So in this case, the solution is to mask a as well.

How to read it? Given the generic message of:

REQUIRED_USE violates immutability rules: [C] requires [E] while the opposite value is enforced by use.force/mask (in profiles: [P])

It indicates that in profiles P (a lot of profiles usually indicates you’re looking for base or top-level arch profile), E is forced or masked, and that you probably need to force/mask C appropriately as well.

GLEP73SelfConflicting: impossible self-conflicting condition

This one is going to be extremely rare. It indicates that somehow the REQUIRED_USE nested a condition and its negation, causing it to never evaluate to true. It is best explained using the following trivial example:

a? ( !a? ( b ) )

This constraint will never be enforced since a and !a can not be true simultaneously.

Is there a point in having such a report at all? Well, such a thing is extremely unlikely to happen. However, it would break the verification algorithms and so we need to account for it explicitly. Since we account for it anyway and it is a clear mistake, why not report it?

GLEP73Conflict: request for conflicting states

This warning indicates that there are at least two constraints that can apply simultaneously and request the opposite states for the same USE flag. Again, best explained on a generic example:

a? ( c ) b? ( !c )

In this example, any USE flag set with both a and b enabled could not satisfy the constraint. However, Portage will happily led us astray:

  The following REQUIRED_USE flag constraints are unsatisfied:
	a? ( c )

If we follow the advice and enable c, we get:

  The following REQUIRED_USE flag constraints are unsatisfied:
	b? ( !c )

The goal of this check is to avoid such a bad advices, and to require constraints to clearly indicate a suggested way forward. For example, the above case could be modified to:

a? ( !b c ) b? ( !c )

to indicate that a takes precedence over b, and that b should be disabled to avoid the impossible constraint. The opposite can be stated similarly — however, note that you need to reorder the constraints to make sure that the PM will get it right:

b? ( !a !c ) a? ( c )

How to read it? Given the generic message of:

REQUIRED_USE can request conflicting states: [Ci] requires [Ei] while [Cj] requires [Ej]

It means that if the user enables Ci and Cj simultaneously, the PM will request conflicting Ei and Ej. Depending on the intent, the solution might involve negating one of the conditions in the other constraint, or reworking the REQUIRED_USE towards another solution.

GLEP73BackAlteration: previous condition starts applying

This warning is the most specific and the least important from all the additions at the moment. It indicates that the specific constraint may cause a preceding condition to start to apply, enforcing additional requirements. Consider the following example:

b? ( c ) a? ( b )

If the user has only a enabled, the second rule will enforce b. Then the condition for the first rule will start matching, and additionally enforce c. Is this a problem? Usually not. However, for the purpose of GLEP 73 we prefer that the REQUIRED_USE can be enforced while processing left-to-right, in a single iteration. If a previous rule starts applying, we may need to do another iteration.

The solution is usually trivial: to reorder (swap) the constraints. However, in some cases developers seem to prefer copying the enforcements into the subsequent rule, e.g.:

b? ( c ) a? ( b c )

Either way works for the purposes of GLEP 73, though the latter increases complexity.

How to read it? Given the generic message of:

REQUIRED_USE causes a preceding condition to start applying: [Cj] enforces [Ej] which may cause preceding [Ci] enforcing [Ei] to evaluate to true

This indicates that if Cj is true, Ej needs to be true as well. Once it is true, a preceding condition of Ci may also become true, adding another requirement for Ei. To fix the issue, you need to either move the latter constraint before the former, or include the enforcement of Ei in the rule for Cj, rendering the application of the first rule unnecessary.

Constructs using ||, ^^ and ?? operators

GLEP 73 specifies a leftmost-preferred behavior for the ||, ^^ and ?? operators. It is expressed in a simple transformation into implications (USE-conditional groups). Long story short:

  • || and ^^ groups force the leftmost unmasked flag if none of the flags are enabled already, and
  • ?? and ^^ groups disable all but the leftmost enabled flag if more than one flag is enabled.

All the verification algorithms work on the transformed form, and so their output may list conditions resulting from it. For example, the following construct:

|| ( a b c ) static? ( !a )

will report a conflict between !b !c ⇒ a and static ⇒ !a. This indicates the fact that per the forementioned rule, || group is transformed into !b? ( !c? ( a ) ) which explains that if none of the flags are enabled, the first one is preferred, causing a conflict with the static flag.

In this particular case you could debate that the algorithm could choose b or c instead in order to avoid the problem. However, we determined that this kind of heuristic is not a goal for GLEP 73, and instead we always obide the developer’s preference expressed in the ordering. The only exception to this rule is when the leftmost flag can not match due to a mask, in which case the first unmasked flag is used.

For completeness, I should add that ?? and ^^ blocks create implications in the form of: a ⇒ !b !c…, b ⇒ !c… and so on.

At some point I might work on making the reports include the original form to avoid ambiguity.

The future

The most important goal for GLEP 73 is to make it possible for users to install packages out-of-the-box without having to fight through mazes of REQUIRED_USE, and for developers to use REQUIRED_USE not only sparingly but whenever possible to improve the visibility of resulting package configuration. However, there is still a lot of testing, some fixing and many bikesheds before that could happen.

Nevertheless, I think we can all agree that most of the reports produced so far (with the exception of the back-alteration case) are meaningful even without automatic enforcing of REQUIRED_USE, and fixing them would benefit our users already. I would like to ask you to look for the reports on your packages and fix them whenever possible. Feel free to ping me if you need any help with that.

Once the number of non-conforming packages goes down, I will convert the reports successively into warning levels, making the CI report new issues and the pull request scans proactively complain about them.

July 04, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
New autoconf, new problems (July 04, 2017, 18:03 UTC)

So a new release of autoconf is in the tree, it’s version 2.62. Guess what? It’s far from glitches-free. Although I suppose the biggest problem is from programs that bend the rules autoconf gives you. While the warnings it spews up during Amarok build are, well, just a warning, which will probably be fixed before a version of autoconf making those problems fatal, there are subtle bugs that might appear, like -bug #217154 which made PAM fail to build.

For a series of fortunate circumstances, since last week I’ve restarted packaging Ruby libraries, that I needed both for my own blog (Typo) and other stuff. As usual, I’m trying to avoid packaging gems since they can be quite of a pain in the bum to deal with. But since that’s something it’s known already I want to post a few more notes, especially in light of the fact we’ll hopefully be having a decent way to package gems in Gentoo by the end of the Summer.

Again gitarella (July 04, 2017, 18:03 UTC)

So, last night I couldn’t sleep, and I then decided to continue with what I do when I cannot sleep: code. I think this would be one of the last times that I have time to do this tho, but I’d wait the official results to say that. Now gitarella loads fine repositories in which there are merges (commits with more than one parent), as ruby-hunspell came to be, plus I’ve added the tag display à-la GitWeb, and added an option to rename the title of Gitarella pages.

No more envelopes! (July 04, 2017, 18:03 UTC)

And finally the job is complete! Tomorrow I’ll probably pass reading Gibson’s Neuromancer to relax, as I’m the tiredness personified. Now, after having to take a forced break yesterday to fix a nasty KDE regression with KSpell2 (you know, zander, ranting won’t bring you anywhere…), I’m working on the next ebuild for Amarok, 1.4.4. With this version, there is a partial Karma device support (as I wrote here); but I didn’t get much requests, and no devs seem to have a Rio Karma device (as far as I remember).

QA by disagreement (July 04, 2017, 18:03 UTC)

A few months ago I criticised Qt team for not following QA indications regarding the installation of documentation. Now, I wish to apologize and thank them: they were tremendously useful in identifying one area of Gentoo that needs fixes in both the QA policy and the actual use of it. The problem: the default QA rules, and the policies encoded in both the Ebuild HOWTO (that should be deprecated!) and Portage itself, is to install the documentation of packages into /usr/share/doc/${PF}, a path that changes between different revisions of the same package.

Blast from the recent past; two years ago (July 04, 2017, 18:03 UTC)

Today I was feeling somewhat blue, mostly because I’m demotivated to do most stuff, and I wanted to see what it was like to work in Gentoo two years ago. One thing I read is that a little shy of exactly two years ago, ICQ broke Kopete, just like they did yesterday. Interestingly enough, even though a workaround has been found for Kopete 0.12 (the one shipped with KDE 3.5), there is no bump I see in the tree this time.

Powered by.. Gentoo/FreeBSD (July 04, 2017, 18:03 UTC)

Okay, I want to thanks really, really much blackace for the “*Powered by Gentoo/FreeBSD*” logo I’ve put on the right on the blog :) I’m also trying to understand how to set a favicon with Gentoo/FreeBSD’s logo, but I’m not yet successful yet. Mostly because I suck at webmastering. This blog might remain a single experiment, a test, as Daniel (dsd) seems to be ready to update b2evolution on Planet Gentoo, so that it will improve antispam capabilities.