September 16 2020

Console-bound systemd services, the right way

Marek Szuba (marecki) September 16, 2020, 17:40

Let’s say that you need to run on your system some sort server software which instead of daemonising, has a command console permanently attached to standard input. Let us also say that said console is the only way for the administrator to interact with the service, including requesting its orderly shutdown – whoever has written it has not implemented any sort of signal handling so sending SIGTERM to the service process causes it to simply drop dead, potentially losing data in the process. And finally, let us say that the server in question is proprietary software so it isn’t really possible for you to fix any of the above in the source code (yes, I am talking about a specific piece of software – which by the way is very much alive and kicking as of late 2020). What do you do?

According to the collective wisdom of World Wide Web, the answer to this question is “use a terminal multiplexer like tmux or screen“, or at the very least a stripped-down variant of same such as dtach. OK, that sort of works – what if you want to run it as a proper system-managed service under e.g. OpenRC? The answer of the Stack Exchange crowd: have your init script invoke the terminal multiplexer. Oooooookay, how about under systemd, which actually prefers services it manages not to daemonise by itself? Nope, still “use a terminal multiplexer”.

What follows is my attempt to run a service like this under systemd more efficiently and elegantly, or at least with no extra dependencies beyond basic Unix shell commands.

Let us have a closer look at what systemd does with standard I/O of processes it spawns. The man page systemd.exec(5) tells us that what happens here is controlled by the directives StandardInput, StandardOutput and StandardError. By default the former is assigned to null while the latter two get piped to the journal, there are however quite a few other options here. According to the documentation, here is what systemd allows us to connect to standard input:

    • we are not interested in null (for obvious reasons) or any of the tty options (the whole point of this exercise is to run fully detached from any terminals);
    • data would work if we needed to feed some commands to the service when it starts but is useless for triggering a shutdown;
    • file looks promising – just point it to a FIFO on the file system and we’re all set – but it doesn’t actually take care of creating the FIFO for us. While we could in theory work around that by invoking mkfifo (and possibly chown if the service is to run as a specific user) in ExecStartPre, let’s see if we can find a better option
    • socket “is valid in socket-activated services only” and the corresponding socket unit must “have Accept=yes set”. What we want is the opposite, i.e. for the service to create its socket
    • finally, there is fd – which seems to be exactly what we need. According to the documentation all we have to do is write a socket unit creating a FIFO with appropriate ownership and permissions, make it a dependency of our service using the Sockets directive, and assign the corresponding named file descriptor to standard input.

Let’s try it out. To begin with, our socket unit “proprietarycrapd.socket”. Note that I have successfully managed to get this to work using unit templates as well, %i expansion works fine both here and while specifying unit or file-descriptor names in the service unit – but in order to avoid any possible confusion caused by the fact socket-activated services explicitly require being defined with templates, I have based my example on static units:

[Unit]
Description=Command FIFO for proprietarycrapd

[Socket]
ListenFIFO=/run/proprietarycrapd/pcd.control
DirectoryMode=0700
SocketMode=0600
SocketUser=pcd
SocketGroup=pcd
RemoveOnStop=true

Apart from the fact the unit in question has got no [Install] section (which makes sense given we want this socket to only be activated by the corresponding service, not by systemd itself), nothing out of the ordinary here. Note that since we haven’t used the directive FileDescriptorName, systemd will apply default behaviour and give the file descriptor associated with the FIFO the name of the socket unit itself.

And now, our service unit “proprietarycrapd.service”:

[Unit]
Description=proprietarycrap daemon
After=network.target

[Service]
User=pcd
Group=pcd
Sockets=proprietarycrapd.socket
StandardInput=socket
StandardOutput=journal
StandardError=journal
ExecStart=/opt/proprietarycrap/bin/proprietarycrapd
ExecStop=/usr/local/sbin/proprietarycrapd-stop

[Install]
WantedBy=multi-user.target

StandardInput=socket??? Whatever’s happened to StandardInput=fd:proprietarycrapd.socket??? Here is an odd thing. If I use the latter on my system, the service starts fine and gets the FIFO attached to its standard input – but when I try to stop the service the journal shows “Failed to load a named file descriptor: No such file or directory”, the ExecStop command is not run and systemd immediately fires a SIGTERM at the process. No idea why. Anyway, through trial and error I have found out that StandardInput=socket not only works fine in spite of being used in a service that is not socket-activated but actually does exactly what I wanted to achieve – so that is what I have ended up using.

Which brings us to the final topic, the ExecStop command. There are three reasons why I have opted for putting all the commands required to shut the server down in a shell script:

    • first and foremost, writing the shutdown command to the FIFO will return right away even if the service takes time to shut down. systemd sends SIGTERM to the unit process as soon as the last ExecStop command has exited so we have to follow the echo with something that waits for the server process to finish (see below)
    • systemd does not execute Exec commands in a shell so simply running echo > /run/proprietarycrapd/pcd.control doesn’t work, we would have to wrap the echo call in an explicit invocation of a shell
    • between the aforementioned two reasons and the fact the particular service for which I have created these units actually requires several commands in order to execute an orderly shutdown, I have decided that putting all those command in a script file instead of cramming them into the unit would be much cleaner.

The shutdown script itself is mostly unremarkable so I’ll only quote the bit responsible for waiting for the server to actually shut down. At present I am still looking for doing it in blocking fashion without adding more dependencies (wait only works on child processes of the current shell, the server in question does not create any lock files to which I could attach inotifywait, and attaching the latter to the relevant directory in /proc does not work) but in the meantime, the loop

while kill -0 “${MAINPID}” 2> /dev/null; do
sleep 1s
done

keeps the script ticking along until either the process has exited or the script has timed out (see the TimeoutStopSec directive in systemd.service(5)) and systemd has killed both it and the service itself.

Acknowledgements: with many thanks to steelman for having figured out the StandardInput=socket bit in particular and having let me bounce my ideas off him in general.

Let’s say that you need to run on your system some sort server software which instead of daemonising, has a command console permanently attached to standard input. Let us also say that said console is the only way for the administrator to interact with the service, including requesting its orderly shutdown – whoever has written it has not implemented any sort of signal handling so sending SIGTERM to the service process causes it to simply drop dead, potentially losing data in the process. And finally, let us say that the server in question is proprietary software so it isn’t really possible for you to fix any of the above in the source code (yes, I am talking about a specific piece of software – which by the way is very much alive and kicking as of late 2020). What do you do?

According to the collective wisdom of World Wide Web, the answer to this question is “use a terminal multiplexer like tmux or screen“, or at the very least a stripped-down variant of same such as dtach. OK, that sort of works – what if you want to run it as a proper system-managed service under e.g. OpenRC? The answer of the Stack Exchange crowd: have your init script invoke the terminal multiplexer. Oooooookay, how about under systemd, which actually prefers services it manages not to daemonise by itself? Nope, still “use a terminal multiplexer”.

What follows is my attempt to run a service like this under systemd more efficiently and elegantly, or at least with no extra dependencies beyond basic Unix shell commands.

Let us have a closer look at what systemd does with standard I/O of processes it spawns. The man page systemd.exec(5) tells us that what happens here is controlled by the directives StandardInput, StandardOutput and StandardError. By default the former is assigned to null while the latter two get piped to the journal, there are however quite a few other options here. According to the documentation, here is what systemd allows us to connect to standard input:

    • we are not interested in null (for obvious reasons) or any of the tty options (the whole point of this exercise is to run fully detached from any terminals);
    • data would work if we needed to feed some commands to the service when it starts but is useless for triggering a shutdown;
    • file looks promising – just point it to a FIFO on the file system and we’re all set – but it doesn’t actually take care of creating the FIFO for us. While we could in theory work around that by invoking mkfifo (and possibly chown if the service is to run as a specific user) in ExecStartPre, let’s see if we can find a better option
    • socket “is valid in socket-activated services only” and the corresponding socket unit must “have Accept=yes set”. What we want is the opposite, i.e. for the service to create its socket
    • finally, there is fd – which seems to be exactly what we need. According to the documentation all we have to do is write a socket unit creating a FIFO with appropriate ownership and permissions, make it a dependency of our service using the Sockets directive, and assign the corresponding named file descriptor to standard input.

Let’s try it out. To begin with, our socket unit “proprietarycrapd.socket”. Note that I have successfully managed to get this to work using unit templates as well, %i expansion works fine both here and while specifying unit or file-descriptor names in the service unit – but in order to avoid any possible confusion caused by the fact socket-activated services explicitly require being defined with templates, I have based my example on static units:

[Unit]
Description=Command FIFO for proprietarycrapd

[Socket]
ListenFIFO=/run/proprietarycrapd/pcd.control
DirectoryMode=0700
SocketMode=0600
SocketUser=pcd
SocketGroup=pcd
RemoveOnStop=true

Apart from the fact the unit in question has got no [Install] section (which makes sense given we want this socket to only be activated by the corresponding service, not by systemd itself), nothing out of the ordinary here. Note that since we haven’t used the directive FileDescriptorName, systemd will apply default behaviour and give the file descriptor associated with the FIFO the name of the socket unit itself.

And now, our service unit “proprietarycrapd.service”:

[Unit]
Description=proprietarycrap daemon
After=network.target

[Service]
User=pcd
Group=pcd
Sockets=proprietarycrapd.socket
StandardInput=socket
StandardOutput=journal
StandardError=journal
ExecStart=/opt/proprietarycrap/bin/proprietarycrapd
ExecStop=/usr/local/sbin/proprietarycrapd-stop

[Install]
WantedBy=multi-user.target

StandardInput=socket??? Whatever’s happened to StandardInput=fd:proprietarycrapd.socket??? Here is an odd thing. If I use the latter on my system, the service starts fine and gets the FIFO attached to its standard input – but when I try to stop the service the journal shows “Failed to load a named file descriptor: No such file or directory”, the ExecStop command is not run and systemd immediately fires a SIGTERM at the process. No idea why. Anyway, through trial and error I have found out that StandardInput=socket not only works fine in spite of being used in a service that is not socket-activated but actually does exactly what I wanted to achieve – so that is what I have ended up using.

Which brings us to the final topic, the ExecStop command. There are three reasons why I have opted for putting all the commands required to shut the server down in a shell script:

    • first and foremost, writing the shutdown command to the FIFO will return right away even if the service takes time to shut down. systemd sends SIGTERM to the unit process as soon as the last ExecStop command has exited so we have to follow the echo with something that waits for the server process to finish (see below)
    • systemd does not execute Exec commands in a shell so simply running echo > /run/proprietarycrapd/pcd.control doesn’t work, we would have to wrap the echo call in an explicit invocation of a shell
    • between the aforementioned two reasons and the fact the particular service for which I have created these units actually requires several commands in order to execute an orderly shutdown, I have decided that putting all those command in a script file instead of cramming them into the unit would be much cleaner.

The shutdown script itself is mostly unremarkable so I’ll only quote the bit responsible for waiting for the server to actually shut down. At present I am still looking for doing it in blocking fashion without adding more dependencies (wait only works on child processes of the current shell, the server in question does not create any lock files to which I could attach inotifywait, and attaching the latter to the relevant directory in /proc does not work) but in the meantime, the loop

while kill -0 “${MAINPID}” 2> /dev/null; do
sleep 1s
done

keeps the script ticking along until either the process has exited or the script has timed out (see the TimeoutStopSec directive in systemd.service(5)) and systemd has killed both it and the service itself.

Acknowledgements: with many thanks to steelman for having figured out the StandardInput=socket bit in particular and having let me bounce my ideas off him in general.

September 15 2020

Distribution kernel for Gentoo

Gentoo News (GentooNews) September 15, 2020, 5:00

The Gentoo Distribution Kernel project is excited to announce that our new Linux Kernel packages are ready for a wide audience! The project aims to create a better Linux Kernel maintenance experience by providing ebuilds that can be used to configure, compile, and install a kernel entirely through the package manager as well as prebuilt binary kernels. We are currently shipping three kernel packages:

  • sys-kernel/gentoo-kernel - providing a kernel with genpatches applied, built using the package manager with either a distribution default or a custom configuration
  • sys-kernel/gentoo-kernel-bin - prebuilt version of gentoo-kernel, saving time on compiling
  • sys-kernel/vanilla-kernel - providing a vanilla (unmodified) upstream kernel

All the packages install the kernel as part of the package installation process — just like the rest of your system! More information can be found in the Gentoo Handbook and on the Distribution Kernel project page. Happy hacking!

Larry with Tux as cowboy

The Gentoo Distribution Kernel project is excited to announce that our new Linux Kernel packages are ready for a wide audience! The project aims to create a better Linux Kernel maintenance experience by providing ebuilds that can be used to configure, compile, and install a kernel entirely through the package manager as well as prebuilt binary kernels. We are currently shipping three kernel packages:

All the packages install the kernel as part of the package installation process — just like the rest of your system! More information can be found in the Gentoo Handbook and on the Distribution Kernel project page. Happy hacking!

September 12 2020

New vulnerability fixes in Python 2.7 (and PyPy)

Michał Górny (mgorny) September 12, 2020, 20:13

As you probably know (and aren’t necessarily happy about it), Gentoo is actively working on eliminating Python 2.7 support from packages until end of 2020. Nevertheless, we are going to keep the Python 2.7 interpreter much longer because of some build-time dependencies. While we do that, we consider it important to keep Python 2.7 as secure as possible.

The last Python 2.7 release was in April 2020. Since then, at least Gentoo and Fedora have backported CVE-2019-20907 (infinite loop in tarfile) fix to it, mostly because the patch from Python 3 applied cleanly to Python 2.7. I’ve indicated that Python 2.7 may contain more vulnerabilities, and two days ago I’ve finally gotten to audit it properly as part of bumping PyPy.

The result is matching two more vulnerabilities that were discovered in Python 3.6, and backporting fixes for them: CVE-2020-8492 (ReDoS in basic HTTP auth handling) and bpo-39603 (header injection via HTTP method). I am pleased to announce that Gentoo is probably the first distribution to address these issues, and our Python 2.7.18-r2 should not contain any known vulnerabilities. Of course, this doesn’t mean it’s safe from undiscovered problems.

While at it, I’ve also audited PyPy. Sadly, all current versions of PyPy2.7 were vulnerable to all aforementioned issues, plus partially to CVE-2019-18348 (header injection via hostname, fixed in 2.7.18). PyPy3.6 was even worse, missing 12 fixes from CPython 3.6. All these issues were fixed in Mercurial now, and should be part of 7.3.2 final.

As you probably know (and aren’t necessarily happy about it), Gentoo is actively working on eliminating Python 2.7 support from packages until end of 2020. Nevertheless, we are going to keep the Python 2.7 interpreter much longer because of some build-time dependencies. While we do that, we consider it important to keep Python 2.7 as secure as possible.

The last Python 2.7 release was in April 2020. Since then, at least Gentoo and Fedora have backported CVE-2019-20907 (infinite loop in tarfile) fix to it, mostly because the patch from Python 3 applied cleanly to Python 2.7. I’ve indicated that Python 2.7 may contain more vulnerabilities, and two days ago I’ve finally gotten to audit it properly as part of bumping PyPy.

The result is matching two more vulnerabilities that were discovered in Python 3.6, and backporting fixes for them: CVE-2020-8492 (ReDoS in basic HTTP auth handling) and bpo-39603 (header injection via HTTP method). I am pleased to announce that Gentoo is probably the first distribution to address these issues, and our Python 2.7.18-r2 should not contain any known vulnerabilities. Of course, this doesn’t mean it’s safe from undiscovered problems.

While at it, I’ve also audited PyPy. Sadly, all current versions of PyPy2.7 were vulnerable to all aforementioned issues, plus partially to CVE-2019-18348 (header injection via hostname, fixed in 2.7.18). PyPy3.6 was even worse, missing 12 fixes from CPython 3.6. All these issues were fixed in Mercurial now, and should be part of 7.3.2 final.

September 09 2020

New Packages site features

Gentoo News (GentooNews) September 09, 2020, 5:00

Our packages.gentoo.org site has recently received major feature upgrades thanks to the continued efforts of Gentoo developer Max Magorsch (arzano). Highlights include:

  • Tracking Gentoo bugs of specific packages (Bugzilla integration)
  • Tracking available upstream package versions (Repology integration)
  • QA check warnings for specific packages (QA reports integration)

Additionally, an experimental command-line client for packages.gentoo.org named “pgo” is in preparation, specifically also for our users with accesssibility needs.

Gentoo in a package

Our packages.gentoo.org site has recently received major feature upgrades thanks to the continued efforts of Gentoo developer Max Magorsch (arzano). Highlights include:

Additionally, an experimental command-line client for packages.gentoo.org named “pgo” is in preparation, specifically also for our users with accesssibility needs.

September 07 2020

py3status v3.29

Alexys Jacob (ultrabug) September 07, 2020, 20:39

Almost 5 months after the latest release (thank you COVID) I’m pleased and relieved to have finally packaged and pushed py3status v3.29 to PyPi and Gentoo portage!

This release comes with a lot of interesting contributions from quite a bunch of first-time contributors so I thought that I’d thank them first for a change!

Thank you contributors!
  • Jacotsu
  • lasers
  • Marc Poulhiès
  • Markus Sommer
  • raphaunix
  • Ricardo Pérez
  • vmoyankov
  • Wilmer van der Gaast
  • Yaroslav Dronskii
So what’s new in v3.29?

Two new exciting modules are in!

  • prometheus module: to display your promQL queries on your bar
  • watson module: for the watson time-tracking tool

Then some interesting bug fixes and enhancements are to be noted

  • py3.requests: return empty json on remote server problem fix #1401
  • core modules: remove deprectated function, fix type annotation support (#1942)

Some modules also got improved

  • battery_level module: add power consumption placeholder (#1939) + support more battery paths detection (#1946)
  • do_not_disturb module: change pause default from False to True
  • mpris module: implement broken chromium mpris interface workaround (#1943)
  • sysdata module: add {mem,swap}_free, {mem,swap}_free_unit, {mem,swap}_free_percent + try to use default intel/amd sensors first
  • google_calendar module: fix imports for newer google-python-client-api versions (#1948)

Next version of py3status will certainly drop support for EOL Python 3.5!

ultrabug (ultrabug ) September 07, 2020, 20:39

Almost 5 months after the latest release (thank you COVID) I’m pleased and relieved to have finally packaged and pushed py3status v3.29 to PyPi and Gentoo portage!

This release comes with a lot of interesting contributions from quite a bunch of first-time contributors so I thought that I’d thank them first for a change!

Thank you contributors!

  • Jacotsu
  • lasers
  • Marc Poulhiès
  • Markus Sommer
  • raphaunix
  • Ricardo Pérez
  • vmoyankov
  • Wilmer van der Gaast
  • Yaroslav Dronskii

So what’s new in v3.29?

Two new exciting modules are in!

  • prometheus module: to display your promQL queries on your bar
  • watson module: for the watson time-tracking tool

Then some interesting bug fixes and enhancements are to be noted

  • py3.requests: return empty json on remote server problem fix #1401
  • core modules: remove deprectated function, fix type annotation support (#1942)

Some modules also got improved

  • battery_level module: add power consumption placeholder (#1939) + support more battery paths detection (#1946)
  • do_not_disturb module: change pause default from False to True
  • mpris module: implement broken chromium mpris interface workaround (#1943)
  • sysdata module: add {mem,swap}_free, {mem,swap}_free_unit, {mem,swap}_free_percent + try to use default intel/amd sensors first
  • google_calendar module: fix imports for newer google-python-client-api versions (#1948)

Next version of py3status will certainly drop support for EOL Python 3.5!

September 05 2020

Portage 3.0 stabilized

Gentoo News (GentooNews) September 05, 2020, 5:00

We have good news! Gentoo’s Portage project has recently stabilized version 3.0 of the package manager.

What’s new? Well, this third version of Portage removes support for Python 2.7, which has been an ongoing effort across the main Gentoo repository by Gentoo’s Python project during the 2020 year (see this blog post).

In addition, due to a user provided patch, updating to the latest version of Portage can vastly speed up dependency calculations by around 50-60%. We love to see our community engaging in our software! For more details, see this Reddit post from the community member who provided the patch. Stay healthy and keep cooking with Gentoo!

Skating Larry

We have good news! Gentoo’s Portage project has recently stabilized version 3.0 of the package manager.

What’s new? Well, this third version of Portage removes support for Python 2.7, which has been an ongoing effort across the main Gentoo repository by Gentoo’s Python project during the 2020 year (see this blog post).

In addition, due to a user provided patch, updating to the latest version of Portage can vastly speed up dependency calculations by around 50-60%. We love to see our community engaging in our software! For more details, see this Reddit post from the community member who provided the patch. Stay healthy and keep cooking with Gentoo!

September 02 2020

New tools to help with package cleanups

Michał Górny (mgorny) September 02, 2020, 7:12

Did you ever have had Croaker shout at you because you removed an old version that just happened to be still required by some other package? Did you have to run your cleanups past (slow-ish) CI just to avoid that? If you did, I have just released app-portage/mgorny-dev-scripts, version 6 that has a tool just for that!

check-revdep to check depgraph of reverse dependencies

If you have used mgorny-dev-tools before, then you may know already about the rdep tool that prints reverse dependency information collected from qa-reports.gentoo.org. Now I’ve put it into a trivial pipeline with pkgcheck, and made check-revdep. The idea is really trivial: it fetches list of reverse dependencies from the server, filters it through qatom (from app-portage/portage-utils) and passes to pkgcheck scan -c VisibilityCheck.

So you do something like:


$ cd dev-python/unidecode
$ git rm unidecode-0.04.21.ebuild 
rm 'dev-python/unidecode/unidecode-0.04.21.ebuild'
$ check-revdep 
== rdep of dev-python/unidecode ==
== ddep of dev-python/unidecode ==
== bdep of dev-python/unidecode ==
== pdep of dev-python/unidecode ==
cat: /tmp/pindex/dev-python/unidecode: No such file or directory
dev-python/awesome-slugify
  NonexistentDeps: version 1.6.5: RDEPEND: nonexistent package: <dev-python/unidecode-0.05
  NonsolvableDepsInDev: version 1.6.5: nonsolvable depset(rdepend) keyword(~amd64) dev profile (default/linux/amd64/17.0/no-multilib/prefix/kernel-3.2+) (33 total): solutions: [ <dev-python/unidecode-0.05 ]
  NonsolvableDepsInStable: version 1.6.5: nonsolvable depset(rdepend) keyword(~amd64) stable profile (default/linux/amd64/17.0) (38 total): solutions: [ <dev-python/unidecode-0.05 ]
[...]
$ git restore --staged --worktree .

…and you know you can’t clean it up.

Warning: the tooling uses data from qa-reports that is updated periodically. If the data is not up-to-date (read: someone just added a dependency on your package), check-revdep may miss something.

Enable cache to speed things up

The rdep also supports using a local cache to avoid fetching everything from the server (= 4 requests per package). To populate the cache using the current data from server, just run:

$ rdep-fetch-cache 
--2020-09-02 09:00:05--  qa-reports.gentoo.org/output/genrdeps/rdeps.tar.xz
Resolving qa-reports.gentoo.org (qa-reports.gentoo.org)... 140.211.166.190, 2001:470:ea4a:1:230:48ff:fef8:9fdc
Connecting to qa-reports.gentoo.org (qa-reports.gentoo.org)|140.211.166.190|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1197176 (1,1M) [application/x-xz]
Saving to: ‘STDOUT’

-                                    100%[=====================================================================>]   1,14M   782KB/s    in 1,5s    

2020-09-02 09:00:08 (782 KB/s) - written to stdout [1197176/1197176]

The script will fetch the data as a tarball from server, and unpack to /tmp/*index.

pkgcheck also has its own caching, so successive checks will run faster if packages don’t change.

Combining rdep with other tools

You can also pass rdep output through to other tools, such as eshowkw (app-portage/gentoolkit) or gpy-showimpls (app-portage/gpyutils). The recommended pipeline is:

$ rdep $(pkg) | grep -v '^\[B' | xargs qatom -C -F '%{CATEGORY}/%{PN}' | sort -u | xargs gpy-showimpls | less
== rdep of dev-python/unidecode ==
== ddep of dev-python/unidecode ==
== bdep of dev-python/unidecode ==
== pdep of dev-python/unidecode ==
cat: /tmp/pindex/dev-python/unidecode: No such file or directory
app-misc/khard:0
          0.13.0: S             3.7 3.8
          0.17.0: ~             3.7 3.8 3.9
app-text/pelican:0
           3.7.1: S   #     3.6
           4.0.0: ~   #     3.6
           4.0.1: ~   #     3.6
           4.1.2: ~   #     3.6
           4.2.0: S         3.6 3.7
            9999:           3.6 3.7
dev-python/awesome-slugify:0
           1.6.5: ~         3.6 3.7 3.8
dev-python/pretty-yaml:0
          20.4.0: S         3.6 3.7 3.8 3.9
dev-python/python-slugify:0
           1.2.6: S   #     3.6 3.7 3.8 3.9
           4.0.1: S         3.6 3.7 3.8 3.9
media-sound/beets:0
        1.4.9-r2: ~ s       3.6 3.7 3.8
            9999:   s       3.6 3.7 3.8
www-apps/nikola:0
          7.8.15: S         3.6
       7.8.15-r1: ~   #     3.6
           8.0.4: ~   #     3.6 3.7 3.8
           8.1.0: ~   #     3.6 3.7 3.8
           8.1.1: ~   #     3.6 3.7 3.8
        8.1.1-r1: ~         3.6 3.7 3.8
Getting redundant versions from pkgcheck

Another nice trick is to have pkgcheck scan for redundant versions, and output it in format convenient for machine use.

For example, I often use:

git grep -l mgorny@ '**/metadata.xml' | cut -d/ -f1-2 | uniq | xargs pkgcheck scan -c RedundantVersionCheck -R FormatReporter --format '( cd {category}/{package} && eshowkw -C )'| sort -u | bash - |& less

that gives eshowkw output for all packages maintained by me that have potentially redundant versions. Plus, in another terminal:

$ git grep -l mgorny@ '**/metadata.xml' | cut -d/ -f1-2 | uniq | xargs pkgcheck scan -c RedundantVersionCheck -R FormatReporter --format 'git rm {category}/{package}/${package}-{version}.ebuild' | less

that gives convenient commands to copy-paste-execute.

Did you ever have had Croaker shout at you because you removed an old version that just happened to be still required by some other package? Did you have to run your cleanups past (slow-ish) CI just to avoid that? If you did, I have just released app-portage/mgorny-dev-scripts, version 6 that has a tool just for that!

check-revdep to check depgraph of reverse dependencies

If you have used mgorny-dev-tools before, then you may know already about the rdep tool that prints reverse dependency information collected from qa-reports.gentoo.org. Now I’ve put it into a trivial pipeline with pkgcheck, and made check-revdep. The idea is really trivial: it fetches list of reverse dependencies from the server, filters it through qatom (from app-portage/portage-utils) and passes to pkgcheck scan -c VisibilityCheck.

So you do something like:


$ cd dev-python/unidecode
$ git rm unidecode-0.04.21.ebuild 
rm 'dev-python/unidecode/unidecode-0.04.21.ebuild'
$ check-revdep 
== rdep of dev-python/unidecode ==
== ddep of dev-python/unidecode ==
== bdep of dev-python/unidecode ==
== pdep of dev-python/unidecode ==
cat: /tmp/pindex/dev-python/unidecode: No such file or directory
dev-python/awesome-slugify
  NonexistentDeps: version 1.6.5: RDEPEND: nonexistent package: <dev-python/unidecode-0.05
  NonsolvableDepsInDev: version 1.6.5: nonsolvable depset(rdepend) keyword(~amd64) dev profile (default/linux/amd64/17.0/no-multilib/prefix/kernel-3.2+) (33 total): solutions: [ <dev-python/unidecode-0.05 ]
  NonsolvableDepsInStable: version 1.6.5: nonsolvable depset(rdepend) keyword(~amd64) stable profile (default/linux/amd64/17.0) (38 total): solutions: [ <dev-python/unidecode-0.05 ]
[...]
$ git restore --staged --worktree .

…and you know you can’t clean it up.

Warning: the tooling uses data from qa-reports that is updated periodically. If the data is not up-to-date (read: someone just added a dependency on your package), check-revdep may miss something.

Enable cache to speed things up

The rdep also supports using a local cache to avoid fetching everything from the server (= 4 requests per package). To populate the cache using the current data from server, just run:

$ rdep-fetch-cache 
--2020-09-02 09:00:05--  https://qa-reports.gentoo.org/output/genrdeps/rdeps.tar.xz
Resolving qa-reports.gentoo.org (qa-reports.gentoo.org)... 140.211.166.190, 2001:470:ea4a:1:230:48ff:fef8:9fdc
Connecting to qa-reports.gentoo.org (qa-reports.gentoo.org)|140.211.166.190|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1197176 (1,1M) [application/x-xz]
Saving to: ‘STDOUT’

-                                    100%[=====================================================================>]   1,14M   782KB/s    in 1,5s    

2020-09-02 09:00:08 (782 KB/s) - written to stdout [1197176/1197176]

The script will fetch the data as a tarball from server, and unpack to /tmp/*index.

pkgcheck also has its own caching, so successive checks will run faster if packages don’t change.

Combining rdep with other tools

You can also pass rdep output through to other tools, such as eshowkw (app-portage/gentoolkit) or gpy-showimpls (app-portage/gpyutils). The recommended pipeline is:

$ rdep $(pkg) | grep -v '^\[B' | xargs qatom -C -F '%{CATEGORY}/%{PN}' | sort -u | xargs gpy-showimpls | less
== rdep of dev-python/unidecode ==
== ddep of dev-python/unidecode ==
== bdep of dev-python/unidecode ==
== pdep of dev-python/unidecode ==
cat: /tmp/pindex/dev-python/unidecode: No such file or directory
app-misc/khard:0
          0.13.0: S             3.7 3.8
          0.17.0: ~             3.7 3.8 3.9
app-text/pelican:0
           3.7.1: S   #     3.6
           4.0.0: ~   #     3.6
           4.0.1: ~   #     3.6
           4.1.2: ~   #     3.6
           4.2.0: S         3.6 3.7
            9999:           3.6 3.7
dev-python/awesome-slugify:0
           1.6.5: ~         3.6 3.7 3.8
dev-python/pretty-yaml:0
          20.4.0: S         3.6 3.7 3.8 3.9
dev-python/python-slugify:0
           1.2.6: S   #     3.6 3.7 3.8 3.9
           4.0.1: S         3.6 3.7 3.8 3.9
media-sound/beets:0
        1.4.9-r2: ~ s       3.6 3.7 3.8
            9999:   s       3.6 3.7 3.8
www-apps/nikola:0
          7.8.15: S         3.6
       7.8.15-r1: ~   #     3.6
           8.0.4: ~   #     3.6 3.7 3.8
           8.1.0: ~   #     3.6 3.7 3.8
           8.1.1: ~   #     3.6 3.7 3.8
        8.1.1-r1: ~         3.6 3.7 3.8

Getting redundant versions from pkgcheck

Another nice trick is to have pkgcheck scan for redundant versions, and output it in format convenient for machine use.

For example, I often use:

git grep -l mgorny@ '**/metadata.xml' | cut -d/ -f1-2 | uniq | xargs pkgcheck scan -c RedundantVersionCheck -R FormatReporter --format '( cd {category}/{package} && eshowkw -C )'| sort -u | bash - |& less

that gives eshowkw output for all packages maintained by me that have potentially redundant versions. Plus, in another terminal:

$ git grep -l mgorny@ '**/metadata.xml' | cut -d/ -f1-2 | uniq | xargs pkgcheck scan -c RedundantVersionCheck -R FormatReporter --format 'git rm {category}/{package}/${package}-{version}.ebuild' | less

that gives convenient commands to copy-paste-execute.

August 25 2020

Is an umbrella organization a good choice for Gentoo?

Michał Górny (mgorny) August 25, 2020, 17:59

The talk of joining an umbrella organization and disbanding the Gentoo Foundation (GF) has been recurring over the last years. To the best of my knowledge, even some unofficial talks have been had earlier. However, so far our major obstacle for joining one was the bad standing of the Gentoo Foundation with the IRS. Now that that is hopefully out of the way, we can start actively working towards it.

But why would we want to join an umbrella in the first place? Isn’t having our own dedicated Foundation better? I believe that an umbrella is better for three reasons:

  1. Long-term sustainability. A dedicated professional entity that supports multiple projects has better chances than a small body run by volunteers from the developer community.
  2. Cost efficiency. Less money spent on organizational support, more money for what really matters to Gentoo.
  3. Added value. Umbrellas can offer us services and status that we currently haven’t been able to achieve.

I’ll expand on all three points.

Long-term sustainability

As you probably know by now, the Gentoo Foundation was not handled properly in the past. For many years, we have failed to file the necessary paperwork or pay due taxes. Successive boards of Trustees have either ignored the problem or were unable to resolve it. Only recently have we finally managed to come clean.

Now, many people point out that since we’re clean now, the problem is solved. However, I would like to point out that our good standing currently depends on one person doing the necessary bookkeeping, and a professional CPA doing the filings for us. The former means a bus factor of one, the latter means expenses. So far all efforts to train a backup have failed.

My point is, as long as Foundation exists we need to rely either on volunteers or on commercial support to keep it running. If we fail, it could be a major problem for Gentoo. We might not get away with it the next time. What’s more important, if we get into bad standing again, the chances of an umbrella taking us would decrease.

Remember that the umbrellas that interest us were founded precisely to support open source projects. They have professional staff to handle legal and financial affairs of their members. Gentoo Foundation on the other hand has staff of Gentoo developers — programmers, scientists but not really bookkeepers or lawyers. Sure, many of us run small companies but so far we lacked volunteers being equipped and willing to seriously handle GF.

Cost efficiency

So far I’ve been focusing on the volunteer-run Foundation. However, if we lack capable volunteers we can always rely on commercial support. The problem with that is that’s really expensive. Admittedly, being part of an umbrella is not free either but so far it seems that even the costliest umbrellas are cheaper than being on our own. Let’s crunch some numbers!

Right now we’re already relying on a CPA to handle our filings. For a commercial company (we are one now), the cost is $1500 a year. If we wanted to go for proper non-profit, the estimated cost is between $2000 and $3000 a year.

If we were to pass full accounting to an external company, the rough estimate I’ve been given by Trustees is $2400. So once our volunteer bookkeeper retires, we’re talking of around $4000 + larger taxes for a corporation, or $4500 to $5500 + very little taxes for a non-profit.

How does that compare to our income? I’ve created the following chart according to the financial reports.

The chart is focused on estimating expected cash income within the particular year. Therefore, commission back payments were omitted from it. In the full version (click for it), GSoC back payments were moved to their respective years too.

Small donations are the key point here, as they are more reliable than other sources of income. Over the years, they varied between $5000 and $12000, amounting to $7200 on average. Over the years, we had a few larger (>$1000) donations but we can’t rely on these in the next years (especially that they were none in FY2020). The next major source of income was Google Summer of Code that I’ve split into cash and travel reimbursement. The former only counts towards actual cash, and again, we can’t really rely on it reliably happening in the future. Interest and commission have minimal impact.

The point is, full bookkeeping services come dangerously close to our baseline annual income. On average, it would eat half of our budget! In 2014, if not for large donations (which are pretty much 0/1 thing) we would have ended up with loss. We’re talking about a situation where we can end up spending more on organization overhead than on Gentoo!

Even if we take the optimistic approach, we’re talking about costs at around 20% to 45% income according to the past years. This is much more than the 10% taken by SFC (and SFC isn’t exactly cheap).

Added value

So far I’ve been focusing on the effort/money necessary to keep the Gentoo Foundation as-is. That is, a for-profit corporation that spends some money on Infrastructure and CPA, and whose biggest non-infra investment in Gentoo was the Nitrokey giveaway.

Over the recent years, the possibility of becoming a non-profit was discussed. The primary advantages of that would be tax deduction for the Foundation, and tax deduction for donors in the USA (hopefully convincing more people to donate). However, becoming a non-profit is non-trivial, requires additional effort and most likely increases maintenance costs. That is, if our application is not rejected like Yorba Foundation was. On the other hand, if we join a non-profit umbrella (such as SFC), we get that as part of the deal!

Another interesting point is increasing actual spending on Gentoo, particularly by issuing bounties on actual development work. If we were to become a non-profit, some legal advice would be greatly desirable here and again, that’s something umbrellas offer. On the other hand, if we spend more and more money on keeping the Gentoo Foundation alive we probably won’t have much to spend on this anyway.

So why keep GF alive?

That’s precisely the question. Some developers argue that an external umbrella could try to take control of Gentoo, and limit our freedom. However, given that we’re going to sign a specific contract with an umbrella, I don’t see this as very likely.

On the other hand, keeping GF alive doesn’t guarantee Gentoo autonomy either — given the lack of interest in becoming a Trustee, it is possible that Foundation will eventually be taken over by people who want to aggressively take control of Gentoo against the will of the greater community. In fact, until very recently you could become a Trustee without getting a single vote of support if there were not enough candidates to compete over seats (and there usually weren’t).

Then, there are snarky people who believe that the GF exists so that non-developers could reap negligible profits from Foundation membership, and people who would never be voted into the Council could win Trustee elections and enhance their CVs.

In any case, I think that the benefits of an umbrella organization outweigh the risks. I believe sustainability is the most important value here — a reasonable guarantee that Gentoo will not get into trouble in a few years because we couldn’t manage to find volunteers to run the Foundation or money to cover the accounting costs.

The talk of joining an umbrella organization and disbanding the Gentoo Foundation (GF) has been recurring over the last years. To the best of my knowledge, even some unofficial talks have been had earlier. However, so far our major obstacle for joining one was the bad standing of the Gentoo Foundation with the IRS. Now that that is hopefully out of the way, we can start actively working towards it.

But why would we want to join an umbrella in the first place? Isn’t having our own dedicated Foundation better? I believe that an umbrella is better for three reasons:

  1. Long-term sustainability. A dedicated professional entity that supports multiple projects has better chances than a small body run by volunteers from the developer community.
  2. Cost efficiency. Less money spent on organizational support, more money for what really matters to Gentoo.
  3. Added value. Umbrellas can offer us services and status that we currently haven’t been able to achieve.

I’ll expand on all three points.

Long-term sustainability

As you probably know by now, the Gentoo Foundation was not handled properly in the past. For many years, we have failed to file the necessary paperwork or pay due taxes. Successive boards of Trustees have either ignored the problem or were unable to resolve it. Only recently have we finally managed to come clean.

Now, many people point out that since we’re clean now, the problem is solved. However, I would like to point out that our good standing currently depends on one person doing the necessary bookkeeping, and a professional CPA doing the filings for us. The former means a bus factor of one, the latter means expenses. So far all efforts to train a backup have failed.

My point is, as long as Foundation exists we need to rely either on volunteers or on commercial support to keep it running. If we fail, it could be a major problem for Gentoo. We might not get away with it the next time. What’s more important, if we get into bad standing again, the chances of an umbrella taking us would decrease.

Remember that the umbrellas that interest us were founded precisely to support open source projects. They have professional staff to handle legal and financial affairs of their members. Gentoo Foundation on the other hand has staff of Gentoo developers — programmers, scientists but not really bookkeepers or lawyers. Sure, many of us run small companies but so far we lacked volunteers being equipped and willing to seriously handle GF.

Cost efficiency

So far I’ve been focusing on the volunteer-run Foundation. However, if we lack capable volunteers we can always rely on commercial support. The problem with that is that’s really expensive. Admittedly, being part of an umbrella is not free either but so far it seems that even the costliest umbrellas are cheaper than being on our own. Let’s crunch some numbers!

Right now we’re already relying on a CPA to handle our filings. For a commercial company (we are one now), the cost is $1500 a year. If we wanted to go for proper non-profit, the estimated cost is between $2000 and $3000 a year.

If we were to pass full accounting to an external company, the rough estimate I’ve been given by Trustees is $2400. So once our volunteer bookkeeper retires, we’re talking of around $4000 + larger taxes for a corporation, or $4500 to $5500 + very little taxes for a non-profit.

How does that compare to our income? I’ve created the following chart according to the financial reports.

Gentoo Foundation income chart

The chart is focused on estimating expected cash income within the particular year. Therefore, commission back payments were omitted from it. In the full version (click for it), GSoC back payments were moved to their respective years too.

Small donations are the key point here, as they are more reliable than other sources of income. Over the years, they varied between $5000 and $12000, amounting to $7200 on average. Over the years, we had a few larger (>$1000) donations but we can’t rely on these in the next years (especially that they were none in FY2020). The next major source of income was Google Summer of Code that I’ve split into cash and travel reimbursement. The former only counts towards actual cash, and again, we can’t really rely on it reliably happening in the future. Interest and commission have minimal impact.

The point is, full bookkeeping services come dangerously close to our baseline annual income. On average, it would eat half of our budget! In 2014, if not for large donations (which are pretty much 0/1 thing) we would have ended up with loss. We’re talking about a situation where we can end up spending more on organization overhead than on Gentoo!

Even if we take the optimistic approach, we’re talking about costs at around 20% to 45% income according to the past years. This is much more than the 10% taken by SFC (and SFC isn’t exactly cheap).

Added value

So far I’ve been focusing on the effort/money necessary to keep the Gentoo Foundation as-is. That is, a for-profit corporation that spends some money on Infrastructure and CPA, and whose biggest non-infra investment in Gentoo was the Nitrokey giveaway.

Over the recent years, the possibility of becoming a non-profit was discussed. The primary advantages of that would be tax deduction for the Foundation, and tax deduction for donors in the USA (hopefully convincing more people to donate). However, becoming a non-profit is non-trivial, requires additional effort and most likely increases maintenance costs. That is, if our application is not rejected like Yorba Foundation was. On the other hand, if we join a non-profit umbrella (such as SFC), we get that as part of the deal!

Another interesting point is increasing actual spending on Gentoo, particularly by issuing bounties on actual development work. If we were to become a non-profit, some legal advice would be greatly desirable here and again, that’s something umbrellas offer. On the other hand, if we spend more and more money on keeping the Gentoo Foundation alive we probably won’t have much to spend on this anyway.

So why keep GF alive?

That’s precisely the question. Some developers argue that an external umbrella could try to take control of Gentoo, and limit our freedom. However, given that we’re going to sign a specific contract with an umbrella, I don’t see this as very likely.

On the other hand, keeping GF alive doesn’t guarantee Gentoo autonomy either — given the lack of interest in becoming a Trustee, it is possible that Foundation will eventually be taken over by people who want to aggressively take control of Gentoo against the will of the greater community. In fact, until very recently you could become a Trustee without getting a single vote of support if there were not enough candidates to compete over seats (and there usually weren’t).

Then, there are snarky people who believe that the GF exists so that non-developers could reap negligible profits from Foundation membership, and people who would never be voted into the Council could win Trustee elections and enhance their CVs.

In any case, I think that the benefits of an umbrella organization outweigh the risks. I believe sustainability is the most important value here — a reasonable guarantee that Gentoo will not get into trouble in a few years because we couldn’t manage to find volunteers to run the Foundation or money to cover the accounting costs.

August 02 2020

Why proactively clean Python 2 up?

Michał Górny (mgorny) August 02, 2020, 10:18

It seems a recurring complaint that we’re too aggressive on cleaning Python 2 up from packages. Why remove it if (package’s) upstream still supports py2? Why remove it when it still works? Why remove it when somebody’s ready to put some work to keep it working?

I’m pretty sure that you’re aware that Python 2 has finally reached its end-of-life. It’s past its last release, and the current version is most likely vulnerable. We know we can’t remove it entirely just yet (but the clock is ticking!), so why remove its support here and there instead of keeping it some more?

This is best explained on the example of dev-python/twisted — but dev-python/pillow is also quite similar. Twisted upstream removed support for Python 2 at version 20. This means that we ended up having to keep two versions of Twisted — 19 that still supports Python 2, and 20 that does not. What does that means for our users?

Firstly, they can’t normally upgrade Twisted if at least one of its reverse dependencies supports Python 2 and is installed. What’s important is that the user does not have to meaningfully need or use Python 2 in that reverse dependency. It is entirely sufficient that it supports Python 2 and the user is using default PYTHON_TARGETS.

Of course, you could argue that changing the default PYTHON_TARGETS would resolve the problem without having to proactively remove Python 2 from Twisted revdeps. Today, I’m not sure which of the two options is better. However, back when cleanup started changing default PT would involve a lot of pain for the users. We’d have to reenable 2.7 via package.use for many packages (but which ones?) or the users would have to reenable it themselves. But that’s really tangential now.

Secondly, when upstream stops supporting the old version, the maintenance cost rises quickly. Since we don’t allow mixing two versions easily (and I don’t really want to go down that path), a single version must provide all implementations that the union of its reverse dependencies requires. This meant that I had to put significant effort fixing Python 3.8 and 3.9 support in Twisted 19.

Thirdly, old versions tend to end up becoming vulnerable. This is now the case both with Twisted and Pillow! In both cases, we can’t clean up vulnerable versions yet because they still have unresolved Python 2 reverse dependencies. We have a pretty descriptive phrase for this kind of situation in Polish: «to wake up with your hand in the potty».

What’s my point here? Removing Python 2 proactively means removing it at our leisure. We start with packages that don’t need it (because they fully support Python 3), we unlock the removal in their dependencies, we clean these dependencies… and when one of the upstreams decides to remove it, we don’t have to do anything because we’ve already done that and resolved all the issues. And we don’t have to worry about having to quickly clean up the depgraph and remove vulnerable versions or perform non-trivial backports.

It seems a recurring complaint that we’re too aggressive on cleaning Python 2 up from packages. Why remove it if (package’s) upstream still supports py2? Why remove it when it still works? Why remove it when somebody’s ready to put some work to keep it working?

I’m pretty sure that you’re aware that Python 2 has finally reached its end-of-life. It’s past its last release, and the current version is most likely vulnerable. We know we can’t remove it entirely just yet (but the clock is ticking!), so why remove its support here and there instead of keeping it some more?

This is best explained on the example of dev-python/twisted — but dev-python/pillow is also quite similar. Twisted upstream removed support for Python 2 at version 20. This means that we ended up having to keep two versions of Twisted — 19 that still supports Python 2, and 20 that does not. What does that means for our users?

Firstly, they can’t normally upgrade Twisted if at least one of its reverse dependencies supports Python 2 and is installed. What’s important is that the user does not have to meaningfully need or use Python 2 in that reverse dependency. It is entirely sufficient that it supports Python 2 and the user is using default PYTHON_TARGETS.

Of course, you could argue that changing the default PYTHON_TARGETS would resolve the problem without having to proactively remove Python 2 from Twisted revdeps. Today, I’m not sure which of the two options is better. However, back when cleanup started changing default PT would involve a lot of pain for the users. We’d have to reenable 2.7 via package.use for many packages (but which ones?) or the users would have to reenable it themselves. But that’s really tangential now.

Secondly, when upstream stops supporting the old version, the maintenance cost rises quickly. Since we don’t allow mixing two versions easily (and I don’t really want to go down that path), a single version must provide all implementations that the union of its reverse dependencies requires. This meant that I had to put significant effort fixing Python 3.8 and 3.9 support in Twisted 19.

Thirdly, old versions tend to end up becoming vulnerable. This is now the case both with Twisted and Pillow! In both cases, we can’t clean up vulnerable versions yet because they still have unresolved Python 2 reverse dependencies. We have a pretty descriptive phrase for this kind of situation in Polish: «to wake up with your hand in the potty».

What’s my point here? Removing Python 2 proactively means removing it at our leisure. We start with packages that don’t need it (because they fully support Python 3), we unlock the removal in their dependencies, we clean these dependencies… and when one of the upstreams decides to remove it, we don’t have to do anything because we’ve already done that and resolved all the issues. And we don’t have to worry about having to quickly clean up the depgraph and remove vulnerable versions or perform non-trivial backports.

July 20 2020

Updated Gentoo RISC-V stages

Andreas K. Hüttel (dilfridge) July 20, 2020, 16:28
♦I finally got around to updating the experimental riscv stages. You can find the result on our webserver. All stages use the rv64gc instruction set; there is a multilib stage with both lp64 and lp64d support, and there are non-multilib stages for both lp64 and lp64d ABI. Please test, and report bugs if anything doesn't work.
As for the technical details, the stages are built using qemu-user on a big and beefy Gentoo amd64 AWS instance. We are currently working on automating that process, such that riscv (and potentially also arm and others) get the same level of support as amd64 and friends. Thanks a lot to Amazon for the credits via their open source promotial program!
I finally got around to updating the experimental riscv stages. You can find the result on our webserver. All stages use the rv64gc instruction set; there is a multilib stage with both lp64 and lp64d support, and there are non-multilib stages for both lp64 and lp64d ABI. Please test, and report bugs if anything doesn't work.
As for the technical details, the stages are built using qemu-user on a big and beefy Gentoo amd64 AWS instance. We are currently working on automating that process, such that riscv (and potentially also arm and others) get the same level of support as amd64 and friends. Thanks a lot to Amazon for the credits via their open source promotial program!

July 07 2020

Gentoo on Android 64-bit release

Gentoo News (GentooNews) July 07, 2020, 5:00

Gentoo Project Android is pleased to announce a new 64-bit release of the stage3 Android prefix tarball. This is a major release after 2.5 years of development, featuring gcc-10.1.0, binutils-2.34 and glibc-2.31. Enjoy Gentoo in your pocket!

gentoo-android logo

Gentoo Project Android is pleased to announce a new 64-bit release of the stage3 Android prefix tarball. This is a major release after 2.5 years of development, featuring gcc-10.1.0, binutils-2.34 and glibc-2.31. Enjoy Gentoo in your pocket!

July 04 2020

gentoo tinderbox

Agostino Sarubbo (ago) July 04, 2020, 13:03

If you are visiting this page, it is very likely that the software you maintain has been analyzed by my tinderbox system.

What is a tinderbox?

It is a machine that compiles 24/7 that aims to find build failures, test failures, QA issues and so on in the portage tree.
It can be differentiated into:

– tinderbox – ci


TINDERBOX:

It compiles the entire portage tree against a particular change like:
– a new version of compiler/libc/linker
– a new C/CXX/LD FLAG
– a different toolchain like clang/llvm/lld
– and so on

In short it uses uncommon but supported settings and looks for breakage.

CI:

It is a continuous integration; the CI system compiles the packages after they have been touched in gentoo.git

The CI system uses a standard set of settings, so if you get a bug report from it, it is very likely that the failure is reproducible for users too.

What are the rules that you may know when you see a report from those systems?

1) The reports are filed automatically.
2) Because of the first, it is not possible for me to set an exact error in the bug summary. Instead a general error is used.
3) Because of the above, maintainer is encouraged to set an appropriate summary at its convenience
4) Common additional logs (like test-suite.log testlog.txt CMakeOutput.log CMakeError.log LastTest.log config.log testsuite.log autoconf.out) are automatically attached but before of the first if you need something else please ask for them.
5) If you ask for another log, I have to stop the tinderbox service, so there may be a delay between your request and my reaction.
6) There may be an internal reference between round brackets on the “Discovered on” line. This is for me to understand where that failure was reproduced.
7) If you see ‘ci’ as internal reference after you pushed a fix, it is very probably that the bug still exist, or there is another failure in the same ebuild phase. Please inspect deeply the build log. Point 8 may help you about that.
8) At the beginning of the build log a git SHA of the repository at the time of emerging is provided. For convenience there is a link.
9) To avoid making a separate attachment on bugzilla, at the beginning of the build log there is the ’emerge –info’, please check it DEEPLY to understand the system configuration and what differs respect to a more ‘standard’ system.
10) If you see a compressed build log, is because the plain text version exceeds the limits on our bugzilla (1MB).
11) This system is not perfect. There may be duplicates or invalid bugs.
12) My best suggestion is try to reproduce the issue on empty stage3 (or docker for convenience).
13) When you close the bug with a resolution different from RESOLVED/FIXED please not be cryptic.
14) If new points will be added, there may be a mention like “Valid from YY:MM:DD”

How to fix the common errors:

1) Compile/build failure:
It depends on the error. Please get in touch with upstream if you are unsure.

2) Test failure:
It depends on the error. Please get in touch with upstream if you are unsure.

3) CFLAGS/LDFLAGS not respected:
You can touch the build system or inject the flags in the ebuild where possible. There are a lot of examples in the tracker.

4) -Wformat-security failure:
TBD

5) Metainfo installed in /usr/share/appdata:
Install metainfo into /usr/share instead of /usr/share/appdata

6) Python modules that are not byte-compiled
TBD

7) Unrecognized configure options:
Remove the configure options from the ebuild where possible. Sometimes there are false positives related to the option passed to configure in subdirectories.

8) Compressed manpages and documentation:
Decompress documentation and install it as plain text.

9) Icon cache not updated:
TBD

10) Deprecated configure.in:
TBD

11) .Desktop do not pass validation:
TBD

12) Path that should be created at runtime:
TBD

13) Libraries that lack NEEDED entries:
TBD

14) Libraries that lack a SONAME:
TBD

15) Text relocation:
TBD

16) Toolchain binaries called directly (cc/gcc/g++/c++/nm/ar/ranlib/cpp/ld/strip/objcopy/objdump/size/as/strings/readelf and so on):
TBD

17) Files with name not encoded with UTF-8:
TBD

18) Files with broken symlink:
TBD

19) Command that do not exist:
TBD

20) Pkg-config files with wrong LDFLAGS:
TBD

21) Pre-stripped files:
TBD

22) File collision:
TBD

23) Compile failure if CPP is set to CC -E:
TBD

24) Compile failure with -fno-common:
TBD

25) Files with unresolved SONAME dependencies:
TBD

26) Files that contain insecure RUNPATHs:
TBD

27) Files installed into unexpected paths:
TBD

28) LD usage instead of CC/CXX:
TBD

29) Link failure with LLD because of /usr/lib:
TBD

30) Compilation in src_install phase:
TBD

31) Automake usage in maintainer-mode:
TBD

32) Mimeinfo cache not updated when .desktop files with MimeType are installed:
TBD

33) Broken png files installed:
TBD

34) Mime-info files installed without update mime-info cache:
TBD

35) Udev rules installed into wrong directory:
TBD

ago (ago ) July 04, 2020, 13:03

If you are visiting this page, it is very likely that the software you maintain has been analyzed by my tinderbox system.

What is a tinderbox?

It is a machine that compiles 24/7 that aims to find build failures, test failures, QA issues and so on in the portage tree.
It can be differentiated into:

– tinderbox

– ci

TINDERBOX:

It compiles the entire portage tree against a particular change like:
– a new version of compiler/libc/linker
– a new C/CXX/LD FLAG
– a different toolchain like clang/llvm/lld
– and so on

In short it uses uncommon but supported settings and looks for breakage.

CI:

It is a continuous integration; the CI system compiles the packages after they have been touched in gentoo.git

The CI system uses a standard set of settings, so if you get a bug report from it, it is very likely that the failure is reproducible for users too.

What are the rules that you may know when you see a report from those systems?

1) The reports are filed automatically.
2) Because of the first, it is not possible for me to set an exact error in the bug summary. Instead a general error is used.
3) Because of the above, maintainer is encouraged to set an appropriate summary at its convenience
4) Common additional logs (like test-suite.log testlog.txt CMakeOutput.log CMakeError.log LastTest.log config.log testsuite.log autoconf.out) are automatically attached but before of the first if you need something else please ask for them.
5) If you ask for another log, I have to stop the tinderbox service, so there may be a delay between your request and my reaction.
6) There may be an internal reference between round brackets on the “Discovered on” line. This is for me to understand where that failure was reproduced.
7) If you see ‘ci’ as internal reference after you pushed a fix, it is very probably that the bug still exist, or there is another failure in the same ebuild phase. Please inspect deeply the build log. Point 8 may help you about that.
8) At the beginning of the build log a git SHA of the repository at the time of emerging is provided. For convenience there is a link.
9) To avoid making a separate attachment on bugzilla, at the beginning of the build log there is the ’emerge –info’, please check it DEEPLY to understand the system configuration and what differs respect to a more ‘standard’ system.
10) If you see a compressed build log, is because the plain text version exceeds the limits on our bugzilla (1MB).
11) This system is not perfect. There may be duplicates or invalid bugs.
12) My best suggestion is try to reproduce the issue on empty stage3 (or docker for convenience).
13) When you close the bug with a resolution different from RESOLVED/FIXED please not be cryptic.
14) If new points will be added, there may be a mention like “Valid from YY:MM:DD”

How to fix the common errors:

1) Compile/build failure:
It depends on the error. Please get in touch with upstream if you are unsure.

2) Test failure:
It depends on the error. Please get in touch with upstream if you are unsure.

3) CFLAGS/LDFLAGS not respected:
You can touch the build system or inject the flags in the ebuild where possible. There are a lot of examples in the tracker.

4) -Wformat-security failure:
TBD

5) Metainfo installed in /usr/share/appdata:
Install metainfo into /usr/share instead of /usr/share/appdata

6) Python modules that are not byte-compiled
TBD

7) Unrecognized configure options:
Remove the configure options from the ebuild where possible. Sometimes there are false positives related to the option passed to configure in subdirectories.

8) Compressed manpages and documentation:
Decompress documentation and install it as plain text.

9) Icon cache not updated:
TBD

10) Deprecated configure.in:
TBD

11) .Desktop do not pass validation:
TBD

12) Path that should be created at runtime:
TBD

13) Libraries that lack NEEDED entries:
TBD

14) Libraries that lack a SONAME:
TBD

15) Text relocation:
TBD

16) Toolchain binaries called directly (cc/gcc/g++/c++/nm/ar/ranlib/cpp/ld/strip/objcopy/objdump/size/as/strings/readelf and so on):
TBD

17) Files with name not encoded with UTF-8:
TBD

18) Files with broken symlink:
TBD

19) Command that do not exist:
TBD

20) Pkg-config files with wrong LDFLAGS:
TBD

21) Pre-stripped files:
TBD

22) File collision:
TBD

23) Compile failure if CPP is set to CC -E:
TBD

24) Compile failure with -fno-common:
TBD

25) Files with unresolved SONAME dependencies:
TBD

26) Files that contain insecure RUNPATHs:
TBD

27) Files installed into unexpected paths:
TBD

28) LD usage instead of CC/CXX:
TBD

29) Link failure with LLD because of /usr/lib:
TBD

30) Compilation in src_install phase:
TBD

31) Automake usage in maintainer-mode:
TBD

32) Mimeinfo cache not updated when .desktop files with MimeType are installed:
TBD

33) Broken png files installed:
TBD

34) Mime-info files installed without update mime-info cache:
TBD

35) Udev rules installed into wrong directory:
TBD

Official Gentoo Docker images

Gentoo News (GentooNews) July 04, 2020, 5:00

Did you already know that we have official Gentoo Docker images available on Docker Hub?! The most popular one is based on the amd64 stage. Images are created automatically; you can peek at the source code for this on our git server. Thanks to the Gentoo Docker project!

docker logo

Did you already know that we have official Gentoo Docker images available on Docker Hub?! The most popular one is based on the amd64 stage. Images are created automatically; you can peek at the source code for this on our git server. Thanks to the Gentoo Docker project!

June 04 2020

Baïkal (CalDAV) 0.7.0 in Gentoo

Nathan Zachary (nathanzachary) June 04, 2020, 3:30

Just this past week, the new version of of Baïkal (0.7.0)—a PHP CalDAV and CardDAV server based on Sabre—was released, and one of the key changes was that support was added for more modern versions of PHP (like 7.4).

Since my personal Gentoo server is running the ~amd64 branch, I had to wait for this release in order to get my CalDAV server up and running. For the most part, installing Baïkal 0.7.0 was a straightforward process, but there were a couple of “gotchas” along the way.

The first (and most confusing) problem came after the installation/initial configuration when I tried to access my newly-created user’s calendar via the URL:

dav.MYDOMAIN.com/html/dav.php/calendar/MYUSERNAME/default

I knew that something was wrong when it wouldn’t even prompt me for credentials. Instead, the logs indicated the following error message:

[Tue Jun 02 14:13:05.529805 2020] [proxy_fcgi:error] [pid 32165:tid 139743908050688] [client 71.81.87.208:38910] AH01071: Got error 'PHP message: LogicException: Requested uri (/html/dav.php) is out of base uri (/s/html/dav.php/) in /var/www/domains/MYDOMAIN/dav/htdocs/vendor/sabre/http/lib/Request.php:184

I couldn’t figure out where the “/s/” was coming in before the “/html” portion, but that was certainly the cause of the error message. I filed an issue for it, and though I still don’t know the source of the problem, I was able to work around it by adding a trailing slash to the DocumentRoot for that particular vhost:

# pwd && diff -Nut dav.MYDOMAIN.conf.PRE-20200602_docroot dav.MYDOMAIN.conf
/etc/apache2/vhosts.d/includes
--- dav.MYDOMAIN.conf.PRE-20200602_docroot 2020-06-02 17:23:20.246281195 -0400 +++ dav.MYDOMAIN.conf 2020-06-02 17:20:59.892270352 -0400
@@ -1,7 +1,7 @@
- DocumentRoot "/var/www/domains/MYDOMAIN/dav/htdocs"
+ DocumentRoot "/var/www/domains/MYDOMAIN/dav/htdocs/"

After solving that strange problem, I was at least prompted for credentials when I accessed the calendar URL from above. After logging in, I ran into one more problem, though:

Class 'XMLWriter' not found

This problem was much easier to fix. I simply needed to add the ‘xmlwriter‘ USE flag to dev-lang/php (I also added ‘xmlreader‘ for good measure), emerge it again, and restart PHP-FPM. Other distributions (like CentOS) will likely need to install the ‘php-xml’ package (or something similar).

After that fix, I am happy to report that Baïkal 0.7.0 is working beautifully, and I have my calendars synced across all my devices. I personally use Thunderbird with Lightning on my computers, and a combination of DAVx5 with Simple Calendar Pro on my Android devices.

Just this past week, the new version of of Baïkal (0.7.0)—a PHP CalDAV and CardDAV server based on Sabre—was released, and one of the key changes was that support was added for more modern versions of PHP (like 7.4).

Since my personal Gentoo server is running the ~amd64 branch, I had to wait for this release in order to get my CalDAV server up and running. For the most part, installing Baïkal 0.7.0 was a straightforward process, but there were a couple of “gotchas” along the way.

The first (and most confusing) problem came after the installation/initial configuration when I tried to access my newly-created user’s calendar via the URL:

https://dav.MYDOMAIN.com/html/dav.php/calendar/MYUSERNAME/default

I knew that something was wrong when it wouldn’t even prompt me for credentials. Instead, the logs indicated the following error message:

[Tue Jun 02 14:13:05.529805 2020] [proxy_fcgi:error] [pid 32165:tid 139743908050688] [client 71.81.87.208:38910] AH01071: Got error 'PHP message: LogicException: Requested uri (/html/dav.php) is out of base uri (/s/html/dav.php/) in /var/www/domains/MYDOMAIN/dav/htdocs/vendor/sabre/http/lib/Request.php:184

I couldn’t figure out where the “/s/” was coming in before the “/html” portion, but that was certainly the cause of the error message. I filed an issue for it, and though I still don’t know the source of the problem, I was able to work around it by adding a trailing slash to the DocumentRoot for that particular vhost:

# pwd && diff -Nut dav.MYDOMAIN.conf.PRE-20200602_docroot dav.MYDOMAIN.conf
/etc/apache2/vhosts.d/includes
--- dav.MYDOMAIN.conf.PRE-20200602_docroot 2020-06-02 17:23:20.246281195 -0400 +++ dav.MYDOMAIN.conf 2020-06-02 17:20:59.892270352 -0400
@@ -1,7 +1,7 @@
- DocumentRoot "/var/www/domains/MYDOMAIN/dav/htdocs"
+ DocumentRoot "/var/www/domains/MYDOMAIN/dav/htdocs/"

After solving that strange problem, I was at least prompted for credentials when I accessed the calendar URL from above. After logging in, I ran into one more problem, though:

Class 'XMLWriter' not found

This problem was much easier to fix. I simply needed to add the ‘xmlwriter‘ USE flag to dev-lang/php (I also added ‘xmlreader‘ for good measure), emerge it again, and restart PHP-FPM. Other distributions (like CentOS) will likely need to install the ‘php-xml’ package (or something similar).

After that fix, I am happy to report that Baïkal 0.7.0 is working beautifully, and I have my calendars synced across all my devices. I personally use Thunderbird with Lightning on my computers, and a combination of DAVx5 with Simple Calendar Pro on my Android devices.

May 10 2020

200th Gentoo Council meeting

Gentoo News (GentooNews) May 10, 2020, 5:00

Way back in 2005, the reorganization of Gentoo led to the formation of the Gentoo Council, a steering body elected annually by the Gentoo developers. Forward 15 years, and today we had our 200th meeting! (No earth shaking decisions were taken today though.) The logs and summaries of all meetings can be read online on the archive page.

council group photo

Way back in 2005, the reorganization of Gentoo led to the formation of the Gentoo Council, a steering body elected annually by the Gentoo developers. Forward 15 years, and today we had our 200th meeting! (No earth shaking decisions were taken today though.) The logs and summaries of all meetings can be read online on the archive page.

May 08 2020

Reviving Gentoo Bugday

Gentoo News (GentooNews) May 08, 2020, 5:00

Reviving an old tradition, the next Gentoo Bugday will take place on Saturday 2020-06-06. Let’s contribute to Gentoo and fix bugs! We will focus on two topics in particular:

  • Adding or improving documentation on the Gentoo wiki
  • Fixing packages that fail with -fno-common (bug #705764)

Join us on channel #gentoo-bugday, freenode IRC, for real-time help. See you on 2020-06-06!

bug outline

Reviving an old tradition, the next Gentoo Bugday will take place on Saturday 2020-06-06. Let’s contribute to Gentoo and fix bugs! We will focus on two topics in particular:

  • Adding or improving documentation on the Gentoo wiki
  • Fixing packages that fail with -fno-common (bug #705764)

Join us on channel #gentoo-bugday, freenode IRC, for real-time help. See you on 2020-06-06!

April 16 2020

Why I stopped fuzzing research

Agostino Sarubbo (ago) April 16, 2020, 17:53

If you followed me in the past, you may have noticed that I stopped fuzzing research. During this time many people have asked me why…so instead of repeating the same answer every time, why not write a few lines about it…

While fuzz research was in my case fully automated, if you want to do a nice job you should:
– Communicate with upstream by making an exhaustive bug-report;
– Publish an advisory that collects all the needed info (affected versions, fixed version, commit fix, reproducer, poc, and so on) otherwise you force each downstream maintainer to do that by himself.

What happens in the majority of cases instead?
– When there is no ticketing system, upstream maintainers do not answer to your emails but fix the issues silently so, if you aren’t familiar with the code or if you don’t have time for investigations, you don’t have enough data to post. Even if you had time and you knew the code, you could still make a mistake; so why take the responsibility of pointing out commit fixes and so on?
– If you pass the above step, you have to request a CVE. In the past it was enough to publish on oss-security and you would get a CVE from a member of the Mitre team. Nowadays you have to fill a request that includes all the mentioned data and………wait ♦

If you pass the above two points and publish your advisory, what’s the next step? Stay tuned and wait for duplicates ♦ .

Let’s see a real example:
In the past I did fuzzing research on audiofile. Here is a screenshot of the issues without any words in the search field:

Do you see anything strange? Yeah there is clearly a duplicate.
I’m showing this image to point out the fact that, in order to avoid the duplicate, it would have been enough to look a little further below, so I am wondering:
if you are able to compile the software, use ASAN, use AFL, why aren’t you able to make a simple search to check if this issue was already filed?
For now, the only answer that I can think of is: everyone is hungry to find security issues and be the discoverer of a CVE.
Let’s clarify: if you find security issues by fuzzing you are not a security researcher at all and you will not be more palatable to the cybersecurity world. You are just creating CVE confusion for the rest of us.

On the other side, dear Mitre: you force us to fill an exhaustive request so, since you have all the data, why are you mistakenly assigning CVEs for already reported issues?

The first few times I saw these duplicates, I tried to report them but, unfortunately, it’s not my job and I found it very hard to do because of the large amount.

So, in short, I stopped fuzzing research because due to the current state of things, it’s a big waste of time.

If you followed me in the past, you may have noticed that I stopped fuzzing research. During this time many people have asked me why…so instead of repeating the same answer every time, why not write a few lines about it…

While fuzz research was in my case fully automated, if you want to do a nice job you should:
– Communicate with upstream by making an exhaustive bug-report;
– Publish an advisory that collects all the needed info (affected versions, fixed version, commit fix, reproducer, poc, and so on) otherwise you force each downstream maintainer to do that by himself.

What happens in the majority of cases instead?
– When there is no ticketing system, upstream maintainers do not answer to your emails but fix the issues silently so, if you aren’t familiar with the code or if you don’t have time for investigations, you don’t have enough data to post. Even if you had time and you knew the code, you could still make a mistake; so why take the responsibility of pointing out commit fixes and so on?
– If you pass the above step, you have to request a CVE. In the past it was enough to publish on oss-security and you would get a CVE from a member of the Mitre team. Nowadays you have to fill a request that includes all the mentioned data and………wait 😀

If you pass the above two points and publish your advisory, what’s the next step? Stay tuned and wait for duplicates 😀 .

Let’s see a real example:
In the past I did fuzzing research on audiofile. Here is a screenshot of the issues without any words in the search field:

Do you see anything strange? Yeah there is clearly a duplicate.
I’m showing this image to point out the fact that, in order to avoid the duplicate, it would have been enough to look a little further below, so I am wondering:
if you are able to compile the software, use ASAN, use AFL, why aren’t you able to make a simple search to check if this issue was already filed?
For now, the only answer that I can think of is: everyone is hungry to find security issues and be the discoverer of a CVE.
Let’s clarify: if you find security issues by fuzzing you are not a security researcher at all and you will not be more palatable to the cybersecurity world. You are just creating CVE confusion for the rest of us.

On the other side, dear Mitre: you force us to fill an exhaustive request so, since you have all the data, why are you mistakenly assigning CVEs for already reported issues?

The first few times I saw these duplicates, I tried to report them but, unfortunately, it’s not my job and I found it very hard to do because of the large amount.

So, in short, I stopped fuzzing research because due to the current state of things, it’s a big waste of time.

April 15 2020

Spam increase due to SpamAssassin Bayes database not available for scanning

Nathan Zachary (nathanzachary) April 15, 2020, 20:04

For approximately the past six weeks or so, I’ve noticed an uptick in the amount of spam getting through (and delivered) on my primary mail server. At first the increase in false negatives (meaning spam not getting flagged as such) wasn’t all that bad, so I didn’t think much of it. However, starting last week and especially this week, the increase was so dramatic that it prompted me to look further into the problem.

I started by looking through my SpamAssassin and amavis settings to make sure that nothing was blatantly wrong, but nothing stood out as having recently changed. I made sure that I had the required Perl modules for all of SpamAssassin’s filtering, and again, nothing had recently changed. Coming up empty-handed, I decided to take a look at some headers for an email that came through even though it was spam:

X-Spam-Status: No, score=1.502 required=3.2 tests=[
	DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.25,
	FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25,
	HTML_MESSAGE=0.001]

I thought that possibly something had changed with how SpamAssassin assigned points for various tests, and temporarily dropped the score required for spam flagging to 1.4. Delving more deeply, though, I found that the point assignments had not changed, so I reverted to 3.2 and kept investigating. After looking again, I noticed that one key test wasn’t showing in the ‘X-Spam-Status’ header for this email: Bayesian filtering. Normally, there would be some type of reference to ‘BAYES_%=#’ (where % represents the chance that the Bayesian filter thought that the message could be spam and # represents the score assigned to the email based on that chance) in the spam header. However, it was no longer showing up, which indicated to me that the Bayes filters weren’t running.

I then started with some basic Bayes troubleshooting steps, and found some clues. By analysing the output of spamassassin -D --lint, I saw that there could be problem with the Bayes database:

Apr 14 22:10:41.875 [20701] dbg: plugin: loading Mail::SpamAssassin::Plugin::Bayes from @INC
Apr 14 22:10:42.061 [20701] dbg: config: fixed relative path: /var/lib/spamassassin/3.004004/updates_spamassassin_org/23_bayes.cf
Apr 14 22:10:42.061 [20701] dbg: config: using "/var/lib/spamassassin/3.004004/updates_spamassassin_org/23_bayes.cf" for included file
Apr 14 22:10:42.061 [20701] dbg: config: read file /var/lib/spamassassin/3.004004/updates_spamassassin_org/23_bayes.cf
Apr 14 22:10:42.748 [20701] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x55f1700749b8) implements 'learner_new', priority 0
Apr 14 22:10:42.748 [20701] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x55f1700749b8), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM
Apr 14 22:10:42.759 [20701] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x55f170db5ba8)
Apr 14 22:10:42.759 [20701] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x55f1700749b8) implements 'learner_is_scan_available', priority 0
Apr 14 22:10:42.759 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
Apr 14 22:10:42.760 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen
Apr 14 22:10:42.760 [20701] dbg: bayes: found bayes db version 3
Apr 14 22:10:42.760 [20701] dbg: bayes: DB journal sync: last sync: 1586894330
Apr 14 22:10:42.760 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200
Apr 14 22:10:42.760 [20701] dbg: bayes: untie-ing
Apr 14 22:10:42.764 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
Apr 14 22:10:42.764 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen
Apr 14 22:10:42.765 [20701] dbg: bayes: found bayes db version 3
Apr 14 22:10:42.765 [20701] dbg: bayes: DB journal sync: last sync: 1586894330
Apr 14 22:10:42.765 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200
Apr 14 22:10:42.765 [20701] dbg: bayes: untie-ing

In particular, the following lines indicated a problem to me:

Apr 14 22:10:42.760 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200
<snip>
Apr 14 22:10:42.765 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200

Thinking back, I remembered that there were some changes to the amavis implementation in Gentoo that caused me problems in late February of 2020. One of those changes was relocating the amavis user’s home/runtime directory from /var/amavis/ to /var/lib/amavishome/. That’s when I saw it in the debugging output:

Apr 14 22:10:42.759 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
Apr 14 22:10:42.760 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen

The directory for the Bayes database shouldn’t be /var/amavis/.spamassassin/bayes* any longer, but instead should be /var/lib/amavishome/.spamassassin/bayes*. I made that change:

# grep bayes_path /etc/mail/spamassassin/local.cf 
bayes_path /var/lib/amavishome/.spamassassin/bayes

and restarted both amavis and spamd, and now I could see the Bayes filter in the ‘X-Spam-Status’ header:

X-Spam-Status: Yes, score=18.483 required=3.2 tests=[BAYES_99=5.75,
	BAYES_999=8, FROM_SUSPICIOUS_NTLD=0.499,
	FROM_SUSPICIOUS_NTLD_FP=0.514, HTML_MESSAGE=0.001,
	HTML_OFF_PAGE=0.927, PDS_OTHER_BAD_TLD=1.999, RDNS_NONE=0.793]

After implementing the change for the Bayes database location in amavis, I have seen the false negative level drop back to where it used to be. ♦

Cheers,
Zach

For approximately the past six weeks or so, I’ve noticed an uptick in the amount of spam getting through (and delivered) on my primary mail server. At first the increase in false negatives (meaning spam not getting flagged as such) wasn’t all that bad, so I didn’t think much of it. However, starting last week and especially this week, the increase was so dramatic that it prompted me to look further into the problem.

I started by looking through my SpamAssassin and amavis settings to make sure that nothing was blatantly wrong, but nothing stood out as having recently changed. I made sure that I had the required Perl modules for all of SpamAssassin’s filtering, and again, nothing had recently changed. Coming up empty-handed, I decided to take a look at some headers for an email that came through even though it was spam:

X-Spam-Status: No, score=1.502 required=3.2 tests=[
	DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.25,
	FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25,
	HTML_MESSAGE=0.001]

I thought that possibly something had changed with how SpamAssassin assigned points for various tests, and temporarily dropped the score required for spam flagging to 1.4. Delving more deeply, though, I found that the point assignments had not changed, so I reverted to 3.2 and kept investigating. After looking again, I noticed that one key test wasn’t showing in the ‘X-Spam-Status’ header for this email: Bayesian filtering. Normally, there would be some type of reference to ‘BAYES_%=#’ (where % represents the chance that the Bayesian filter thought that the message could be spam and # represents the score assigned to the email based on that chance) in the spam header. However, it was no longer showing up, which indicated to me that the Bayes filters weren’t running.

I then started with some basic Bayes troubleshooting steps, and found some clues. By analysing the output of spamassassin -D --lint, I saw that there could be problem with the Bayes database:

Apr 14 22:10:41.875 [20701] dbg: plugin: loading Mail::SpamAssassin::Plugin::Bayes from @INC
Apr 14 22:10:42.061 [20701] dbg: config: fixed relative path: /var/lib/spamassassin/3.004004/updates_spamassassin_org/23_bayes.cf
Apr 14 22:10:42.061 [20701] dbg: config: using "/var/lib/spamassassin/3.004004/updates_spamassassin_org/23_bayes.cf" for included file
Apr 14 22:10:42.061 [20701] dbg: config: read file /var/lib/spamassassin/3.004004/updates_spamassassin_org/23_bayes.cf
Apr 14 22:10:42.748 [20701] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x55f1700749b8) implements 'learner_new', priority 0
Apr 14 22:10:42.748 [20701] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x55f1700749b8), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM
Apr 14 22:10:42.759 [20701] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x55f170db5ba8)
Apr 14 22:10:42.759 [20701] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x55f1700749b8) implements 'learner_is_scan_available', priority 0
Apr 14 22:10:42.759 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
Apr 14 22:10:42.760 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen
Apr 14 22:10:42.760 [20701] dbg: bayes: found bayes db version 3
Apr 14 22:10:42.760 [20701] dbg: bayes: DB journal sync: last sync: 1586894330
Apr 14 22:10:42.760 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200
Apr 14 22:10:42.760 [20701] dbg: bayes: untie-ing
Apr 14 22:10:42.764 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
Apr 14 22:10:42.764 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen
Apr 14 22:10:42.765 [20701] dbg: bayes: found bayes db version 3
Apr 14 22:10:42.765 [20701] dbg: bayes: DB journal sync: last sync: 1586894330
Apr 14 22:10:42.765 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200
Apr 14 22:10:42.765 [20701] dbg: bayes: untie-ing

In particular, the following lines indicated a problem to me:

Apr 14 22:10:42.760 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200
<snip>
Apr 14 22:10:42.765 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200

Thinking back, I remembered that there were some changes to the amavis implementation in Gentoo that caused me problems in late February of 2020. One of those changes was relocating the amavis user’s home/runtime directory from /var/amavis/ to /var/lib/amavishome/. That’s when I saw it in the debugging output:

Apr 14 22:10:42.759 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
Apr 14 22:10:42.760 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen

The directory for the Bayes database shouldn’t be /var/amavis/.spamassassin/bayes* any longer, but instead should be /var/lib/amavishome/.spamassassin/bayes*. I made that change:

# grep bayes_path /etc/mail/spamassassin/local.cf 
bayes_path /var/lib/amavishome/.spamassassin/bayes

and restarted both amavis and spamd, and now I could see the Bayes filter in the ‘X-Spam-Status’ header:

X-Spam-Status: Yes, score=18.483 required=3.2 tests=[BAYES_99=5.75,
	BAYES_999=8, FROM_SUSPICIOUS_NTLD=0.499,
	FROM_SUSPICIOUS_NTLD_FP=0.514, HTML_MESSAGE=0.001,
	HTML_OFF_PAGE=0.927, PDS_OTHER_BAD_TLD=1.999, RDNS_NONE=0.793]

After implementing the change for the Bayes database location in amavis, I have seen the false negative level drop back to where it used to be. 🙂

Cheers,
Zach

April 14 2020

py3status v3.28 – goodbye py2.6-3.4

Alexys Jacob (ultrabug) April 14, 2020, 14:51

The newest version of py3status starts to enforce the deprecation of Python 2.6 to 3.4 (included) initiated by Thiago Kenji Okada more than a year ago and orchestrated by Hugo van Kemenade via #1904 and #1896.

Thanks to Hugo, I discovered a nice tool by @asottile to update your Python code base to recent syntax sugars called pyupgrade!

Debian buster users might be interested in the installation war story that @TRS-80 kindly described and the final (and documented) solution found.

Changelog since v3.26
  • drop support for EOL Python 2.6-3.4 (#1896), by Hugo van Kemenade
  • i3status: support read_file module (#1909), by @lasers thx to @dohseven
  • clock module: add “locale” config parameter to change time representation (#1910), by inemajo
  • docs: update debian instructions fix #1916
  • mpd_status module: use currentsong command if possible (#1924), by girst
  • networkmanager module: allow using the currently active AP in formats (#1921), by Benoît Dardenne
  • volume_status module: change amixer flag ordering fix #1914 (#1920)
Thank you contributors
  • Thiago Kenji Okada
  • Hugo van Kemenade
  • Benoît Dardenne
  • @dohseven
  • @inemajo
  • @girst
  • @lasers

The newest version of py3status starts to enforce the deprecation of Python 2.6 to 3.4 (included) initiated by Thiago Kenji Okada more than a year ago and orchestrated by Hugo van Kemenade via #1904 and #1896.

Thanks to Hugo, I discovered a nice tool by @asottile to update your Python code base to recent syntax sugars called pyupgrade!

Debian buster users might be interested in the installation war story that @TRS-80 kindly described and the final (and documented) solution found.

Changelog since v3.26

  • drop support for EOL Python 2.6-3.4 (#1896), by Hugo van Kemenade
  • i3status: support read_file module (#1909), by @lasers thx to @dohseven
  • clock module: add “locale” config parameter to change time representation (#1910), by inemajo
  • docs: update debian instructions fix #1916
  • mpd_status module: use currentsong command if possible (#1924), by girst
  • networkmanager module: allow using the currently active AP in formats (#1921), by Benoît Dardenne
  • volume_status module: change amixer flag ordering fix #1914 (#1920)

Thank you contributors

  • Thiago Kenji Okada
  • Hugo van Kemenade
  • Benoît Dardenne
  • @dohseven
  • @inemajo
  • @girst
  • @lasers

Zstandard (zstd) Coming to >= gentoo-sources-5.6.4 (use=experimental)

Mike Pagano (mpagano) April 08, 2020, 18:15

I just added zstd to gentoo-sources which will apply to gentoo-sources kernels >=5.6.4 when the ‘experimental’ use flag is enabled.

zstd is described here[1] as “…a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios. It’s backed by a very fast entropy stage, provided by Huff0 and FSE library.”

You can read more about it here[2].

Thanks to Klemen Mihevc for the request [3]

[1] github.com/facebook/zstd

[2]facebook.github.io/zstd/

[3] bugs.gentoo.org/716520

I just added zstd to gentoo-sources which will apply to gentoo-sources kernels >=5.6.4 when the ‘experimental’ use flag is enabled.

zstd is described here[1] as “…a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios. It’s backed by a very fast entropy stage, provided by Huff0 and FSE library.”

You can read more about it here[2].

Thanks to Klemen Mihevc for the request [3]

[1] https://github.com/facebook/zstd

[2]https://facebook.github.io/zstd/

[3] https://bugs.gentoo.org/716520

March 30 2020

Linux kernel 5.6.0 iwlwifi bug

Mike Pagano (mpagano) March 30, 2020, 13:27

Quick note that the Linux kernel 5.6.0 has an iwlwifi bug that will prevent network connectivity. [1]

A patch is out but did not make 5.6.0. This patch IS included in gentoo-sources-5.6.0. It will be in a future vanilla-sources 5.6.X once upstream releases a new version.

[1] www.phoronix.com/scan.php?page=news_item&px=Linux-5.6-Broken-Intel-IWLWIFI

[2] git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/patch/?id=be8c827f50a0bcd56361b31ada11dc0a3c2fd240

Quick note that the Linux kernel 5.6.0 has an iwlwifi bug that will prevent network connectivity. [1]

A patch is out but did not make 5.6.0. This patch IS included in gentoo-sources-5.6.0. It will be in a future vanilla-sources 5.6.X once upstream releases a new version.

[1] https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.6-Broken-Intel-IWLWIFI

[2] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/patch/?id=be8c827f50a0bcd56361b31ada11dc0a3c2fd240

VIEW

SCOPE

FILTER
  from
  to