May 10 2020

200th Gentoo Council meeting

Gentoo News (GentooNews) May 10, 2020, 0:00

Way back in 2005, the reorganization of Gentoo led to the formation of the Gentoo Council, a steering body elected annually by the Gentoo developers. Forward 15 years, and today we had our 200th meeting! (No earth shaking decisions were taken today though.) The logs and summaries of all meetings can be read online on the archive page.

council group photo

Way back in 2005, the reorganization of Gentoo led to the formation of the Gentoo Council, a steering body elected annually by the Gentoo developers. Forward 15 years, and today we had our 200th meeting! (No earth shaking decisions were taken today though.) The logs and summaries of all meetings can be read online on the archive page.

May 08 2020

Reviving Gentoo Bugday

Gentoo News (GentooNews) May 08, 2020, 0:00

Reviving an old tradition, the next Gentoo Bugday will take place on Saturday 2020-06-06. Let’s contribute to Gentoo and fix bugs! We will focus on two topics in particular:

  • Adding or improving documentation on the Gentoo wiki
  • Fixing packages that fail with -fno-common (bug #705764)

Join us on channel #gentoo-bugday, freenode IRC, for real-time help. See you on 2020-06-06!

bug outline

Reviving an old tradition, the next Gentoo Bugday will take place on Saturday 2020-06-06. Let’s contribute to Gentoo and fix bugs! We will focus on two topics in particular:

  • Adding or improving documentation on the Gentoo wiki
  • Fixing packages that fail with -fno-common (bug #705764)

Join us on channel #gentoo-bugday, freenode IRC, for real-time help. See you on 2020-06-06!

April 16 2020

Why I stopped fuzzing research

Agostino Sarubbo (ago) April 16, 2020, 17:53

If you followed me in the past, you may have noticed that I stopped fuzzing research. During this time many people have asked me why…so instead of repeating the same answer every time, why not write a few lines about it…

While fuzz research was in my case fully automated, if you want to do a nice job you should:
– Communicate with upstream by making an exhaustive bug-report;
– Publish an advisory that collects all the needed info (affected versions, fixed version, commit fix, reproducer, poc, and so on) otherwise you force each downstream maintainer to do that by himself.

What happens in the majority of cases instead?
– When there is no ticketing system, upstream maintainers do not answer to your emails but fix the issues silently so, if you aren’t familiar with the code or if you don’t have time for investigations, you don’t have enough data to post. Even if you had time and you knew the code, you could still make a mistake; so why take the responsibility of pointing out commit fixes and so on?
– If you pass the above step, you have to request a CVE. In the past it was enough to publish on oss-security and you would get a CVE from a member of the Mitre team. Nowadays you have to fill a request that includes all the mentioned data and………wait ♦

If you pass the above two points and publish your advisory, what’s the next step? Stay tuned and wait for duplicates ♦ .

Let’s see a real example:
In the past I did fuzzing research on audiofile. Here is a screenshot of the issues without any words in the search field:

Do you see anything strange? Yeah there is clearly a duplicate.
I’m showing this image to point out the fact that, in order to avoid the duplicate, it would have been enough to look a little further below, so I am wondering:
if you are able to compile the software, use ASAN, use AFL, why aren’t you able to make a simple search to check if this issue was already filed?
For now, the only answer that I can think of is: everyone is hungry to find security issues and be the discoverer of a CVE.
Let’s clarify: if you find security issues by fuzzing you are not a security researcher at all and you will not be more palatable to the cybersecurity world. You are just creating CVE confusion for the rest of us.

On the other side, dear Mitre: you force us to fill an exhaustive request so, since you have all the data, why are you mistakenly assigning CVEs for already reported issues?

The first few times I saw these duplicates, I tried to report them but, unfortunately, it’s not my job and I found it very hard to do because of the large amount.

So, in short, I stopped fuzzing research because due to the current state of things, it’s a big waste of time.

If you followed me in the past, you may have noticed that I stopped fuzzing research. During this time many people have asked me why…so instead of repeating the same answer every time, why not write a few lines about it…

While fuzz research was in my case fully automated, if you want to do a nice job you should:
– Communicate with upstream by making an exhaustive bug-report;
– Publish an advisory that collects all the needed info (affected versions, fixed version, commit fix, reproducer, poc, and so on) otherwise you force each downstream maintainer to do that by himself.

What happens in the majority of cases instead?
– When there is no ticketing system, upstream maintainers do not answer to your emails but fix the issues silently so, if you aren’t familiar with the code or if you don’t have time for investigations, you don’t have enough data to post. Even if you had time and you knew the code, you could still make a mistake; so why take the responsibility of pointing out commit fixes and so on?
– If you pass the above step, you have to request a CVE. In the past it was enough to publish on oss-security and you would get a CVE from a member of the Mitre team. Nowadays you have to fill a request that includes all the mentioned data and………wait 😀

If you pass the above two points and publish your advisory, what’s the next step? Stay tuned and wait for duplicates 😀 .

Let’s see a real example:
In the past I did fuzzing research on audiofile. Here is a screenshot of the issues without any words in the search field:

Do you see anything strange? Yeah there is clearly a duplicate.
I’m showing this image to point out the fact that, in order to avoid the duplicate, it would have been enough to look a little further below, so I am wondering:
if you are able to compile the software, use ASAN, use AFL, why aren’t you able to make a simple search to check if this issue was already filed?
For now, the only answer that I can think of is: everyone is hungry to find security issues and be the discoverer of a CVE.
Let’s clarify: if you find security issues by fuzzing you are not a security researcher at all and you will not be more palatable to the cybersecurity world. You are just creating CVE confusion for the rest of us.

On the other side, dear Mitre: you force us to fill an exhaustive request so, since you have all the data, why are you mistakenly assigning CVEs for already reported issues?

The first few times I saw these duplicates, I tried to report them but, unfortunately, it’s not my job and I found it very hard to do because of the large amount.

So, in short, I stopped fuzzing research because due to the current state of things, it’s a big waste of time.

April 15 2020

Spam increase due to SpamAssassin Bayes database not available for scanning

Nathan Zachary (nathanzachary) April 15, 2020, 20:04

For approximately the past six weeks or so, I’ve noticed an uptick in the amount of spam getting through (and delivered) on my primary mail server. At first the increase in false negatives (meaning spam not getting flagged as such) wasn’t all that bad, so I didn’t think much of it. However, starting last week and especially this week, the increase was so dramatic that it prompted me to look further into the problem.

I started by looking through my SpamAssassin and amavis settings to make sure that nothing was blatantly wrong, but nothing stood out as having recently changed. I made sure that I had the required Perl modules for all of SpamAssassin’s filtering, and again, nothing had recently changed. Coming up empty-handed, I decided to take a look at some headers for an email that came through even though it was spam:

X-Spam-Status: No, score=1.502 required=3.2 tests=[
	DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.25,
	FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25,
	HTML_MESSAGE=0.001]

I thought that possibly something had changed with how SpamAssassin assigned points for various tests, and temporarily dropped the score required for spam flagging to 1.4. Delving more deeply, though, I found that the point assignments had not changed, so I reverted to 3.2 and kept investigating. After looking again, I noticed that one key test wasn’t showing in the ‘X-Spam-Status’ header for this email: Bayesian filtering. Normally, there would be some type of reference to ‘BAYES_%=#’ (where % represents the chance that the Bayesian filter thought that the message could be spam and # represents the score assigned to the email based on that chance) in the spam header. However, it was no longer showing up, which indicated to me that the Bayes filters weren’t running.

I then started with some basic Bayes troubleshooting steps, and found some clues. By analysing the output of spamassassin -D --lint, I saw that there could be problem with the Bayes database:

Apr 14 22:10:41.875 [20701] dbg: plugin: loading Mail::SpamAssassin::Plugin::Bayes from @INC
Apr 14 22:10:42.061 [20701] dbg: config: fixed relative path: /var/lib/spamassassin/3.004004/updates_spamassassin_org/23_bayes.cf
Apr 14 22:10:42.061 [20701] dbg: config: using "/var/lib/spamassassin/3.004004/updates_spamassassin_org/23_bayes.cf" for included file
Apr 14 22:10:42.061 [20701] dbg: config: read file /var/lib/spamassassin/3.004004/updates_spamassassin_org/23_bayes.cf
Apr 14 22:10:42.748 [20701] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x55f1700749b8) implements 'learner_new', priority 0
Apr 14 22:10:42.748 [20701] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x55f1700749b8), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM
Apr 14 22:10:42.759 [20701] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x55f170db5ba8)
Apr 14 22:10:42.759 [20701] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x55f1700749b8) implements 'learner_is_scan_available', priority 0
Apr 14 22:10:42.759 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
Apr 14 22:10:42.760 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen
Apr 14 22:10:42.760 [20701] dbg: bayes: found bayes db version 3
Apr 14 22:10:42.760 [20701] dbg: bayes: DB journal sync: last sync: 1586894330
Apr 14 22:10:42.760 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200
Apr 14 22:10:42.760 [20701] dbg: bayes: untie-ing
Apr 14 22:10:42.764 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
Apr 14 22:10:42.764 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen
Apr 14 22:10:42.765 [20701] dbg: bayes: found bayes db version 3
Apr 14 22:10:42.765 [20701] dbg: bayes: DB journal sync: last sync: 1586894330
Apr 14 22:10:42.765 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200
Apr 14 22:10:42.765 [20701] dbg: bayes: untie-ing

In particular, the following lines indicated a problem to me:

Apr 14 22:10:42.760 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200
<snip>
Apr 14 22:10:42.765 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200

Thinking back, I remembered that there were some changes to the amavis implementation in Gentoo that caused me problems in late February of 2020. One of those changes was relocating the amavis user’s home/runtime directory from /var/amavis/ to /var/lib/amavishome/. That’s when I saw it in the debugging output:

Apr 14 22:10:42.759 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
Apr 14 22:10:42.760 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen

The directory for the Bayes database shouldn’t be /var/amavis/.spamassassin/bayes* any longer, but instead should be /var/lib/amavishome/.spamassassin/bayes*. I made that change:

# grep bayes_path /etc/mail/spamassassin/local.cf 
bayes_path /var/lib/amavishome/.spamassassin/bayes

and restarted both amavis and spamd, and now I could see the Bayes filter in the ‘X-Spam-Status’ header:

X-Spam-Status: Yes, score=18.483 required=3.2 tests=[BAYES_99=5.75,
	BAYES_999=8, FROM_SUSPICIOUS_NTLD=0.499,
	FROM_SUSPICIOUS_NTLD_FP=0.514, HTML_MESSAGE=0.001,
	HTML_OFF_PAGE=0.927, PDS_OTHER_BAD_TLD=1.999, RDNS_NONE=0.793]

After implementing the change for the Bayes database location in amavis, I have seen the false negative level drop back to where it used to be. ♦

Cheers,
Zach

For approximately the past six weeks or so, I’ve noticed an uptick in the amount of spam getting through (and delivered) on my primary mail server. At first the increase in false negatives (meaning spam not getting flagged as such) wasn’t all that bad, so I didn’t think much of it. However, starting last week and especially this week, the increase was so dramatic that it prompted me to look further into the problem.

I started by looking through my SpamAssassin and amavis settings to make sure that nothing was blatantly wrong, but nothing stood out as having recently changed. I made sure that I had the required Perl modules for all of SpamAssassin’s filtering, and again, nothing had recently changed. Coming up empty-handed, I decided to take a look at some headers for an email that came through even though it was spam:

X-Spam-Status: No, score=1.502 required=3.2 tests=[
	DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.25,
	FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25,
	HTML_MESSAGE=0.001]

I thought that possibly something had changed with how SpamAssassin assigned points for various tests, and temporarily dropped the score required for spam flagging to 1.4. Delving more deeply, though, I found that the point assignments had not changed, so I reverted to 3.2 and kept investigating. After looking again, I noticed that one key test wasn’t showing in the ‘X-Spam-Status’ header for this email: Bayesian filtering. Normally, there would be some type of reference to ‘BAYES_%=#’ (where % represents the chance that the Bayesian filter thought that the message could be spam and # represents the score assigned to the email based on that chance) in the spam header. However, it was no longer showing up, which indicated to me that the Bayes filters weren’t running.

I then started with some basic Bayes troubleshooting steps, and found some clues. By analysing the output of spamassassin -D --lint, I saw that there could be problem with the Bayes database:

Apr 14 22:10:41.875 [20701] dbg: plugin: loading Mail::SpamAssassin::Plugin::Bayes from @INC
Apr 14 22:10:42.061 [20701] dbg: config: fixed relative path: /var/lib/spamassassin/3.004004/updates_spamassassin_org/23_bayes.cf
Apr 14 22:10:42.061 [20701] dbg: config: using "/var/lib/spamassassin/3.004004/updates_spamassassin_org/23_bayes.cf" for included file
Apr 14 22:10:42.061 [20701] dbg: config: read file /var/lib/spamassassin/3.004004/updates_spamassassin_org/23_bayes.cf
Apr 14 22:10:42.748 [20701] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x55f1700749b8) implements 'learner_new', priority 0
Apr 14 22:10:42.748 [20701] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x55f1700749b8), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM
Apr 14 22:10:42.759 [20701] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x55f170db5ba8)
Apr 14 22:10:42.759 [20701] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x55f1700749b8) implements 'learner_is_scan_available', priority 0
Apr 14 22:10:42.759 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
Apr 14 22:10:42.760 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen
Apr 14 22:10:42.760 [20701] dbg: bayes: found bayes db version 3
Apr 14 22:10:42.760 [20701] dbg: bayes: DB journal sync: last sync: 1586894330
Apr 14 22:10:42.760 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200
Apr 14 22:10:42.760 [20701] dbg: bayes: untie-ing
Apr 14 22:10:42.764 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
Apr 14 22:10:42.764 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen
Apr 14 22:10:42.765 [20701] dbg: bayes: found bayes db version 3
Apr 14 22:10:42.765 [20701] dbg: bayes: DB journal sync: last sync: 1586894330
Apr 14 22:10:42.765 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200
Apr 14 22:10:42.765 [20701] dbg: bayes: untie-ing

In particular, the following lines indicated a problem to me:

Apr 14 22:10:42.760 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200
<snip>
Apr 14 22:10:42.765 [20701] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200

Thinking back, I remembered that there were some changes to the amavis implementation in Gentoo that caused me problems in late February of 2020. One of those changes was relocating the amavis user’s home/runtime directory from /var/amavis/ to /var/lib/amavishome/. That’s when I saw it in the debugging output:

Apr 14 22:10:42.759 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks
Apr 14 22:10:42.760 [20701] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen

The directory for the Bayes database shouldn’t be /var/amavis/.spamassassin/bayes* any longer, but instead should be /var/lib/amavishome/.spamassassin/bayes*. I made that change:

# grep bayes_path /etc/mail/spamassassin/local.cf 
bayes_path /var/lib/amavishome/.spamassassin/bayes

and restarted both amavis and spamd, and now I could see the Bayes filter in the ‘X-Spam-Status’ header:

X-Spam-Status: Yes, score=18.483 required=3.2 tests=[BAYES_99=5.75,
	BAYES_999=8, FROM_SUSPICIOUS_NTLD=0.499,
	FROM_SUSPICIOUS_NTLD_FP=0.514, HTML_MESSAGE=0.001,
	HTML_OFF_PAGE=0.927, PDS_OTHER_BAD_TLD=1.999, RDNS_NONE=0.793]

After implementing the change for the Bayes database location in amavis, I have seen the false negative level drop back to where it used to be. 🙂

Cheers,
Zach

April 14 2020

py3status v3.28 – goodbye py2.6-3.4

Alexys Jacob (ultrabug) April 14, 2020, 14:51

The newest version of py3status starts to enforce the deprecation of Python 2.6 to 3.4 (included) initiated by Thiago Kenji Okada more than a year ago and orchestrated by Hugo van Kemenade via #1904 and #1896.

Thanks to Hugo, I discovered a nice tool by @asottile to update your Python code base to recent syntax sugars called pyupgrade!

Debian buster users might be interested in the installation war story that @TRS-80 kindly described and the final (and documented) solution found.

Changelog since v3.26
  • drop support for EOL Python 2.6-3.4 (#1896), by Hugo van Kemenade
  • i3status: support read_file module (#1909), by @lasers thx to @dohseven
  • clock module: add “locale” config parameter to change time representation (#1910), by inemajo
  • docs: update debian instructions fix #1916
  • mpd_status module: use currentsong command if possible (#1924), by girst
  • networkmanager module: allow using the currently active AP in formats (#1921), by Benoît Dardenne
  • volume_status module: change amixer flag ordering fix #1914 (#1920)
Thank you contributors
  • Thiago Kenji Okada
  • Hugo van Kemenade
  • Benoît Dardenne
  • @dohseven
  • @inemajo
  • @girst
  • @lasers

The newest version of py3status starts to enforce the deprecation of Python 2.6 to 3.4 (included) initiated by Thiago Kenji Okada more than a year ago and orchestrated by Hugo van Kemenade via #1904 and #1896.

Thanks to Hugo, I discovered a nice tool by @asottile to update your Python code base to recent syntax sugars called pyupgrade!

Debian buster users might be interested in the installation war story that @TRS-80 kindly described and the final (and documented) solution found.

Changelog since v3.26

  • drop support for EOL Python 2.6-3.4 (#1896), by Hugo van Kemenade
  • i3status: support read_file module (#1909), by @lasers thx to @dohseven
  • clock module: add “locale” config parameter to change time representation (#1910), by inemajo
  • docs: update debian instructions fix #1916
  • mpd_status module: use currentsong command if possible (#1924), by girst
  • networkmanager module: allow using the currently active AP in formats (#1921), by Benoît Dardenne
  • volume_status module: change amixer flag ordering fix #1914 (#1920)

Thank you contributors

  • Thiago Kenji Okada
  • Hugo van Kemenade
  • Benoît Dardenne
  • @dohseven
  • @inemajo
  • @girst
  • @lasers

Zstandard (zstd) Coming to >= gentoo-sources-5.6.4 (use=experimental)

Mike Pagano (mpagano) April 08, 2020, 18:15

I just added zstd to gentoo-sources which will apply to gentoo-sources kernels >=5.6.4 when the ‘experimental’ use flag is enabled.

zstd is described here[1] as “…a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios. It’s backed by a very fast entropy stage, provided by Huff0 and FSE library.”

You can read more about it here[2].

Thanks to Klemen Mihevc for the request [3]

[1] github.com/facebook/zstd

[2]facebook.github.io/zstd/

[3] bugs.gentoo.org/716520

I just added zstd to gentoo-sources which will apply to gentoo-sources kernels >=5.6.4 when the ‘experimental’ use flag is enabled.

zstd is described here[1] as “…a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios. It’s backed by a very fast entropy stage, provided by Huff0 and FSE library.”

You can read more about it here[2].

Thanks to Klemen Mihevc for the request [3]

[1] https://github.com/facebook/zstd

[2]https://facebook.github.io/zstd/

[3] https://bugs.gentoo.org/716520

March 30 2020

Linux kernel 5.6.0 iwlwifi bug

Mike Pagano (mpagano) March 30, 2020, 13:27

Quick note that the Linux kernel 5.6.0 has an iwlwifi bug that will prevent network connectivity. [1]

A patch is out but did not make 5.6.0. This patch IS included in gentoo-sources-5.6.0. It will be in a future vanilla-sources 5.6.X once upstream releases a new version.

[1] www.phoronix.com/scan.php?page=news_item&px=Linux-5.6-Broken-Intel-IWLWIFI

[2] git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/patch/?id=be8c827f50a0bcd56361b31ada11dc0a3c2fd240

Quick note that the Linux kernel 5.6.0 has an iwlwifi bug that will prevent network connectivity. [1]

A patch is out but did not make 5.6.0. This patch IS included in gentoo-sources-5.6.0. It will be in a future vanilla-sources 5.6.X once upstream releases a new version.

[1] https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.6-Broken-Intel-IWLWIFI

[2] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/patch/?id=be8c827f50a0bcd56361b31ada11dc0a3c2fd240

February 27 2020

Amavisd crashes immediately on start with little in the logs

Nathan Zachary (nathanzachary) February 27, 2020, 17:02

Recently I was updating amavisd and other portions of the mail stack on one of my mail servers (running Gentoo Linux). This particular set of updates didn’t go as smoothly as I would have liked primarily due to the acct-user/amavis package being created on Thursday, 13 February 2020 as a part of the ongoing effort to standardise user accounts used within applications. Some other problems with the upgrade of amavisd (to version 2.12.0-r1) and the mail stack were that some dependencies were erroneously removed from the ebuild (bug 710842) and (though not directly related) permissions within the new amavis user’s home directory being inadequate for ClamAV to successfully run. I will address these problems and their respective fixes below.

Missing dependencies in 2.12.0-r1 (bug 710842)

Some of the Perl modules (since amavisd-new is written in Perl) were erroneously removed from the dependency list in version 2.12.0-r1. The fix here was to simply reinstall the modules manually, which in my case entailed:

emerge -av MIME-tools Net-Server Mail-DKIM MailTools Net-LibIDN

However, my situation required one additional module that wasn’t mentioned in the bug, and was discovered when trying to manually start amavisd because of this error message:

# /usr/sbin/amavisd
Problem in Amavis::Unpackers code: Can't locate Archive/Zip.pm in @INC (you may need to install the Archive::Zip module) (@INC contains: /etc/perl /usr/local/lib64/perl5/5.30.1/x86_64-linux /usr/local/lib64/perl5/5.30.1 /usr/lib64/perl5/vendor_perl/5.30.1/x86_64-linux /usr/lib64/perl5/vendor_perl/5.30.1 /usr/local/lib64/perl5 /usr/lib64/perl5/vendor_perl/5.30.0/x86_64-linux /usr/lib64/perl5/vendor_perl/5.30.0 /usr/lib64/perl5/vendor_perl/5.28.0 /usr/lib64/perl5/vendor_perl/5.26.2 /usr/lib64/perl5/vendor_perl/5.26.1 /usr/lib64/perl5/vendor_perl/5.24.0 /usr/lib64/perl5/vendor_perl /usr/lib64/perl5/5.30.1/x86_64-linux /usr/lib64/perl5/5.30.1) at (eval 101) line 47.
BEGIN failed--compilation aborted at (eval 101) line 47.

That one was also simple enough to solve by installing dev-perl/Archive-Zip:

emerge -av dev-perl/Archive-Zip

Fortunately, this ‘missing dependencies’ problem was fixed with version 2.12.0-r2.

Changes in the ‘amavis’ user’s home directory

With the addition of the acct-user/amavis package, a few things changes about the default setup for the ‘amavis’ user. When installing that package, Portage warns about some steps that need to be implemented in order to make amavis work again after migrating its home directory from /var/amavis/ to /var/lib/amavishome. Even after following those steps, I saw the following error message when starting amavis:

Feb 27 00:08:31 [amavis] socket module IO::Socket::IP, protocol families available: INET, INET6
Feb 27 00:08:31 [amavis] will bind to /var/amavis/amavisd.sock|unix, 127.0.0.1:10024/tcp, [::1]:10024/tcp
Feb 27 00:08:31 [amavis] sd_notify (no socket): STATUS=Transferring control to Net::Server.
Feb 27 00:08:31 [amavis] sd_notify (no socket): STATUS=Preparing to bind sockets.
Feb 27 00:08:31 [amavis] Net::Server: 2020/02/27-00:08:31 Amavis (type Net::Server::PreForkSimple) starting! pid(1738)
Feb 27 00:08:31 [amavis] Net::Server: Binding to UNIX socket file "/var/amavis/amavisd.sock"
Feb 27 00:08:31 [amavis] (!)Net::Server: 2020/02/27-00:08:31 Can't connect to UNIX socket at file /var/amavis/amavisd.sock [No such file or directory]\n  at line 66 in file /usr/lib64/perl5/vendor_perl/5.30.1/Net/Server/Proto/UNIX.pm
Feb 27 00:08:31 [amavis] sd_notify (no socket): STOPPING=1\nSTATUS=Server rundown, notifying child processes.
Feb 27 00:08:31 [amavis] Net::Server: 2020/02/27-00:08:31 Server closing!
Feb 27 00:08:31 [amavis] sd_notify (no socket): STATUS=Child processes have been stopped.

In that error message, I noticed that there were still references to /var/amavis/ instead of the new /var/lib/amavishome/ directory, so I updated it using the $MYHOME variable in /etc/amavisd.conf:

# grep -e '^$MYHOME' /etc/amavisd.conf 
$MYHOME = '/var/lib/amavishome';   # a convenient default for other settings, -H

Thereafter, the start-up logs indicated that it was binding to a UNIX socket in the correct home directory:

Feb 27 00:12:15 [amavis] socket module IO::Socket::IP, protocol families available: INET, INET6
Feb 27 00:12:15 [amavis] will bind to /var/lib/amavishome/amavisd.sock|unix, 127.0.0.1:10024/tcp, [::1]:10024/tcp
Feb 27 00:12:15 [amavis] sd_notify (no socket): STATUS=Transferring control to Net::Server.
Feb 27 00:12:15 [amavis] sd_notify (no socket): STATUS=Preparing to bind sockets.
Feb 27 00:12:15 [amavis] Net::Server: 2020/02/27-00:12:15 Amavis (type Net::Server::PreForkSimple) starting! pid(1952)
Feb 27 00:12:15 [amavis] Net::Server: Binding to UNIX socket file "/var/lib/amavishome/amavisd.sock"
Feb 27 00:12:15 [amavis] Net::Server: Binding to TCP port 10024 on host 127.0.0.1 with IPv4
Feb 27 00:12:15 [amavis] Net::Server: Binding to TCP port 10024 on host ::1 with IPv6
ClamAV permissions within the amavis home directory:

Though this error wasn’t directly related to the upgrades (and had likely existed for quite some time beforehand), I only just now noticed it whilst combing through the logs (the first error is from my system’s mail log, and the second error is from the clamd log):

Feb 27 00:15:10 [amavis] (01980-01) run_av (ClamAV-clamd) result: /var/lib/amavishome/tmp/amavis-20200227T001510-01980-zrKbp28h/parts: lstat() failed: Permission denied. ERROR\n
Feb 27 00:15:10 [amavis] (01980-01) (!)run_av (ClamAV-clamd) FAILED - unexpected , output="/var/lib/amavishome/tmp/amavis-20200227T001510-01980-zrKbp28h/parts: lstat() failed: Permission denied. ERROR\n"
Feb 27 00:15:10 [amavis] (01980-01) (!)ClamAV-clamd av-scanner FAILED: CODE(0x5611198fa5d8) unexpected , output="/var/lib/amavishome/tmp/amavis-20200227T001510-01980-zrKbp28h/parts: lstat() failed: Permission denied. ERROR\n" at (eval 85) line 951.
# grep 'lstat' /var/log/clamav/clamd.log
Wed Feb 26 23:26:53 2020 -> WARNING: lstat() failed on: /var/amavis/tmp/amavis-20200226T180007-00529-S1gbY8cd/parts
Wed Feb 26 23:28:32 2020 -> WARNING: lstat() failed on: /var/amavis/tmp/amavis-20200226T180038-00592-FK2_Uj2T/parts
Wed Feb 26 23:31:30 2020 -> WARNING: lstat() failed on: /var/amavis/tmp/amavis-20200226T165427-32346-V4WeP0YX/parts
Thu Feb 27 00:15:10 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T001510-01980-zrKbp28h/parts
Thu Feb 27 00:26:21 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T002621-01981-l36mWT4P/parts
Thu Feb 27 00:26:30 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T002630-01982-k0qgJdjl/parts
Thu Feb 27 00:29:12 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T002912-01983-G57aKCmK/parts
Thu Feb 27 00:31:55 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T003155-01984-M9r9r1Gc/parts
Thu Feb 27 00:33:07 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T003307-01985-8n_wS6pQ/parts
Thu Feb 27 00:40:20 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T004020-01986-gH4PrFAk/parts
Thu Feb 27 00:45:45 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T004545-01987-J4YGuNbw/parts

These errors were fixed by 1) adding the ‘clamav’ user to the ‘amavis’ group, 2) setting g+w permissions on the /var/lib/amavishome/tmp/ directory, and 3) restarting clamd and amavisd

# gpasswd -a clamav amavis
Adding user clamav to group amavis

# ls -alh /var/lib/amavishome/ | grep tmp
drwxr-xr-x 141 amavis amavis  12K Feb 27 00:46 tmp

# chmod 775 /var/lib/amavishome/tmp

# ls -alh /var/lib/amavishome/ | grep tmp
drwxrwxr-x 141 amavis amavis  12K Feb 27 00:46 tmp

# /etc/init.d/clamd restart && /etc/init.d/amavisd restart

Now ClamAV is readily able to access files under /var/lib/amavishome/tmp/:

Feb 27 01:00:12 [amavis] (03346-01) ClamAV-clamd: Connecting to socket  /var/run/clamav/clamd.sock
Feb 27 01:00:12 [amavis] (03346-01) new socket by IO::Socket::UNIX to /var/run/clamav/clamd.sock, timeout set to 10
Feb 27 01:00:12 [amavis] (03346-01) connected to /var/run/clamav/clamd.sock successfully
Feb 27 01:00:12 [amavis] (03346-01) ClamAV-clamd: Sending CONTSCAN /var/lib/amavishome/tmp/amavis-20200227T010012-03346-YjbUpYOk/parts\n to socket /var/run/clamav/clamd.sock

Hopefully, if you run into these errors, you will be able to take this information and apply it to your particular mail stack.

Cheers,

Zach

Recently I was updating amavisd and other portions of the mail stack on one of my mail servers (running Gentoo Linux). This particular set of updates didn’t go as smoothly as I would have liked primarily due to the acct-user/amavis package being created on Thursday, 13 February 2020 as a part of the ongoing effort to standardise user accounts used within applications. Some other problems with the upgrade of amavisd (to version 2.12.0-r1) and the mail stack were that some dependencies were erroneously removed from the ebuild (bug 710842) and (though not directly related) permissions within the new amavis user’s home directory being inadequate for ClamAV to successfully run. I will address these problems and their respective fixes below.

Missing dependencies in 2.12.0-r1 (bug 710842)

Some of the Perl modules (since amavisd-new is written in Perl) were erroneously removed from the dependency list in version 2.12.0-r1. The fix here was to simply reinstall the modules manually, which in my case entailed:

emerge -av MIME-tools Net-Server Mail-DKIM MailTools Net-LibIDN

However, my situation required one additional module that wasn’t mentioned in the bug, and was discovered when trying to manually start amavisd because of this error message:

# /usr/sbin/amavisd
Problem in Amavis::Unpackers code: Can't locate Archive/Zip.pm in @INC (you may need to install the Archive::Zip module) (@INC contains: /etc/perl /usr/local/lib64/perl5/5.30.1/x86_64-linux /usr/local/lib64/perl5/5.30.1 /usr/lib64/perl5/vendor_perl/5.30.1/x86_64-linux /usr/lib64/perl5/vendor_perl/5.30.1 /usr/local/lib64/perl5 /usr/lib64/perl5/vendor_perl/5.30.0/x86_64-linux /usr/lib64/perl5/vendor_perl/5.30.0 /usr/lib64/perl5/vendor_perl/5.28.0 /usr/lib64/perl5/vendor_perl/5.26.2 /usr/lib64/perl5/vendor_perl/5.26.1 /usr/lib64/perl5/vendor_perl/5.24.0 /usr/lib64/perl5/vendor_perl /usr/lib64/perl5/5.30.1/x86_64-linux /usr/lib64/perl5/5.30.1) at (eval 101) line 47.
BEGIN failed--compilation aborted at (eval 101) line 47.

That one was also simple enough to solve by installing dev-perl/Archive-Zip:

emerge -av dev-perl/Archive-Zip

Fortunately, this ‘missing dependencies’ problem was fixed with version 2.12.0-r2.

Changes in the ‘amavis’ user’s home directory

With the addition of the acct-user/amavis package, a few things changes about the default setup for the ‘amavis’ user. When installing that package, Portage warns about some steps that need to be implemented in order to make amavis work again after migrating its home directory from /var/amavis/ to /var/lib/amavishome. Even after following those steps, I saw the following error message when starting amavis:

Feb 27 00:08:31 [amavis] socket module IO::Socket::IP, protocol families available: INET, INET6
Feb 27 00:08:31 [amavis] will bind to /var/amavis/amavisd.sock|unix, 127.0.0.1:10024/tcp, [::1]:10024/tcp
Feb 27 00:08:31 [amavis] sd_notify (no socket): STATUS=Transferring control to Net::Server.
Feb 27 00:08:31 [amavis] sd_notify (no socket): STATUS=Preparing to bind sockets.
Feb 27 00:08:31 [amavis] Net::Server: 2020/02/27-00:08:31 Amavis (type Net::Server::PreForkSimple) starting! pid(1738)
Feb 27 00:08:31 [amavis] Net::Server: Binding to UNIX socket file "/var/amavis/amavisd.sock"
Feb 27 00:08:31 [amavis] (!)Net::Server: 2020/02/27-00:08:31 Can't connect to UNIX socket at file /var/amavis/amavisd.sock [No such file or directory]\n  at line 66 in file /usr/lib64/perl5/vendor_perl/5.30.1/Net/Server/Proto/UNIX.pm
Feb 27 00:08:31 [amavis] sd_notify (no socket): STOPPING=1\nSTATUS=Server rundown, notifying child processes.
Feb 27 00:08:31 [amavis] Net::Server: 2020/02/27-00:08:31 Server closing!
Feb 27 00:08:31 [amavis] sd_notify (no socket): STATUS=Child processes have been stopped.

In that error message, I noticed that there were still references to /var/amavis/ instead of the new /var/lib/amavishome/ directory, so I updated it using the $MYHOME variable in /etc/amavisd.conf:

# grep -e '^$MYHOME' /etc/amavisd.conf 
$MYHOME = '/var/lib/amavishome';   # a convenient default for other settings, -H

Thereafter, the start-up logs indicated that it was binding to a UNIX socket in the correct home directory:

Feb 27 00:12:15 [amavis] socket module IO::Socket::IP, protocol families available: INET, INET6
Feb 27 00:12:15 [amavis] will bind to /var/lib/amavishome/amavisd.sock|unix, 127.0.0.1:10024/tcp, [::1]:10024/tcp
Feb 27 00:12:15 [amavis] sd_notify (no socket): STATUS=Transferring control to Net::Server.
Feb 27 00:12:15 [amavis] sd_notify (no socket): STATUS=Preparing to bind sockets.
Feb 27 00:12:15 [amavis] Net::Server: 2020/02/27-00:12:15 Amavis (type Net::Server::PreForkSimple) starting! pid(1952)
Feb 27 00:12:15 [amavis] Net::Server: Binding to UNIX socket file "/var/lib/amavishome/amavisd.sock"
Feb 27 00:12:15 [amavis] Net::Server: Binding to TCP port 10024 on host 127.0.0.1 with IPv4
Feb 27 00:12:15 [amavis] Net::Server: Binding to TCP port 10024 on host ::1 with IPv6

ClamAV permissions within the amavis home directory:

Though this error wasn’t directly related to the upgrades (and had likely existed for quite some time beforehand), I only just now noticed it whilst combing through the logs (the first error is from my system’s mail log, and the second error is from the clamd log):

Feb 27 00:15:10 [amavis] (01980-01) run_av (ClamAV-clamd) result: /var/lib/amavishome/tmp/amavis-20200227T001510-01980-zrKbp28h/parts: lstat() failed: Permission denied. ERROR\n
Feb 27 00:15:10 [amavis] (01980-01) (!)run_av (ClamAV-clamd) FAILED - unexpected , output="/var/lib/amavishome/tmp/amavis-20200227T001510-01980-zrKbp28h/parts: lstat() failed: Permission denied. ERROR\n"
Feb 27 00:15:10 [amavis] (01980-01) (!)ClamAV-clamd av-scanner FAILED: CODE(0x5611198fa5d8) unexpected , output="/var/lib/amavishome/tmp/amavis-20200227T001510-01980-zrKbp28h/parts: lstat() failed: Permission denied. ERROR\n" at (eval 85) line 951.
# grep 'lstat' /var/log/clamav/clamd.log
Wed Feb 26 23:26:53 2020 -> WARNING: lstat() failed on: /var/amavis/tmp/amavis-20200226T180007-00529-S1gbY8cd/parts
Wed Feb 26 23:28:32 2020 -> WARNING: lstat() failed on: /var/amavis/tmp/amavis-20200226T180038-00592-FK2_Uj2T/parts
Wed Feb 26 23:31:30 2020 -> WARNING: lstat() failed on: /var/amavis/tmp/amavis-20200226T165427-32346-V4WeP0YX/parts
Thu Feb 27 00:15:10 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T001510-01980-zrKbp28h/parts
Thu Feb 27 00:26:21 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T002621-01981-l36mWT4P/parts
Thu Feb 27 00:26:30 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T002630-01982-k0qgJdjl/parts
Thu Feb 27 00:29:12 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T002912-01983-G57aKCmK/parts
Thu Feb 27 00:31:55 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T003155-01984-M9r9r1Gc/parts
Thu Feb 27 00:33:07 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T003307-01985-8n_wS6pQ/parts
Thu Feb 27 00:40:20 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T004020-01986-gH4PrFAk/parts
Thu Feb 27 00:45:45 2020 -> WARNING: lstat() failed on: /var/lib/amavishome/tmp/amavis-20200227T004545-01987-J4YGuNbw/parts

These errors were fixed by 1) adding the ‘clamav’ user to the ‘amavis’ group, 2) setting g+w permissions on the /var/lib/amavishome/tmp/ directory, and 3) restarting clamd and amavisd

# gpasswd -a clamav amavis
Adding user clamav to group amavis

# ls -alh /var/lib/amavishome/ | grep tmp
drwxr-xr-x 141 amavis amavis  12K Feb 27 00:46 tmp

# chmod 775 /var/lib/amavishome/tmp

# ls -alh /var/lib/amavishome/ | grep tmp
drwxrwxr-x 141 amavis amavis  12K Feb 27 00:46 tmp

# /etc/init.d/clamd restart && /etc/init.d/amavisd restart

Now ClamAV is readily able to access files under /var/lib/amavishome/tmp/:

Feb 27 01:00:12 [amavis] (03346-01) ClamAV-clamd: Connecting to socket  /var/run/clamav/clamd.sock
Feb 27 01:00:12 [amavis] (03346-01) new socket by IO::Socket::UNIX to /var/run/clamav/clamd.sock, timeout set to 10
Feb 27 01:00:12 [amavis] (03346-01) connected to /var/run/clamav/clamd.sock successfully
Feb 27 01:00:12 [amavis] (03346-01) ClamAV-clamd: Sending CONTSCAN /var/lib/amavishome/tmp/amavis-20200227T010012-03346-YjbUpYOk/parts\n to socket /var/run/clamav/clamd.sock

Hopefully, if you run into these errors, you will be able to take this information and apply it to your particular mail stack.

Cheers,

Zach

February 23 2020

Searx and Gentoo wiki search

Alice Ferrazzi (alicef) February 23, 2020, 15:00

Two years ago I started to get interested in selfhosting services. I started to go away from private services and implementing selfhosting, manly because private services was disabling most of the features that I liked and I had no way to contribute or see how they was working.
That is what made look into old.reddit.com/r/selfhosted/ and www.privacytools.io/ That is when I disovered searx, as the github page say searx is a "Privacy-respecting metasearch engine".
As any selfhost service, you can install easily on your server or
also on the local computer.
For the installation instruction go here or use the searx-docker project.
As I use selfhosted services also because I like to contribute back,
after a few look I decided to add a meta-engine in searx.
Specifically a Gentoo wiki search meta-engine. Pull requeset #1368

Gentoo wiki search is usually enabled by default in the
it tab :)

Gentoo wiki search can also be used by the searx shortcut system (same as bang in duckduckgo if you have familiarity)

The Gentoo wiki search shortcut is !ge
for example
search.stinpriza.org/?q=%21ge%20project%3Akernel&categories=none&language=en

will give you this:

for concluding, have fun with searx and Gentoo!

I also highly recommend to have your own searx instance but you can play with public instances.

Two years ago I started to get interested in selfhosting services. I started to go away from private services and implementing selfhosting, manly because private services was disabling most of the features that I liked and I had no way to contribute or see how they was working.
That is what made look into https://old.reddit.com/r/selfhosted/ and https://www.privacytools.io/ That is when I disovered searx, as the github page say searx is a "Privacy-respecting metasearch engine".
As any selfhost service, you can install easily on your server or
also on the local computer.
For the installation instruction go here or use the searx-docker project.
As I use selfhosted services also because I like to contribute back,
after a few look I decided to add a meta-engine in searx.
Specifically a Gentoo wiki search meta-engine. Pull requeset #1368

Gentoo wiki search is usually enabled by default in the
it tab :)

searx_gentoo_default

Gentoo wiki search can also be used by the searx shortcut system (same as bang in duckduckgo if you have familiarity)

The Gentoo wiki search shortcut is !ge
for example
https://search.stinpriza.org/?q=%21ge%20project%3Akernel&categories=none&language=en

will give you this:
searx_shortcut_ge

for concluding, have fun with searx and Gentoo!

I also highly recommend to have your own searx instance but you can play with public instances.

February 21 2020

Gentoo Python Guide

Michał Górny (mgorny) February 21, 2020, 10:35

Gentoo provides one of the best frameworks for providing Python support in packages among operating systems. This includes support for running multiple versions of Python (while most other distributions avoid going beyond simultaneous support for Python 2 and one version of Python 3), alternative implementations of Python, reliable tests, deep QA checks. While we aim to keep things simple, this is not always possible.

At the same time, the available documentation is limited and not always up-to-date. Both the built-in eclass documentation and Python project wiki page provide bits of documentation but they are mostly in reference form and not very suitable for beginners nor people who do not actively follow the developments within the ecosystem. This results in suboptimal ebuilds, improper dependencies, missing tests.

Gentoo Python Guide aims to fill the gap by providing a good, complete, by-topic (rather than reference-style) documentation for the ecosystem in Gentoo and the relevant eclasses. Combined with examples, it should help you write good ebuilds and solve common problems as simply as possible.

Gentoo Python Guide sources are available on GitHub. Suggestions and improvements are welcome.

mgorny (mgorny ) February 21, 2020, 10:35

Gentoo provides one of the best frameworks for providing Python support in packages among operating systems. This includes support for running multiple versions of Python (while most other distributions avoid going beyond simultaneous support for Python 2 and one version of Python 3), alternative implementations of Python, reliable tests, deep QA checks. While we aim to keep things simple, this is not always possible.

At the same time, the available documentation is limited and not always up-to-date. Both the built-in eclass documentation and Python project wiki page provide bits of documentation but they are mostly in reference form and not very suitable for beginners nor people who do not actively follow the developments within the ecosystem. This results in suboptimal ebuilds, improper dependencies, missing tests.

Gentoo Python Guide aims to fill the gap by providing a good, complete, by-topic (rather than reference-style) documentation for the ecosystem in Gentoo and the relevant eclasses. Combined with examples, it should help you write good ebuilds and solve common problems as simply as possible.

Gentoo Python Guide sources are available on GitHub. Suggestions and improvements are welcome.

February 10 2020

No more PYTHON_TARGETS in single-r1

Michał Górny (mgorny) February 10, 2020, 7:39

Since its inception in 2012, python-single-r1 has been haunting users with two sets of USE flags: PYTHON_TARGETS and PYTHON_SINGLE_TARGET. While this initially seemed a necessary part of the grand design, today I know we could have done better. Today this chymera is disappearing for real, and python-single-r1 are going to use PYTHON_SINGLE_TARGET flags only.

I would like to take this opportunity to explain why the eclass has been designed this way in the first place, and what has been done to change that.

Why PYTHON_SINGLE_TARGET?

Why did we need a second variable in the first place? After all, we could probably get away with using PYTHON_TARGETS everywhere, and adding an appropriate REQUIRED_USE constraint.

Back in the day we have established that for users’ convenience we need to default to enabling one version of Python 2 and one version of Python 3. If we enabled only one of them, the users would end up having to enable the other for a lot of packages. On the other had, if we combined both with using PT for single-r1 packages, the users would have to disable the extra implementation for a lot of them. Neither option was good.

The primary purpose of PYTHON_SINGLE_TARGET was to provide a parallel sensible setting for those packages. It was not only to make the default work out of the box but also to let users change it in one step.

Today, with the demise of Python 2 and the effort to remove Python 2 from default PT, it may seem less important to keep the distinction. Nevertheless, a number of developers and at least some users keep multiple versions of Python in PT to test their packages. Having PST is still helpful to them.

Why additional PYTHON_TARGETS then?

PST is only half of the story. What I explained above does not justify having PYTHON_TARGETS on those packages as well, and a REQUIRED_USE constraint to make them superset of enabled PST. Why did we need to have two flag sets then?

The answer is: PYTHON_USEDEP. The initial design goal was that both python-r1 eclasses would use the same approach to declaring USE dependencies between packages. This also meant that this variable must work alike on dependencies that are multi-impl and single-r1 packages. In the end, this meant a gross hack.

Without getting into details, the currently available USE dependency syntax does not permit directly depending on PT flags based on PST-based conditions. This needs to be done using the more verbose expanded syntax:

pst2_7? ( foo[pt2_7] )
pst3_7? ( foo[pt3_7] )

While this was doable back in the day, it was not possible with PYTHON_USEDEP-based approach. Hence, all single-r1 packages gained additional set of flags merely to construct dependencies conveniently.

What is the problem with that?

I suppose some of you see the problem already. Nevertheless, let’s list them explicitly.

Firstly, enabling additional implementations is inconvenient. Whenever you need to do that, you need to add both PST and PT flags.

Secondly, the PT flags are entirely redundant and meaningless for the package in question. Whenever your value of PT changes, all single-r1 packages trigger rebuilds even if their PST value stays the same.

Thirdly, the PT flags overspecify dependencies. If your PT flags specify multiple implementations (which is normally the case), all dependencies will also have to be built for those interpreters even though PST requires only one of them.

The solution

The user-visible part of the solution is that PYTHON_TARGETS are disappearing from single-r1 packages. From now on, only PYTHON_SINGLE_TARGET will be necessary. Furthermore, PT enforcement on dependencies (if necessary) will be limited to the single implementation selected by PST rather than all of PT.

The developer-oriented part is that PYTHON_USEDEP is no longer valid in single-r1 packages. Instead, PYTHON_SINGLE_USEDEP is provided for dependencies on other single-r1 packages, and PYTHON_MULTI_USEDEP placeholder is used for multi-impl packages. The former is available as a global variable, the latter only as a placeholder in python_gen_cond_dep (the name is a bit of misnomer now but I’ve decided not to introduce additional function).

All existing uses have been converted, and the eclasses will now fail if someone tries to use the old logic. The conversion of existing ebuilds is rather simple:

  1. Replace all ${PYTHON_USEDEP}s with ${PYTHON_SINGLE_USEDEP} when the dep is single-r1, or with ${PYTHON_MULTI_USEDEP} otherwise.
  2. Wrap all dependencies containing ${PYTHON_MULTI_USEDEP} in a python_gen_cond_dep. Remember that the variable must be a literal placeholder, i.e. use single quotes.

An example of the new logic follows:

RDEPEND="
  dev-libs/libfoo[python,${PYTHON_SINGLE_USEDEP}]
  $(python_gen_cond_dep '
    dev-python/foo[${PYTHON_MULTI_USEDEP}]
    dev-python/bar[${PYTHON_MULTI_USEDEP}]
  ')
"

If you get the dependency type wrong, repoman/pkgcheck will complain about bad dependency.

Since its inception in 2012, python-single-r1 has been haunting users with two sets of USE flags: PYTHON_TARGETS and PYTHON_SINGLE_TARGET. While this initially seemed a necessary part of the grand design, today I know we could have done better. Today this chymera is disappearing for real, and python-single-r1 are going to use PYTHON_SINGLE_TARGET flags only.

I would like to take this opportunity to explain why the eclass has been designed this way in the first place, and what has been done to change that.

Why PYTHON_SINGLE_TARGET?

Why did we need a second variable in the first place? After all, we could probably get away with using PYTHON_TARGETS everywhere, and adding an appropriate REQUIRED_USE constraint.

Back in the day we have established that for users’ convenience we need to default to enabling one version of Python 2 and one version of Python 3. If we enabled only one of them, the users would end up having to enable the other for a lot of packages. On the other had, if we combined both with using PT for single-r1 packages, the users would have to disable the extra implementation for a lot of them. Neither option was good.

The primary purpose of PYTHON_SINGLE_TARGET was to provide a parallel sensible setting for those packages. It was not only to make the default work out of the box but also to let users change it in one step.

Today, with the demise of Python 2 and the effort to remove Python 2 from default PT, it may seem less important to keep the distinction. Nevertheless, a number of developers and at least some users keep multiple versions of Python in PT to test their packages. Having PST is still helpful to them.

Why additional PYTHON_TARGETS then?

PST is only half of the story. What I explained above does not justify having PYTHON_TARGETS on those packages as well, and a REQUIRED_USE constraint to make them superset of enabled PST. Why did we need to have two flag sets then?

The answer is: PYTHON_USEDEP. The initial design goal was that both python-r1 eclasses would use the same approach to declaring USE dependencies between packages. This also meant that this variable must work alike on dependencies that are multi-impl and single-r1 packages. In the end, this meant a gross hack.

Without getting into details, the currently available USE dependency syntax does not permit directly depending on PT flags based on PST-based conditions. This needs to be done using the more verbose expanded syntax:

pst2_7? ( foo[pt2_7] )
pst3_7? ( foo[pt3_7] )

While this was doable back in the day, it was not possible with PYTHON_USEDEP-based approach. Hence, all single-r1 packages gained additional set of flags merely to construct dependencies conveniently.

What is the problem with that?

I suppose some of you see the problem already. Nevertheless, let’s list them explicitly.

Firstly, enabling additional implementations is inconvenient. Whenever you need to do that, you need to add both PST and PT flags.

Secondly, the PT flags are entirely redundant and meaningless for the package in question. Whenever your value of PT changes, all single-r1 packages trigger rebuilds even if their PST value stays the same.

Thirdly, the PT flags overspecify dependencies. If your PT flags specify multiple implementations (which is normally the case), all dependencies will also have to be built for those interpreters even though PST requires only one of them.

The solution

The user-visible part of the solution is that PYTHON_TARGETS are disappearing from single-r1 packages. From now on, only PYTHON_SINGLE_TARGET will be necessary. Furthermore, PT enforcement on dependencies (if necessary) will be limited to the single implementation selected by PST rather than all of PT.

The developer-oriented part is that PYTHON_USEDEP is no longer valid in single-r1 packages. Instead, PYTHON_SINGLE_USEDEP is provided for dependencies on other single-r1 packages, and PYTHON_MULTI_USEDEP placeholder is used for multi-impl packages. The former is available as a global variable, the latter only as a placeholder in python_gen_cond_dep (the name is a bit of misnomer now but I’ve decided not to introduce additional function).

All existing uses have been converted, and the eclasses will now fail if someone tries to use the old logic. The conversion of existing ebuilds is rather simple:

  1. Replace all ${PYTHON_USEDEP}s with ${PYTHON_SINGLE_USEDEP} when the dep is single-r1, or with ${PYTHON_MULTI_USEDEP} otherwise.
  2. Wrap all dependencies containing ${PYTHON_MULTI_USEDEP} in a python_gen_cond_dep. Remember that the variable must be a literal placeholder, i.e. use single quotes.

An example of the new logic follows:

RDEPEND="
  dev-libs/libfoo[python,${PYTHON_SINGLE_USEDEP}]
  $(python_gen_cond_dep '
    dev-python/foo[${PYTHON_MULTI_USEDEP}]
    dev-python/bar[${PYTHON_MULTI_USEDEP}]
  ')
"

If you get the dependency type wrong, repoman/pkgcheck will complain about bad dependency.

January 03 2020

FOSDEM 2020

Gentoo News (GentooNews) January 03, 2020, 0:00

It’s FOSDEM time again! Join us at Université libre de Bruxelles, Campus du Solbosch, in Brussels, Belgium. This year’s FOSDEM 2020 will be held on February 1st and 2nd.

Our developers will be happy to greet all open source enthusiasts at our Gentoo stand in building K where we will also celebrate 20 years compiling! Visit this year’s wiki page to see who’s coming.

FOSDEM logo

It’s FOSDEM time again! Join us at Université libre de Bruxelles, Campus du Solbosch, in Brussels, Belgium. This year’s FOSDEM 2020 will be held on February 1st and 2nd.

Our developers will be happy to greet all open source enthusiasts at our Gentoo stand in building K where we will also celebrate 20 years compiling! Visit this year’s wiki page to see who’s coming.

December 28 2019

Scylla Summit 2019

Alexys Jacob (ultrabug) December 28, 2019, 19:04

I’ve had the pleasure to attend again and present at the Scylla Summit in San Francisco and the honor to be awarded the Most innovative use case of Scylla.

It was a great event, full of friendly people and passionate conversations. Peter did a great full write-up of it already so I wanted to share some of my notes instead…

This a curated set of topics that I happened to question or discuss in depth so this post is not meant to be taken as a full coverage of the conference.

Scylla Manager version 2

The upcoming version of scylla-manager is dropping its dependency on SSH setup which will be replaced by an agent, most likely shipped as a separate package.

On the features side, I was a bit puzzled by the fact that ScyllaDB is advertising that its manager will provide a repair scheduling window so that you can control when it’s running or not.

Why did it struck me you ask?

Because MongoDB does the same thing within its balancer process and I always thought of this as a patch to a feature that the database should be able to cope with by itself.

And that database-do-it-better-than-you motto is exactly one of the promises of Scylla, the boring database, so smart at handling workload impacts on performance that you shouldn’t have to start playing tricks to mitigate them… I don’t want this time window feature on scylla-manager to be a trojan horse on the demise of that promise!

Kubernetes

They almost got late on this but are working hard to play well with the new toy of every tech around the world. Helm charts are also being worked on!

The community developed scylla operator by Yannis is now being worked on and backed by ScyllaDB. It can deploy, scale up and down a cluster.

Few things to note:

  • it’s using a configmap to store the scylla config
  • no TLS support yet
  • no RBAC support yet
  • kubernetes networking is lighter on the network performance hit that was seen on Docker
  • use placement strategies to dedicate kubernetes nodes to scylla!
Change Data Capture

Oh boy this one was awaited… but it’s now coming soon!

I inquired about it’s performance impact since every operation will be written to a table. Clearly my questioning was a bit alpha since CDC is still being worked on.

I had the chance to discuss ideas with Kamil, Tzach and Dor: one of the thing that one of my colleague Julien asked for was the ability for the CDC to generate an event when a tombstone is written so we could actually know when a specific data expired!

I want to stress a few other things too:

  • default TTL on CDC table is 24H
  • expect I/O impact (logical)
  • TTL tombstones can have a hidden disk space cost and nobody was able to tell me if the CDC table was going to be configured with a lower gc_grace_period than the default 10 days so that’s something we need to keep in mind and check for
  • there was no plan to add user information that would allow us to know who actually did the operation, so that’s something I asked for because it could be used as a cheap and open source way to get auditing!
LightWeight Transactions

Another so long awaited feature is also coming from the amazing work and knowledge of Konstantin. We had a great conversation about the differences between the currently worked on Paxos based LWT implementation and the maybe later Raft one.

So yes, the first LWT implementation will be using Paxos as a consensus algorithm. This will make the LWT feature very consistent while having it slower that what could be achieved using Raft. That’s why ScyllaDB have plans on another implementation that could be faster with less data consistency guarantees.

User Defined Functions / Aggregations

This one is bringing the Lua language inside Scylla!

To be precise, it will be a Lua JIT as its footprint is low and Lua can be cooperative enough but the ScyllaDB people made sure to monitor its violations (when it should yield but does not) and act strongly upon them.

I got into implementation details with Avi, this is what I noted:

  • lua function return type is not checked at creation but at execution, so expect runtime errors if your lua code is bad
  • since lua is lightweight, there’s no need to assign a core to lua execution
  • I found UDA examples, like top-k rows, to be very similar to the Map/Reduce logic
  • UDF will allow simpler token range full table scans thanks to syntax sugar
  • there will be memory limits applied to result sets from UDA, and they will be tunable
Text search

Dejan is the text search guy at ScyllaDB and the one who kindly implemented the LIKE feature we asked for and that will be released in the upcoming 3.2 version.

We discussed ideas and projected use cases to make sure that what’s going to be worked on will be used!

Redis API

I’ve always been frustrated about Redis because while I love the technology I never trusted its clustering and scaling capabilities.

What if you could scale your Redis like Scylla without giving up on performance? That’s what the implementation of the Redis API backed by Scylla will get us!

I’m desperately looking forward to see this happen!

ultrabug (ultrabug ) December 28, 2019, 19:04

I’ve had the pleasure to attend again and present at the Scylla Summit in San Francisco and the honor to be awarded the Most innovative use case of Scylla.

It was a great event, full of friendly people and passionate conversations. Peter did a great full write-up of it already so I wanted to share some of my notes instead…

This a curated set of topics that I happened to question or discuss in depth so this post is not meant to be taken as a full coverage of the conference.

Scylla Manager version 2

The upcoming version of scylla-manager is dropping its dependency on SSH setup which will be replaced by an agent, most likely shipped as a separate package.

On the features side, I was a bit puzzled by the fact that ScyllaDB is advertising that its manager will provide a repair scheduling window so that you can control when it’s running or not.

Why did it struck me you ask?

Because MongoDB does the same thing within its balancer process and I always thought of this as a patch to a feature that the database should be able to cope with by itself.

And that database-do-it-better-than-you motto is exactly one of the promises of Scylla, the boring database, so smart at handling workload impacts on performance that you shouldn’t have to start playing tricks to mitigate them… I don’t want this time window feature on scylla-manager to be a trojan horse on the demise of that promise!

Kubernetes

They almost got late on this but are working hard to play well with the new toy of every tech around the world. Helm charts are also being worked on!

The community developed scylla operator by Yannis is now being worked on and backed by ScyllaDB. It can deploy, scale up and down a cluster.

Few things to note:

  • it’s using a configmap to store the scylla config
  • no TLS support yet
  • no RBAC support yet
  • kubernetes networking is lighter on the network performance hit that was seen on Docker
  • use placement strategies to dedicate kubernetes nodes to scylla!

Change Data Capture

Oh boy this one was awaited… but it’s now coming soon!

I inquired about it’s performance impact since every operation will be written to a table. Clearly my questioning was a bit alpha since CDC is still being worked on.

I had the chance to discuss ideas with Kamil, Tzach and Dor: one of the thing that one of my colleague Julien asked for was the ability for the CDC to generate an event when a tombstone is written so we could actually know when a specific data expired!

I want to stress a few other things too:

  • default TTL on CDC table is 24H
  • expect I/O impact (logical)
  • TTL tombstones can have a hidden disk space cost and nobody was able to tell me if the CDC table was going to be configured with a lower gc_grace_period than the default 10 days so that’s something we need to keep in mind and check for
  • there was no plan to add user information that would allow us to know who actually did the operation, so that’s something I asked for because it could be used as a cheap and open source way to get auditing!

LightWeight Transactions

Another so long awaited feature is also coming from the amazing work and knowledge of Konstantin. We had a great conversation about the differences between the currently worked on Paxos based LWT implementation and the maybe later Raft one.

So yes, the first LWT implementation will be using Paxos as a consensus algorithm. This will make the LWT feature very consistent while having it slower that what could be achieved using Raft. That’s why ScyllaDB have plans on another implementation that could be faster with less data consistency guarantees.

User Defined Functions / Aggregations

This one is bringing the Lua language inside Scylla!

To be precise, it will be a Lua JIT as its footprint is low and Lua can be cooperative enough but the ScyllaDB people made sure to monitor its violations (when it should yield but does not) and act strongly upon them.

I got into implementation details with Avi, this is what I noted:

  • lua function return type is not checked at creation but at execution, so expect runtime errors if your lua code is bad
  • since lua is lightweight, there’s no need to assign a core to lua execution
  • I found UDA examples, like top-k rows, to be very similar to the Map/Reduce logic
  • UDF will allow simpler token range full table scans thanks to syntax sugar
  • there will be memory limits applied to result sets from UDA, and they will be tunable

Text search

Dejan is the text search guy at ScyllaDB and the one who kindly implemented the LIKE feature we asked for and that will be released in the upcoming 3.2 version.

We discussed ideas and projected use cases to make sure that what’s going to be worked on will be used!

Redis API

I’ve always been frustrated about Redis because while I love the technology I never trusted its clustering and scaling capabilities.

What if you could scale your Redis like Scylla without giving up on performance? That’s what the implementation of the Redis API backed by Scylla will get us!

I’m desperately looking forward to see this happen!

December 24 2019

Handling PEP 517 (pyproject.toml) packages in Gentoo

Michał Górny (mgorny) December 24, 2019, 22:59

So far, the majority of Python packages have either used distutils, or a build system built upon it. Most frequently, this was setuptools. All those solutions provided a setup.py script with a semi-standard interface, and we were able to handle them reliably within distutils-r1.eclass. PEP 517 changed that.

Instead of a setup script, packages now only need to supply a declarative project information in pyproject.toml file (fun fact: TOML parser is not even part of Python stdlib yet). The build system used is specified as a combination of a package requirement and a backend object to use. The backends are expected to provide a very narrow API: it’s limited to building wheel packages and source distribution tarballs.

The new build systems built around this concept are troublesome to Gentoo. They are more focused on being standalone package managers than build systems. They lack the APIs matching our needs. They have large dependency trees, including circular dependencies. Hence, we’ve decided to try an alternate route.

Instead of trying to tame the new build systems, or work around their deficiencies (i.e. by making them build wheel packages, then unpacking and repackaging them), we’ve explored the possibility of converting the pyproject.toml files into setup.py scripts. Since the new formats are declarative, this should not be that hard.

We’ve found poetry-setup project which seemed to have a similar goal. However, it was already discontinued at the time in favor of dephell. The latter project looked pretty powerful but the name was pretty ominous. We did not need most of the functions, and it was hell to package.

Finally, I’ve managed to dedicate some time into building an in-house solution instead. pyproject2setuppy is a small-ish (<100 SLOC) pyproject.toml-to-setuptools adapter which allows us to run flit- or poetry-based projects as if they used regular distutils. While it’s quite limited, it’s good enough to build and install the packages that we needed to deal with so far.

The design is quite simple — it reads pyproject.toml and calls setuptools’ setup() function with the metadata read. As such, the package can even be used to provide a backwards-compatible setup.py script in other packages. In fact, this is how its own setup.py works — it carries flit-compatible pyproject.toml and uses itself to install itself via setuptools.

dev-python/pyproject2setuppy is already packaged in Gentoo. I’ve sent eclass patches to easily integrate it into distutils-r1. Once they are merged, installing pyproject.toml packages should be as simple as adding the following declaration into ebuilds:

DISTUTILS_USE_SETUPTOOLS=pyproject.toml

This should make things easier both for us (as it saves us from having to hurriedly add new build systems and their NIH dependencies) and for users who will not have to suffer from more circular dependencies in the Python world. It may also help some upstream projects to maintain backwards compatibility while migrating to new build systems.

So far, the majority of Python packages have either used distutils, or a build system built upon it. Most frequently, this was setuptools. All those solutions provided a setup.py script with a semi-standard interface, and we were able to handle them reliably within distutils-r1.eclass. PEP 517 changed that.

Instead of a setup script, packages now only need to supply a declarative project information in pyproject.toml file (fun fact: TOML parser is not even part of Python stdlib yet). The build system used is specified as a combination of a package requirement and a backend object to use. The backends are expected to provide a very narrow API: it’s limited to building wheel packages and source distribution tarballs.

The new build systems built around this concept are troublesome to Gentoo. They are more focused on being standalone package managers than build systems. They lack the APIs matching our needs. They have large dependency trees, including circular dependencies. Hence, we’ve decided to try an alternate route.

Instead of trying to tame the new build systems, or work around their deficiencies (i.e. by making them build wheel packages, then unpacking and repackaging them), we’ve explored the possibility of converting the pyproject.toml files into setup.py scripts. Since the new formats are declarative, this should not be that hard.

We’ve found poetry-setup project which seemed to have a similar goal. However, it was already discontinued at the time in favor of dephell. The latter project looked pretty powerful but the name was pretty ominous. We did not need most of the functions, and it was hell to package.

Finally, I’ve managed to dedicate some time into building an in-house solution instead. pyproject2setuppy is a small-ish (<100 SLOC) pyproject.toml-to-setuptools adapter which allows us to run flit- or poetry-based projects as if they used regular distutils. While it’s quite limited, it’s good enough to build and install the packages that we needed to deal with so far.

The design is quite simple — it reads pyproject.toml and calls setuptools’ setup() function with the metadata read. As such, the package can even be used to provide a backwards-compatible setup.py script in other packages. In fact, this is how its own setup.py works — it carries flit-compatible pyproject.toml and uses itself to install itself via setuptools.

dev-python/pyproject2setuppy is already packaged in Gentoo. I’ve sent eclass patches to easily integrate it into distutils-r1. Once they are merged, installing pyproject.toml packages should be as simple as adding the following declaration into ebuilds:

DISTUTILS_USE_SETUPTOOLS=pyproject.toml

This should make things easier both for us (as it saves us from having to hurriedly add new build systems and their NIH dependencies) and for users who will not have to suffer from more circular dependencies in the Python world. It may also help some upstream projects to maintain backwards compatibility while migrating to new build systems.

December 19 2019

A distribution kernel for Gentoo

Michał Górny (mgorny) December 19, 2019, 12:32

The traditional Gentoo way of getting a kernel is to install the sources, and then configure and build one yourself. For those who didn’t want to go through the tedious process of configuring it manually, an alternative route of using genkernel was provided. However, neither of those variants was able to really provide the equivalent of kernels provided by binary distributions.

I have manually configured the kernels for my private systems long time ago. Today, I wouldn’t really have bothered. In fact, I realized that for some time I’m really hesitant to even upgrade them because of the effort needed to update configuration. The worst part is, whenever a new kernel does not boot, I have to ask myself: is it a real bug, or is it my fault for configuring it wrong?

I’m not alone in this. Recently Михаил Коляда (zlogene) has talked to me about providing binary kernels for Gentoo. While I have not strictly implemented what he had in mind, he inspired me to start working on a distribution kernel. The goal was to create a kernel package that users can install to get a working kernel with minimal effort, and that would be upgraded automatically as part of regular @world upgrades.

Pros and cons of your own kernel

If I am to justify switching from the old tradition of custom kernels to a universal kernel package, I should start by discussing the reasons why you may want to configure a custom kernel in the first place.

In my opinion, the most important feature of a custom kernel is that you can fine-tune it to your hardware. You just have to build the drivers you need (or may need), and the features you care about. The modules for my last custom kernel have occupied 44 MiB. The modules for the distribution kernel occupy 294 MiB. Such a difference in size also comes with a proportional increase of build time. This can be an important argument for people with low-end hardware. On the other hand, the distribution kernel permits building reusable binary packages that can save more computing power.

The traditional Gentoo argument is performance. However, these days I would be very careful arguing about that. I suppose you are able to reap benefits if you know how to configure your kernel towards a specific workload. But then — a misconfiguration can have the exact opposite effect. We must not forget that binary distributions are important players in the field — and the kernel must also be able to achieve good performance when not using a dedicated configuration.

At some point I have worked on achieving a very fast startup. For this reason I’ve switched to using LILO as the bootloader, and a kernel suitable for booting my system without an initramfs. A universal kernel naturally needs an initramfs, and is slower to boot.

The main counterargument is the effort. As mentioned earlier, I’ve personally grown tired of having to manually deal with my kernel. Do the potential gains mentioned outweigh the loss of human time on configuring and maintaining a custom kernel?

Creating a truly universal kernel

A distribution kernel makes sense only if it works on a wide range of systems. Furthermore, I didn’t forget the original idea of binary kernel packages. I didn’t want to write an ebuild that can install a working kernel anywhere. I wanted to create an ebuild that can be used to build a binary package that’s going to work on a wide range of setups — including not only different hardware but also bootloaders and /boot layout. A package that would work fine both for my ‘traditional’ LILO setup and UEFI systemd-boot setup.

The first part of a distribution kernel is the right configuration. I wanted to use a well-tested configuration known to build kernels used on many systems, while at the same time minimizing the maintenance effort on our end. Reusing the configuration from a binary distro was the obvious solution. I went for using the config from Arch Linux’s kernel package with minimal changes (e.g. changing the default hostname to Gentoo).

The second part is an initramfs. Since we need to support a wide variety of setups, we can’t get away without it. To follow the configuration used, Dracut was the natural choice.

The third and hardest part is installing it all. Since I’ve already set a goal of reusing the same binary package on different filesystem layouts, the actual installation needed to be moved to postinst phase. Our distribution kernel package installs the kernel into an interim location which is entirely setup-independent, rendering the binary packages setup-agnostic as well. The initramfs is created and installed into the final location along with the kernel in pkg_postinst.

Support for different install layouts is provided by reusing the installkernel tool, originally installed by debianutils. As part of the effort, it was extended with initramfs support and moved into a separate sys-kernel/installkernel-gentoo package. Furthermore, an alternative sys-kernel/installkernel-systemd-boot package was created to provide an out-of-the-box support for systemd-boot layout. If neither of those two work for you, you can easily create your own /usr/local/bin/installkernel that follows your own layout.

Summary

The experimental versions of the distribution kernel are packaged as sys-kernel/vanilla-kernel (in distinction from sys-kernel/vanilla-sources that install the sources). Besides providing the default zero-effort setup, the package supports using your own configuration via savedconfig (but no easy way to update it at the moment). It also provides a forced flag that can be used by expert users to disable the initramfs.

The primary goal at the moment is to test the package and find bugs that could prevent our users from using it. In the future, we’re planning to extend it to other architectures, kernel variants (Gentoo patch set in particular) and LTS versions. We’re also considering providing prebuilt binary packages — however, this will probably be a part of a bigger effort into providing an official Gentoo binhost.

The traditional Gentoo way of getting a kernel is to install the sources, and then configure and build one yourself. For those who didn’t want to go through the tedious process of configuring it manually, an alternative route of using genkernel was provided. However, neither of those variants was able to really provide the equivalent of kernels provided by binary distributions.

I have manually configured the kernels for my private systems long time ago. Today, I wouldn’t really have bothered. In fact, I realized that for some time I’m really hesitant to even upgrade them because of the effort needed to update configuration. The worst part is, whenever a new kernel does not boot, I have to ask myself: is it a real bug, or is it my fault for configuring it wrong?

I’m not alone in this. Recently Михаил Коляда (zlogene) has talked to me about providing binary kernels for Gentoo. While I have not strictly implemented what he had in mind, he inspired me to start working on a distribution kernel. The goal was to create a kernel package that users can install to get a working kernel with minimal effort, and that would be upgraded automatically as part of regular @world upgrades.

Pros and cons of your own kernel

If I am to justify switching from the old tradition of custom kernels to a universal kernel package, I should start by discussing the reasons why you may want to configure a custom kernel in the first place.

In my opinion, the most important feature of a custom kernel is that you can fine-tune it to your hardware. You just have to build the drivers you need (or may need), and the features you care about. The modules for my last custom kernel have occupied 44 MiB. The modules for the distribution kernel occupy 294 MiB. Such a difference in size also comes with a proportional increase of build time. This can be an important argument for people with low-end hardware. On the other hand, the distribution kernel permits building reusable binary packages that can save more computing power.

The traditional Gentoo argument is performance. However, these days I would be very careful arguing about that. I suppose you are able to reap benefits if you know how to configure your kernel towards a specific workload. But then — a misconfiguration can have the exact opposite effect. We must not forget that binary distributions are important players in the field — and the kernel must also be able to achieve good performance when not using a dedicated configuration.

At some point I have worked on achieving a very fast startup. For this reason I’ve switched to using LILO as the bootloader, and a kernel suitable for booting my system without an initramfs. A universal kernel naturally needs an initramfs, and is slower to boot.

The main counterargument is the effort. As mentioned earlier, I’ve personally grown tired of having to manually deal with my kernel. Do the potential gains mentioned outweigh the loss of human time on configuring and maintaining a custom kernel?

Creating a truly universal kernel

A distribution kernel makes sense only if it works on a wide range of systems. Furthermore, I didn’t forget the original idea of binary kernel packages. I didn’t want to write an ebuild that can install a working kernel anywhere. I wanted to create an ebuild that can be used to build a binary package that’s going to work on a wide range of setups — including not only different hardware but also bootloaders and /boot layout. A package that would work fine both for my ‘traditional’ LILO setup and UEFI systemd-boot setup.

The first part of a distribution kernel is the right configuration. I wanted to use a well-tested configuration known to build kernels used on many systems, while at the same time minimizing the maintenance effort on our end. Reusing the configuration from a binary distro was the obvious solution. I went for using the config from Arch Linux’s kernel package with minimal changes (e.g. changing the default hostname to Gentoo).

The second part is an initramfs. Since we need to support a wide variety of setups, we can’t get away without it. To follow the configuration used, Dracut was the natural choice.

The third and hardest part is installing it all. Since I’ve already set a goal of reusing the same binary package on different filesystem layouts, the actual installation needed to be moved to postinst phase. Our distribution kernel package installs the kernel into an interim location which is entirely setup-independent, rendering the binary packages setup-agnostic as well. The initramfs is created and installed into the final location along with the kernel in pkg_postinst.

Support for different install layouts is provided by reusing the installkernel tool, originally installed by debianutils. As part of the effort, it was extended with initramfs support and moved into a separate sys-kernel/installkernel-gentoo package. Furthermore, an alternative sys-kernel/installkernel-systemd-boot package was created to provide an out-of-the-box support for systemd-boot layout. If neither of those two work for you, you can easily create your own /usr/local/bin/installkernel that follows your own layout.

Summary

The experimental versions of the distribution kernel are packaged as sys-kernel/vanilla-kernel (in distinction from sys-kernel/vanilla-sources that install the sources). Besides providing the default zero-effort setup, the package supports using your own configuration via savedconfig (but no easy way to update it at the moment). It also provides a forced flag that can be used by expert users to disable the initramfs.

The primary goal at the moment is to test the package and find bugs that could prevent our users from using it. In the future, we’re planning to extend it to other architectures, kernel variants (Gentoo patch set in particular) and LTS versions. We’re also considering providing prebuilt binary packages — however, this will probably be a part of a bigger effort into providing an official Gentoo binhost.

December 12 2019

A better ebuild workflow with pure git and pkgcheck

Michał Górny (mgorny) December 12, 2019, 20:04

Many developers today continue using repoman commit as their primary way of committing to Gentoo. While this tool was quite helpful, if not indispensable in times of CVS, today it’s a burden. The workflow using a single serial tool to check your packages and commit to them is not very efficient. Not only it wastes your time and slows you down — it discourages you from splitting your changes into more atomic commits.

Upon hearing the pkgcheck advocacy, many developers ask whether it can commit for you. It won’t do that, that’s not its purpose. Not only it’s waste of time to implement that — it would actually make it a worse tool. With its parallel engine pkgcheck really shines when dealing with multiple packages — forcing it to work on one package is a waste of its potential.

Rather than trying to proliferate your bad old habits, you should learn how to use git and pkgcheck efficiently. This post aims to give you a few advices.

pkgcheck after committing

Repoman was built under the assumption that checks should be done prior to committing. That is understandable when you’re working on a ‘live’ repository as the ones used by CVS or Subversion. However, in case of VCS-es involving staging commits such as Git there is no real difference between checking prior to or post commit. The most efficient pkgcheck workflow is to check once all changes are committed and you are ready to push.

The most recent version of pkgcheck has a command just for that:

$ pkgcheck scan --commits

Yes, it’s that simple. It checks what you’ve committed compared to origin (note: you’ll need to have a correct origin remote), and runs scan on all those packages. Now, if you’re committing changes to multiple packages (which should be pretty common), the scan is run in parallel to utilize your CPU power better.

You might say: but repoman ensures that my commit message is neat these days! Guess what. The --commits option does exactly that — it raises warnings if your commit message is bad. Admittedly, it only checks summary line at the moment but that’s something that can (and will) be improved easily.

And I’ve forgotten the most cool thing of all: pkgcheck also reports if you accidentally remove the newest ebuild with stable keywords on given arch!

One more tip. You can use the following option to include full live verification of URLs:

$ pkgcheck scan --net --commits

Again, this is a feature missing entirely from repoman.

pkgcommit to ease committing to ebuilds

While the majority of repoman’s VCS support is superficial or better implemented elsewhere, there’s one killer feature worth keeping: automatically prepending the package name to the summary line. Since that is a really trivial thing, I’ve reimplemented it in a few lines of bash as pkgcommit.

When run in a package directory, it runs an editor with pre-filled commit message template to let you type it in, then passes it along with its own arguments to git. Usually, I use it as (I like to be explicit about signoffs and signing, you can make .git/config take care of it):

$ pkgcommit -sS .

Its extra feature is that it processes -m option and lets you skip the editor for simple messages:

$ pkgcommit -sS . -m 'Bump to 1.2.3'

Note that it does not go out of its way to figure out what to commit. You need to either stage changes yourself via git add, or pass appropriate paths to the command. What’s important is that it does not limit you to committing to one directory — you can e.g. include some profile changes easily.

You’ll also need pkg script from the same repository. Or you just install the whole bundle of app-portage/mgorny-dev-scripts.

Amending commits via fixups

Most of you know probably know that you can update commits via git commit --amend. However, that’s useful only for editing the most recent commit. You can also use interactive rebase to choose specific commits for editing, and then amend them. Yet, usually there’s a much more convenient way of doing that.

In order to commit a fixup to a particular past commit, use:

$ git commit --fixup OLD_COMMIT_ID

This will create a specially titled commit that will be automatically picked up and ordered by the interactive rebase:

$ git rebase -i -S origin

Again, I have a tool of greater convenience. Frequently, I just want to update the latest commit to a particular package (directory). git-fixup does exactly that — it finds the identifier of the latest commit to a particular file/directory (or the current directory when no parameter is given) and commits a fixup to that:

$ git-fixup .

Note that if you try to push fixups into the repository, nothing will stop you. This is one of the reasons that I don’t enable signoffs and signing on all commits by default. This way, if I forget to rebase my fixups, the git hook will reject them as lacking signoff and/or signature.

Again, it is part of app-portage/mgorny-dev-scripts.

Interactive rebase to the rescue

When trivial tools are no longer sufficient, interactive rebase is probably one of the best tools for editing your commits. Start by initiating it for all commits since the last push:

$ git rebase -i -S origin

It will bring your editor with a list of all commits. Using this list, you can do a lot: reorder commits, drop them, reword their commit messages, use squash or fixup to merge them into other commits, and finally: edit them (open for amending).

The interactive rebase is probably the most powerful porcelain git command. I’ve personally found the immediate tips given by git good enough but I realize that many people find it hard nevertheless. Since it’s not my goal here to provide detailed instructions on using git, I’m going to suggest looking online for tutorials and guides. The Rewriting History section of the Git Book also has a few examples.

Before pushing: git log

git log seems to be one of the most underappreciated pre-push tools. However, it can be of great service to you. When run prior to pushing, it can help you verify that what you’re pushing is actually what you’ve meant to push.

$ git log --stat

will list all staged commits along with a pretty summary of affected files. This can help you notice that you’ve forgotten to git add a patch, or that you’ve accidentally committed some extraneous change, or that you’ve just mixed changes from two commits.

Of course, you can go even further and take a look at the changes in patch form:

$ git log -p

While I realize this is nothing new or surprising to you, sometimes it’s worthwhile to reiterate the basics in a different context to make you realize something obvious.

Many developers today continue using repoman commit as their primary way of committing to Gentoo. While this tool was quite helpful, if not indispensable in times of CVS, today it’s a burden. The workflow using a single serial tool to check your packages and commit to them is not very efficient. Not only it wastes your time and slows you down — it discourages you from splitting your changes into more atomic commits.

Upon hearing the pkgcheck advocacy, many developers ask whether it can commit for you. It won’t do that, that’s not its purpose. Not only it’s waste of time to implement that — it would actually make it a worse tool. With its parallel engine pkgcheck really shines when dealing with multiple packages — forcing it to work on one package is a waste of its potential.

Rather than trying to proliferate your bad old habits, you should learn how to use git and pkgcheck efficiently. This post aims to give you a few advices.

pkgcheck after committing

Repoman was built under the assumption that checks should be done prior to committing. That is understandable when you’re working on a ‘live’ repository as the ones used by CVS or Subversion. However, in case of VCS-es involving staging commits such as Git there is no real difference between checking prior to or post commit. The most efficient pkgcheck workflow is to check once all changes are committed and you are ready to push.

The most recent version of pkgcheck has a command just for that:

$ pkgcheck scan --commits

Yes, it’s that simple. It checks what you’ve committed compared to origin (note: you’ll need to have a correct origin remote), and runs scan on all those packages. Now, if you’re committing changes to multiple packages (which should be pretty common), the scan is run in parallel to utilize your CPU power better.

You might say: but repoman ensures that my commit message is neat these days! Guess what. The --commits option does exactly that — it raises warnings if your commit message is bad. Admittedly, it only checks summary line at the moment but that’s something that can (and will) be improved easily.

And I’ve forgotten the most cool thing of all: pkgcheck also reports if you accidentally remove the newest ebuild with stable keywords on given arch!

One more tip. You can use the following option to include full live verification of URLs:

$ pkgcheck scan --net --commits

Again, this is a feature missing entirely from repoman.

pkgcommit to ease committing to ebuilds

While the majority of repoman’s VCS support is superficial or better implemented elsewhere, there’s one killer feature worth keeping: automatically prepending the package name to the summary line. Since that is a really trivial thing, I’ve reimplemented it in a few lines of bash as pkgcommit.

When run in a package directory, it runs an editor with pre-filled commit message template to let you type it in, then passes it along with its own arguments to git. Usually, I use it as (I like to be explicit about signoffs and signing, you can make .git/config take care of it):

$ pkgcommit -sS .

Its extra feature is that it processes -m option and lets you skip the editor for simple messages:

$ pkgcommit -sS . -m 'Bump to 1.2.3'

Note that it does not go out of its way to figure out what to commit. You need to either stage changes yourself via git add, or pass appropriate paths to the command. What’s important is that it does not limit you to committing to one directory — you can e.g. include some profile changes easily.

You’ll also need pkg script from the same repository. Or you just install the whole bundle of app-portage/mgorny-dev-scripts.

Amending commits via fixups

Most of you know probably know that you can update commits via git commit --amend. However, that’s useful only for editing the most recent commit. You can also use interactive rebase to choose specific commits for editing, and then amend them. Yet, usually there’s a much more convenient way of doing that.

In order to commit a fixup to a particular past commit, use:

$ git commit --fixup OLD_COMMIT_ID

This will create a specially titled commit that will be automatically picked up and ordered by the interactive rebase:

$ git rebase -i -S origin

Again, I have a tool of greater convenience. Frequently, I just want to update the latest commit to a particular package (directory). git-fixup does exactly that — it finds the identifier of the latest commit to a particular file/directory (or the current directory when no parameter is given) and commits a fixup to that:

$ git-fixup .

Note that if you try to push fixups into the repository, nothing will stop you. This is one of the reasons that I don’t enable signoffs and signing on all commits by default. This way, if I forget to rebase my fixups, the git hook will reject them as lacking signoff and/or signature.

Again, it is part of app-portage/mgorny-dev-scripts.

Interactive rebase to the rescue

When trivial tools are no longer sufficient, interactive rebase is probably one of the best tools for editing your commits. Start by initiating it for all commits since the last push:

$ git rebase -i -S origin

It will bring your editor with a list of all commits. Using this list, you can do a lot: reorder commits, drop them, reword their commit messages, use squash or fixup to merge them into other commits, and finally: edit them (open for amending).

The interactive rebase is probably the most powerful porcelain git command. I’ve personally found the immediate tips given by git good enough but I realize that many people find it hard nevertheless. Since it’s not my goal here to provide detailed instructions on using git, I’m going to suggest looking online for tutorials and guides. The Rewriting History section of the Git Book also has a few examples.

Before pushing: git log

git log seems to be one of the most underappreciated pre-push tools. However, it can be of great service to you. When run prior to pushing, it can help you verify that what you’re pushing is actually what you’ve meant to push.

$ git log --stat

will list all staged commits along with a pretty summary of affected files. This can help you notice that you’ve forgotten to git add a patch, or that you’ve accidentally committed some extraneous change, or that you’ve just mixed changes from two commits.

Of course, you can go even further and take a look at the changes in patch form:

$ git log -p

While I realize this is nothing new or surprising to you, sometimes it’s worthwhile to reiterate the basics in a different context to make you realize something obvious.

Gentoo eclass design pitfalls

Michał Górny (mgorny) November 06, 2019, 7:57

I have written my share of eclasses, and I have made my share of mistakes. Designing good eclasses is a non-trivial problem, and there are many pitfalls you should be watching for. In this post, I would like to highlight three of them.

Not all metadata variables are combined

PMS provides a convenient feature for eclass writers: cumulative handling of metadata variables. Quoting the relevant passage:

The IUSE, REQUIRED_USE, DEPEND, BDEPEND, RDEPEND and PDEPEND variables are handled specially when set by an eclass. They must be accumulated across eclasses, appending the value set by each eclass to the resulting value after the previous one is loaded. Then the eclass-defined value is appended to that defined by the ebuild. […]

Package Manager Specification (30th April 2018), 10.2 Eclass-defined Metadata Keys

That’s really handy! However, the important thing that’s not obvious from this description is that not all metadata variables work this way. The following multi-value variables don’t: HOMEPAGE, SRC_URI, LICENSE, KEYWORDS, PROPERTIES and RESTRICT. Surely, some of them are not supposed to be set in eclasses but e.g. the last two are confusing.

This means that technically you need to append when defining them, e.g.:

# my.eclass
RESTRICT+=" !test? ( test )"

However, that’s not the biggest problem. The real issue is that those variables are normally set in ebuilds after inherit, so you actually need to make sure that all ebuilds append to them. For example, the ebuild needs to do:

# my-1.ebuild
inherit my
RESTRICT+=" bindist"

Therefore, this design is prone to mistakes at ebuild level. I’m going to discuss an alternative solution below.

Declarative vs functional

It is common to use declarative style in eclasses — create a bunch of variables that ebuilds can use to control the eclass behavior. However, this style has two significant disadvantages.

Firstly, it is prone to typos. If someone recalls the variable name wrong, and its effects are not explicitly visible, it is very easy to commit an ebuild with a silly bug. If the effects are visible, it can still give you some quality debugging headache.

Secondly, in order to affect global scope, the variables need to be set before inherit. This is not trivially enforced, and it is easy to miss that the variable doesn’t work (or partially misbehaves) when set too late.

The alternative is to use functional style, especially for affecting global scope variables. Instead of immediately editing variables in global scope and expecting ebuilds to control the behavior via variables, give them a function to do it:

# my.eclass
my_enable_pytest() {
  IUSE+=" test"
  RESTRICT+=" !test? ( test )"
  BDEPEND+=" test? ( dev-python/pytest[${PYTHON_USEDEP}] )"
  python_test() {
    pytest -vv || die
  }
}

Note that this function is evaluated in ebuild context, so all variables need appending. Its main advantage is that it works independently of where in ebuild it’s called (but if you call it early, remember to append!), and in case of typo you get an explicit error. Example use in ebuild:

# my-1.ebuild
inherit my
IUSE="randomstuff"
RDEPEND="randomstuff? ( dev-libs/random )"
my_enable_pytest
Think what phases to export

Exporting phase functions is often a matter of convenience. However, doing it poorly can cause ebuild writers more pain than if they weren’t exported in the first place. An example of this is vala.eclass as of today. It wrongly exports dysfunctional src_prepare(), and all ebuilds have to redefine it anyway.

It is often a good idea to consider how your eclass is going to be used. If there are both use cases for having the phases exported and for providing utility functions without any phases, it is probably a good idea to split the eclass in two: into -utils eclass that just provides the functions, and main eclass that combines them with phase functions. A good examples today are xdg and xdg-utils eclasses.

When you do need to export phases, it is wortwhile to consider how different eclasses are going to be combined. Generally, a few eclass types could be listed:

  • Unpacking (fetching) eclasses; e.g. git-r3 with src_unpack(),
  • Build system eclasses; e.g. cmake-utils, src_prepare() through src_install(),
  • Post-install eclasses; e.g. xdg, pkg_*inst(), pkg_*rm(),
  • Build environment setup eclasses; e.g. python-single-r1, pkg_setup().

Generally, it’s best to fit your eclass into as few of those as possible. If you do that, there’s a good chance that the ebuild author would be able to combine multiple eclasses easily:

# my-1.ebuild
PYTHON_COMPAT=( python3_7 )
inherit cmake-utils git-r3 python-single-r1

Note that since each of those eclasses uses a different phase function set to do its work, they combine just fine! The inherit order is also irrelevant. If we e.g. need to add llvm to the list, we just have to redefine pkg_setup().

I have written my share of eclasses, and I have made my share of mistakes. Designing good eclasses is a non-trivial problem, and there are many pitfalls you should be watching for. In this post, I would like to highlight three of them.

Not all metadata variables are combined

PMS provides a convenient feature for eclass writers: cumulative handling of metadata variables. Quoting the relevant passage:

The IUSE, REQUIRED_USE, DEPEND, BDEPEND, RDEPEND and PDEPEND variables are handled specially when set by an eclass. They must be accumulated across eclasses, appending the value set by each eclass to the resulting value after the previous one is loaded. Then the eclass-defined value is appended to that defined by the ebuild. […]

Package Manager Specification (30th April 2018), 10.2 Eclass-defined Metadata Keys

That’s really handy! However, the important thing that’s not obvious from this description is that not all metadata variables work this way. The following multi-value variables don’t: HOMEPAGE, SRC_URI, LICENSE, KEYWORDS, PROPERTIES and RESTRICT. Surely, some of them are not supposed to be set in eclasses but e.g. the last two are confusing.

This means that technically you need to append when defining them, e.g.:

# my.eclass
RESTRICT+=" !test? ( test )"

However, that’s not the biggest problem. The real issue is that those variables are normally set in ebuilds after inherit, so you actually need to make sure that all ebuilds append to them. For example, the ebuild needs to do:

# my-1.ebuild
inherit my
RESTRICT+=" bindist"

Therefore, this design is prone to mistakes at ebuild level. I’m going to discuss an alternative solution below.

Declarative vs functional

It is common to use declarative style in eclasses — create a bunch of variables that ebuilds can use to control the eclass behavior. However, this style has two significant disadvantages.

Firstly, it is prone to typos. If someone recalls the variable name wrong, and its effects are not explicitly visible, it is very easy to commit an ebuild with a silly bug. If the effects are visible, it can still give you some quality debugging headache.

Secondly, in order to affect global scope, the variables need to be set before inherit. This is not trivially enforced, and it is easy to miss that the variable doesn’t work (or partially misbehaves) when set too late.

The alternative is to use functional style, especially for affecting global scope variables. Instead of immediately editing variables in global scope and expecting ebuilds to control the behavior via variables, give them a function to do it:

# my.eclass
my_enable_pytest() {
  IUSE+=" test"
  RESTRICT+=" !test? ( test )"
  BDEPEND+=" test? ( dev-python/pytest[${PYTHON_USEDEP}] )"
  python_test() {
    pytest -vv || die
  }
}

Note that this function is evaluated in ebuild context, so all variables need appending. Its main advantage is that it works independently of where in ebuild it’s called (but if you call it early, remember to append!), and in case of typo you get an explicit error. Example use in ebuild:

# my-1.ebuild
inherit my
IUSE="randomstuff"
RDEPEND="randomstuff? ( dev-libs/random )"
my_enable_pytest

Think what phases to export

Exporting phase functions is often a matter of convenience. However, doing it poorly can cause ebuild writers more pain than if they weren’t exported in the first place. An example of this is vala.eclass as of today. It wrongly exports dysfunctional src_prepare(), and all ebuilds have to redefine it anyway.

It is often a good idea to consider how your eclass is going to be used. If there are both use cases for having the phases exported and for providing utility functions without any phases, it is probably a good idea to split the eclass in two: into -utils eclass that just provides the functions, and main eclass that combines them with phase functions. A good examples today are xdg and xdg-utils eclasses.

When you do need to export phases, it is wortwhile to consider how different eclasses are going to be combined. Generally, a few eclass types could be listed:

  • Unpacking (fetching) eclasses; e.g. git-r3 with src_unpack(),
  • Build system eclasses; e.g. cmake-utils, src_prepare() through src_install(),
  • Post-install eclasses; e.g. xdg, pkg_*inst(), pkg_*rm(),
  • Build environment setup eclasses; e.g. python-single-r1, pkg_setup().

Generally, it’s best to fit your eclass into as few of those as possible. If you do that, there’s a good chance that the ebuild author would be able to combine multiple eclasses easily:

# my-1.ebuild
PYTHON_COMPAT=( python3_7 )
inherit cmake-utils git-r3 python-single-r1

Note that since each of those eclasses uses a different phase function set to do its work, they combine just fine! The inherit order is also irrelevant. If we e.g. need to add llvm to the list, we just have to redefine pkg_setup().

VIEW

SCOPE

FILTER
  from
  to