Gentoo Logo
Gentoo Logo Side
Gentoo Spaceship

Contributors:
. Aaron W. Swenson
. Agostino Sarubbo
. Alec Warner
. Alex Alexander
. Alex Legler
. Alexey Shvetsov
. Alexis Ballier
. Alexys Jacob
. Amadeusz Żołnowski
. Andreas K. Hüttel
. Andreas Proschofsky
. Anthony Basile
. Arun Raghavan
. Bernard Cafarelli
. Bjarke Istrup Pedersen
. Brent Baude
. Brian Harring
. Christian Faulhammer
. Christian Ruppert
. Christopher Harvey
. Chí-Thanh Christopher Nguyễn
. Daniel Gryniewicz
. David Abbott
. Denis Dupeyron
. Detlev Casanova
. Diego E. Pettenò
. Domen Kožar
. Donnie Berkholz
. Doug Goldstein
. Eray Aslan
. Fabio Erculiani
. Gentoo Haskell Herd
. Gentoo Monthly Newsletter
. Gentoo News
. Gilles Dartiguelongue
. Greg KH
. Hanno Böck
. Hans de Graaff
. Ian Whyman
. Ioannis Aslanidis
. Jan Kundrát
. Jason Donenfeld
. Jauhien Piatlicki
. Jeffrey Gardner
. Jeremy Olexa
. Joachim Bartosik
. Johannes Huber
. Jonathan Callen
. Jorge Manuel B. S. Vicetto
. Joseph Jezak
. Kenneth Prugh
. Kristian Fiskerstrand
. Lance Albertson
. Liam McLoughlin
. LinuxCrazy Podcasts
. Luca Barbato
. Luis Francisco Araujo
. Mark Loeser
. Markos Chandras
. Mart Raudsepp
. Matt Turner
. Matthew Marlowe
. Matthew Thode
. Matti Bickel
. Michael Palimaka
. Michal Hrusecky
. Michał Górny
. Mike Doty
. Mike Gilbert
. Mike Pagano
. Nathan Zachary
. Ned Ludd
. Nirbheek Chauhan
. Pacho Ramos
. Patrick Kursawe
. Patrick Lauer
. Patrick McLean
. Pavlos Ratis
. Paweł Hajdan, Jr.
. Petteri Räty
. Piotr Jaroszyński
. Rafael Goncalves Martins
. Raúl Porcel
. Remi Cardona
. Richard Freeman
. Robin Johnson
. Ryan Hill
. Sean Amoss
. Sebastian Pipping
. Steev Klimaszewski
. Stratos Psomadakis
. Sune Kloppenborg Jeppesen
. Sven Vermeulen
. Sven Wegener
. Theo Chatzimichos
. Thomas Kahle
. Tiziano Müller
. Tobias Heinlein
. Tobias Klausmann
. Tom Wijsman
. Tomáš Chvátal
. Victor Ostorga
. Vikraman Choudhury
. Vlastimil Babka
. Zack Medico

Last updated:
July 31, 2015, 13:05 UTC

Disclaimer:
Views expressed in the content published here do not necessarily represent the views of Gentoo Linux or the Gentoo Foundation.


Bugs? Comments? Suggestions? Contact us!

Powered by:
Planet Venus

Welcome to Planet Gentoo, an aggregation of Gentoo-related weblog articles written by Gentoo developers. For a broader range of topics, you might be interested in Gentoo Universe.

July 23, 2015
Johannes Huber a.k.a. johu (homepage, bugs)
Tasty calamares in Gentoo (July 23, 2015, 20:36 UTC)

First of all it’s nothing to eat. So what is it then? This is the introduction by upstream:

Calamares is an installer framework. By design it is very customizable, in order to satisfy a wide variety of needs and use cases. Calamares aims to be easy, usable, beautiful, pragmatic, inclusive and distribution-agnostic. Calamares includes an advanced partitioning feature, with support for both manual and automated partitioning operations. It is the first installer with an automated “Replace Partition” option, which makes it easy to reuse a partition over and over for distribution testing. Got a Linux distribution but no system installer? Grab Calamares, mix and match any number of Calamares modules (or write your own in Python or C++), throw together some branding, package it up and you are ready to ship!

I have just added newest release version (1.1.2) to the tree and in my dev overlay a live version (9999). The underlaying technology stack is mainly Qt5, KDE Frameworks, Python3, YAML and systemd. It’s picked up and of course in evaluation process by several Linux distributions.

You may asking why i have added it to Gentoo then where we have OpenRC as default init system?! You are right at the moment it is not very useful for Gentoo. But for example Sabayon as a downstream of us will (maybe) use it for the next releases, so in the first place it is just a service for our downstreams.

The second reason, there is a discussion on gentoo-dev mailing list at the moment to reboot the Gentoo installer. Instead of creating yet another installer implementation, we have two potential ways to pick it up, which are not mutual exclusive:

1. Write modules to make it work with sysvinit aka OpenRC
2. Solve Bug #482702 – Provide alternative stage3 tarballs using sys-apps/systemd

Have fun!

[1] https://calamares.io/about/
[2] johu dev overlay
[3] gentoo-dev ml – Rebooting the Installer Project
[4] Bug #482702 – Provide alternative stage3 tarballs using sys-apps/systemd

July 22, 2015
Rafael Goncalves Martins a.k.a. rafaelmartins (homepage, bugs)
Hello, World! (July 22, 2015, 06:45 UTC)

Hi all,

I'm starting a new blog!

Actually, it is almost the same blog, but powered by a new "blogging engine", and I don't want to spend time migrating the old posts, that are mostly outdated right now.

The old content is archived here, if you need it due to some crazy reason: http://old.rafaelmartins.eng.br/.

For Gentoo planet readers, everything should be working just fine. I created rewrite rules to keep the old atom feeds working.

I'll publish another blog post soon, talking about the "blogging engine" and my next plans.

Thanks.

July 20, 2015
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)

Since updating to VMware Workstation 11 (from the Gentoo vmware overlay), I've experienced a lot of hangs of my KDE environment whenever a virtual machine was running. Basically my system became unusable, which is bad if your workflow depends on accessing both Linux and (gasp!) Windows 7 (as guest). I first suspected a dbus timeout (doing the "stopwatch test" for 25s waits), but it seems according to some reports that this might be caused by buggy behavior in kwin (4.11.21). Sadly I haven't been able to pinpoint a specific bug report.

Now, I'm not sure if the problem is really 100% fixed, but at least now the lags are much smaller- and here's how to do it (kudos to matthewls and vrenn): 

  • Add to /etc/xorg.conf in the Device section
    Option "TripleBuffer" "True"
  • Create a file in /etc/profile.d with content
    __GL_YIELD="USLEEP"
    (yes that starts with a double underscore).
  • Log out, stop your display manager, restart it.
I'll leave it as an exercise to the reader to figure out what these settings do. (Feel free to explain it in a comment. :) No guarantees of any kind. If this kills kittens you have been warned. Cheers.

July 18, 2015
Richard Freeman a.k.a. rich0 (homepage, bugs)
Running cron jobs as units automatically (July 18, 2015, 16:00 UTC)

I just added sys-process/systemd-cron to the Gentoo repository.  Until now I’ve been running it from my overlay and getting it into the tree was overdue.  I’ve found it to be an incredibly useful tool.

All it does is install a set of unit files and a crontab generator.  The unit files (best used by starting/enabling cron.target) will run jobs from /etc/cron.* at the appropriate times.  The generator can parse /etc/crontab and create timer units for every line dynamically.

Note that the default Gentoo install runs the /etc/cron.* jobs from /etc/crontab, so if you aren’t careful you might end up running them twice.  The simplest solutions this are to either remove those lines from /etc/crontab, or install systemd-cron using USE=etc-crontab-systemd which will have the generator ignore /etc/crontab and instead look for /etc/crontab-systemd where you can install jobs you’d like to run using systemd.

The generator works like you’d expect it to – if you edit the crontab file the units will automatically be created/destroyed dynamically.

One warning about timer units compared to cron jobs is that the jobs are run as services, which means that when the main process dies all its children will be killed.  If you have anything in /etc/cron.* which forks you’ll need to have the main script wait at the end.

On the topic of race conditions, each cron.* directory and each /etc/crontab line will create a separate unit.  Those units will all run in parallel (to the extent that one is still running when the next starts), but within a cron.* directory the scripts will run in series.  That may be a bit different from some cron implementations which may limit the number of simultaneous jobs globally.

All the usual timer unit logic applies.  stdout goes to the journal, systemctl list-timers shows what is scheduled, etc.


Filed under: gentoo, linux, systemd

July 16, 2015

Description:
Libav is an open source set of tools for audio and video processing.

After talking with Luca Barbato which is both a Gentoo and Libav developer, I spent a bit of my time fuzzing libav and in particular I fuzzed libavcodec though avplay.
I hit a crash and after I reported it to upstream, they confirmed the issue as a divide-by-zero.

The complete gdb output:

ago@willoughby $ gdb --args /usr/bin/avplay avplay.crash 
GNU gdb (Gentoo 7.7.1 p1) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/avplay...Reading symbols from /usr/lib64/debug//usr/bin/avplay.debug...done.
done.
(gdb) run
Starting program: /usr/bin/avplay avplay.crash
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
avplay version 11.3, Copyright (c) 2003-2014 the Libav developers
  built on Jun 19 2015 09:50:59 with gcc 4.8.4 (Gentoo 4.8.4 p1.6, pie-0.6.1)
[New Thread 0x7fffec4c7700 (LWP 7016)]
[New Thread 0x7fffeb166700 (LWP 7017)]
INFO: AddressSanitizer ignores mlock/mlockall/munlock/munlockall
[New Thread 0x7fffe9e28700 (LWP 7018)]
[h263 @ 0x60480000f680] Format detected only with low score of 25, misdetection possible!
[h263 @ 0x60440001f980] Syntax-based Arithmetic Coding (SAC) not supported
[h263 @ 0x60440001f980] Reference Picture Selection not supported
[h263 @ 0x60440001f980] Independent Segment Decoding not supported
[h263 @ 0x60440001f980] header damaged

Program received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0x7fffe9e28700 (LWP 7018)]
0x00007ffff21e3313 in ff_h263_decode_mba (s=s@entry=0x60720005a100) at /tmp/portage/media-video/libav-11.3/work/libav-11.3/libavcodec/ituh263dec.c:142
142     /tmp/portage/media-video/libav-11.3/work/libav-11.3/libavcodec/ituh263dec.c: No such file or directory.
(gdb) bt
#0  0x00007ffff21e3313 in ff_h263_decode_mba (s=s@entry=0x60720005a100) at /tmp/portage/media-video/libav-11.3/work/libav-11.3/libavcodec/ituh263dec.c:142
#1  0x00007ffff21f3c2d in ff_h263_decode_picture_header (s=0x60720005a100) at /tmp/portage/media-video/libav-11.3/work/libav-11.3/libavcodec/ituh263dec.c:1112
#2  0x00007ffff1ae16ed in ff_h263_decode_frame (avctx=0x60440001f980, data=0x60380002f480, got_frame=0x7fffe9e272f0, avpkt=) at /tmp/portage/media-video/libav-11.3/work/libav-11.3/libavcodec/h263dec.c:444
#3  0x00007ffff2cd963e in avcodec_decode_video2 (avctx=0x60440001f980, picture=0x60380002f480, got_picture_ptr=got_picture_ptr@entry=0x7fffe9e272f0, avpkt=avpkt@entry=0x7fffe9e273b0) at /tmp/portage/media-video/libav-11.3/work/libav-11.3/libavcodec/utils.c:1600
#4  0x00007ffff44d4fb4 in try_decode_frame (st=st@entry=0x60340002fb00, avpkt=avpkt@entry=0x601c00037b00, options=) at /tmp/portage/media-video/libav-11.3/work/libav-11.3/libavformat/utils.c:1910
#5  0x00007ffff44ebd89 in avformat_find_stream_info (ic=0x60480000f680, options=0x600a00009e80) at /tmp/portage/media-video/libav-11.3/work/libav-11.3/libavformat/utils.c:2276
#6  0x0000000000431834 in decode_thread (arg=0x7ffff7e0b800) at /tmp/portage/media-video/libav-11.3/work/libav-11.3/avplay.c:2268
#7  0x00007ffff0284b08 in ?? () from /usr/lib64/libSDL-1.2.so.0
#8  0x00007ffff02b4be9 in ?? () from /usr/lib64/libSDL-1.2.so.0
#9  0x00007ffff4e65aa8 in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.4/libasan.so.0
#10 0x00007ffff0062204 in start_thread () from /lib64/libpthread.so.0
#11 0x00007fffefda957d in clone () from /lib64/libc.so.6
(gdb)

Affected version:
11.3 (and maybe past versions)

Fixed version:
11.5 and 12.0

Commit fix:
https://git.libav.org/?p=libav.git;a=commitdiff;h=0a49a62f998747cfa564d98d36a459fe70d3299b;hp=6f4cd33efb5a9ec75db1677d5f7846c60337129f

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2015-5479

Timeline:
2015-06-21: bug discovered
2015-06-22: bug reported privately to upstream
2015-06-30: upstream commit the fix
2015-07-14: CVE assigned
2015-07-16: advisory release

Note:
This bug was found with American Fuzzy Lop.
This bug does not affect ffmpeg.

Permalink:
http://blogs.gentoo.org/ago/2015/07/16/libav-divide-by-zero-in-ff_h263_decode_mba

July 14, 2015
siege: off-by-one in load_conf() (July 14, 2015, 19:04 UTC)

Description:
Siege is an http load testing and benchmarking utility.

During the test of a webserver, I hit a segmentation fault. I recompiled siege with ASan and it clearly show an off-by-one in load_conf(). The issue is reproducible without passing any arguments to the binary.
The complete output:

ago@willoughby ~ # siege
=================================================================
==488==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000d7f1 at pc 0x00000051ab64 bp 0x7ffcc3d19a70 sp 0x7ffcc3d19a68
READ of size 1 at 0x60200000d7f1 thread T0
#0 0x51ab63 in load_conf /var/tmp/portage/app-benchmarks/siege-3.1.0/work/siege-3.1.0/src/init.c:263:12
#1 0x515486 in init_config /var/tmp/portage/app-benchmarks/siege-3.1.0/work/siege-3.1.0/src/init.c:96:7
#2 0x5217b9 in main /var/tmp/portage/app-benchmarks/siege-3.1.0/work/siege-3.1.0/src/main.c:324:7
#3 0x7fb2b1b93aa4 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.20-r2/work/glibc-2.20/csu/libc-start.c:289
#4 0x439426 in _start (/usr/bin/siege+0x439426)

0x60200000d7f1 is located 0 bytes to the right of 1-byte region [0x60200000d7f0,0x60200000d7f1)
allocated by thread T0 here:
#0 0x4c03e2 in __interceptor_malloc /var/tmp/portage/sys-devel/llvm-3.6.1/work/llvm-3.6.1.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:40:3
#1 0x7fb2b1bf31e9 in __strdup /var/tmp/portage/sys-libs/glibc-2.20-r2/work/glibc-2.20/string/strdup.c:42

SUMMARY: AddressSanitizer: heap-buffer-overflow /var/tmp/portage/app-benchmarks/siege-3.1.0/work/siege-3.1.0/src/init.c:263 load_conf
Shadow bytes around the buggy address:
0x0c047fff9aa0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9ab0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9ac0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9ad0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9ae0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c047fff9af0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa[01]fa
0x0c047fff9b00: fa fa 03 fa fa fa fd fd fa fa fd fa fa fa fd fd
0x0c047fff9b10: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fd
0x0c047fff9b20: fa fa fd fd fa fa fd fa fa fa fd fa fa fa fd fa
0x0c047fff9b30: fa fa fd fa fa fa fd fd fa fa fd fa fa fa fd fa
0x0c047fff9b40: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==488==ABORTING

Affected version:
3.1.0 (and maybe past versions).

Fixed version:
Not available.

Commit fix:
Not available.

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
Not assigned.

Timeline:
2015-06-09: bug discovered
2015-06-10: bug reported privately to upstream
2015-07-13: no upstream response
2015-07-14: advisory release

Permalink:
http://blogs.gentoo.org/ago/2015/07/14/siege-off-by-one-in-load_conf

July 10, 2015
Johannes Huber a.k.a. johu (homepage, bugs)
Plasma 5 and kdbus testing (July 10, 2015, 22:03 UTC)

Thanks to Mike Pagano who enabled kdbus support in Gentoo kernel sources almost 2 weeks ago. Which gives us the choice to test it. As described in Mikes blog post you will need to enable the use flags kdbus and experimental on sys-kernel/gentoo-sources and kdbus on sys-apps/systemd.

root # echo "sys-kernel/gentoo-sources kdbus experimental" >> /etc/portage/package.use/kdbus

If you are running >=sys-apps/systemd-221 kdbus is already enabled by default otherwise you have to enable it.

root # echo "sys-apps/systemd kdbus" >> /etc/portage/package.use/kdbus

Any packages affected by the change need to be rebuilt.

root # emerge -avuND @world

Enable kdbus option in kernel.

General setup --->
<*> kdbus interprocess communication

Build the kernel, install it and reboot. Now we can check if kdbus is enabled properly. systemd should automatically mask dbus.service and start systemd-bus-proxyd.service instead (Thanks to eliasp for the info).

root # systemctl status dbus
● dbus.service
Loaded: masked (/dev/null)
Active: inactive (dead)



root # systemctl status systemd-bus-proxyd
● systemd-bus-proxyd.service - Legacy D-Bus Protocol Compatibility Daemon
Loaded: loaded (/usr/lib64/systemd/system/systemd-bus-proxyd.service; static; vendor preset: enabled)
Active: active (running) since Fr 2015-07-10 22:42:16 CEST; 16min ago
Main PID: 317 (systemd-bus-pro)
CGroup: /system.slice/systemd-bus-proxyd.service
└─317 /usr/lib/systemd/systemd-bus-proxyd --address=kernel:path=/sys/fs/kdbus/0-system/bus

Plasma 5 starts fine here using sddm as login manager. On Plasma 4 you may be interested in Bug #553460.

Looking forward when Plasma 5 will get user session support.

Have fun!

Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Important!

My tech articles—especially Linux ones—are some of the most-viewed on The Z-Issue. If this one has helped you, please consider a small donation to The Parker Fund by using the top widget at the right. Thanks!

For quite some time, I have tried to get links in Thunderbird to open automatically in Chrome or Chromium instead of defaulting to Firefox. Moreover, I have Chromium start in incognito mode by default, and I would like those links to do the same. This has been a problem for me since I don’t use a full desktop environment like KDE, GNOME, or even XFCE. As I’m really a minimalist, I only have my window manager (which is Openbox), and the applications that I use on a regular basis.

One thing I found, though, is that by using PCManFM as my file manager, I do have a few other related applications and utilities that help me customise my workspace and workflows. One such application is libfm-pref-apps, which allows for setting preferred applications. I found that I could do just what I wanted to do without mucking around with manually setting MIME types, writing custom hooks for Thunderbird, or any of that other mess.

Here’s how it was done:

  1. Execute /usr/bin/libfm-pref-apps from your terminal emulator of choice
  2. Under “Web Browser,” select “Customise” from the drop-down menu
  3. Select the “Custom Command Line” tab
  4. In the “Command line to execute” box, type /usr/bin/chromium --incognito --start-maximized %U
  5. In the “Application name” box, type “Chromium incognito” (or however else you would like to identify the application)

Voilà! After restarting Thunderbird, my links opened just like I wanted them to. The only modification that you might need to make is the “Command line to execute” portion. If you use the binary of Chrome instead of building the open-source Chromium browser, you would need to change it to the appropriate executable (and the path may be different for you, depending on your system and distribution). Also, in the command line that I have above, here are some notes about the switches used:

  • –incognito starts Chromium in incognito mode by default (that one should be obvious)
  • –start-maximized makes the browser window open in the full size of your screen
  • %U allows Chromium to accept a URL or list of URLs, and thus, opens the link that you clicked in Thunderbird

Under the hood, it seems like libfm-pref-apps is adding some associations in the ~/.config/mimeapps.list file. The relevant lines that I found were:

[Added Associations]
x-scheme-handler/http=userapp-chromium --incognito --start-maximized-8KZNYX.desktop;
x-scheme-handler/https=userapp-chromium --incognito --start-maximized-8KZNYX.desktop;

Hope this information helps you get your links to open in your browser of choice (and with the command-line arguments that you want)!

Cheers,
Zach

July 09, 2015
Alexys Jacob a.k.a. ultrabug (homepage, bugs)

In our previous attempt to upgrade our production cluster to 3.0, we had to roll back from the WiredTiger engine on primary servers.

Since then, we switched back our whole cluster to 3.0 MMAPv1 which has brought us some better performances than 2.6 with no instability.

Production checklist

We decided to use this increase in performance to allow us some time to fulfil the entire production checklist from MongoDB, especially the migration to XFS. We’re slowly upgrading our servers kernels and resynchronising our data set after migrating from ext4 to XFS.

Ironically, the strong recommendation of XFS in the production checklist appeared 3 days after our failed attempt at WiredTiger… This is frustrating but gives some kind of hope.

I’ll keep on posting on our next steps and results.

Our hero WiredTiger Replica Set

While we were battling with our production cluster, we got a spontaneous major increase in the daily volumes from another platform which was running on a single Replica Set. This application is write intensive and very disk I/O bound. We were killing the disk I/O with almost a continuous 100% usage on the disk write queue.

Despite our frustration with WiredTiger so far, we decided to give it a chance considering that this time we were talking about a single Replica Set. We were very happy to see WiredTiger keep up to its promises with an almost shocking serenity.

Disk I/O went down dramatically, almost as if nothing was happening any more. Compression did magic on our disk usage and our application went Roarrr !

July 06, 2015
Zack Medico a.k.a. zmedico (homepage, bugs)

I’ve created a utility called tardelta (ebuild available) that people using containers may be interested in. Here’s the README:

It is possible to optimize docker containers such that multiple containers are based off of a single copy of a common base image. If containers are constructed from tarballs, then it can be useful to create a delta tarball which contains the differences between a base image and a derived image. The delta tarball can then be layered on top of the base image using a Dockerfile like the following:

FROM base
ADD delta.tar.xz /

Many different types of containers can thus be derived from a common base image, while sharing a single copy of the base image. This saves disk space, and can also reduce memory consumption since it avoids having duplicate copies of base image data in the kernel’s buffer cache.

July 03, 2015
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)
KDEPIM without Akonadi (July 03, 2015, 14:16 UTC)

As you know, Gentoo is all about flexibility. You can run bleeding edge code (portage, our package manager, even provides you with installation from git master KF5 and friends) or you can focus on stability and trusted code. This is why we've been offering our users for the last years KDEPIM 4.4.11.1 (the version where KMail e-mail storage was not integrated with Akonadi yet, also known as KMail1) as a drop-in replacement for the newer versions.
Recently the Nepomuk search framework has been replaced by Baloo, and after some discussion we decided that for the Nepomuk-related packages it's now time to go. Problem is, the old KDEPIM packages still depend on it via their Akonadi version. This is why - for those of our users who prefer to run KDEPIM 4.4 / KMail1 - we've decided to switch to Pali Rohár's kdepim-noakonadi fork (see also his 2013 blog post and the code).The packages are right now in the KDE overlay, but will move to the main tree after a few days of testing and be treated as an update of KDEPIM 4.4.11.1.
The fork is essentially KDEPIM 4.4 including some additional bugfixes from the KDE/4.4 git branch, with KAddressbook patched back to KDEPIM 4.3 state and references to Akonadi removed elsewhere. This is in some ways a functionality regression since the integration of e.g. different calendar types is lost, however in that version it never really worked perfectly anyway.

For now, you will still need the akonadi-server package, since kdepimlibs (outside kdepim and now at version 4.14.9) requires it to build, but you'll never need to start the Akonadi server. As a consequence, Nepomuk support can be disabled everywhere, and the Nepomuk core and client and Akonadi client packages can be removed by the package manager (--depclean, make sure to first globally disable the nepomuk useflag and rebuild accordingly).

You might ask "Why are you still doing this?"... well. I've been told Akonadi and Baloo is working very nicely, and again I've considered upgrading all my installations... but then on my work desktop where I am using newest and greatest KDE4PIM bug 338658 pops up regularly and stops syncing of important folders. I just don't have the time to pointlessly dig deep into the Akonadi database every few days. So KMail1 it is, and I'll rather spend some time occasionally picking and backporting bugfixes.

June 26, 2015
Mike Pagano a.k.a. mpagano (homepage, bugs)
kdbus in gentoo-sources (June 26, 2015, 23:35 UTC)

 

Keeping with the theme of ‘Gentoo is about choice” I’ve added the ability for users to include kdbus into their gentoo-sources kernel.  I wanted an easy way for gentoo users to test the patchset while maintaining the default installation of not having it at all.

In order to include the patchset on your gentoo-sources you’ll need the following:

1. A kernel version >= 4.1.0-r1

2. the ‘experimental’ use flag

3. the ‘kdbus’ use flag

I am not a systemd user, but from the ebuild it looks like if you build systemd with the ‘kdbus’ use flag it will use it.

Please send all kdbus bugs upstream by emailing the developers and including linux-kernel@vger.kernel.org in the CC .

Read as much as you can about kdbus before you decided to build it into your kernel.  There have been security concerns mentioned (warranted or not), so following the upstream patch review at lkml.org would probably be prudent.

When a new version is released, wait a week before opening a bug.  Unless I am on vacation, I will most likely have it included before the week is out. Thanks!

NOTE: This is not some kind of Gentoo endorsement of kdbus.  Nor is it a Mike Pagano endorsement of kdbus.  This is no different then some of the other optional and experimental patches we carry.  I do all the genpatches work which includes the patches, the ebuilds and the bugs therefore since I don’t mind the extra work of keeping this up to date, then I can’t see any reason not to include it as an option.

 

 

June 25, 2015
Johannes Huber a.k.a. johu (homepage, bugs)
KDE Plasma 5.3.1 testing (June 25, 2015, 23:15 UTC)

After several month of packaging in kde overlay and almost a month in tree, we have lifted the mask for KDE Plasma 5.3.1 today. If you want to test it out, now some infos how to get it.

For easy transition we provide two new profiles, one for OpenRC and the other for systemd.

root # eselect profile list
...
[8] default/linux/amd64/13.0/desktop/plasma
[9] default/linux/amd64/13.0/desktop/plasma/systemd
...

Following example activates the Plasma systemd profile:

root # eselect profile set 9

On stable systems you need to unmask the qt5 use flag:

root # echo "-qt5" >> /etc/portage/profile/use.stable.mask

Any packages affected by the profile change need to be rebuilt:

root # emerge -avuND @world

For stable users, you also need to keyword the required packages. You can let portage handle it with autokeyword feature or just grep the keyword files for KDE Frameworks 5.11 and KDE Plasma 5.3.1 from kde overlay.

Now just install it (this is full Plasma 5, the basic desktop would be kde-plasma/plasma-desktop):

root # emerge -av kde-plasma/plasma-meta

KDM is not supported for Plasma 5 anymore, so if you have installed it kill it with fire. Possible and tested login managers are SDDM and LightDM.

For detailed instructions read the full upgrade guide. Package bugs can be filed to bugs.gentoo.org and about the software to bugs.kde.org.

Have fun,
the Gentoo KDE Team

June 23, 2015
Hanno Böck a.k.a. hanno (homepage, bugs)

tl;dr Most servers running a multi-user webhosting setup with Apache HTTPD probably have a security problem. Unless you're using Grsecurity there is no easy fix.

I am part of a small webhosting business that I run as a side project since quite a while. We offer customers user accounts on our servers running Gentoo Linux and webspace with the typical Apache/PHP/MySQL combination. We recently became aware of a security problem regarding Symlinks. I wanted to share this, because I was appalled by the fact that there was no obvious solution.

Apache has an option FollowSymLinks which basically does what it says. If a symlink in a webroot is accessed the webserver will follow it. In a multi-user setup this is a security problem. Here's why: If I know that another user on the same system is running a typical web application - let's say Wordpress - I can create a symlink to his config file (for Wordpress that's wp-config.php). I can't see this file with my own user account. But the webserver can see it, so I can access it with the browser over my own webpage. As I'm usually allowed to disable PHP I'm able to prevent the server from interpreting the file, so I can read the other user's database credentials. The webserver needs to be able to see all files, therefore this works. While PHP and CGI scripts usually run with user's rights (at least if the server is properly configured) the files are still read by the webserver. For this to work I need to guess the path and name of the file I want to read, but that's often trivial. In our case we have default paths in the form /home/[username]/websites/[hostname]/htdocs where webpages are located.

So the obvious solution one might think about is to disable the FollowSymLinks option and forbid users to set it themselves. However symlinks in web applications are pretty common and many will break if you do that. It's not feasible for a common webhosting server.

Apache supports another Option called SymLinksIfOwnerMatch. It's also pretty self-explanatory, it will only follow symlinks if they belong to the same user. That sounds like it solves our problem. However there are two catches: First of all the Apache documentation itself says that "this option should not be considered a security restriction". It is still vulnerable to race conditions.

But even leaving the race condition aside it doesn't really work. Web applications using symlinks will usually try to set FollowSymLinks in their .htaccess file. An example is Drupal which by default comes with such an .htaccess file. If you forbid users to set FollowSymLinks then the option won't be just ignored, the whole webpage won't run and will just return an error 500. What you could do is changing the FollowSymLinks option in the .htaccess manually to SymlinksIfOwnerMatch. While this may be feasible in some cases, if you consider that you have a lot of users you don't want to explain to all of them that in case they want to install some common web application they have to manually edit some file they don't understand. (There's a bug report for Drupal asking to change FollowSymLinks to SymlinksIfOwnerMatch, but it's been ignored since several years.)

So using SymLinksIfOwnerMatch is neither secure nor really feasible. The documentation for Cpanel discusses several possible solutions. The recommended solutions require proprietary modules. None of the proposed fixes work with a plain Apache setup, which I think is a pretty dismal situation. The most common web server has a severe security weakness in a very common situation and no usable solution for it.

The one solution that we chose is a feature of Grsecurity. Grsecurity is a Linux kernel patch that greatly enhances security and we've been very happy with it in the past. There are a lot of reasons to use this patch, I'm often impressed that local root exploits very often don't work on a Grsecurity system.

Grsecurity has an option like SymlinksIfOwnerMatch (CONFIG_GRKERNSEC_SYMLINKOWN) that operates on the kernel level. You can define a certain user group (which in our case is the "apache" group) for which this option will be enabled. For us this was the best solution, as it required very little change.

I haven't checked this, but I'm pretty sure that we were not alone with this problem. I'd guess that a lot of shared web hosting companies are vulnerable to this problem.

Here's the German blog post on our webpage and here's the original blogpost from an administrator at Uberspace (also German) which made us aware of this issue.

June 21, 2015
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)
Perl 5.22 testers needed! (June 21, 2015, 08:14 UTC)

Gentoo users rejoice, for a few days already we have Perl 5.22.0 packaged in the main tree. Since we don't know yet how much stuff will break because of the update, it is masked for now. Which means, we need daring testers (preferably running ~arch systems, stable is also fine but may need more work on your part to get things running) who unmask the new Perl, upgrade, and file bugs if needed!!!
Here's what you need in /etc/portage/package.unmask (and possibly package.accept_keywords) to get started (download); please always use the full block, since partial unmasking will lead to chaos. We're looking forward to your feedback!
# Perl 5.22.0 mask / unmask block
=dev-lang/perl-5.22.0
=virtual/perl-Archive-Tar-2.40.0
=virtual/perl-Attribute-Handlers-0.970.0
=virtual/perl-B-Debug-1.230.0
=virtual/perl-CPAN-2.110.0
=virtual/perl-CPAN-Meta-2.150.1
=virtual/perl-CPAN-Meta-Requirements-2.132.0
=virtual/perl-Carp-1.360.0
=virtual/perl-Compress-Raw-Bzip2-2.68.0
=virtual/perl-Compress-Raw-Zlib-2.68.0
=virtual/perl-DB_File-1.835.0
=virtual/perl-Data-Dumper-2.158.0
=virtual/perl-Devel-PPPort-3.310.0
=virtual/perl-Digest-MD5-2.540.0
=virtual/perl-Digest-SHA-5.950.0
=virtual/perl-Exporter-5.720.0
=virtual/perl-ExtUtils-CBuilder-0.280.221
=virtual/perl-ExtUtils-Command-1.200.0
=virtual/perl-ExtUtils-Install-2.40.0
=virtual/perl-ExtUtils-MakeMaker-7.40.100_rc
=virtual/perl-ExtUtils-ParseXS-3.280.0
=virtual/perl-File-Spec-3.560.0
=virtual/perl-Filter-Simple-0.920.0
=virtual/perl-Getopt-Long-2.450.0
=virtual/perl-HTTP-Tiny-0.54.0
=virtual/perl-IO-1.350.0
=virtual/perl-IO-Compress-2.68.0
=virtual/perl-IO-Socket-IP-0.370.0
=virtual/perl-JSON-PP-2.273.0
=virtual/perl-Locale-Maketext-1.260.0
=virtual/perl-MIME-Base64-3.150.0
=virtual/perl-Math-BigInt-1.999.700
=virtual/perl-Math-BigRat-0.260.800
=virtual/perl-Module-Load-Conditional-0.640.0
=virtual/perl-Module-Metadata-1.0.26
=virtual/perl-Perl-OSType-1.8.0
=virtual/perl-Pod-Escapes-1.70.0
=virtual/perl-Pod-Parser-1.630.0
=virtual/perl-Pod-Simple-3.290.0
=virtual/perl-Safe-2.390.0
=virtual/perl-Scalar-List-Utils-1.410.0
=virtual/perl-Socket-2.18.0
=virtual/perl-Storable-2.530.0
=virtual/perl-Term-ANSIColor-4.30.0
=virtual/perl-Term-ReadLine-1.150.0
=virtual/perl-Test-Harness-3.350.0
=virtual/perl-Test-Simple-1.1.14
=virtual/perl-Text-Balanced-2.30.0
=virtual/perl-Text-ParseWords-3.300.0
=virtual/perl-Time-Piece-1.290.0
=virtual/perl-Unicode-Collate-1.120.0
=virtual/perl-XSLoader-0.200.0
=virtual/perl-autodie-2.260.0
=virtual/perl-bignum-0.390.0
=virtual/perl-if-0.60.400
=virtual/perl-libnet-3.50.0
=virtual/perl-parent-0.232.0
=virtual/perl-threads-2.10.0
=virtual/perl-threads-shared-1.480.0
=dev-perl/Test-Tester-0.114.0
=dev-perl/Test-use-ok-0.160.0

# end of the Perl 5.22.0 mask / unmask block
After the update, first run
emerge --depclean --ask
and afterwards
perl-cleaner --all
perl-cleaner should not need to do anything, ideally. If you have depcleaned first and it still wants to rebuild something, that's a bug. Please file a bug report for the package that is getting rebuilt (but check our wiki page on known Perl 5.22 issues first to avoid duplicates).


June 19, 2015
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Important!

My tech articles—especially Linux ones—are some of the most-viewed on The Z-Issue. If this one has helped you, please consider a small donation to The Parker Fund by using the top widget at the right. Thanks!

Recently, I wrote an article about amavisd not running with Postfix, and getting a “Connection refused to 127.0.0.1″ error message that wasn’t easy to diagnose. Yesterday, I ran into another problem with amavisd refusing to start properly, and I wasn’t readily able to figure out why. By default, amavisd logs to your mail log, which for me is located at /var/log/mail.log, but could be different for you based on your syslogger preferences. The thing is, though, that it will not log start-up errors there. So basically, one is seemingly left in the dark if you start amavisd and then realise it isn’t running immediately thereafter.

I decided to take a look at the init script for amavisd, and saw that there were some non-standard functions in it:


# grep 'extra_commands' /etc/init.d/amavisd
extra_commands="debug debug_sa"

These extra commands map to the following functions:


debug() {
ebegin "Starting ${progname} in debug mode"
"${prog}" debug
eend $?
}

debug_sa() {
ebegin "Starting ${progname} in debug-sa mode"
"${prog}" debug-sa
eend $?
}

Though these extra commands may be Gentoo-specific, they are pretty easy to implement on other distributions by directly calling the binary itself. For instance, if you wanted the debug function, it would be the location of the binary with ‘debug’ appended to it. On my system, that would be:


/usr/sbin/amavisd -c $LOCATION_OF_CONFIG_FILE debug

replacing the $LOCATION_OF_CONFIG_FILE with your actual config file location.

When I started amavisd in debug mode, the start-up problem that it was having became readily apparent:


# /etc/init.d/amavisd debug
* Starting amavisd-new in debug mode ...
Jun 18 12:48:21.948 /usr/sbin/amavisd[4327]: logging initialized, log level 5, syslog: amavis.mail
Jun 18 12:48:21.948 /usr/sbin/amavisd[4327]: starting. /usr/sbin/amavisd at amavisd-new-2.10.1 (20141025), Unicode aware, LANG="en_GB.UTF-8"

Jun 18 12:48:22.200 /usr/sbin/amavisd[4327]: Net::Server: 2015/06/18-12:48:22 Amavis (type Net::Server::PreForkSimple) starting! pid(4327)
Jun 18 12:48:22.200 /usr/sbin/amavisd[4327]: (!)Net::Server: 2015/06/18-12:48:22 Unresolveable host [::1]:10024 - could not load IO::Socket::INET6: Can't locate Socket6.pm in @INC (you may need to install the Socket6 module) (@INC contains: lib /etc/perl /usr/local/lib64/perl5/5.20.2/x86_64-linux /usr/local/lib64/perl5/5.20.2 /usr/lib64/perl5/vendor_perl/5.20.2/x86_64-linux /usr/lib64/perl5/vendor_perl/5.20.2 /usr/local/lib64/perl5 /usr/lib64/perl5/vendor_perl/5.20.1/x86_64-linux /usr/lib64/perl5/vendor_perl/5.20.1 /usr/lib64/perl5/vendor_perl /usr/lib64/perl5/5.20.2/x86_64-linux /usr/lib64/perl5/5.20.2) at /usr/lib64/perl5/vendor_perl/5.20.1/Net/Server/Proto.pm line 122.\n\n at line 82 in file /usr/lib64/perl5/vendor_perl/5.20.1/Net/Server/Proto.pm
Jun 18 12:48:22.200 /usr/sbin/amavisd[4327]: Net::Server: 2015/06/18-12:48:22 Server closing!

In that code block, the actual error (in bold text) indicates that it couldn’t find the Perl module IO:Socket::INET6. This problem was easily fixed in Gentoo with emerge -av dev-perl/IO-Socket-INET6, but could be rectified by installing the module from your distribution’s repositories, or by using CPAN. In my case, it was caused by my recent compilation and installation of a new kernel that, this time, included IPV6 support.

The point of my post, however, wasn’t about my particular problem with amavisd starting, but rather how one can debug start-up problems with the daemon. Hopefully, if you run into woes with amavisd logging, these debug options will help you track down the problem.

Cheers,
Zach

June 10, 2015
Sven Vermeulen a.k.a. swift (homepage, bugs)
Live SELinux userspace ebuilds (June 10, 2015, 18:07 UTC)

In between courses, I pushed out live ebuilds for the SELinux userspace applications: libselinux, policycoreutils, libsemanage, libsepol, sepolgen, checkpolicy and secilc. These live ebuilds (with Gentoo version 9999) pull in the current development code of the SELinux userspace so that developers and contributors can already work with in-progress code developments as well as see how they work on a Gentoo platform.

That being said, I do not recommend using the live ebuilds for anyone else except developers and contributors in development zones (definitely not on production). One of the reasons is that the ebuilds do not apply Gentoo-specific patches to the ebuilds. I would also like to remove the Gentoo-specific manipulations that we do, such as small Makefile adjustments, but let’s start with just ignoring the Gentoo patches.

Dropping the patches makes sure that we track upstream libraries and userspace closely, and allows developers to try and send out patches to the SELinux project to fix Gentoo related build problems. But as not all packages can be deployed successfully on a Gentoo system some patches need to be applied anyway. For this, users can drop the necessary patches inside /etc/portage/patches as all userspace ebuilds use the epatch_user method.

Finally, observant users will notice that “secilc” is also provided. This is a new package, which is probably going to have an official release with a new userspace release. It allows for building CIL-based SELinux policy code, and was one of the drivers for me to create the live ebuilds as I’m experimenting with the CIL constructions. So expect more on that later.

June 07, 2015
Sebastian Pipping a.k.a. sping (homepage, bugs)
gentoo.de relaunched (June 07, 2015, 23:02 UTC)

Hi!

Two months ago, the gentoo.org website redesign was announced. For a few hours now, gentoo.de is following. The similarities in look are not by coincidence: Both design were done by Alex Legler.

The new website is based on Jekyll rather than GuideXML previously. Any bugs you find on the website can be reported on GitHub where the website sources are hosted.

I would like to take the opportunity to thank both Manitu and SysEleven for providing the hardware running gentoo.de (and other Gentoo e.V. services) free of charge. Very kind!

Best, Sebastian

June 06, 2015
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
LibreSSL, OpenSSL, collisions and problems (June 06, 2015, 12:33 UTC)

Some time ago, on the gentoo-dev mailing list, there has been an interesting thread on the state of LibreSSL in Gentoo. In particular I repeated some of my previous concerns about ABI and API compatibility, especially when trying to keep both libraries on the same system.

While I hope that the problems I pointed out are well clear to the LibreSSL developers, I thought reiterating them again clearly in a blog post would give them a wider reach and thus hope that they can be addressed. Please feel free to reshare in response to people hand waving the idea that LibreSSL can be either a drop-in, or stand-aside replacement for OpenSSL.

Last year, when I first blogged about LibreSSL, I had to write a further clarification as my post was used to imply that you could just replace the OpenSSL binaries with LibreSSL and be done with it. This is not the case and I won't even go back there. What I'm concerned about this time is whether you can install the two in the same system, and somehow decide which one you want to use on a per-package basis.

Let's start with the first question: why would you want to do that? Everybody at this point knows that LibreSSL was forked from the OpenSSL code and started removing code that has been needed unnecessary or even dangerous – a very positive thing, given the amount of compatibility kludges around OpenSSL! – and as such it was a subset of the same interface as its parent, thus there would be no reason to wanting the two libraries on the same system.

But then again, LibreSSL never meant to be considered a drop-in replacement, so they haven't cared as much for the evolution of OpenSSL, and just proceeded in their own direction; said direction included building a new library, libtls, that implements higher-level abstractions of TLS protocol. This vaguely matches the way NSS (the Netscape-now-Mozilla TLS library) is designed, and I think it makes sense: it reduces the amount of repetition that needs to be coded in multiple parts of the software stack to implement HTTPS for instance, reducing the chance of one of them making a stupid mistake.

Unfortunately, this library was originally tied firmly to LibreSSL and there was no way for it to be usable with OpenSSL — I think this has changed recently as a "portable" build of libtls should be available. Ironically, this wouldn't have been a problem at all if it wasn't that LibreSSL is not a superset of OpenSSL, as this is where the core of the issue lies.

By far, this is not the first time a problem like this happens in Open Source software communities: different people will want to implement the same concept in different ways. I like to describe this as software biodiversity and I find it generally a good thing. Having more people looking at the same concept from different angles can improve things substantially, especially in regard to finding safe implementations of network protocols.

But there is a problem when you apply parallel evolution to software: if you fork a project and then evolve it on your own agenda, but keep the same library names and a mostly compatible (thus conflicting) API/ABI, you're going to make people suffer, whether they are developers, consumers, packagers or users.

LibreSSL, libav, Ghostscript, ... there are plenty of examples. Since the features of the projects, their API and most definitely their ABIs are not the same, when you're building a project on top of any of these (or their originators), you'll end up at some point making a conscious decision on which one you want to rely on. Sometimes you can do that based only on your technical needs, but in most cases you end up with a compromise based on technical needs, licensing concerns and availability in the ecosystem.

These projects didn't change the name of their libraries, that way they can be used as drop-rebuild replacement for consumers that keep to the greatest common divisor of the interface, but that also means you can't easily install two of them in the same system. And since most distributions, with the exception of Gentoo, would not really provide the users with choice of multiple implementations, you end up with either a fractured ecosystem, or one that is very much non-diverse.

So if all distributions decide to standardize on one implementation, that's what the developers will write for. And this is why OpenSSL will likely to stay the standard for a long while still. Of course in this case it's not as bad as the situation with libav/ffmpeg, as the base featureset is going to be more or less the same, and the APIs that have been dropped up to now, such as the entropy-gathering daemon interface, have been considered A Bad Idea™ for a while, so there are not going to be OpenSSL-only source projects in the future.

What becomes an issue here is that software is built against OpenSSL right now, and you can't really change this easily. I've been told before that this is not true, because OpenBSD switched, but there is a huge difference between all of the BSDs and your usual Linux distributions: the former have much more control on what they have to support.

In particular, the whole base system is released in a single scoop, and it generally includes all the binary packages you can possibly install. Very few third party software providers release binary packages for OpenBSD, and not many more do for NetBSD or FreeBSD. So as long as you either use the binaries provided by those projects or those built by you on the same system, switching the provider is fairly easy.

When you have to support third-party binaries, then you have a big problem, because a given binary may be built against one provider, but depend on a library that depends on the other. So unless you have full control of your system, with no binary packages at all, you're going to have to provide the most likely provider — which right now is OpenSSL, for good or bad.

Gentoo Linux is, once again, in a more favourable position than many others. As long as you have a full source stack, you can easily choose your provider without considering its popularity. I have built similar stacks before, and my servers deploy stacks similarly, although I have not tried using LibreSSL for any of them yet. But on the desktop it might be trickier, especially if you want to do things like playing Steam games.

But here's the harsh reality, even if you were to install the libraries in different directories, and you would provide a USE flag to choose between the two, it is not going to be easy to apply the right constraints between final executables and libraries all the way into the tree.

I'm not sure if I have an answer to balance the ability to just make the old software use the new library and the side-installation. I'm scared that a "solution" that can be found to solve this problem is bundling and you can probably figure out that doing so for software like OpenSSL or LibreSSL is a terrible idea, given how fast you should update in response to a security vulnerability.

June 05, 2015
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Important!

This article is one of the most-viewed on The Z-Issue, and is sometimes read thousands of times per day. If it has helped you, please consider a small donation to The Parker Fund by using the top widget at the right. Thanks!

So, for several weeks now, I’ve been struggling with disk naming and UDEV as they relate to RHEL or CentOS Kickstart automation via PXE. Basically, what used to be a relatively straight-forward process of setting up partitions based on disk name (e.g. /dev/sda for the primary disk, /dev/sdb for the secondary disk, and so on) has become a little more complicated due to the order in which the kernel may identify disks with UDEV implementations. The originally-used naming conventions of /dev/sdX may no longer be consistent across reboots or across machines. How this problem came to my attention was when I was Kickstarting a bunch of hosts, and realised that several of them were indicating that there wasn’t enough space on the drives that I had referenced. After doing some digging, I realised that some servers were identifying KVM-based media (like via a Cisco UCS CIMC interface), or other USB media as primary disks instead of the actual primary SCSI-based disks (SATA or SAS drives).

Though I agree that identifying disks by their path, their UUID, or an otherwise more permanent name is preferred to the ambiguous /dev/sdX naming scheme, it did cause a problem, and one that didn’t have a readily-identifiable workaround for my situation. My first thought was to use UUID, or gather the disk ID from /dev/disk/by-id/. However, that wouldn’t work because it wouldn’t be the same across all servers, so automatic installations via PXE wouldn’t be feasible. Next, I looked at referencing disks by their PCI location, which can be found within /dev/disk/by-path/. Unfortunately, I found that the PCI location might vary between servers as well. For example, I found:

Server 1:
/dev/disk/by-path/pci-0000:82:00.0-scsi-0:2:0:0 –> primary SCSI disk
/dev/disk/by-path/pci-0000:82:00.0-scsi-0:2:0:0 –> secondary SCSI disk

Server 2:
/dev/disk/by-path/pci-0000:0b:00.0-scsi-0:2:0:0 –> primary SCSI disk
/dev/disk/by-path/pci-0000:0b:00.0-scsi-0:2:0:0 –> secondary SCSI disk

That being said, I did notice one commonality between the servers that I used for testing. The primary and secondary disks (which, by the way, are RAID arrays attached to the same controller) all ended with ‘scsi-0:2:0:0 ‘ for the primary disk, and ‘scsi-0:2:1:0 ‘ for the secondary disk. Thinking that I could possibly just specify using a wildcard, I tried:

part /boot --fstype ext2 --size=100 --ondisk=disk/by-path/*scsi-0:2:0:0

but alas, that caused Anaconda to error out stating that the specified drive did not exist. At this point, I thought that all hope was lost in terms of automation. Then, however, I had a flash of genius (they don’t happen all that often). I could probably gather the full path ID in a pre-installation script, and then get the FULL path ID into the Kickstart configuration file with an include. Here are the code snippets that I wrote:


%pre --interpreter=/bin/bash
## Get the full by-path ID of the primary and secondary disks
osdisk=$(ls -lh /dev/disk/by-path/ | grep 'scsi-0:2:0:0 ' | awk '{print $9}')
datadisk=$(ls -lh /dev/disk/by-path/ | grep 'scsi-0:2:1:0 ' | awk '{print $9}')


## Create a temporary file with the partition scheme to be included in the KS config
echo "# Partitioning scheme" > /tmp/partition_layout
echo "clearpart --all" >> /tmp/partition_layout
echo "zerombr" >> /tmp/partition_layout
echo "part /boot --fstype ext2 --size=100 --ondisk=disk/by-path/$osdisk" >> /tmp/partition_layout
echo "part swap --recommended --ondisk=disk/by-path/$osdisk" >> /tmp/partition_layout
echo "part pv.01 --size=100 --grow --ondisk=disk/by-path/$osdisk" >> /tmp/partition_layout
echo "part pv.02 --size=100 --grow --ondisk=disk/by-path/$datadisk" >> /tmp/partition_layout
echo "volgroup vg_os pv.01" >> /tmp/partition_layout
echo "volgroup vg_data pv.02" >> /tmp/partition_layout
echo "logvol / --vgname=vg_os --size=100 --name=lv_os --fstype=ext4 --grow" >> /tmp/partition_layout
echo "logvol /data --vgname=vg_data --size=100 --name=lv_data --fstype=ext4 --grow" >> /tmp/partition_layout
echo "# Bootloader location" >> /tmp/partition_layout
echo "bootloader --location=mbr --driveorder=/dev/disk/by-path/$osdisk,/dev/disk/by-path/$datadisk --append=\"rhgb\"" >> /tmp/partition_layout
%end

Some notes about the pre-installation script would be: 1) yes, I know that it is a bit clumsy, but it functions despite the lack of elegance; 2) in lines 3 and 4, make sure that the grep statements include a final space. The final space (e.g. grep 'scsi-0:2:0:0 ' instead of just grep 'scsi-0:2:0:0') is important because the machine may have partitions already set up, and in that case, those partitions would be identified as '*scsi-0:2:0:0-part#' where # is the partition number.

So, this pre-installation script generates a file called /tmp/partition_layout that essentially has a normal partition scheme for a Kickstart configuration file, but references the primary and secondary SCSI disks (again, as RAID arrays attached to the same controller) by their full ‘by-path’ IDs. Then, the key is to include that file within the Kickstart configuration via:


# Partitioning scheme information gathered from pre-install script below
%include /tmp/partition_layout

I am happy to report that for the servers that I have tested thus far, this method works. Is it possible that there will be primary and secondary disks that are identified via different path IDs not caught by my pre-installation script? Yes, that is possible. It’s also possible that your situation will require a completely different pre-installation script to gather the unique identifier of your choice. That being said, I hope that this post will help you devise your method of reliably identifying disks so that you can create the partition scheme automatically via your Kickstart configuration file. That way, you can then deploy it to many servers via PXE (or any other implementation that you’re currently utilising for mass deployment).

Cheers,
Zach

Robin Johnson a.k.a. robbat2 (homepage, bugs)
gnupg-2.1 mutt (June 05, 2015, 17:25 UTC)

For the mutt users with GnuPG, depending on your configuration, you might notice that mutt's handling of GnuPG mail stopped working with GnuPG. There were a few specific cases that would have caused this, which I'll detail, but if you just want it to work again, put the below into your Muttrc, and make the tweak to gpg-agent.conf. The underlying cause for most if it is that secret key operations have moved to the agent, and many Mutt users used the agent-less mode, because Mutt handled the passphrase nicely on it's own.

  • -u must now come BEFORE --cleansign
  • Add allow-loopback-pinentry to gpg-agent.conf, and restart the agent
  • The below config adds --pinentry-mode loopback before --passphrase-fd 0, so that GnuPG (and the agent) will accept it from Mutt still.
  • --verbose is optional, depending what you're doing, you might find --no-verbose cleaner.
  • --trust-model always is a personal preference for my Mutt mail usage, because I do try and curate my keyring
set pgp_autosign = yes
set pgp_use_gpg_agent = no
set pgp_timeout = 600
set pgp_sign_as="(your key here)"
set pgp_ignore_subkeys = no

set pgp_decode_command="gpg %?p?--pinentry-mode loopback  --passphrase-fd 0? --verbose --no-auto-check-trustdb --batch --output - %f"
set pgp_verify_command="gpg --pinentry-mode loopback --verbose --batch --output - --no-auto-check-trustdb --verify %s %f"
set pgp_decrypt_command="gpg %?p?--pinentry-mode loopback --passphrase-fd 0? --verbose --batch --output - %f"
set pgp_sign_command="gpg %?p?--pinentry-mode loopback --passphrase-fd 0? --verbose --batch --output - --armor --textmode %?a?-u %a? --detach-sign %f"
set pgp_clearsign_command="gpg %?p?--pinentry-mode loopback --passphrase-fd 0? --verbose --batch --output - --armor --textmode %?a?-u %a? --detach-sign %f"
set pgp_encrypt_sign_command="pgpewrap gpg %?p?--pinentry-mode loopback --passphrase-fd 0? --verbose --batch --textmode --trust-model always --output - %?a?-u %a? --armor --encrypt --sign --armor -- -r %r -- %f"
set pgp_encrypt_only_command="pgpewrap gpg %?p?--pinentry-mode loopback --passphrase-fd 0? --verbose --batch --trust-model always --output --output - --encrypt --textmode --armor -- -r %r -- %f"
set pgp_import_command="gpg %?p?--pinentry-mode loopback --passphrase-fd 0? --verbose --import -v %f"
set pgp_export_command="gpg %?p?--pinentry-mode loopback --passphrase-fd 0? --verbose --export --armor %r"
set pgp_verify_key_command="gpg %?p?--pinentry-mode loopback --passphrase-fd 0? --verbose --batch --fingerprint --check-sigs %r"
set pgp_list_pubring_command="gpg %?p?--pinentry-mode loopback --passphrase-fd 0? --verbose --batch --with-colons --list-keys %r"
set pgp_list_secring_command="gpg %?p?--pinentry-mode loopback --passphrase-fd 0? --verbose --batch --with-colons --list-secret-keys %r"

This entry was originally posted at http://robbat2.dreamwidth.org/238770.html. Please comment there using OpenID.

May 26, 2015
Alexys Jacob a.k.a. ultrabug (homepage, bugs)

In my previous post regarding the migration of our production cluster to mongoDB 3.0 WiredTiger, we successfully upgraded all the secondaries of our replica-sets with decent performances and (almost, read on) no breakage.

Step 2 plan

The next step of our migration was to test our work load on WiredTiger primaries. After all, this is where the new engine would finally demonstrate all its capabilities.

  • We thus scheduled a step down from our 3.0 MMAPv1 primary servers so that our WiredTiger secondaries would take over.
  • Not migrating the primaries was a safety net in case something went wrong… And boy it went so wrong we’re glad we played it safe that way !
  • We rolled back after 10 minutes of utter bitterness.

The failure

After all the wait and expectation, I can’t express our level of disappointment at work when we saw that the WiredTiger engine could not handle our work load. Our application started immediately to throw 50 to 250 WriteConflict errors per minute !

Turns out that we are affected by this bug and that, of course, we’re not the only ones. So far it seems that it affects collections with :

  • heavy insert / update work loads
  • an unique index (or compound index)

The breakages

We also discovered that we’re affected by a weird mongodump new behaviour where the dumped BSON file does not contain the number of documents that mongodump said it was exporting. This is clearly a new problem because it happened right after all our secondaries switched to WiredTiger.

Since we have to ensure a strong consistency of our exports and that the mongoDB guys don’t seem so keen on moving on the bug (which I surely can understand) there is a large possibility that we’ll have to roll back even the WiredTiger secondaries altogether.

Not to mention that since the 3.0 version, we experience some CPU overloads crashing the entire server on our MMAPv1 primaries that we’re still trying to tackle before opening another JIRA bug…

Sad panda

Of course, any new major release such as 3.0 causes its headaches and brings its lot of bugs. We were ready for this hence the safety steps we took to ensure that we could roll back on any problem.

But as a long time advocate of mongoDB I must admit my frustration, even more after the time it took to get this 3.0 out and all the expectations that came with it.

I hope I can share some better news on the next blog post.

May 23, 2015
Sebastian Pipping a.k.a. sping (homepage, bugs)

Hallo!

Die Troisdorfer Linux User Group (kurz TroLUG) veranstaltet

am Samstag den 01.08.2015
in Troisdorf nahe Köln/Bonn

einen Gentoo-Workshop, der sich an fortgeschrittene User richtet.

Mehr Details, die genaue Adresse und der Ablauf finden sich auf der entsprechenden Seite der TroLUG.

Grüße,

 

Sebastian

Alexys Jacob a.k.a. ultrabug (homepage, bugs)

Good news for gevent users blocked on python < 2.7.9 due to broken SSL support since python upstream dropped the private API _ssl.sslwrap that eventlet was using.

This issue was starting to get old and problematic since GLSA 2015-0310 but I’m happy to say that almost 6 hours after the gevent-1.0.2 release, it is already available on portage !

We were also affected by this issue at work so I’m glad that the tension between ops and devs this issue was causing will finally be over ;)

May 20, 2015
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Important!

My tech articles—especially Linux ones—are some of the most-viewed on The Z-Issue. If this one has helped you, please consider a small donation to The Parker Fund by using the top widget at the right. Thanks!

Late last night, I decided to apply some needed updates to my personal mail server, which is running Gentoo Linux (OpenRC) with a mail stack of Postfix & Dovecot with AMaViS (filtering based on SpamAssassin, ClamAV, and Vipul’s Razor). After applying the updates, and restarting the necessary components of the mail stack, I ran my usual test of sending an email from one of my accounts to another one. It went through without a problem.

However, I realised that it isn’t a completely valid test to send an email from one internal account to another because I have amavisd configured to not scan anything coming from my trusted IPs and domains. I noticed several hundred mails in the queue when I ran postqueue -p, and they all had notices similar to:


status=deferred (delivery temporarily suspended:
connect to 127.0.0.1[127.0.0.1]:10024: Connection refused)

That indicated to me that it wasn’t a problem with Postfix (and I knew it wasn’t a problem with Dovecot, because I could connect to my accounts via IMAP). Seeing as amavisd is running on localhost:10024, I figured that that is where the problem had to be. A lot of times, when there is a “connection refused” notification, it is because no service is listening on that port. You can test to see what ports are in a listening state and what processes, applications, or daemons are listening by running:

netstat -tupan | grep LISTEN

When I did that, I noticed that amavisd wasn’t listening on port 10024, which made me think that it wasn’t running at all. That’s when I ran into the strange part of the problem: the init script output:

# /etc/init.d/amavisd start
* WARNING: amavisd has already been started
# /etc/init.d/amavisd stop
The amavisd daemon is not running                [ !! ]
* ERROR: amavisd failed to start

So, apparently it is running and not running at the same time (sounds like a Linux version of Schrödinger’s cat to me)! It was obvious, though, that it wasn’t actually running (which could be verified with ‘ps -elf | grep -i amavis’). So, what to do? I tried manually removing the PID file, but that actually just made matters a bit worse. Ultimately, this combination is what fixed the problem for me:


sa-update
/etc/init.d/amavisd zap
/etc/init.d/amavisd start

It seems that the SpamAssassin rules file had gone missing, and that was causing amavisd to not start properly. Manually updating the rules file (with ‘sa-update’) regenerated it, and then I zapped amavisd completely, and lastly restarted the daemon.

Hope that helps anyone running into the same problem.

Cheers,
Zach

EDIT: I have included a new post about debugging amavisd start-up problems.

May 13, 2015
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
uWSGI, gevent and pymongo 3 threads mayhem (May 13, 2015, 14:56 UTC)

This is a quick heads-up post about a behaviour change when running a gevent based application using the new pymongo 3 driver under uWSGI and its gevent loop.

I was naturally curious about testing this brand new and major update of the python driver for mongoDB so I just played it dumb : update and give a try on our existing code base.

The first thing I noticed instantly is that a vast majority of our applications were suddenly unable to reload gracefully and were force killed by uWSGI after some time !

worker 1 (pid: 9839) is taking too much time to die...NO MERCY !!!

uWSGI’s gevent-wait-for-hub

All our applications must be able to be gracefully reloaded at any time. Some of them are spawning quite a few greenlets on their own so as an added measure of making sure we never loose any running greenlet we use the gevent-wait-for-hub option, which is described as follow :

wait for gevent hub's death instead of the control greenlet

… which does not mean a lot but is explained in a previous uWSGI changelog :

During shutdown only the greenlets spawned by uWSGI are taken in account,
and after all of them are destroyed the process will exit.

This is different from the old approach where the process wait for
ALL the currently available greenlets (and monkeypatched threads).

If you prefer the old behaviour just specify the option gevent-wait-for-hub

pymongo 3

Compared to its previous 2.x versions, one of the overall key aspect of the new pymongo 3 driver is its intensive usage of threads to handle server discovery and connection pools.

Now we can relate this very fact to the gevent-wait-for-hub behaviour explained above :

the process wait for ALL the currently available greenlets
(and monkeypatched threads)

This explained why our applications were hanging until the reload-mercy (force kill) timeout option of uWSGI hit the fan !

conclusion

When using pymongo 3 with the gevent-wait-for-hub option, you have to keep in mind that all of pymongo’s threads (so monkey patched threads) are considered as active greenlets and will thus be waited for termination before uWSGI recycles the worker !

Two options come in mind to handle this properly :

  1. stop using the gevent-wait-for-hub option and change your code to use a gevent pool group to make sure that all of your important greenlets are taken care of when a graceful reload happens (this is how we do it today, the gevent-wait-for-hub option usage was just over protective for us).
  2. modify your code to properly close all your pymongo connections on graceful reloads.

Hope this will save some people the trouble of debugging this ;)

May 09, 2015
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)
Reviving Gentoo VMware support (May 09, 2015, 23:41 UTC)

Sadly over the last months the support for VMware Workstation and friends in Gentoo dropped a lot. Why? Well, I was the only developer left who cared, and it's definitely not at the top of my Gentoo priorities list. To be honest that has not really changed. However... let's try to harness the power of the community now.

I've pushed a mirror of the Gentoo vmware overlay to Github, see


If you have improvements, version bumps, ... - feel free to generate pull requests. Everything related to VMware products is acceptable. I hope some more people will over time sign up and help merging. Just be careful when using the overlay, it likely won't get the same level of review as ebuilds in the main tree.

May 07, 2015
Alexys Jacob a.k.a. ultrabug (homepage, bugs)

We’ve been running a nice mongoDB cluster in production for several years now in my company.

This cluster suits quite a wide range of use cases from very simple configuration collections to complex queried ones and real time analytics. This versatility has been the strong point of mongoDB for us since the start as it allows different teams to address their different problems using the same technology. We also run some dedicated replica sets for other purposes and network segmentation reasons.

We’ve waited a long time to see the latest 3.0 release features happening. The new WiredTiger storage engine hit the fan at the right time for us since we had reached the limits of our main production cluster and were considering alternatives.

So as surprising it may seem, it’s the first of our mongoDB architecture we’re upgrading to v3.0 as it has become a real necessity.

This post is about sharing our first experience about an ongoing and carefully planned major upgrade of a production cluster and does not claim to be a definitive migration guide.

Upgrade plan and hardware

The upgrade process is well covered in the mongoDB documentation already but I will list the pre-migration base specs of every node of our cluster.

  • mongodb v2.6.8
  • RAID1 spinning HDD 15k rpm for the OS (Gentoo Linux)
  • RAID10 4x SSD for mongoDB files under LVM
  • 64 GB RAM

Our overall philosophy is to keep most of the configuration parameters to their default values to start with. We will start experimenting with them when we have sufficient metrics to compare with later.

Disk (re)partitioning considerations

The master-get-all-the-writes architecture is still one of the main limitation of mongoDB and this does not change with v3.0 so you obviously need to challenge your current disk layout to take advantage of the new WiredTiger engine.

mongoDB 2.6 MMAPv1

Considering our cluster data size, we were forced to use our 4 SSD in a RAID10 as it was the best compromise to preserve performance while providing sufficient data storage capacity.

We often reached the limits of our I/O and moved the journal out of the RAID10 to the mostly idle OS RAID1 with no significant improvements.

mongoDB 3.0 WiredTiger

The main consideration point for us is the new feature allowing to store the indexes in a separate directory. So we anticipated the data storage consumption reduction thanks to the snappy compression and decided to split our RAID10 in two dedicated RAID1.

Our test layout so far is :

  • RAID1 SSD for the data
  • RAID1 SSD for the indexes and journal

Our first node migration

After migrating our mongos and config servers to 3.0, we picked our worst performing secondary node to test the actual migration to WiredTiger. After all, we couldn’t do worse right ?

We are aware that the strong suit of WiredTiger is actually about having the writes directed to it and will surely share our experience of this aspect later.

compression is bliss

To make this comparison accurate, we resynchronized this node totally before migrating to WiredTiger so we could compare a non fragmented MMAPv1 disk usage with the WiredTiger compressed one.

While I can’t disclose the actual values, compression worked like a charm for us with a gain ratio of 3,2 on disk usage (data + indexes) which is way beyond our expectations !

This is the DB Storage graph from MMS, showing a gain ratio of 4 surely due to indexes being in a separate disk now.

2015-05-07-115324_461x184_scrot

 

 

 

 

memory usage

As with the disk usage, the node had been running hot on MMAPv1 before the actual migration so we can compare memory allocation/consumption of both engines.

There again the memory management of WiredTiger and its cache shows great improvement. For now, we left the default setting which has WiredTiger limit its cache to half the available memory of the system. We’ll experiment with this setting later on.

2015-05-07-115347_459x177_scrot

 

 

 

 

connections

This I’m still not sure of the actual cause yet but the connections count is higher and more steady than before on this node.

2015-05-07-123449_454x183_scrot

First impressions

The node is running smooth for several hours now. We are getting acquainted to the new metrics and statistics from WiredTiger. The overall node and I/O load is better than before !

While all the above graphs show huge improvements there is no major change from our applications point of view. We didn’t expect any since this is only one node in a whole cluster and that the main benefits will also come from master node migrations.

I’ll continue to share our experience and progress about our mongoDB 3.0 upgrade.

April 29, 2015
Patrick Lauer a.k.a. bonsaikitten (homepage, bugs)
Code Hygiene (April 29, 2015, 03:03 UTC)

Some convenient Makefile targets that make it very easy to keep code clean:

scan:
        scan-build clang foo.c -o foo

indent:
        indent -linux *.c
scan-build is llvm/clang's static analyzer and generates some decent warnings. Using clang to build (in addition to 'default' gcc in my case) helps diversity and sometimes catches different errors.

indent makes code pretty, the 'linux' default settings are not exactly what I want, but close enough that I don't care to finetune yet.

Every commit should be properly indented and not cause more warnings to appear!

April 27, 2015
Sven Vermeulen a.k.a. swift (homepage, bugs)
Moving closer to 2.4 stabilization (April 27, 2015, 17:18 UTC)

The SELinux userspace project has released version 2.4 in february this year, after release candidates have been tested for half a year. After its release, we at the Gentoo Hardened project have been working hard to integrate it within Gentoo. This effort has been made a bit more difficult due to the migration of the policy store from one location to another while at the same time switching to HLL- and CIL based builds.

Lately, 2.4 itself has been pretty stable, and we’re focusing on the proper migration from 2.3 to 2.4. The SELinux policy has been adjusted to allow the migrations to work, and a few final fixes are being tested so that we can safely transition our stable users from 2.3 to 2.4. Hopefully we’ll be able to stabilize the userspace this month or beginning of next month.

Sebastian Pipping a.k.a. sping (homepage, bugs)
Comment vulnerability in WordPress 4.2 (April 27, 2015, 13:21 UTC)

Hanno Böck tweeted about WordPress 4.2 Stored XSS rather recently. The short version is: if an attacker can comment on your blog, your blog is owned.

Since the latest release is affected and is the version I am using, I have been looking for a way to disable comments globally, at least until a fix has been released.
I’m surprised how difficult disabling comments globally is.

Option “Allow people to post comments on new articles” is filed under “Default article settings”, so it applies to new posts only. Let’s disable that.
There is a plug-in Disable comments, but since it claims to not alter the database (unless in persistent mode), I have a feeling that it may only remove commenting forms but leave commenting active to hand-made GET/POST requests, so that may not be safe.

So without studying WordPress code in depth my impression is that I have two options:

  • a) restrict comments to registered users, deactivate registration (hoping that all existing users are friendly and that disabled registration is waterproof in 4.2) and/or
  • b) disable comments for future posts in the settings (in case I post again before an update) and for every single post from the past.

On database level, the former can be seen here:

mysql> SELECT option_name, option_value FROM wp_options
           WHERE option_name LIKE '%regist%';
+----------------------+--------------+
| option_name          | option_value |
+----------------------+--------------+
| users_can_register   | 0            |
| comment_registration | 1            |
+----------------------+--------------+
2 rows in set (0.01 sec)

For the latter, this is how to disable comments on all previous posts:

mysql> UPDATE wp_posts SET comment_status = 'closed';
Query OK, .... rows affected (.... sec)
Rows matched: ....  Changed: ....  Warnings: 0

If you have comments to share, please use e-mail this time. Upgraded to 4.2.1 now.

April 26, 2015
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
New devbox running (April 26, 2015, 16:31 UTC)

I announced it in February that Excelsior, which ran the Tinderbox, was no longer at Hurricane Electric. I have also said I'll start on working on a new generation Tinderbox, and to do that I need a new devbox, as the only three Gentoo systems I have at home are the laptops and my HTPC, not exactly hardware to run compilation all the freaking time.

So after thinking of options, I decided that it was much cheaper to just rent a single dedicated server, rather than a full cabinet, and after asking around for options I settled for Online.net, because of price and recommendation from friends. Unfortunately they do not support Gentoo as an operating system, which makes a few things a bit more complicated. They do provide you with a rescue system, based on Ubuntu, which is enough to do the install, but not everything is easy that way either.

Luckily, most of the configuration (but not all) was stored in Puppet — so I only had to rename the hosts there, changed the MAC addresses for the LAN and WAN interfaces (I use static naming of the interfaces as lan0 and wan0, which makes many other pieces of configuration much easier to deal with), changed the IP addresses, and so on. Unfortunately since I didn't start setting up that machine through Puppet, it also meant that it did not carry all the information to replicate the system, so it required some iteration and fixing of the configuration. This also means that the next move is going to be easier.

The biggest problem has been setting up correctly the MDRAID partitions, because of GRUB2: if you didn't know, grub2 has an automagic dependency on mdadm — if you don't install it it won't be able to install itself on a RAID device, even though it can detect it; the maintainer refused to add an USE flag for it, so you have to know about it.

Given what can and cannot be autodetected by the kernel, I had to fight a little more than usual and just gave up and rebuilt the two (/boot and / — yes laugh at me but when I installed Excelsior it was the only way to get GRUB2 not to throw up) arrays as metadata 0.90. But the problem was being able to tell what the boot up errors were, as I have no physical access to the device of course.

The Online.net server I rented is a Dell server, that comes with iDRAC for remote management (Dell's own name for IPMI, essentially), and Online.net allows you to set up connections to through your browser, which is pretty neat — they use a pool of temporary IP addresses and they only authorize your own IP address to connect to them. On the other hand, they do not change the default certificates, which means you end up with the same untrustable Dell certificate every time.

From the iDRAC console you can't do much, but you can start up the remove, JavaWS-based console, which reminded me of something. Unfortunately the JNLP file that you can download from iDRAC did not work on either Sun, Oracle or IcedTea JREs, segfaulting (no kidding) with an X.509 error log as last output — I seriously thought the problem was with the certificates until I decided to dig deeper and found this set of entries in the JNLP file:

 <resources os="Windows" arch="x86">
   <nativelib href="https://idracip/software/avctKVMIOWin32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLWin32.jar" download="eager"/>
 </resources>
 <resources os="Windows" arch="amd64">
   <nativelib href="https://idracip/software/avctKVMIOWin64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLWin64.jar" download="eager"/>
 </resources>
 <resources os="Windows" arch="x86_64">
   <nativelib href="https://idracip/software/avctKVMIOWin64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLWin64.jar" download="eager"/>
 </resources>
  <resources os="Linux" arch="x86">
    <nativelib href="https://idracip/software/avctKVMIOLinux32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux32.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="i386">
    <nativelib href="https://idracip/software/avctKVMIOLinux32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux32.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="i586">
    <nativelib href="https://idracip/software/avctKVMIOLinux32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux32.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="i686">
    <nativelib href="https://idracip/software/avctKVMIOLinux32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux32.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="amd64">
    <nativelib href="https://idracip/software/avctKVMIOLinux64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux64.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="x86_64">
    <nativelib href="https://idracip/software/avctKVMIOLinux64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux64.jar" download="eager"/>
  </resources>
  <resources os="Mac OS X" arch="x86_64">
    <nativelib href="https://idracip/software/avctKVMIOMac64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLMac64.jar" download="eager"/>
  </resources>

Turns out if you remove everything but the Linux/x86_64 option, it does fetch the right jar and execute the right code without segfaulting. Mysteries of Java Web Start I guess.

So after finally getting the system to boot, the next step is setting up networking — as I said I used Puppet to set up the addresses and everything, so I had working IPv4 at boot, but I had to fight a little longer to get IPv6 working. Indeed IPv6 configuration with servers, virtual and dedicated alike, is very much an unsolved problem. Not because there is no solution, but mostly because there are too many solutions — essentially every single hosting provider I ever used had a different way to set up IPv6 (including none at all in one case, so the only option was a tunnel) so it takes some fiddling around to set it up correctly.

To be honest, Online.net has a better set up than OVH or Hetzner, the latter being very flaky, and a more self-service one that Hurricane, which was very flexible, making it very easy to set up, but at the same time required me to just mail them if I wanted to make changes. They document for dibbler, as they rely on DHCPv6 with DUID for delegation — they give you a single /56 v6 net that you can then split up in subnets and delegate independently.

What DHCPv6 in this configuration does not give you is routing — which kinda make sense, as you can use RA (Route Advertisement) for it. Unfortunately at first I could not get it to work. Turns out that, since I use subnets for the containerized network, I enabled IPv6 forwarding, through Puppet of course. Turns out that Linux will ignore Route Advertisement packets when forwarding IPv6 unless you ask it nicely to — by setting accept_ra=2 as well. Yey!

Again this is the kind of problems that finding this information took much longer than it should have been; Linux does not really tell you that it's ignoring RA packets, and it is by far not obvious that setting one sysctl will disable another — unless you go and look for it.

Luckily this was the last problem I had, after that the server was set up fine and I just had to finish configuring the domain's zone file, and the reverse DNS and the SPF records… yes this is all the kind of trouble you go through if you don't just run your whole infrastructure, or use fully cloud — which is why I don't consider self-hosting a general solution.

What remained is just bits and pieces. The first was me realizing that Puppet does not remove the entries from /etc/fstab by default, so I noticed that the Gentoo default /etc/fstab file still contains the entries for CD-ROM drives as well as /dev/fd0. I don't remember which was the last computer with a floppy disk drive that I used, let alone owned.

The other fun bit has been setting up the containers themselves — similarly to the server itself, they are set up with Puppet. Since the server used to be running a tinderbox, it used to also host a proper rsync mirror, it was just easier, but I didn't want to repeat that here, and since I was unable to find a good mirror through mirrorselect (longer story), I configured Puppet to just provide to all the containers with distfiles.gentoo.org as their sync server, which did not work. Turns out that our default mirror address does not have any IPv6 hosts on it ­– when I asked Robin about it, it seems like we just don't have any IPv6-hosted mirror that can handle that traffic, it is sad.

So anyway, I now have a new devbox and I'm trying to set up the rest of my repositories and access (I have not set up access to Gentoo's repositories yet which is kind of the point here.) Hopefully this will also lead to more technical blogging in the next few weeks as I'm cutting down on the overwork to relax a bit.

April 25, 2015
Git changes & impact to Overlays hostnames (April 25, 2015, 00:00 UTC)

Changes to the Gentoo Git hosting setup may require URL changes in your checkouts: Repositories are now only available via git.gentoo.org for authenticated users and anongit.gentoo.org for read-only traffic.

As previously announced [1] [2], and previously in the discussion of merging Overlays with Gentoo’s primary SCM hosting (CVS+Git): The old overlays hostnames (git.overlays.gentoo.org and overlays.gentoo.org) have now been disabled, as well as non-SSH traffic to git.gentoo.org. This was a deliberate move to separate anonymous versus authenticated Git traffic, and ensure that anonymous Git traffic can continued to be scaled when we go ahead with switching away from CVS. Anonymous and authenticated Git is now served by separate systems, and no anonymous Git traffic is permitted to the authenticated Git server.

If you have anonymous Git checkouts from any of the affected hostnames, you should switch them to using one of these new URLs:

  • https://anongit.gentoo.org/git/$REPO
  • http://anongit.gentoo.org/git/$REPO
  • git://anongit.gentoo.org/$REPO

If you have authenticated Git checkouts from the same hosts, you should switch them to this new URL:

  • git+ssh://git@git.gentoo.org/$REPO

In either case, you can trivially update any existing checkout with:
git remote set-url origin git+ssh://git@git.gentoo.org/$REPO
(be sure to adjust the path of the repository and the name of the remote as needed).

April 23, 2015
Denis Dupeyron a.k.a. calchan (homepage, bugs)

is 46.

In a previous post I described how to patch QEMU to allow building binutils in a cross chroot. In there I increased the maximal number of argument pages to 64 because I was just after a quick fix. Today I finally bisected that, and the result is you need at least 46 for MAX_ARG_PAGES in order for binutils to build.

In bug 533882 it is discussed that LibreOffice requires an even larger number of pages. It is possible other packages also require such a large limit. Note that it may not be a good idea to increase the MAX_ARG_PAGES limit to an absurdly high number and leave it at that. A large amount of memory will be allocated in the target’s memory space and that may be a problem.

Hopefully QEMU switches to a dynamic limit someday like the kernel. In the meantime, my upcoming crossroot tool will offer a way to more easily deal with that.

April 21, 2015
Donnie Berkholz a.k.a. dberkholz (homepage, bugs)
How to give a great talk, the lazy way (April 21, 2015, 15:42 UTC)

presenter modeGot a talk coming up? Want it to go well? Here’s some starting points.

I give a lot of talks. Often I’m paid to give them, and I regularly get very high ratings or even awards. But every time I listen to people speaking in public for the first time, or maybe the first few times, I think of some very easy ways for them to vastly improve their talks.

Here, I wanted to share my top tips to make your life (and, selfishly, my life watching your talks) much better:

  1. Presenter mode is the greatest invention ever. Use it. If you ignore or forget everything else in this post, remember the rainbows and unicorns of presenter mode. This magical invention keeps the current slide showing on the projector while your laptop shows something different — the current slide, a small image of the next slide, and your slide notes. The last bit is the key. What I put on my notes is the main points of the current slide, followed by my transition to the next slide. Presentations look a lot more natural when you say the transition before you move to the next slide rather than after. More than anything else, presenter mode dramatically cut down on my prep time, because suddenly I no longer had to rehearse. I had seamless, invisible crib notes while I was up on stage.
  2. Plan your intro. Starting strong goes a long way, as it turns out that making a good first impression actually matters. It’s time very well spent to literally script your first few sentences. It helps you get the flow going and get comfortable, so you can really focus on what you’re saying instead of how nervous you are. Avoid jokes unless most of your friends think you’re funny almost all the time. (Hint: they don’t, and you aren’t.)
  3. No bullet points. Ever. (Unless you’re an expert, and you probably aren’t.) We’ve been trained by too many years of boring, sleep-inducing PowerPoint presentations that bullet points equal naptime. Remember presenter mode? Put the bullet points in the slide notes that only you see. If for some reason you think you’re the sole exception to this, at a minimum use visual advances/transitions. (And the only good transition is an instant appear. None of that fading crap.) That makes each point appear on-demand rather than all of them showing up at once.
  4. Avoid text-filled slides. When you put a bunch of text in slides, people inevitably read it. And they read it at a different pace than you’re reading it. Because you probably are reading it, which is incredibly boring to listen to. The two different paces mean they can’t really focus on either the words on the slide or the words coming out of your mouth, and your attendees consequently leave having learned less than either of those options alone would’ve left them with.
  5. Use lots of really large images. Each slide should be a single concept with very little text, and images are a good way to force yourself to do so. Unless there’s a very good reason, your images should be full-bleed. That means they go past the edges of the slide on all sides. My favorite place to find images is a Flickr advanced search for Creative Commons licenses. Google also has this capability within Search Tools. Sometimes images are code samples, and that’s fine as long as you remember to illustrate only one concept — highlight the important part.
  6. Look natural. Get out from behind the podium, so you don’t look like a statue or give the classic podium death-grip (one hand on each side). You’ll want to pick up a wireless slide advancer and make sure you have a wireless lavalier mic, so you can wander around the stage. Remember to work your way back regularly to check on your slide notes, unless you’re fortunate enough to have them on extra monitors around the stage. Talk to a few people in the audience beforehand, if possible, to get yourself comfortable and get a few anecdotes of why people are there and what their background is.
  7. Don’t go over time. You can go under, even a lot under, and that’s OK. One of the best talks I ever gave took 22 minutes of a 45-minute slot, and the rest filled up with Q&A. Nobody’s going to mind at all if you use up 30 minutes of that slot, but cutting into their bathroom or coffee break, on the other hand, is incredibly disrespectful to every attendee. This is what watches, and the timer in presenter mode, and clocks, are for. If you don’t have any of those, ask a friend or make a new friend in the front row.
  8. You’re the centerpiece. The slides are a prop. If people are always looking at the slides rather than you, chances are you’ve made a mistake. Remember, the focus should be on you, the speaker. If they’re only watching the slides, why didn’t you just post a link to Slideshare or Speakerdeck and call it a day?

I’ve given enough talks that I have a good feel on how long my slides will take, and I’m able to adjust on the fly. But if you aren’t sure of that, it might make sense to rehearse. I generally don’t rehearse, because after all, this is the lazy way.

If you can manage to do those 8 things, you’ve already come a long way. Good luck!


Tagged: communication, gentoo

April 16, 2015
Bernard Cafarelli a.k.a. voyageur (homepage, bugs)
Removing old NX packages from tree (April 16, 2015, 21:30 UTC)

I already sent the last rites announce a few days ago, but here is a more detailed post on the coming up removal of “old” NX packages. Long story short: migrate to X2Go if possible, or use the NX overlay (“best-effort” support provided).
2015/04/26 note: treecleaning done!

Affected packages

Basically, all NX clients and servers except x2go and nxplayer! Here is the complete list with some specific last rites reasons:

  • net-misc/nxclient,net-misc/nxnode,net-misc/nxserver-freeedition: binary-only original NX client and server. Upstream has moved on to a closed-source technology, and this version  bundles potientally vulnerable binary code. It does not work as well as before with current libraries (like Cairo).
  • net-misc/nxserver-freenx, net-misc/nxsadmin: the first open-source alternative server. It could be tricky to get working, and is not updated anymore (last upstream activity around 2009)
  • net-misc/nxcl, net-misc/qtnx: an open-source alternative client (last upstream activity around 2008)
  • net-misc/neatx: Google’s take on a NX server, sadly it never took off (last upstream activity around 2010)
  • app-admin/eselect-nxserver (an eselect module to switch active NX server, useless without these servers in tree)

Continue using these packages on Gentoo

These packages will be dropped from the main tree by the end of this month (2015/04), and then only available in the NX overlay. They will still be supported there in a “best effort” way (no guarantee how long some of these packages will work with current systems).

So, if one of these packages still works better for you, or you need to keep them around before migrating, just run:

# layman -a nx

Alternatives

While it is not a direct drop-in replacement, x2go is the most complete solution currently in Gentoo tree (and my recommendation), with a lot of possible advanced features, active upstream development, … You can connect to net-misc/x2goserver with net-misc/x2goclient, net-misc/pyhoca-gui, or net-misc/pyhoca-cli.

If you want to try NoMachine’s (the company that created NX) new system, well the client is available in Portage as net-misc/nxplayer. The server itself is not packaged yet, if you are interested in it, this is bug #488334

April 11, 2015
Sebastian Pipping a.k.a. sping (homepage, bugs)
Firefox: You may want to update to 37.0.1 (April 11, 2015, 19:23 UTC)

I was pointed to this Mozilla Security Advisory:

Certificate verification bypass through the HTTP/2 Alt-Svc header
https://www.mozilla.org/en-US/security/advisories/mfsa2015-44/

While it doesn’t say if all versions prior to 37.0.1 are affected, it does say that sending a certain server response header disabled warnings of invalid SSL certificates for that domain. Ooops.

I’m not sure how relevant HTTP/2 is by now.

Paweł Hajdan, Jr. a.k.a. phajdan.jr (homepage, bugs)

Slot conflicts can be annoying. It's worse when an attempt to fix them leads to an even bigger mess. I hope this post helps you with some cases - and that portage will keep getting smarter about resolving them automatically.

Read more »

Sebastian Pipping a.k.a. sping (homepage, bugs)

While https://panopticlick.eff.org/ is not really new, I learned about that site only recently.
And while I knew that browser self-identification would reduce my anonymity on the Internet, I didn’t expect this result:

Your browser fingerprint appears to be unique among the 5,198,585 tested so far.

Wow. Why? Let’s try one of the others browsers I use. “Appears to be unique”, again (where Flash is enabled).

What’s so unique about my setup? The two reports say about my setup:

Characteristic One in x browsers have this value
Browser Firefox
36.0.1
Google Chrome
42.0.2311.68
Chromium
41.0.2272.76
User Agent 2,030.70 472,599.36 16,576.56
HTTP_ACCEPT Headers 12.66 5477.97 5,477.97
Browser Plugin Details 577,620.56 259,929.65 7,351.75
Time Zone 6.51 6.51 6.51
Screen Size and Color Depth 13.72 13.72 13.72
System Fonts 5,198,585.00 5,198,585.00 5.10
(Flash and Java disabled)
Are Cookies Enabled? 1.35 1.35 1.35
Limited supercookie test 1.83 1.83 1.83

User agent and browser plug-ins hurt, fonts alone kill me altogether. Ouch.

Update:

  • It’s the very same when browsing with an incognito window. Re-reading, what that feature is officially intended to do (being incognito to your own history), that stops being a surprise.
  • Chromium (with Flash/Java disabled) added

Thoughts on fixing this issue:

I’m not sure about how I want to fix this myself. Faking popular values (in a popular combination to not fire back) could work using a local proxy, a browser patch, a browser plug-in maybe. Obtaining true popular value combinations is another question. Fake values can reduce the quality of the content I am presented, e.g. I would not fake my screen resolution or be sure to not deviate by much, probably.

Patrick Lauer a.k.a. bonsaikitten (homepage, bugs)
Almost quiet dataloss (April 11, 2015, 11:06 UTC)

Some harddisk manufacturers have interesting ideas ... using some old Samsung disks in a RAID5 config:

[15343.451517] ata3.00: exception Emask 0x0 SAct 0x40008410 SErr 0x0 action 0x6 frozen
[15343.451522] ata3.00: failed command: WRITE FPDMA QUEUED
[15343.451527] ata3.00: cmd 61/20:20:d8:7d:6c/01:00:07:00:00/40 tag 4 ncq 147456 out
                        res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)                                                                                                                                                                                            
[15343.451530] ata3.00: status: { DRDY }
[15343.451532] ata3.00: failed command: WRITE FPDMA QUEUED
[15343.451536] ata3.00: cmd 61/30:50:d0:2f:40/00:00:0d:00:00/40 tag 10 ncq 24576 out
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)                                                                                                                                                                                            
[15343.451538] ata3.00: status: { DRDY }
[15343.451540] ata3.00: failed command: WRITE FPDMA QUEUED
[15343.451544] ata3.00: cmd 61/a8:78:90:be:da/00:00:0b:00:00/40 tag 15 ncq 86016 out
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)                                                                                                                                                                                            
[15343.451546] ata3.00: status: { DRDY }
[15343.451549] ata3.00: failed command: READ FPDMA QUEUED
[15343.451552] ata3.00: cmd 60/38:f0:c0:2b:d6/00:00:0e:00:00/40 tag 30 ncq 28672 in
                        res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)                                                                                                                                                                                            
[15343.451555] ata3.00: status: { DRDY }
[15343.451557] ata3: hard resetting link
[15343.911891] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[15344.062112] ata3.00: configured for UDMA/133
[15344.062130] ata3.00: device reported invalid CHS sector 0
[15344.062139] ata3.00: device reported invalid CHS sector 0
[15344.062146] ata3.00: device reported invalid CHS sector 0
[15344.062153] ata3.00: device reported invalid CHS sector 0
[15344.062169] ata3: EH complete
Hmm, that doesn't look too good ... but mdadm still believes the RAID is functional.

And a while later things like this happen:
[ 2968.701999] XFS (md4): Metadata corruption detected at xfs_dir3_data_read_verify+0x72/0x77 [xfs], block 0x36900a0
[ 2968.702004] XFS (md4): Unmount and run xfs_repair
[ 2968.702007] XFS (md4): First 64 bytes of corrupted metadata buffer:
[ 2968.702011] ffff8802ab5cf000: 04 00 00 00 99 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702015] ffff8802ab5cf010: 03 00 00 00 00 00 00 00 02 00 00 00 9e 00 00 00  ................
[ 2968.702018] ffff8802ab5cf020: 0c 00 00 00 00 00 00 00 13 00 00 00 00 00 00 00  ................
[ 2968.702021] ffff8802ab5cf030: 04 00 00 00 82 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702048] XFS (md4): metadata I/O error: block 0x36900a0 ("xfs_trans_read_buf_map") error 117 numblks 8
[ 2968.702476] XFS (md4): Metadata corruption detected at xfs_dir3_data_reada_verify+0x69/0x6d [xfs], block 0x36900a0
[ 2968.702491] XFS (md4): Unmount and run xfs_repair
[ 2968.702494] XFS (md4): First 64 bytes of corrupted metadata buffer:
[ 2968.702498] ffff8802ab5cf000: 04 00 00 00 99 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702501] ffff8802ab5cf010: 03 00 00 00 00 00 00 00 02 00 00 00 9e 00 00 00  ................
[ 2968.702505] ffff8802ab5cf020: 0c 00 00 00 00 00 00 00 13 00 00 00 00 00 00 00  ................
[ 2968.702508] ffff8802ab5cf030: 04 00 00 00 82 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702825] XFS (md4): Metadata corruption detected at xfs_dir3_data_read_verify+0x72/0x77 [xfs], block 0x36900a0
[ 2968.702831] XFS (md4): Unmount and run xfs_repair
[ 2968.702834] XFS (md4): First 64 bytes of corrupted metadata buffer:
[ 2968.702839] ffff8802ab5cf000: 04 00 00 00 99 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702842] ffff8802ab5cf010: 03 00 00 00 00 00 00 00 02 00 00 00 9e 00 00 00  ................
[ 2968.702866] ffff8802ab5cf020: 0c 00 00 00 00 00 00 00 13 00 00 00 00 00 00 00  ................
[ 2968.702871] ffff8802ab5cf030: 04 00 00 00 82 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702888] XFS (md4): metadata I/O error: block 0x36900a0 ("xfs_trans_read_buf_map") error 117 numblks 8
fsck finds quite a lot of data not being where it should be.
I'm not sure who to blame here - the kernel should actively punch out any harddisk that is fish-on-land flopping around like that, the md layer should hate on any device that even looks weirdly, but somehow "just doing a link reset" is considered enough.

I'm not really upset that an old cheap disk that is now ~9 years old decides to have dementia, but I'm quite unhappy with the firmware programming that doesn't seem to consider data loss as a problem ... (but at least it's not Seagate!)

April 07, 2015
Hanno Böck a.k.a. hanno (homepage, bugs)
How Heartbleed could've been found (April 07, 2015, 13:23 UTC)

Heartbleedtl;dr With a reasonably simple fuzzing setup I was able to rediscover the Heartbleed bug. This uses state-of-the-art fuzzing and memory protection technology (american fuzzy lop and Address Sanitizer), but it doesn't require any prior knowledge about specifics of the Heartbleed bug or the TLS Heartbeat extension. We can learn from this to find similar bugs in the future.

Exactly one year ago a bug in the OpenSSL library became public that is one of the most well-known security bug of all time: Heartbleed. It is a bug in the code of a TLS extension that up until then was rarely known by anybody. A read buffer overflow allowed an attacker to extract parts of the memory of every server using OpenSSL.

Can we find Heartbleed with fuzzing?

Heartbleed was introduced in OpenSSL 1.0.1, which was released in March 2012, two years earlier. Many people wondered how it could've been hidden there for so long. David A. Wheeler wrote an essay discussing how fuzzing and memory protection technologies could've detected Heartbleed. It covers many aspects in detail, but in the end he only offers speculation on whether or not fuzzing would have found Heartbleed. So I wanted to try it out.

Of course it is easy to find a bug if you know what you're looking for. As best as reasonably possible I tried not to use any specific information I had about Heartbleed. I created a setup that's reasonably simple and similar to what someone would also try it without knowing anything about the specifics of Heartbleed.

Heartbleed is a read buffer overflow. What that means is that an application is reading outside the boundaries of a buffer. For example, imagine an application has a space in memory that's 10 bytes long. If the software tries to read 20 bytes from that buffer, you have a read buffer overflow. It will read whatever is in the memory located after the 10 bytes. These bugs are fairly common and the basic concept of exploiting buffer overflows is pretty old. Just to give you an idea how old: Recently the Chaos Computer Club celebrated the 30th anniversary of a hack of the German BtX-System, an early online service. They used a buffer overflow that was in many aspects very similar to the Heartbleed bug. (It is actually disputed if this is really what happened, but it seems reasonably plausible to me.)

Fuzzing is a widely used strategy to find security issues and bugs in software. The basic idea is simple: Give the software lots of inputs with small errors and see what happens. If the software crashes you likely found a bug.

When buffer overflows happen an application doesn't always crash. Often it will just read (or write if it is a write overflow) to the memory that happens to be there. Whether it crashes depends on a lot of circumstances. Most of the time read overflows won't crash your application. That's also the case with Heartbleed. There are a couple of technologies that improve the detection of memory access errors like buffer overflows. An old and well-known one is the debugging tool Valgrind. However Valgrind slows down applications a lot (around 20 times slower), so it is not really well suited for fuzzing, where you want to run an application millions of times on different inputs.

Address Sanitizer finds more bug

A better tool for our purpose is Address Sanitizer. David A. Wheeler calls it “nothing short of amazing”, and I want to reiterate that. I think it should be a tool that every C/C++ software developer should know and should use for testing.

Address Sanitizer is part of the C compiler and has been included into the two most common compilers in the free software world, gcc and llvm. To use Address Sanitizer one has to recompile the software with the command line parameter -fsanitize=address . It slows down applications, but only by a relatively small amount. According to their own numbers an application using Address Sanitizer is around 1.8 times slower. This makes it feasible for fuzzing tasks.

For the fuzzing itself a tool that recently gained a lot of popularity is american fuzzy lop (afl). This was developed by Michal Zalewski from the Google security team, who is also known by his nick name lcamtuf. As far as I'm aware the approach of afl is unique. It adds instructions to an application during the compilation that allow the fuzzer to detect new code paths while running the fuzzing tasks. If a new interesting code path is found then the sample that created this code path is used as the starting point for further fuzzing.

Currently afl only uses file inputs and cannot directly fuzz network input. OpenSSL has a command line tool that allows all kinds of file inputs, so you can use it for example to fuzz the certificate parser. But this approach does not allow us to directly fuzz the TLS connection, because that only happens on the network layer. By fuzzing various file inputs I recently found two issues in OpenSSL, but both had been found by Brian Carpenter before, who at the same time was also fuzzing OpenSSL.

Let OpenSSL talk to itself

So to fuzz the TLS network connection I had to create a workaround. I wrote a small application that creates two instances of OpenSSL that talk to each other. This application doesn't do any real networking, it is just passing buffers back and forth and thus doing a TLS handshake between a server and a client. Each message packet is written down to a file. It will result in six files, but the last two are just empty, because at that point the handshake is finished and no more data is transmitted. So we have four files that contain actual data from a TLS handshake. If you want to dig into this, a good description of a TLS handshake is provided by the developers of OCaml-TLS and MirageOS.

Then I added the possibility of switching out parts of the handshake messages by files I pass on the command line. By calling my test application selftls with a number and a filename a handshake message gets replaced by this file. So to test just the first part of the server handshake I'd call the test application, take the output file packed-1 and pass it back again to the application by running selftls 1 packet-1. Now we have all the pieces we need to use american fuzzy lop and fuzz the TLS handshake.

I compiled OpenSSL 1.0.1f, the last version that was vulnerable to Heartbleed, with american fuzzy lop. This can be done by calling ./config and then replacing gcc in the Makefile with afl-gcc. Also we want to use Address Sanitizer, to do so we have to set the environment variable AFL_USE_ASAN to 1.

There are some issues when using Address Sanitizer with american fuzzy lop. Address Sanitizer needs a lot of virtual memory (many Terabytes). American fuzzy lop limits the amount of memory an application may use. It is not trivially possible to only limit the real amount of memory an application uses and not the virtual amount, therefore american fuzzy lop cannot handle this flawlessly. Different solutions for this problem have been proposed and are currently developed. I usually go with the simplest solution: I just disable the memory limit of afl (parameter -m -1). This poses a small risk: A fuzzed input may lead an application to a state where it will use all available memory and thereby will cause other applications on the same system to malfuction. Based on my experience this is very rare, so I usually just ignore that potential problem.

After having compiled OpenSSL 1.0.1f we have two files libssl.a and libcrypto.a. These are static versions of OpenSSL and we will use them for our test application. We now also use the afl-gcc to compile our test application:

AFL_USE_ASAN=1 afl-gcc selftls.c -o selftls libssl.a libcrypto.a -ldl

Now we run the application. It needs a dummy certificate. I have put one in the repo. To make things faster I'm using a 512 bit RSA key. This is completely insecure, but as we don't want any security here – we just want to find bugs – this is fine, because a smaller key makes things faster. However if you want to try fuzzing the latest OpenSSL development code you need to create a larger key, because it'll refuse to accept such small keys.

The application will give us six packet files, however the last two will be empty. We only want to fuzz the very first step of the handshake, so we're interested in the first packet. We will create an input directory for american fuzzy lop called in and place packet-1 in it. Then we can run our fuzzing job:

afl-fuzz -i in -o out -m -1 -t 5000 ./selftls 1 @@

american fuzzy lop screenshot

We pass the input and output directory, disable the memory limit and increase the timeout value, because TLS handshakes are slower than common fuzzing tasks. On my test machine around 6 hours later afl found the first crash. Now we can manually pass our output to the test application and will get a stack trace by Address Sanitizer:

==2268==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x629000013748 at pc 0x7f228f5f0cfa bp 0x7fffe8dbd590 sp 0x7fffe8dbcd38
READ of size 32768 at 0x629000013748 thread T0
#0 0x7f228f5f0cf9 (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/libasan.so.1+0x2fcf9)
#1 0x43d075 in memcpy /usr/include/bits/string3.h:51
#2 0x43d075 in tls1_process_heartbeat /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/t1_lib.c:2586
#3 0x50e498 in ssl3_read_bytes /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/s3_pkt.c:1092
#4 0x51895c in ssl3_get_message /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/s3_both.c:457
#5 0x4ad90b in ssl3_get_client_hello /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/s3_srvr.c:941
#6 0x4c831a in ssl3_accept /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/s3_srvr.c:357
#7 0x412431 in main /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/selfs.c:85
#8 0x7f228f03ff9f in __libc_start_main (/lib64/libc.so.6+0x1ff9f)
#9 0x4252a1 (/data/openssl/openssl-handshake/openssl-1.0.1f-nobreakrng-afl-asan-fuzz/selfs+0x4252a1)

0x629000013748 is located 0 bytes to the right of 17736-byte region [0x62900000f200,0x629000013748)
allocated by thread T0 here:
#0 0x7f228f6186f7 in malloc (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/libasan.so.1+0x576f7)
#1 0x57f026 in CRYPTO_malloc /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/crypto/mem.c:308


We can see here that the crash is a heap buffer overflow doing an invalid read access of around 32 Kilobytes in the function tls1_process_heartbeat(). It is the Heartbleed bug. We found it.

I want to mention a couple of things that I found out while trying this. I did some things that I thought were necessary, but later it turned out that they weren't. After Heartbleed broke the news a number of reports stated that Heartbleed was partly the fault of OpenSSL's memory management. A mail by Theo De Raadt claiming that OpenSSL has “exploit mitigation countermeasures” was widely quoted. I was aware of that, so I first tried to compile OpenSSL without its own memory management. That can be done by calling ./config with the option no-buf-freelist.

But it turns out although OpenSSL uses its own memory management that doesn't defeat Address Sanitizer. I could replicate my fuzzing finding with OpenSSL compiled with its default options. Although it does its own allocation management, it will still do a call to the system's normal malloc() function for every new memory allocation. A blog post by Chris Rohlf digs into the details of the OpenSSL memory allocator.

Breaking random numbers for deterministic behaviour

When fuzzing the TLS handshake american fuzzy lop will report a red number counting variable runs of the application. The reason for that is that a TLS handshake uses random numbers to create the master secret that's later used to derive cryptographic keys. Also the RSA functions will use random numbers. I wrote a patch to OpenSSL to deliberately break the random number generator and let it only output ones (it didn't work with zeros, because OpenSSL will wait for non-zero random numbers in the RSA function).

During my tests this had no noticeable impact on the time it took afl to find Heartbleed. Still I think it is a good idea to remove nondeterministic behavior when fuzzing cryptographic applications. Later in the handshake there are also timestamps used, this can be circumvented with libfaketime, but for the initial handshake processing that I fuzzed to find Heartbleed that doesn't matter.

Conclusion

You may ask now what the point of all this is. Of course we already know where Heartbleed is, it has been patched, fixes have been deployed and it is mostly history. It's been analyzed thoroughly.

The question has been asked if Heartbleed could've been found by fuzzing. I'm confident to say the answer is yes. One thing I should mention here however: American fuzzy lop was already available back then, but it was barely known. It only received major attention later in 2014, after Michal Zalewski used it to find two variants of the Shellshock bug. Earlier versions of afl were much less handy to use, e. g. they didn't have 64 bit support out of the box. I remember that I failed to use an earlier version of afl with Address Sanitizer, it was only possible after a couple of issues were fixed. A lot of other things have been improved in afl, so at the time Heartbleed was found american fuzzy lop probably wasn't in a state that would've allowed to find it in an easy, straightforward way.

I think the takeaway message is this: We have powerful tools freely available that are capable of finding bugs like Heartbleed. We should use them and look for the other Heartbleeds that are still lingering in our software. Take a look at the Fuzzing Project if you're interested in further fuzzing work. There are beginner tutorials that I wrote with the idea in mind to show people that fuzzing is an easy way to find bugs and improve software quality.

I already used my sample application to fuzz the latest OpenSSL code. Nothing was found yet, but of course this could be further tweaked by trying different protocol versions, extensions and other variations in the handshake.

I also wrote a German article about this finding for the IT news webpage Golem.de.

Update:

I want to point out some feedback I got that I think is noteworthy.

On Twitter it was mentioned that Codenomicon actually found Heartbleed via fuzzing. There's a Youtube video from Codenomicon's Antti Karjalainen explaining the details. However the way they did this was quite different, they built a protocol specific fuzzer. The remarkable feature of afl is that it is very powerful without knowing anything specific about the used protocol. Also it should be noted that Heartbleed was found twice, the first one was Neel Mehta from Google.

Kostya Serebryany mailed me that he was able to replicate my findings with his own fuzzer which is part of LLVM, and it was even faster.

In the comments Michele Spagnuolo mentions that by compiling OpenSSL with -DOPENSSL_TLS_SECURITY_LEVEL=0 one can use very short and insecure RSA keys even in the latest version. Of course this shouldn't be done in production, but it is helpful for fuzzing and other testing efforts.

April 05, 2015
Denis Dupeyron a.k.a. calchan (homepage, bugs)

Here’s a bug and my response to it which both deserve a little bit more visibility than being buried under some random bug number. I’m actually surprised nobody complained about that before.

GNU R supports to run
> install.packages(‘ggplot2′)
in the R console as user. The library ggplot2 will then be installed in users home. Most distros like debian and the like provide a package per library.

First, thank you for pointing out that it is possible to install and maintain your own packages in your $HOME. It didn’t use to work, and the reason why it now does is a little further down but I will not spoil it.

Here’s my response:

Please, do not ever add R packages to the tree. There are thousands of them and they are mostly very badly written, to be polite. If you look at other distros you will see that they give an illusion of having some R packages, but almost all of them (if not all) are seriously lagging behind their respective upstream or simply unmaintained. The reason is that it’s a massive amount of very frustrating and pointless work.

Upstream recommends maintaining your packages yourself in your $HOME and we’ll go with that. I have sent patches a couple of years ago to fix the way this works, and now (as you can obviously see) it does work correctly on all distros, not just Gentoo. Also, real scientists usually like to lock down the exact versions of packages they use, which is not possible when they are maintained by a third party.

If you want to live on the edge then feel free to ask Benda Xu (heroxbd) for an access to the R overlay repository. It serves tens of thousands of ebuilds for R packages automatically converted from a number of sources. It mostly works, and helps in preserving a seemingly low but nonetheless functional level of mental sanity of your beloved volunteer developers.

That, or you maintain your own overlay of packages and have it added to layman.

While we are on that subject, I would like to publicly thank André Erdmann for the fantastic work he has done over the past few years. He wrote, and still occasionally updates, the magical software which runs behind the R overlay server. Thank you, André.

March 31, 2015
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
py3status v2.4 (March 31, 2015, 13:53 UTC)

I’m very pleased to announce this new release of py3status because it is by far the most contributed one with a total of 33 files changed, 1625 insertions and 509 deletions !

I’ll start by thanking this release’s contributors with a special mention for Federico Ceratto for his precious insights, his CLI idea and implementation and other modules contributions.

Thank you

  • Federico Ceratto
  • @rixx (and her amazing reactivity)
  • J.M. Dana
  • @Gamonics
  • @guilbep
  • @lujeni
  • @obb
  • @shankargopal
  • @thomas-

IMPORTANT

In order to keep a clean and efficient code base, this is the last version of py3status supporting the legacy modules loading and ordering, this behavior will be dropped on the next 2.5 version !

CLI commands

py3status now supports some CLI commands which allows you to get information about all the available modules and their documentation.

  • list all available modules

if you specify your own inclusion folder(s) with the -i parameter, your modules will be listed too !

$ py3status modules list
Available modules:
  battery_level          Display the battery level.
  bitcoin_price          Display bitcoin prices using bitcoincharts.com.
  bluetooth              Display bluetooth status.
  clementine             Display the current "artist - title" playing in Clementine.
  dpms                   Activate or deactivate DPMS and screen blanking.
  glpi                   Display the total number of open tickets from GLPI.
  imap                   Display the unread messages count from your IMAP account.
  keyboard_layout        Display the current keyboard layout.
  mpd_status             Display information from mpd.
  net_rate               Display the current network transfer rate.
  netdata                Display network speed and bandwidth usage.
  ns_checker             Display DNS resolution success on a configured domain.
  online_status          Display if a connection to the internet is established.
  pingdom                Display the latest response time of the configured Pingdom checks.
  player_control         Control music/video players.
  pomodoro               Display and control a Pomodoro countdown.
  scratchpad_counter     Display the amount of windows in your i3 scratchpad.
  spaceapi               Display if your favorite hackerspace is open or not.
  spotify                Display information about the current song playing on Spotify.
  sysdata                Display system RAM and CPU utilization.
  vnstat                 Display vnstat statistics.
  weather_yahoo          Display Yahoo! Weather forecast as icons.
  whoami                 Display the currently logged in user.
  window_title           Display the current window title.
  xrandr                 Control your screen(s) layout easily.
  • get available modules details and configuration
$ py3status modules details
Available modules:
  battery_level          Display the battery level.
                         
                         Configuration parameters:
                             - color_* : None means - get it from i3status config
                             - format : text with "text" mode. percentage with % replaces {}
                             - hide_when_full : hide any information when battery is fully charged
                             - mode : for primitive-one-char bar, or "text" for text percentage ouput
                         
                         Requires:
                             - the 'acpi' command line
                         
                         @author shadowprince, AdamBSteele
                         @license Eclipse Public License
                         ---
[...]

Modules changelog

  • new bluetooth module by J.M. Dana
  • new online_status module by @obb
  • new player_control module, by Federico Ceratto
  • new spotify module, by Pierre Guilbert
  • new xrandr module to handle your screens layout from your bar
  • dpms module activate/deactivate the screensaver as well
  • imap module various configuration and optimizations
  • pomodoro module can use DBUS notify, play sounds and be paused
  • spaceapi module bugfix for space APIs without ‘lastchange’ field
  • keyboard_layout module incorrect parsing of “setxkbmap -query”
  • battery_level module better python3 compatibility

Other highlights

Full changelog here.

  • catch daylight savings time change
  • ensure modules methods are always iterated alphabetically
  • refactor default config file detection
  • rename and move the empty_class example module to the doc/ folder
  • remove obsolete i3bar_click_events module
  • py3status will soon be available on debian thx to Federico Ceratto !

Thank you for participating in Gentoo’s 2015 April Fools’ joke!

Now that April 1 has passed, we shed a tear as we say goodbye CGA Web™ but also to our website. Our previous website, that is, that has been with us for more than a decade.

Until all contents are migrated, you can find the previous version on wwwold.gentoo.org, please note that the contents found there are not maintained any longer.

As this is indeed a major change, we’re still working out some rough edges and would appreciate your feedback via email to www@gentoo.org or on IRC in #gentoo-www.

We hope you appreciate the new look and had a great time finding out how terrible you are at Pong and are looking forward to seeing your reactions once again when we celebrate the launch of the new Gentoo Disk™ set.

As for Alex, Robin, and all co-conspirators, thank you again for your participation!


Old April Fools’ day announcement

Gentoo Linux today announced the launch of its new totally revamped and more inclusive website which was built to conform to the CGA Web™ graphics standards.

“Our previous website served the community superbly well for the past 10 years but was frankly not as inclusive as we would have liked as it could not be viewed by aspiring community members who did not have access to the latest hardware,” said a Gentoo Linux Council Member who asked not to be named.

“Dedicated community members worked all hours for many months to get the new site ready for its launch today. We are proud of their efforts and are convinced that the new site will be way more inclusive than ever and thereby deepen the sense of community felt by all,” they said.

“Gentoo Linux’s seven-person council determined that the interests of the community were not being served by the previous site and decided that it had to be made more inclusive,” said Web project lead Alex Legler (a3li). The new site is was also available via Gopher (gopher://gopher.gentoo.org/).

“What’s the use of putting millions of colours out there when so many in the world cannot appreciate them and who, indeed, may even feel disappointed by their less capable hardware platforms,” he said.

“We accept that members in more fortunate circumstances may feel that a site with a 16-colour palette and an optimal screen resolution of 640 x 200 pixels is not the best fit for their needs but we urge such members to keep the greater good in mind. The vast majority of potential new Gentoo Linux users are still using IBM XT computers, storing their information on 5.25-inch floppy disks and communicating via dial-up BBS,” said Roy Bamford (neddyseagoon), a Foundation trustee.

“These people will be touched and grateful that their needs are now being taken in account and that they will be able to view the Gentoo Linux site comfortably on whatever hardware they have available.”

“The explosion of gratitude will ensure other leading firms such as Microsoft and Apple begin to move into conformance with CGA Web™ and it is hoped it will help bring knowledge to more and more informationally-disadvantaged people every year,” said Daniel Robbins (drobbins), former Gentoo founder.

Several teams participated in the early development of the new website and would like to showcase their work:

  • Games Team (JavaScript Pong)
  • Multimedia Team (Ghostbusters Theme on 6 floppy drives)
  • Net-News Team (A list of Gentoo newsgroups)

Phase II

The second phase of the project to get Gentoo Linux to a wider user base will involve the creation of floppy disk sets containing a compact version of the operating system and a selection of software essentials. It is estimated that sets could be created using less than 700 disks each and sponsorship is currently being sought. The launch of Gentoo Disk™ can be expected in about a year.

Media release prepared by A Jackson.

Editorial inquiries: PR team.

Interviews, photography and screen shots available on request.

March 30, 2015
Sebastian Pipping a.k.a. sping (homepage, bugs)

I found

Send email on SSH login using PAM

to be a great guide for setting up e-mail delivery for any successful log-in through SSH.

My current script:

#! /bin/bash
if [ "$PAM_TYPE" != "open_session" ]; then
  exit 0
fi

cat <<-BODY | mailx -s "Log-in to ${PAM_USER:-???}@$(hostname -f) \
(${PAM_SERVICE:-???}) detected" mail@example.org
        # $(LC_ALL=C date +'%Y-%m-%d %H:%M (UTC%z)')
        $(env | grep '^PAM_' | sort)
BODY

exit 0

March 27, 2015
Luca Barbato a.k.a. lu_zero (homepage, bugs)
Again on assert() (March 27, 2015, 12:25 UTC)

Since apparently there are still people not reading the fine man page.

If the macro NDEBUG was defined at the moment was last included, the macro assert() generates no code, and hence does nothing at all.
Otherwise, the macro assert() prints an error message to standard error and terminates the program by calling abort(3) if expression is false (i.e., compares equal to zero).
The purpose of this macro is to help the programmer find bugs in his program. The message “assertion failed in file foo.c, function do_bar(), line 1287″ is of no help at all to a user.

I guess it is time to return on security and expand a bit which are good practices and which are misguided ideas that should be eradicated to reduce the amount of Deny Of Service waiting to happen.

Security issues

The term “Security issue” covers a lot of different kind of situations. Usually unhanded paths in the code lead to memory corruption, memory leaks, crashes and other less evident problems such as information leaks.

I’m focusing on crashes today, assume the others are usually more annoying or dangerous, it might be true or not depending on the scenarios:

If you are watching a movie and you have a glitch in the bitstream that makes the application leak some memory you would not care at all as long you can enjoy your movie. If the same glitch makes VLC to close suddenly a second before you get to see who is the mastermind behind a really twisted plot… I guess you’ll scream at whoever thought was a good idea to crash there.

If a glitch might get an attacker to run arbitrary code while you are watching your movie probably you’d like better to have your player to just crash instead.

It is a false dichotomy since what you want is to have the glitch handled properly, and keep watching the rest of the movie w/out having VLC crashing w/out any meaningful information for you to know.

Errors must be handled, trading a crash for something else you consider worse is just being naive.

What is assert exactly?

assert is a debugging facility mandated by POSIX and C89 and C99, it is a macro that more or less looks like this

#define assert()                                       \
    if (condition) {                                   \
        do_nothing();                                  \
    } else {                                           \
       fprintf(stderr, "%s %s", __LINE__, __func__);   \
       abort();                                        \
    }

If the condition does not happen crash, here the real-life version from musl

#define assert(x) ((void)((x) || (__assert_fail(#x, __FILE__, __LINE__, __func__),0)))

How to use it

Assert should be use to verify assumptions. While developing they help you to verify if your
assumptions meet reality. If not they tell you that should investigate because something is
clearly wrong. They are not intended to be used in release builds.
– some wise Federico while talking about another language asserts

Usually when you write some code you might do something like this to make sure you aren’t doing anything wrong, you start with

int my_function_doing_difficult_computations(Structure *s)
{
   a = some_computation(s);
   ....
   b = other_operations(a, s);
   ....
   c = some_input(s, b);
   ...
   idx = some_operation(a, b, c);

   return some_lut[idx];
}

Where idx in a signed integer, and so a, b, c are with some ranges that might or not depend on some external input.

You do not want to have idx to be outside the range of the lookup table array some_lut and you are not so sure. How to check that you aren’t getting outside the array?

When you write the code usually you iteratively improve a prototype, you can add tests to make sure every function is returning values within the expected range and you can use assert() as a poor-man C version of proper unit-testing.

If some function depends on values outside your control (e.g. an input file), you usually do validation over them and cleanly error out there. Leaving external inputs unaccounted or, even worse, put an assert() there is really bad.

Unit testing and assert()

We want to make sure our function works fine, let’s make a really tiny test.

void test_some_computation(void)
{
    Structure *s = NULL;
    int i;
    while (input_generator(&s, i)) {
       int a = some_computation(s);
       assert(a > 0 && a <10);
    }
}

It is compact and you can then run your test under gdb and inspect a bit around. Quite good if you are refactoring the innards of some_computation() and you want to be sure you did not consider some corner case.

Here assert() is quite nice since we can pack in a single line the testcase and have a simple report if something went wrong. We could do better since assert does not tell use the value or how we ended up there though.

You might not be that thorough and you can just decide to put the same assert in your function and check there, assuming you cover all the input space properly using regression tests.

To crash or not to crash

The people that consider OK crashing on runtime (remember the sad user that cannot watch his wonderful movie till the end?) suggest to leave the assert enabled at runtime.

If you consider the example above, would be better to crash than to read a random value from the memory? Again this is a false dichotomy!

You can expect failures, e.g. broken bitstreams and you want to just check and return a proper failure message.

In our case some_input() return value should be checked for failures and the return value forwarder further up till the library user that then will decide what to do.

Now remains the access to the lookup table. If you didn’t check sufficiently the other functions you might get a bogus index and if you get a bogus index you will read from random memory (crashing or not depending if the random memory is on an address mapped to the program outside). Do you want to have an assert() there? Or you’d rather ad another normal check with a normal failure path?

An correct answer is to test your code enough so you do not need to add yet another check and, in fact, if the problem arises is wrong to add a check there, or, even worse an assert(), you should just go up in the execution path and fix the problem where it is: a non validated input, a wrong “optimization” or something sillier.

There is open debate on if having assert() enabled is a good or bad practice when talking about defensive design. In C, in my opinion, it is a complete misuse. You if you want to litter your release code with tons of branches you can also spend time to implement something better and make sure to clean up correctly. Calling abort() leaves your input and output possibly in severely inconsistent state.

How to use it the wrong way

I want to trade a crash anytime the alternative is memory corruption
– some misguided guy

Assume you have something like that

int size = some_computation(s);
uint8_t *p;
uint8_t *buf = p = malloc(size);


while (some_related_computations(s)) {
   do_stuff_(s, p);
   p += 4;
}

assert(p - buf == size);

If some_computation() and some_related_computation(s) do not agree, you might write over the allocated buffer! The naive person above starts talking about how the memory is corrupted by do_stuff() and horrible things (e.g. foreign code execution) could happen without the assert() and how even calling return at that point is terrible and would lead to horrible horrible things.

Ok, NO. Stop NOW. Go up and look at how assert is implemented. If you check at that point that something went wrong, you have the corruption already. No matter what you do, somebody could exploit it depending on how naive you had been or unlucky.

Remember: assert() does do I/O, allocates memory, raises a signal and calls functions. All that you would rather not do when your memory is corrupted is done by assert().

You can be less naive.

int size = some_computation(s);
uint8_t *p;
uint8_t *buf = p = malloc(size);

while (some_related_computations(s) && size > 4) {
   do_stuff_(s, p);
   p    += 4;
   size -= 4;
}
assert(size != 0);

But then, instead of the assert you can just add

if (size != 0) {
    msg("Something went really wrong!");
    log("The state is %p", s->some_state);
    cleanup(s);
    goto fail;
}

This way when the “impossible” happens the user gets a proper notification and you can recover cleanly and no memory corruption ever happened.

Better than assert

Albeit being easy to use and portable assert() does not provide that much information, there are plenty of tools that can be leveraged to get better reporting.

In Closing

assert() is a really nice debugging tool and it helps a lot to make sure some state remains invariant while refactoring.

Leaving asserts in release code, on the other hand, is quite wrong, it does not give you any additional safety. Please do not buy the fairly tale that assert() saves you from the scary memory corruption issues, it does NOT.

March 26, 2015
Alex Legler a.k.a. a3li (homepage, bugs)
On Secunia’s Vulnerability Review 2015 (March 26, 2015, 19:44 UTC)

Today, Secunia have released their Vulnerability Review 2015, including various statistics on security issues fixed in the last year.

If you don’t know about Secunia’s services: They aggregate security issues from various sources into a single stream, or as they call it: they provide vulnerability intelligence.
In the past, this intelligence was available to anyone in a free newsletter or on their website. Recent changes however caused much of the useful information to go behind login and/or pay walls. This circumstance has also forced us at the Gentoo Security team to cease using their reports as references when initiating package updates due to security issues.

Coming back to their recently published document, there is one statistic that is of particular interest: Gentoo is listed as having the third largest number of vulnerabilities in a product in 2014.

from Secunia: Secunia Vulnerability Review 2015 (http://secunia.com/resources/reports/vr2015/)from Secunia: Secunia Vulnerability Review 2015
(http://secunia.com/resources/reports/vr2015/)

Looking at the whole table, you’d expect at least one other Linux distribution with a similarly large pool of available packages, but you won’t find any.

So is Gentoo less secure than other distros? tl;dr: No.

As Secunia’s website does not let me see the actual “vulnerabilities” they have counted for Gentoo in 2014, there’s no way to actually find out how these numbers came into place. What I can see though are “Secunia advisories” which seem to be issued more or less for every GLSA we send. Comparing the number of posted Secunia advisories for Gentoo to those available for Debian 6 and 7 tells me something is rotten in the state of Denmark (scnr):
While there were 203 Secunia advisories posted for Gentoo in the last year, Debian 6 and 7 had 304, yet Debian would have to have fixed less than 105 vulnerabilities in (55+249=) 304 advisories to be at least rank 21 and thus not included in the table above. That doesn’t make much sense. Maybe issues in Gentoo’s packages are counted for the distribution as well—no idea.

That aside, 2014 was a good year in terms of security for Gentoo: The huge backlog of issues waiting for an advisory was heavily reduced as our awesome team managed to clean up old issues and make them known to glsa-check in three wrap-up advisories—and then we also issued 239 others, more than ever since 2007. Thanks to everyone involved!

March 18, 2015
Jan Kundrát a.k.a. jkt (homepage, bugs)

It is that time of the year again, and people are applying for Google Summer of Code positions. It's great to see a big crowd of newcomers. This article explains what sort of students are welcome in GSoC from the point of view of Trojitá, a fast Qt IMAP e-mail client. I suspect that many other projects within KDE share my views, but it's best to ask them. Hopefully, this post will help students understand what we are looking for, and assist in deciding what project to work for.

Finding a motivation

As a mentor, my motivation in GSoC is pretty simple — I want to attract new contributors to the project I maintain. This means that I value long-term sustainability above fancy features. If you are going to apply with us, make sure that you actually want to stick around. What happens when GSoC terminates? What happens when GSoC terminates and the work you've been doing is not ready yet? Do you see yourself continuing the work you've done so far? Or is it going to become an abandonware, with some cash in your pocket being your only reward? Who is going to maintain the code which you worked hard to create?

Selecting an area of work

This is probably the most important aspect of your GSoC involvement. You're going to spend three months of full time activity on some project, a project you might have not heard about before. Why are you doing this — is it only about the money, or do you already have a connection to the project you've selected? Is the project trying to solve a problem that you find interesting? Would you use the results of that project even without the GSoC?

My experience shows that it's best to find a project which fills a niche that you find interesting. Do you have a digital camera, and do you think that a random photo editor's interface sucks? Work on that, make the interface better. Do you love listening to music? Maybe your favorite music player has some annoying bug that you could fix. Maybe you could add a feature to, say, synchronize the playlist with your cell phone (this is just an example, of course). Do you like 3D printing? Help improve an existing software for 3D printing, then. Are you a database buff? Is there something you find lacking in, e.g., PostgreSQL?

Either way, it is probably a good idea to select something which you need to use, or want to use for some reason. It's of course fine to e.g. spend your GSoC term working on an astronomy tool even though you haven't used one before, but unless you really like astronomy, then you should probably choose something else. In case of Trojitá, if you have been using GMail's web interface for the past five years and you think that it's the best thing since sliced bread, well, chances are that you won't enjoy working on a desktop e-mail client.

Pick something you like, something which you enjoy working with.

Making a proposal

An excellent idea is to make yourself known in advance. This does not happen by joining the IRC channel and saying "I want to work on GSoC", or mailing us to let us know about this. A much better way of getting involved is through showing your dedication.

Try to play with the application you are about to apply for. Do you see some annoying bug? Fix it! Does it work well? Use the application more; you will find bugs. Look at the project's bug tracker, maybe there are some issues which people are hitting. Do you think that you can fix it? Diving into bug fixing is an excellent opportunity to get yourself familiar with the project's code base, and to make sure that our mentors know the style and pace of your work.

Now that you have some familiarity with the code, maybe you can already see opportunities for work besides what's already described on the GSoC ideas wiki page. That's fine — the best proposals usually come from students who have found them on their own. The list of ideas is just that, a list of ideas, not an exhaustive cookbook. There's usually much more what can be done during the course of the GSoC. What would be most interesting area for you? How does it fit into the bigger picture?

After you've thought about the area to work on, now it's time to write your proposal. Start early, and make sure that you talk about your ideas with your prospective mentors before you spend three hours preparing a detailed roadmap. Define the goals that you want to achieve, and talk with your mentors about them. Make sure that the work fits well with the length and style of the GSoC.

And finally, be sure that you stay open and honest with your mentoring team. Remember, this is not a contest of writing a best project proposal. For me, GSoC is all about finding people who are interested in working on, say, Trojitá. What I'm looking for are honest, fair-behaving people who demonstrate willingness to learn new stuff. On top of that, I like to accept people with whom I have already worked. Hearing about you for the first time when I read your GSoC proposal is not a perfect way of introducing yourself. Make yourself known in advance, and show us how you can help us make our project better. Show us that you want to become a part of that "we".

Patrick Lauer a.k.a. bonsaikitten (homepage, bugs)
Upgrading ThunderBird (March 18, 2015, 01:35 UTC)

With the recent update from the LongTimeSuffering / ExtendedSufferingRelease of Thunderbird from 24 to 31 we encountered some serious badness.

The best description of the symptoms would be "IMAP doesn't work at all"
On some machines the existing accounts would be disappeared, on others they would just be inert and never receive updates.

After some digging I was finally able to find the cause of this:
Too old config file.

Uhm ... what? Well - some of these accounts have been around since TB2. Some newer ones were enhanced by copying the prefs.js from existing accounts. And so there's a weird TB bugreport that is mostly triggered by some bits being rewritten around Firefox 30, and the config parser screwing up with translating 'old' into 'new', and ... effectively ... IMAP being not-whitelisted, thus by default blacklisted, and hilarity happens.

Should you encounter this bug you "just" need to revert to a prefs.js from before the update (sigh) and then remove all lines involving "capability.policy".
Then update and ... things work. Whew.

Why not just remove profile and start with a clean one you say? Well ... for one TB gets brutally unusably slow if you have emails. So just re-reading the mailbox content from a local fast IMAP server will take ~8h and TB will not respond to user input during that time.
And then you manually have to go into eeeevery single subfolder so that TB remembers it is there and actually updates it. That's about one work-day per user lost to idiocy, so sed'ing the config file into compliance is the easy way out.
Thank you, Mozilla, for keeping our lives exciting!

March 17, 2015
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
mongoDB 3.0.1 (March 17, 2015, 13:46 UTC)

This is a quite awaited version bump coming to portage and I’m glad to announce it’s made its way to the tree today !

I’ll right away thank a lot Tomas Mozes and Darko Luketic for their amazing help, feedback and patience !

mongodb-3.0.1

I introduced quite some changes in this ebuild which I wanted to share with you and warn you about. MongoDB upstream have stripped quite a bunch of things out of the main mongo core repository which I have in turn split into ebuilds.

Major changes :

  • respect upstream’s optimization flags : unless in debug build, user’s optimization flags will be ignored to prevent crashes and weird behaviour.
  • shared libraries for C/C++ are not built by the core mongo respository anymore, so I removed the static-libs USE flag.
  • various dependencies optimization to trigger a rebuild of mongoDB when one of its linked dependency changes.

app-admin/mongo-tools

The new tools USE flag allows you to pull a new ebuild named app-admin/mongo-tools which installs the commands listed below. Obviously, you can now just install this package if you only need those tools on your machine.

  • mongodump / mongorestore
  • mongoexport / mongoimport
  • mongotop
  • mongofiles
  • mongooplog
  • mongostat
  • bsondump

app-admin/mms-agent

The MMS agent has now some real version numbers and I don’t have to host their source on Gentoo’s infra woodpecker. At the moment there is only the monitoring agent available, shall anyone request the backup one, I’ll be glad to add its support too.

dev-libs/mongo-c(xx)-driver

I took this opportunity to add the dev-libs/mongo-cxx-driver to the tree and bump the mongo-c-driver one. Thank you to Balint SZENTE for his insight on this.