Gentoo Logo
Gentoo Logo Side
Gentoo Spaceship

Contributors:
. Aaron W. Swenson
. Agostino Sarubbo
. Alec Warner
. Alex Alexander
. Alex Legler
. Alexey Shvetsov
. Alexis Ballier
. Alexys Jacob
. Amadeusz Żołnowski
. Andreas K. Hüttel
. Anthony Basile
. Arun Raghavan
. Bernard Cafarelli
. Bjarke Istrup Pedersen
. Brent Baude
. Brian Harring
. Christian Ruppert
. Chí-Thanh Christopher Nguyễn
. David Abbott
. Denis Dupeyron
. Detlev Casanova
. Diego E. Pettenò
. Domen Kožar
. Donnie Berkholz
. Doug Goldstein
. Eray Aslan
. Fabio Erculiani
. Gentoo Haskell Herd
. Gentoo Miniconf 2016
. Gentoo Monthly Newsletter
. Gentoo News
. Gilles Dartiguelongue
. Greg KH
. Hanno Böck
. Hans de Graaff
. Ian Whyman
. Ioannis Aslanidis
. Jan Kundrát
. Jason A. Donenfeld
. Jeffrey Gardner
. Jeremy Olexa
. Joachim Bartosik
. Johannes Huber
. Jonathan Callen
. Jorge Manuel B. S. Vicetto
. Joseph Jezak
. Kenneth Prugh
. Kristian Fiskerstrand
. Lance Albertson
. Liam McLoughlin
. LinuxCrazy Podcasts
. Luca Barbato
. Luis Francisco Araujo
. Markos Chandras
. Mart Raudsepp
. Matt Turner
. Matthew Marlowe
. Matthew Thode
. Matti Bickel
. Michael Palimaka
. Michal Hrusecky
. Michał Górny
. Mike Doty
. Mike Gilbert
. Mike Pagano
. Nathan Zachary
. Ned Ludd
. Nirbheek Chauhan
. Pacho Ramos
. Patrick Kursawe
. Patrick Lauer
. Patrick McLean
. Pavlos Ratis
. Paweł Hajdan, Jr.
. Peter Wilmott
. Petteri Räty
. Piotr Jaroszyński
. Rafael Goncalves Martins
. Raúl Porcel
. Remi Cardona
. Richard Freeman
. Robin Johnson
. Ryan Hill
. Sean Amoss
. Sebastian Pipping
. Steev Klimaszewski
. Stratos Psomadakis
. Sven Vermeulen
. Sven Wegener
. Thomas Kahle
. Tiziano Müller
. Tobias Heinlein
. Tobias Klausmann
. Tom Wijsman
. Tomáš Chvátal
. Vikraman Choudhury
. Vlastimil Babka
. Yury German
. Zack Medico

Last updated:
July 29, 2016, 09:06 UTC

Disclaimer:
Views expressed in the content published here do not necessarily represent the views of Gentoo Linux or the Gentoo Foundation.


Bugs? Comments? Suggestions? Contact us!

Powered by:
Planet Venus

Welcome to Gentoo Universe, an aggregation of weblog articles on all topics written by Gentoo developers. For a more refined aggregation of Gentoo-related topics only, you might be interested in Planet Gentoo.

July 28, 2016

Description:
Paps is an UTF-8 to PostScript converter that makes use of pango. It provides both a stand alone command line tool as well as a library

It was discovered that a crafted/empty file is able to cause an heap-based buffer overflow.
Apparently, the project does not have release(s) since 2007 and seems to be dead, but I just discovered right now that the project has moved silently to github where the PR has been sent.

The complete ASan output:

# paps $crafted.file
=================================================================                                                                                                                                                                                                              
==30527==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000dfaf at pc 0x0000004e122d bp 0x7ffd8f3dfe90 sp 0x7ffd8f3dfe88                                                                                                                                      
READ of size 1 at 0x60200000dfaf thread T0                                                                                                                                                                                                                                     
    #0 0x4e122c in read_file /tmp/portage/app-text/paps-0.6.8-r1/work/paps-0.6.8/src/paps.c:573:7                                                                                                                                                                              
    #1 0x4e122c in main /tmp/portage/app-text/paps-0.6.8-r1/work/paps-0.6.8/src/paps.c:493                                                                                                                                                                                     
    #2 0x7fd8aff707af in __libc_start_main (/lib64/libc.so.6+0x207af)                                                                                                                                                                                                          
    #3 0x436968 in _start (/usr/bin/paps+0x436968)                                                                                                                                                                                                                             
                                                                                                                                                                                                                                                                               
0x60200000dfaf is located 1 bytes to the left of 4-byte region [0x60200000dfb0,0x60200000dfb4)                                                                                                                                                                                 
allocated by thread T0 here:                                                                                                                                                                                                                                                   
    #0 0x4bdc75 in realloc (/usr/bin/paps+0x4bdc75)                                                                                                                                                                                                                            
    #1 0x7fd8b111c35d in g_realloc (/usr/lib64/libglib-2.0.so.0+0x4e35d)                                                                                                                                                                                                       
                                                                                                                                                                                                                                                                               
SUMMARY: AddressSanitizer: heap-buffer-overflow /tmp/portage/app-text/paps-0.6.8-r1/work/paps-0.6.8/src/paps.c:573 read_file                                                                                                                                                   
Shadow bytes around the buggy address:                                                                                                                                                                                                                                         
  0x0c047fff9ba0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa                                                                                                                                                                                                              
  0x0c047fff9bb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa                                                                                                                                                                                                              
  0x0c047fff9bc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa                                                                                                                                                                                                              
  0x0c047fff9bd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa                                                                                                                                                                                                              
  0x0c047fff9be0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa                                                                                                                                                                                                              
=>0x0c047fff9bf0: fa fa fa fa fa[fa]04 fa fa fa 00 02 fa fa 00 02                                                                                                                                                                                                              
  0x0c047fff9c00: fa fa fd fd fa fa fd fd fa fa fd fd fa fa fd fa                                                                                                                                                                                                              
  0x0c047fff9c10: fa fa fd fa fa fa 00 00 fa fa 00 00 fa fa 00 00                                                                                                                                                                                                              
  0x0c047fff9c20: fa fa 00 00 fa fa 00 fa fa fa 00 00 fa fa fd fa                                                                                                                                                                                                              
  0x0c047fff9c30: fa fa fd fa fa fa fd fa fa fa 00 00 fa fa 00 00                                                                                                                                                                                                              
  0x0c047fff9c40: fa fa 00 fa fa fa 00 00 fa fa 00 00 fa fa 00 fa                                                                                                                                                                                                              
Shadow byte legend (one shadow byte represents 8 application bytes):                                                                                                                                                                                                           
  Addressable:           00                                                                                                                                                                                                                                                    
  Partially addressable: 01 02 03 04 05 06 07                                                                                                                                                                                                                                  
  Heap left redzone:       fa                                                                                                                                                                                                                                                  
  Heap right redzone:      fb                                                                                                                                                                                                                                                  
  Freed heap region:       fd                                                                                                                                                                                                                                                  
  Stack left redzone:      f1                                                                                                                                                                                                                                                  
  Stack mid redzone:       f2                                                                                                                                                                                                                                                  
  Stack right redzone:     f3                                                                                                                                                                                                                                                  
  Stack partial redzone:   f4                                                                                                                                                                                                                                                  
  Stack after return:      f5                                                                                                                                                                                                                                                  
  Stack use after scope:   f8                                                                                                                                                                                                                                                  
  Global redzone:          f9                                                                                                                                                                                                                                                  
  Global init order:       f6                                                                                                                                                                                                                                                  
  Poisoned by user:        f7                                                                                                                                                                                                                                                  
  Container overflow:      fc                                                                                                                                                                                                                                                  
  Array cookie:            ac                                                                                                                                                                                                                                                  
  Intra object redzone:    bb                                                                                                                                                                                                                                                  
  ASan internal:           fe                                                                                                                                                                                                                                                  
  Left alloca redzone:     ca                                                                                                                                                                                                                                                  
  Right alloca redzone:    cb                                                                                                                                                                                                                                                  
==30527==ABORTING

Affected version:
All versions.

Fixed version:
0.6.8-r2 (in Gentoo)

Commit fix:
https://gitweb.gentoo.org/repo/gentoo.git/tree/app-text/paps/files/paps-0.6.8-fix-empty-file.patch

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.
This bug was fixed by Jason A. Donenfeld of Gentoo.

CVE:
N/A

Timeline:
2015-06-09: bug discovered
2015-11-17: bug reported downstream (Gentoo)
2016-07-12: fixed produced downstream
2016-07-28: advisory release

Note:
This bug was found with American Fuzzy Lop.

Permalink:

paps: heap-based buffer overflow in read_file() (paps.c)

July 27, 2016
Nirbheek Chauhan a.k.a. nirbheek (homepage, bugs)

Two months ago, I talked about how we at Centricular have been working on a Meson port of GStreamer and its basic dependencies (glib, libffi, and orc) for various reasons — faster builds, better cross-platform support (particularly Windows), better toolchain support, ease of use, and for a better build system future in general.

Meson also has built-in support for things like gtk-doc, gobject-introspection, translations, etc. It can even generate Visual Studio project files at build time so projects don't have to expend resources maintaining those separately.

Today I'm here to share instructions on how to use Cerbero (our “aggregating” build system) to build all of GStreamer on Windows using MSVC 2015 (wherever possible). Note that this means you won't see any Meson invocations at all because Cerbero does all that work for you.

Note that this is still all unofficial and has not been proposed for inclusion upstream. We still have a few issues that need to be ironed out before we can do that¹.

First, you need to setup the environment on Windows by installing a bunch of external tools: Python 2, Python3, Git, etc. You can find the instructions for that here:

https://github.com/centricular/cerbero#windows

This is very similar to the old Cerbero instructions, but some new tools are needed. Once you've done everything there (Visual Studio especially takes a while to fetch and install itself), the next step is fetching Cerbero:

$ git clone https://github.com/centricular/cerbero.git

This will clone and checkout the meson-1.8 branch that will build GStreamer 1.8.x. Next, we bootstrap it:

https://github.com/centricular/cerbero#bootstrap

Now we're (finally) ready to build GStreamer. Just invoke the package command:

python2 cerbero-uninstalled -c config/win32-mixed-msvc.cbc package gstreamer-1.0

This will build all the `recipes` that constitute GStreamer, including the core libraries and all the plugins including their external dependencies. This comes to about 76 recipes. Out of all these recipes, only the following are ported to Meson and are built with MSVC:

bzip2.recipe
orc.recipe
libffi.recipe (only 32-bit)
glib.recipe
gstreamer-1.0.recipe
gst-plugins-base-1.0.recipe
gst-plugins-good-1.0.recipe
gst-plugins-bad-1.0.recipe
gst-plugins-ugly-1.0.recipe

The rest still mostly use Autotools, plain GNU make or cmake. Almost all of these are still built with MinGW. The only exception is libvpx, which uses its custom make-based build system but is built with MSVC.

Eventually we want to build everything including all external dependencies with MSVC by porting everything to Meson, but as you can imagine it's not an easy task. :-)

However, even with just these recipes, there is a large improvement in how quickly you can build all of GStreamer inside Cerbero on Windows. For instance, the time required for building gstreamer-1.0.recipe which builds gstreamer.git went from 10 minutes to 45 seconds. It is now easier to do GStreamer development on Windows since rebuilding doesn't take an inordinate amount of time!

As a further improvement for doing GStreamer development on Windows, for all these recipes (except libffi because of complicated reasons), you can also generate Visual Studio 2015 project files and use them from within Visual Studio for editing, building, and so on.

Go ahead, try it out and tell me if it works for you!

As an aside, I've also been working on some proper in-depth documentation of Cerbero that explains how the tool works, the recipe format, supported configurations, and so on. You can see the work-in-progress if you wish to.

1. Most importantly, the tests cannot be built yet because GStreamer bundles a very old version of libcheck. I'm currently working on fixing that.

July 18, 2016
Sebastian Pipping a.k.a. sping (homepage, bugs)
Gimp 2.9.4 now in Gentoo (July 18, 2016, 16:24 UTC)

Hi there!

Just a quick heads up that Gimp 2.9.4 is now available in Gentoo.

Upstream has an article on what’s new with Gimp 2.9.4: GIMP 2.9.4 Released

Gentoo Miniconf 2016 a.k.a. miniconf-2016 (homepage, bugs)

The call for papers for the Gentoo Miniconf 2016 (8th+9th October 2016 in Prague) will close in two weeks, on 1 August 2016. Time to get your submission ready!

July 15, 2016
Hanno Böck a.k.a. hanno (homepage, bugs)
Insecure updates in Joomla before 3.6 (July 15, 2016, 17:35 UTC)

In early April I reported security problems with the update process to the security contact of Joomla. While the issue has been fixed in Joomla 3.6, the communication process was far from ideal.

The issue itself is pretty simple: Up until recently Joomla fetched information about its updates over unencrypted and unauthenticated HTTP without any security measures.

The update process works in three steps. First of all the Joomla backend fetches a file list.xml from update.joomla.org that contains information about current versions. If a newer version than the one installed is found then the user gets a button that allows him to update Joomla. The file list.xml references an URL for each version with further information about the update called extension_sts.xml. Interestingly this file is fetched over HTTPS, while - in version 3.5 - the file list.xml is not. However this does not help, as the attacker can already intervene at the first step and serve a malicious list.xml that references whatever he wants. In extension_sts.xml there is a download URL for a zip file that contains the update.

Exploiting this for a Man-in-the-Middle-attacker is trivial: Requests to update.joomla.org need to be redirected to an attacker-controlled host. Then the attacker can place his own list.xml, which will reference his own extension_sts.xml, which will contain a link to a backdoored update. I have created a trivial proof of concept for this (just place that on the HTTP host that update.joomla.org gets redirected to).

I think it should be obvious that software updates are a security sensitive area and need to be protected. Using HTTPS is one way of doing that. Using any kind of cryptographic signature system is another way. Unfortunately it seems common web applications are only slowly learning that. Drupal only switched to HTTPS updates earlier this year. It's probably worth checking other web applications that have integrated update processes if they are secure (Wordpress is secure fwiw).

Now here's how the Joomla developers handled this issue: I contacted Joomla via their webpage on April 6th. Their webpage form didn't have a way to attach files, so I offered them to contact me via email so I could send them the proof of concept. I got a reply to that shortly after asking for it. This was the only communication from their side. Around two months later, on June 14th, I asked about the status of this issue and warned that I would soon publish it if I don't get a reaction. I never got any reply.

In the meantime Joomla had published beta versions of the then upcoming version 3.6. I checked that and noted that they have changed the update url from http://update.joomla.org/ to https://update.joomla.org/. So while they weren't communicating with me it seemed a fix was on its way. I then found that there was a pull request and a Github discussion that started even before I first contacted them. Joomla 3.6 was released recently, therefore the issue is fixed. However the release announcement doesn't mention it.

So all in all I contacted them about a security issue they were already in the process of fixing. The problem itself is therefore solved. But the lack of communication about the issue certainly doesn't cast a good light on Joomla's security process.

July 11, 2016
Michał Górny a.k.a. mgorny (homepage, bugs)
Common filesystem I/O pitfalls (July 11, 2016, 12:41 UTC)

Filesystem I/O is one of the key elements of the standard library in many programming languages. Most of them derive it from the interfaces provided by the standard C library, potentially wrapped in some portability and/or OO sugar. Most of them share an impressive set of pitfalls for careless programmers.

In this article, I would like to shortly go over a few more or less common pitfalls that come to my mind.

Overwriting the file in-place

This one will be remembered by me as the ‘setuptools screwup’ for quite some time. Consider the following snippet:

if not self.dry_run:
    ensure_directory(target)
    f = open(target,"w"+mode)
    f.write(contents)
    f.close()

This is the code that setuptools used to install scripts. At a first glance, it looks good — and seems to work well, too. However, think of what happens if the file at target exists already.

The obvious answer would be: it is overwritten. The more commonly noticed pitfall here is that the old contents are discarded before the new are written. If user happens to run the script before it is completely written, he’ll get unexpected results. If writes fail for some reason, user will be left with partially written new script.

While in the case of installations this is not very significant (after all, failure in middle of installation is never a good thing, mid-file or not), this becomes very important when dealing with data. Imagine that a program would update your data this way — and a failure to add new data (as well as unexpected program termination, power loss…) would instantly cause all previous data to be erased.

However, there is another problem with this concept. In fact, it does not strictly overwrite the file — it opens it in-place and implicitly truncates it. This causes more important issues in a few cases:

  • if the file is hardlinked to another file(s) or is a symbolic link, then the contents of all the linked files are overwritten,
  • if the file is a named pipe, the program will hang waiting for the other end of the pipe to be open for reading,
  • other special files may cause other unexpected behavior.

This is exactly what happened in Gentoo. Package-installed script wrappers were symlinked to python-exec, and setuptools used by pip attempted to install new scripts on top of those wrappers. But instead of overwriting the wrappers, it overwrote python-exec and broke everything relying on it.

The lesson is simple: don’t overwrite files like this. The easy way around it is to unlink the file first — ensuring that any links are broken, and special files are removed. The more correct way is to use a temporary file (created safely), and use the atomic rename() call to replace the target with it (no unlinking needed then). However, it should be noted that the rename can fail and a fallback code with unlink and explicit copy is necessary.

Path canonicalization

For some reason, many programmers have taken a fancy to canonicalize paths. While canonicalization itself is not that bad, it’s easy to do it wrongly and it cause a major headache. Let’s take a look at the following path:

//foo/../bar/example.txt

You could say it’s ugly. It has a double slash, and a parent directory reference. It almost itches to canonicalize it to more pretty:

/bar/example.txt

However, this path is not necessarily the same as the original.

For a start, let’s imagine that foo is actually a symbolic link to baz/ooka. In this case, its parent directory referenced by .. is actually /baz, not /, and the obvious canonicalization fails.

Furthermore, double slashes can be meaningful. For example, on Windows double slash (yes, yes, backslashes are used normally) would mean a network resource. In this case, stripping the adjacent slash would change the path to a local one.

So, if you are really into canonicalization, first make sure to understand all the rules governing your filesystem. On POSIX systems, you really need to take symbolic links into consideration — usually you start with the left-most path component and expand all symlinks recursively (you need to take into consideration that link target path may carry more symlinks). Once all symbolic links are expanded, you can safely start interpreting the .. components.

However, if you are going to do that, think of another path:

/usr/lib/foo

If you expand it on common Gentoo old-style multilib system, you’ll get:

/usr/lib64/foo

However, now imaging that the /usr/lib symlink is replaced with a directory, and the appropriate files are moved to it. At this point, the path recorded by your program is no longer correct since it relies on a canonicalization done using a different directory structure.

To summarize: think twice before canonicalizing. While it may seem beneficial to have pretty paths or use real filesystem paths, you may end up discarding user’s preferences (if I set a symlink somewhere, I don’t want program automagically switching to another path). If you really insist on it, consider all the consequences and make sure you do it correctly.

Relying on xattr as an implementation for ACL/caps

Since common C libraries do not provide proper file copying functions, many people attempted to implement their own with better or worse results. While copying the data is a minor problem, preserving the metadata requires a more complex solution.

The simpler programs focused on copying the properties retrieved via stat() — modes, ownership and times. The more correct ones added also support for copying extended attributes (xattrs).

Now, it is a known fact that Linux filesystems implement many metadata extensions using extended attributes — ACLs, capabilities, security contexts. Sadly, this causes many people to assume that copying extended attributes is guaranteed to copy all of that extended metadata as well. This is a bad assumption to make, even though it is correct on Linux. It will cause your program to work fine on Linux but silently fail to copy ACLs on other systems.

Therefore: always use explicit APIs, and never rely on implementation details. If you want to work on ACLs, use the ACL API (provided by libacl on Linux). If you want to use capabilities, use the capability API (libcap or libcap-ng).

Using incompatible APIs interchangeably

Now for something less common. There are at least three different file locking mechanisms on Linux — the somehow portable, non-standardized flock() function, the POSIX lockf() and (also POSIX) fcntl() commands. The Linux manpage says that commonly both interfaces are implemented using the fcntl. However, this is not guaranteed and mixing the two can result in unpredictable results on different systems.

Dealing with the two standard file APIs is even more curious. On one hand, we have high-level stdio interfaces including FILE* and DIR*. On the other, we have all fd-oriented interfaces from unistd. Now, POSIX officially supports converting between the two — using fileno(), dirfd(), fdopen() and fddiropen().

However, it should be noted that the result of such a conversion reuses the same underlying file descriptor (rather than duplicating it). Two major points, however:

  1. There is no well-defined way to destroy a FILE* or DIR* without closing the descriptor, nor any guarantee that fclose() or closedir() will work correctly on a closed descriptor. Therefore, you should not create more than one FILE* (or DIR*) for a fd, and if you have one, always close it rather than the fd itself.
  2. The stdio streams are explicitly stateful, buffered and have some extra magic on top (like ungetc()). Once you start using stdio I/O operations on a file, you should not try to use low-level I/O (e.g. read()) or the other way around since the results are pretty much undefined. Supposedly fflush() + rewind() could help but no guarantees.

So, if you want to do I/O, decide whether you want stdio or fd-based I/O. Convert between the two types only when you need to use additional routines not available for the other one; but if those routines involve some kind of content-related operations, avoid using the other type for I/O. If you need to do separate I/O, use dup() to get a clone of the file descriptor.

To summarize: avoid combining different APIs. If you really insist on doing that, check if it is supported and what are the requirements for doing so. You have to be especially careful not to run into undefined results. And as usual — remember that different systems may implement things differently.

Atomicity of operations

For the end, something commonly known, and even more commonly repeated — race conditions due to non-atomic operations. Long story short, all the unexpected results resulting from the assumption that nothing can happen to the file between successive calls to functions.

I think the most common mistake is the ‘does the file exist?’ problem. It is awfully common for programs to use some wrappers over stat() (like os.path.exists() in Python) to check if a file exists, and then immediately proceed with opening or creating it. For example:

def do_foo(path):
    if not os.path.exists(path):
        return False

    f = open(path, 'r')

Usually, this will work. However, if the file gets removed between the precondition check and the open(), the program will raise an exception instead of returning False. For example, this can practically happen if the file is part of a large directory tree being removed via rm -r.

The double bug here could be easily fixed via introducing explicit error handling, that will also render the precondition unnecessary:

def do_foo(path):
    try:
        f = open(path, 'r')
    except OSError as e:
        if e.errno == errno.ENOENT:
            return False
        raise

The new snippet ensures that the file will be open if it exists at the point of open(). If it does not, errno will indicate an appropriate error. For other errors, we are re-raising the exception. If the file is removed post open(), the fd will still be valid.

We could extend this to a few generic rules:

  1. Always check for errors, even if you asserted that they should not happen. Proper error checks make many (unsafe) precondition checks unnecessary.
  2. Open file descriptors will remain valid even when the underlying files are removed; paths can become invalid (i.e. referencing non-existing files or directories) or start pointing to another file (created using the same path). So, prefer opening the file as soon as necessary, and fstat(), fchown(), futimes()… over stat(), chown(), utimes()
  3. Open directory descriptors will continue to reference the same directory even when the underlying path is removed or replaced; paths may start referencing another directory. When performing operations on multiple files in a directory, prefer opening the directory and using openat(), unlinkat()… However, note that the directory can still be removed and therefore further calls may return ENOENT.
  4. If you need to atomically overwrite a file with another one, use rename(). To atomically create a new file, use open() with O_EXCL. Usually, you will want to use the latter to create a temporary file, then the former to replace the actual file with it.
  5. If you need to use temporary files, use mkstemp() or mkdtemp() to create them securely. The former can be used when you only need an open fd (the file is removed immediately), the latter if you need visible files. If you want to use tmpnam(), put it in a loop and try opening with O_EXCL to ensure you do not accidentally overwrite something.
  6. When you can’t guarantee atomicity, use locks to prevent simultaneous operations. For file operations, you can lock the file in question. For directory operations, you can create and lock lock files (however, do not rely on existence of lock files alone). Note though that the POSIX locks are non-mandatory — i.e. only prevent other programs from acquiring the lock explicitly but do not block them from performing I/O ignoring the locks.
  7. Think about the order of operations. If you create a world-readable file, and afterwards chmod() it, it is possible for another program to open it before the chmod() and retain the open handle while secure data is being written. Instead, restrict the access via mode parameter of open() (or umask()).

July 08, 2016
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)
Lab::Measurement 3.512 released (July 08, 2016, 16:03 UTC)

Immediately at the heels of the previous post, I've just uploaded Lab::Measurement 3.512. It fixes some problems in the Yokogawa GS200 driver introduced in the previous version. Enjoy!

July 05, 2016
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)

It's been some time since the last Lab::Measurement blog post; we are at Lab::Measurement version 3.511 by now. Here are the most important changes since 3.31:

  • One big addition "under the hood", which is still in flux, was a generic input/output framework for status and error messages. 
  • The device cache code has seen updates and bugfixes. 
  • Agilent multimeter drivers have been cleaned up and rewritten. 
  • Minimal support has been added for the Agilent E8362A network analyzer.
  • The Oxford Instruments IPS driver has been sprinkled with consistency checks and debug output, the ITC driver has seen bugfixes.
  • Controlling an Oxford Instruments Triton system is work in progress.
  • The Stanford Research SR830 lock-in now supports using the auxiliary inputs as "multimeters" and the auxiliary outputs as voltage sources.
  • Support for the Keithley 2400 multimeter, the Lakeshore 224 temperature monitor, and the Rohde&Schwarz SMB100A rf-source  has been added.
  • Work on generic SCPI parsing utilities has started.
  • Sweeps can now also vary pulse length and pulse repetition rate; the "time sweep" has been enhanced.
  • Test routines (both with instruments attached and software-only) are work in progress.
 Lab::VISA has also seen a new bugfix release, 3.04. Changes since version 3.01 are:
  • Support for VXI_SERVANT ressources has been removed; these are NI-specific and not available in 64bit VISA.
  • The documentation, especially on compiling and installing on recent Windows installations, has been improved. No need for Visual Studio and similar giga-downloads anymore!
  • Compiling on both 32bit and 64bit Windows 10 should now work without manual modifications in the Makefile.pl.
Enjoy!

July 04, 2016
Jason A. Donenfeld a.k.a. zx2c4 (homepage, bugs)

After quite a bit of hard work, I've at long last launched WireGuard, a secure network tunnel that uses modern crypto, is extremely fast, and is easy and pleasurable to use. You can read about it at the website, but in short, it's based on the simple idea of an association between public keys and permitted IP addresses. Along the way it uses some nice crypto trick to achieve it's goal. For performance it lives in the kernel, though cross-platform versions in safe languages like Rust, Go, etc are on their way.

The launch was wildly successful. About 10 minutes after I ran /etc/init.d/nginx restart, somebody had already put it on Hacker News and the Twitter sphere, and within 24 hours I had received 150,000 unique IPs. The reception has been very warm, and the mailing list has already started to get some patches. Distro maintainers have stepped up and packages are being prepared. There are currently packages for Gentoo, Arch, Debian, and OpenWRT, which is very exciting.

Although it's still experimental and not yet in final stable/secure form, I'd be interested in general feedback from experimenters and testers.

June 29, 2016
Arun Raghavan a.k.a. ford_prefect (homepage, bugs)
Beamforming in PulseAudio (June 29, 2016, 05:22 UTC)

In case you missed it — we got PulseAudio 9.0 out the door, with the echo cancellation improvements that I wrote about. Now is probably a good time for me to make good on my promise to expand upon the subject of beamforming.

As with the last post, I’d like to shout out to the wonderful folks at Aldebaran Robotics who made this work possible!

Beamforming

Beamforming as a concept is used in various aspects of signal processing including radio waves, but I’m going to be talking about it only as applied to audio. The basic idea is that if you have a number of microphones (a mic array) in some known arrangement, it is possible to “point” or steer the array in a particular direction, so sounds coming from that direction are made louder, while sounds from other directions are rendered softer (attenuated).

Practically speaking, it should be easy to see the value of this on a laptop, for example, where you might want to focus a mic array to point in front of the laptop, where the user probably is, and suppress sounds that might be coming from other locations. You can see an example of this in the webcam below. Notice the grilles on either side of the camera — there is a microphone behind each of these.

Webcam with 2 micsWebcam with 2 mics

This raises the question of how this effect is achieved. The simplest approach is called “delay-sum beamforming”. The key idea in this approach is that if we have an array of microphones that we want to steer the array at a particular angle, the sound we want to steer at will reach each microphone at a different time. This is illustrated below. The image is taken from this great article describing the principles and math in a lot more detail.

Delay-sum beamformingDelay-sum beamforming

In this figure, you can see that the sound from the source we want to listen to reaches the top-most microphone slightly before the next one, which in turn captures the audio slightly before the bottom-most microphone. If we know the distance between the microphones and the angle to which we want to steer the array, we can calculate the additional distance the sound has to travel to each microphone.

The speed of sound in air is roughly 340 m/s, and thus we can also calculate how much of a delay occurs between the same sound reaching each microphone. The signal at the first two microphones is delayed using this information, so that we can line up the signal from all three. Then we take the sum of the signal from all three (actually the average, but that’s not too important).

The signal from the direction we’re pointing in is going to be strongly correlated, so it will turn out loud and clear. Signals from other directions will end up being attenuated because they will only occur in one of the mics at a given point in time when we’re summing the signals — look at the noise wavefront in the illustration above as an example.

Implementation

(this section is a bit more technical than the rest of the article, feel free to skim through or skip ahead to the next section if it’s not your cup of tea!)

The devil is, of course, in the details. Given the microphone geometry and steering direction, calculating the expected delays is relatively easy. We capture audio at a fixed sample rate — let’s assume this is 32000 samples per second, or 32 kHz. That translates to one sample every 31.25 µs. So if we want to delay our signal by 125µs, we can just add a buffer of 4 samples (4 × 31.25 = 125). Sound travels about 4.25 cm in that time, so this is not an unrealistic example.

Now, instead, assume the signal needs to be delayed by 80 µs. This translates to 2.56 samples. We’re working in the digital domain — the mic has already converted the analog vibrations in the air into digital samples that have been provided to the CPU. This means that our buffer delay can either be 2 samples or 3, not 2.56. We need another way to add a fractional delay (else we’ll end up with errors in the sum).

There is a fair amount of academic work describing methods to perform filtering on a sample to provide a fractional delay. One common way is to apply an FIR filter. However, to keep things simple, the method I chose was the Thiran approximation — the literature suggests that it performs the task reasonably well, and has the advantage of not having to spend a whole lot of CPU cycles first transforming to the frequency domain (which an FIR filter requires)(edit: converting to the frequency domain isn’t necessary, thanks to the folks who pointed this out).

I’ve implemented all of this as a separate module in PulseAudio as a beamformer filter module.

Now it’s time for a confession. I’m a plumber, not a DSP ninja. My delay-sum beamformer doesn’t do a very good job. I suspect part of it is the limitation of the delay-sum approach, partly the use of an IIR filter (which the Thiran approximation is), and it’s also entirely possible there is a bug in my fractional delay implementation. Reviews and suggestions are welcome!

A Better Implementation

The astute reader has, by now, realised that we are already doing a bunch of processing on incoming audio during voice calls — I’ve written in the previous article about how the webrtc-audio-processing engine provides echo cancellation, acoustic gain control, voice activity detection, and a bunch of other features.

Another feature that the library provides is — you guessed it — beamforming. The engineers at Google (who clearly are DSP ninjas) have a pretty good beamformer implementation, and this is also available via module-echo-cancel. You do need to configure the microphone geometry yourself (which means you have to manually load the module at the moment). Details are on our wiki (thanks to Tanu for that!).

How well does this work? Let me show you. The image below is me talking to my laptop, which has two microphones about 4cm apart, on either side of the webcam, above the screen. First I move to the right of the laptop (about 60°, assuming straight ahead is 0°). Then I move to the left by about the same amount (the second speech spike). And finally I speak from the center (a couple of times, since I get distracted by my phone).

The upper section represents the microphone input — you’ll see two channels, one corresponding to each mic. The bottom part is the processed version, with echo cancellation, gain control, noise suppression, etc. and beamforming.

WebRTC beamformingWebRTC beamforming

You can also listen to the actual recordings …

… and the processed output.

Feels like black magic, doesn’t it?

Finishing thoughts

The webrtc-audio-processing-based beamforming is already available for you to use. The downside is that you need to load the module manually, rather than have this automatically plugged in when needed (because we don’t have a way to store and retrieve the mic geometry). At some point, I would really like to implement a configuration framework within PulseAudio to allow users to set configuration from some external UI and have that be picked up as needed.

Nicolas Dufresne has done some work to wrap the webrtc-audio-processing library functionality in a GStreamer element (and this is in master now). Adding support for beamforming to the element would also be good to have.

The module-beamformer bits should be a good starting point for folks who want to wrap their own beamforming library and have it used in PulseAudio. Feel free to get in touch with me if you need help with that.

June 25, 2016
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
py3status v3.0 (June 25, 2016, 15:31 UTC)

Oh boy, this new version is so amazing in terms of improvements and contributions that it’s hard to sum it up !

Before going into more explanations I want to dedicate this release to tobes whose contributions, hard work and patience have permitted this ambitious 3.0 : THANK YOU !

This is the graph of contributed commits since 2.9 just so you realise how much this version is thanks to him:
2016-06-25-165245_1289x248_scrotI can’t continue on without also thanking Horgix who started this madness by splitting the code base into modular files and pydsigner for his everlasting contributions and code reviews !

The git stat since 2.9 also speaks for itself:

 73 files changed, 7600 insertions(+), 3406 deletions(-)

So what’s new ?

  • the monolithic code base have been split into modules responsible for the given tasks py3status performs
  • major improvements on modules output orchestration and execution resulting in considerable CPU consumption reduction and i3bar responsiveness
  • refactoring of user notifications with added dbus support and rate limiting
  • improved modules error reporting
  • py3status can now survive an i3status crash and will try to respawn it
  • a new ‘container’ module output type gives the ability to group modules together
  • refactoring of the time and tztime modules support brings the support of all the time macros (%d, %Z etc)
  • support for stopping py3status and its modules when i3bar hide mode is used
  • refactoring of general, contribution and most noticeably modules documentation
  • more details on the rest of the changelog

Modules

Along with a cool list of improvements on the existing modules, these are the new modules:

  • new group module to cycle display of several modules (check it out, it’s insanely handy !)
  • new fedora_updates module to check for your Fedora packages updates
  • new github module to check a github repository and notifications
  • new graphite module to check metrics from graphite
  • new insync module to check your current insync status
  • new timer module to have a simple countdown displayed
  • new twitch_streaming module to check is a Twitch Streamer is online
  • new vpn_status module to check your VPN status
  • new xrandr_rotate module to rotate your screens
  • new yandexdisk_status module to display Yandex.Disk status

Contributors

And of course thank you to all the others who made this version possible !

  • @egeskow
  • Alex Caswell
  • Johannes Karoff
  • Joshua Pratt
  • Maxim Baz
  • Nathan Smith
  • Themistokle Benetatos
  • Vladimir Potapev
  • Yongming Lai

Donnie Berkholz a.k.a. dberkholz (homepage, bugs)
Time to retire (June 25, 2016, 07:08 UTC)

sign-big-150dpi-magnified-name-200x200I’m sad to say it’s the end of the road for me with Gentoo, after 13 years volunteering my time (my “anniversary” is tomorrow). My time and motivation to commit to Gentoo have steadily declined over the past couple of years and eventually stopped entirely. It was an enormous part of my life for more than a decade, and I’m very grateful to everyone I’ve worked with over the years.

My last major involvement was running our participation in the Google Summer of Code, which is now fully handed off to others. Prior to that, I was involved in many things from migrating our X11 packages through the Big Modularization and maintaining nearly 400 packages to serving 6 terms on the council and as desktop manager in the pre-council days. I spent a long time trying to change and modernize our distro and culture. Some parts worked better than others, but the inertia I had to fight along the way was enormous.

No doubt I’ve got some packages floating around that need reassignment, and my retirement bug is already in progress.

Thanks, folks. You can reach me by email using my nick at this domain, or on Twitter, if you’d like to keep in touch.


Tagged: gentoo, x.org

June 23, 2016
Bernard Cafarelli a.k.a. voyageur (homepage, bugs)

I think there has been more than enough news on nextcloud origins, the fork from owncloud, … so I will keep this post to the technical bits.

Installing nextcloud on Gentoo

For now, the differences between owncloud 9 and nextcloud 9 are mostly cosmetic, so a few quick edits to owncloud ebuild ended in a working nextcloud one. And thanks to the different package names, you can install both in parallel (as long as they do not use the same database, of course).

So if you want to test nextcloud, it’s just a command away:

# emerge -a nextcloud

With the default webapp parameters, it will install alongside owncloud.

Migrating owncloud data

Nothing official again here, but as I run a small instance (not that much data) with simple sqlite backend, I could copy the data and configuration to nextcloud and test it while keeping owncloud.
Adapt the paths and web user to your setup: I have these webapps in the default /var/www/localhost/htdocs/ path, with www-server as the web server user.

First, clean the test data and configuration (if you logged in nextcloud):

# rm /var/www/localhost/htdocs/nextcloud/config/config.php
# rm -r /var/www/localhost/htdocs/nextcloud/data/*

Then clone owncloud’s data and config. If you feel adventurous (or short on available disk space), you can move these files instead of copying them:

# cp -a /var/www/localhost/htdocs/owncloud/data/* /var/www/localhost/htdocs/nextcloud/data/
# cp -a /var/www/localhost/htdocs/owncloud/config/config.php /var/www/localhost/htdocs/nextcloud/config/

Change all owncloud occurences in config.php to nextcloud (there should be only one, for ‘datadirectory’. And then run the (nextcloud) updater. You can do it via the web interface, or (safer) with the CLI occ tool:

# sudo -u www-server php /var/www/localhost/htdocs/nextcloud/occ upgrade

As with “standard” owncloud upgrades, you will have to reactivate additional plugins after logging in. Also check the nextcloud log for potential warnings and errors.

In my test, the only non-official plugin that I use, files_reader (for ebooks)  installed fine in nextcloud, and the rest worked as fine as owncloud with a lighter default theme 🙂
For now, owncloud-client works if you point it to the new /nextcloud URL on your server, but this can (and probably will) change in the future.

More migration tips and news can be found in this nextcloud forum post, including some nice detailed backup steps for mysql-backed systems migration.

June 22, 2016
Michał Górny a.k.a. mgorny (homepage, bugs)

In my previous post I have described a number of pitfalls regarding Gentoo dependency specifications. However, I have missed a minor point of correctness of various dependency types in specific dependency classes. I am going to address this in this short post.

There are three classes of dependencies in Gentoo: build-time dependencies that are installed before the source build happens, runtime dependencies that should be installed before the package is installed to the live system and ‘post’ dependencies which are pretty much runtime dependencies whose install can be delayed if necessary to avoid dependency loops. Now, there are some fun relationships between dependency classes and dependency types.

Blockers

Blockers are the dependencies used to prevent a specific package from being installed, or to force its uninstall. In modern EAPIs, there are two kinds of blockers: weak blockers (single !) and strong blockers (!!).

Weak blockers indicate that if the blocked package is installed, its uninstall may be delayed until the blocking package is installed. This is mostly used to solve file collisions between two packages — e.g. it allows the package manager to replace colliding files, then unmerge remaining files of the blocked package. It can also be used if the blocked package causes runtime issues on the blocking package.

Weak blockers make sense only in RDEPEND. While they’re technically allowed in DEPEND (making it possible for DEPEND=${RDEPEND} assignment to be valid), they are not really meaningful in DEPEND alone. That’s because weak blockers can be delayed post build, and therefore may not influence the build environment at all. In turn, after the build is done, build dependencies are no longer used, and unmerging the blocker does not make sense anymore.

Strong blockers indicate that the blocked package must be uninstalled before the dependency specification is considered satisfied. Therefore, they are meaningful both for build-time dependencies (where they indicate the blocker must be uninstalled before source build starts) and for runtime dependencies (where they indicate it must be uninstalled before install starts).

This leaves PDEPEND which is a bit unclear. Again, technically both blocker types are valid. However, weak blockers in PDEPEND would be pretty much equivalent to those in RDEPEND, so there is no reason to use that class. Strong blockers in PDEPEND would logically be equivalent to weak blockers — since the satisfaction of this dependency class can be delayed post install.

Any-of dependencies and :* slot operator

This is going just to be a short reminder: those types of dependencies are valid in all dependency classes but no binding between those occurences is provided.

An any-of dependency in DEPEND indicates that at least one of the packages will be installed before the build starts. An any-of dependency in RDEPEND (or PDEPEND) indicates that at least one of them will be installed at runtime. There is no guarantee that the dependency used to satisfy DEPEND will be the same as the one used to satisfy RDEPEND, and the latter is fully satisfied when one of the listed packages is replaced by another.

A similar case occurs for :* operator — only that slots are used instead of separate packages.

:= slot operator

Now, the ‘equals’ slot operator is a fun one. Technically, it is valid in all dependency classes — for the simple reason of DEPEND=${RDEPEND}. However, it does not make sense in DEPEND alone as it is used to force rebuilds of installed package while build-time dependencies apply only during the build.

The fun part is that for the := slot operator to be valid, the matching package needs to be installed when the metadata for the package is recorded — i.e. when a binary package is created or the built package is installed from source. For this to happen, a dependency guaranteeing this must be in DEPEND.

So, the common rule would be that a package dependency using := operator would have to be both in RDEPEND and DEPEND. However, strictly speaking the dependencies can be different as long as a package matching the specification from RDEPEND is guaranteed by DEPEND.

June 21, 2016
Michał Górny a.k.a. mgorny (homepage, bugs)

During my work on Gentoo, I have seen many types of dependency pitfalls that developers fell in. Sad to say, their number is increasing with new EAPI features — we are constantly increasing new ways into failures rather than working on simplifying things. I can’t say the learning curve is getting much steeper but it is considerably easier to make a mistake.

In this article, I would like to point out a few common misunderstandings and pitfalls regarding slots, slot operators and any-of (|| ()) deps. All of those constructs are used to express dependencies that can be usually be satisfied by multiple packages or package versions that can be installed in parallel, and missing this point is often the cause of trouble.

Separate package dependencies are not combined into a single slot

One of the most common mistakes is to assume that multiple package dependency specifications listed in one package are going to be combined somehow. However, there is no such guarantee, and when a package becomes slotted this fact actually becomes significant.

Of course, some package managers take various precautions to prevent the following issues. However, such precautions not only can not be relied upon but may also violate the PMS.

For example, consider the following dependency specification:

>=app-misc/foo-2
<app-misc/foo-5

It is a common way of expressing version ranges (in this case, versions 2*, 3* and 4* are acceptable). However, if app-misc/foo is slotted and there are versions satisfying the dependencies in different slots, there is no guarantee that the dependency could not be satisfied by installing foo-1 (satisfies <foo-5) and foo-6 (satisfies >=foo-2) in two slots!

Similarly, consider:

app-misc/foo[foo]
bar? ( app-misc/foo[baz] )

This one is often used to apply multiple sets of USE flags to a single package. Once again, if the package is slotted, there is no guarantee that the dependency specifications will not be satisfied by installing two slots with different USE flag configurations.

However, those problems mostly apply to fully slotted packages such as sys-libs/db where multiple slots are actually meaningfully usable by a package. With the more common use of multiple slots to provide incompatible versions of the package (e.g. binary compatibility slots), there is a more important problem: that even a single package dependency can match the wrong slot.

For non-truly multi-slotted packages, the solution to all those problems is simple: always specify the correct slot. For truly multi-slotted packages, there is no easy solution.

For example, a version range has to be expressed using an any-of dep:

|| (
     =sys-libs/db-5*
     =sys-libs/db-4*
)

Multiple sets of USE flags? Well, if you really insist, you can combine them for each matching slot separately…

|| (
    ( sys-libs/db:5.3 tools? ( sys-libs/db:5.3[cxx] ) )  
    ( sys-libs/db:5.1 tools? ( sys-libs/db:5.1[cxx] ) )  
	…
)

The ‘equals’ slot operator and multiple slots

A similar problem applies to the use of the EAPI 5 ‘equals’ slot operator. The PMS notes that:

=
Indicates that any slot value is acceptable. In addition, for runtime dependencies, indicates that the package will break unless a matching package with slot and sub-slot equal to the slot and sub-slot of the best installed version at the time the package was installed is available.

[…]

To implement the equals slot operator, the package manager will need to store the slot/sub-slot pair of the best installed version of the matching package. […]

PMS, 8.2.6.3 Slot Dependencies

The significant part is that the slot and subslot is recorded for the best package version matched by the specification containing the operator. So again, if the operator is used on multiple dependencies that can match multiple slots, multiple slots can actually be recorded.

Again, this becomes really significant in truly slotted packages:

|| (
     =sys-libs/db-5*
     =sys-libs/db-4*
)
sys-libs/db:=

While one may expect the code to record the slot of sys-libs/db used by the package, this may actually record any newer version that is installed while the package is being built. In other words, this may implicitly bind to db-6* (and pull it in too).

For this to work, you need to ensure that the dependency with the slot operator can not match any version newer than the two requested:

|| (
     =sys-libs/db-5*
     =sys-libs/db-4*
)
<sys-libs/db-6:=

In this case, the dependency with the operator could still match earlier versions. However, the other dependency enforces (as long as it’s in DEPEND) that at least one of the two versions specified is installed at build-time, and therefore is used by the operator as the best version matching it.

The above block can easily be extended by a single set of USE dependencies (being applied to all the package dependencies including the one with slot operator). For multiple conditional sets of USE dependencies, finding a correct solution becomes harder…

The meaning of any-of dependencies

Since I have already started using the any-of dependencies in the examples, I should point out yet another problem. Many of Gentoo developers do not understand how any-of dependencies work, and make wrong assumptions about them.

In an any-of group, at least one immediate child element must be matched. A blocker is considered to be matched if its associated package dependency specification is not matched.

PMS, 8.2.3 Any-of Dependency Specifications

So, PMS guarantees that if at least one of the immediate child elements (package dependencies, nested blocks) of the any-of block, the dependency is considered satisfied. This is the only guarantee PMS gives you. The two common mistakes are to assume that the order is significant and that any kind of binding between packages installed at build time and at run time is provided.

Consider an any-of dependency specification like the following:

|| (
    A
    B
    C
)

In this case, it is guaranteed that at least one of the listed packages is installed at the point appropriate for the dependency class. If none of the packages are installed already, it is customary to assume the Package Manager will prefer the first one — while this is not specified and may depend on satisfiability of the dependencies, it is a reasonable assumption to make.

If multiple packages are installed, it is undefined which one is actually going to be used. In fact, the package may even provide the user with explicit run time choice of the dependency used, or use multiple of them. Assuming that A will be preferred over B, and B over C is simply wrong.

Furthermore, if one of the packages is uninstalled, while one of the remaining ones is either already installed or being installed, the dependency is still considered satisfied. It is wrong to assume that in any case the Package Manager will bind to the package used at install time, or cause rebuilds when switching between the packages.

The ‘equals’ slot operator in any-of dependencies

Finally, I am reaching the point of lately recurring debates. Let me make it clear: our current policy states that under no circumstances may := appear anywhere inside any-of dependency blocks.

Why? Because it is meaningless, it is contradictory. It is not even undefined behavior, it is a case where requirements put for the slot operator can not be satisfied. To explain this, let me recall the points made in the preceding sections.

First of all, the implementation of the ‘equals’ slot operator requires the Package Manager to explicitly bind the slot/subslot of the dependency to the installed version. This can only happen if the dependency is installed — and an any-of block only guarantees that one of them will actually be installed. Therefore, an any-of block may trigger a case when PMS-enforced requirements can not be satisfied.

Secondly, the definition of an any-of block allows replacing one of the installed packages with another at run time, while the slot operator disallows changing the slot/subslot of one of the packages. The two requested behaviors are contradictory and do not make sense. Why bind to a specific version of one package, while any version of the other package is allowed?

Thirdly, the definition of an any-of block does not specify any particular order/preference of packages. If the listed packages do not block one another, you could end up having multiple of them installed, and bound to specific slots/subslots. Therefore, the Package Manager should allow you to replace A:1 with B:2 but not with B:1 nor with A:2. We’re reaching insanity now.

Now, all of the above is purely theoretical. The Package Manager can do pretty much anything given invalid input, and that is why many developers wrongly assume that slot operators work inside any-of. The truth is: they do not, the developer just did not test all the cases correctly. The Portage behavior varies from allowing replacements with no rebuilds, to requiring both of mutually exclusive packages to be installed simultaneously.

June 19, 2016
Michał Górny a.k.a. mgorny (homepage, bugs)

Since GLEP 67 was approved, bug assignment became easier. However, there were still many metadata.xml files which made this suboptimal. Today, I have fixed most of them and I would like to provide this short guide on how to write good metadata.xml files.

The bug assignment procedure

To understand the points that I am going to make, let’s take a look at how bug assignment happens these days. Assuming a typical case of bug related to a specific package (or multiple packages), the procedure for assigning the bug involves, for each package:

  1. reading all <maintainer/> elements from the package’s metadata.xml file, in order;
  2. filtering the maintainers based on restrict="" attributes (if any);
  3. filtering and reordering the maintainers based on <description/>s;
  4. assigning the bug to the first maintainer left, and CC-ing the remaining ones.

I think the procedure is quite clear. Since we no longer have <herd/> elements with special meaning applied to them, the assignment is mostly influenced by maintainer occurrence order. Restrictions can be used to limit maintenance to specific versions of a package, and descriptions to apply special rules and conditions.

Now, for semi-automatic bug assignment, only the first or the first two of the above steps can be clearly automated. Applying restrictions correctly requires understanding whether the bug can be correctly isolated to a specific version range, as some bugs (e.g. invalid metadata) may require being fixed in multiple versions of the package. Descriptions, in turn, are written for humans and require a human to interpret them.

What belongs in a good description

Now, many of existing metadata.xml files had either useless or even problematic maintainer descriptions. This is a problem since it increases time needed for bug assignment, and makes automation harder. Common examples of bad maintainer descriptions include:

  1. Assign bugs to him; CC him on bugs — this is either redundant or contradictory. Ensure that maintainers are listed in correct order, and bugs will be assigned correctly. Those descriptions only force a human to read them and possibly change the automatically determined order.
  2. Primary maintainer; proxied maintainer — this is some information but it does not change anything. If the maintainer comes first, he’s obviously the primary one. If the maintainer has non-Gentoo e-mail and there are proxies listed, he’s obviously proxied. And even if we did not know that, does it change anything? Again, we are forced to read information we do not need.

Good maintainer descriptions include:

  1. Upstream; CC on bugs concerning upstream, Only CC on bugs that involve USE=”d3d9″ — useful information that influences bug assignment;
  2. Feel free to fix/update, All modifications to this package must be approved by the wxwidgets herd. — important information for other developers.

So, before adding another description, please answer two questions: will the information benefit anyone? Can’t it be expressed in machine-readable form?

Proxy-maintained packages

Since a lot of the affected packages are maintained by proxied maintainers, I’d like to explicitly point out how proxy-maintained packages are to be described. This overlaps with the current Proxy maintainers team policy.

For proxy-maintained packages, the maintainers should be listed in the following order:

  1. actual package maintainers, in appropriate order — including developers maintaining or co-maintaining the package, proxied maintainers and Gentoo projects;
  2. developer proxies, preferably described as such — i.e. developers who do not actively maintain the package but only proxy for the maintainers;
  3. Proxy-maintainers project — serving as the generic fallback proxy.

I would like to put more emphasis on the key point here — the maintainers should be listed in an order making it clearly possible to distinguish packages that are maintained only by a proxied maintainer (with developers acting as proxies) from packages that are maintained by Gentoo developers and co-maintained by a proxied maintainer.

Third-party repositories (overlays)

As a last point, I would like to point out the special case of unofficial Gentoo repositories. Unlike the core repositories, metadata.xml files can not be fully trusted there. The reason for this is quite simple — many users copy (fork) packages from Gentoo along with metadata.xml files. If we were to trust those files — we would be assigning overlay bugs to Gentoo developers maintaining the original package!

For this reason, all bugs on unofficial repository packages are assigned to the repository owners.

June 17, 2016
Michał Górny a.k.a. mgorny (homepage, bugs)
bugs.gentoo.org: bug assignment UserJS (June 17, 2016, 12:48 UTC)

Since time does not permit me to write in more extent, just a short note: yesterday, I have published a Gentoo Bugzilla bug assignment UserJS. When enabled, it automatically tries to find package names in bug summary, fetches maintainers for them (from packages.g.o) and displays them in a table with quick assignment/CC checkboxes.

Note that it’s still early work. If you find any bugs, please let me know. Patches will be welcome too. And some redesign, since it looks pretty bad, standard Bugzilla style applied to plain HTML.

Update: now on GitHub as bug-assign-user-js

June 15, 2016
Sven Vermeulen a.k.a. swift (homepage, bugs)
Comparing Hadoop with mainframe (June 15, 2016, 18:55 UTC)

At my work, I have the pleasure of being involved in a big data project that uses Hadoop as the primary platform for several services. As an architect, I try to get to know the platform's capabilities, its potential use cases, its surrounding ecosystem, etc. And although the implementation at work is not in its final form (yay agile infrastructure releases) I do start to get a grasp of where we might be going.

For many analysts and architects, this Hadoop platform is a new kid on the block so I have some work explaining what it is and what it is capable of. Not for the fun of it, but to help the company make the right decisions, to support management and operations, to lift the fear of new environments. One thing I've once said is that "Hadoop is the poor man's mainframe", because I notice some high-level similarities between the two.

Somehow, it stuck, and I was asked to elaborate. So why not bring these points into a nice blog post :)

The big fat disclaimer

Now, before embarking on this comparison, I would like to state that I am not saying that Hadoop offers the same services, or even quality and functionality of what can be found in mainframe environments. Considering how much time, effort and experience was already put in the mainframe platform, it would be strange if Hadoop could match the same. This post is to seek some similarities and, who knows, learn a few more tricks from one or another.

Second, I am not an avid mainframe knowledgeable person. I've been involved as an IT architect in database and workload automation technical domains, which also spanned the mainframe parts of it, but most of the effort was within the distributed world. Mainframes remain somewhat opaque to me. Still, that shouldn't prevent me from making any comparisons for those areas that I do have some grasp on.

And if my current understanding is just wrong, I'm sure that I'll learn from the comments that you can leave behind!

With that being said, here it goes...

Reliability, Availability, Serviceability

Let's start with some of the promises that both platforms make - and generally are also able to deliver. Those promises are of reliability, availability and serviceability.

For the mainframe platform, these quality attributes are shown as the mainframe strengths. The platform's hardware has extensive self-checking and self-recovery capabilities, the systems can recover from failed components without service interruption, and failures can be quickly determined and resolved. On the mainframes, this is done through a good balance and alignment of hardware and software, design decisions and - in my opinion - tight control over the various components and services.

I notice the same promises on Hadoop. Various components are checking the state of the hardware and other components, and when something fails, it is often automatically recovered without impacting services. Instead of tight control over the components and services, Hadoop uses a service architecture and APIs with Java virtual machine abstractions.

Let's consider hardware changes.

For hardware failure and component substitutions, both platforms are capable of dealing with those without service disruption.

  • Mainframe probably has a better reputation in this matter, as its components have a very high Mean Time Between Failure (MTBF), and many - if not all - of the components are set up in a redundant fashion. Lots of error detection and failure detection processes try to detect if a component is close to failure, and ensure proper transitioning of any workload towards the other components without impact.
  • Hadoop uses redundancy on a server level. If a complete server fails, Hadoop is usually able to deal with this without impact. Either the sensor-like services disable a node before it goes haywire, or the workload and data that was running on the failed node is restarted on a different node.

Hardware (component) failures on the mainframe side will not impact the services and running transactions. Component failures on Hadoop might have a noticeable impact (especially if it is OLTP-like workload), but will be quickly recovered.

Failures are more likely to happen on Hadoop clusters though, as it was designed to work with many systems that have a worse MTBF design than a mainframe. The focus within Hadoop is on resiliency and fast recoverability. Depending on the service that is being used, active redundancy can be in use (so disruptions are not visible to the user).

If the Hadoop workload includes anything that resembles online transactional processing, you're still better off with enterprise-grade hardware such as ECC memory to at least allow improved hardware failure detection (and perform proactive workload management). CPU failures are not that common (at least not those without any upfront Machine Check Exception - MCE), and disk/controller failures are handled through the abstraction of HDFS anyway.

For system substitutions, I think both platforms can deal with this in a dynamic fashion as well:

  • For the mainframe side (and I'm guessing here) it is possible to switch machines with no service impact if the services are running on LPARs that are joined together in a Parallel Sysplex setup (sort-of clustering through the use of the Coupling Facilities of mainframe, which is supported through high-speed data links and services for handling data sharing and IPC across LPARs). My company switched to the z13 mainframe last year, and was able to keep core services available during the migration.
  • For Hadoop systems, the redundancy on system level is part of its design. Extending clusters, removing nodes, moving services, ... can be done with no impact. For instance, switching the active HiveServer2 instance means de-registering it in the ZooKeeper service. New client connects are then no longer served by that HiveServer2 instance, while active client connections remain until finished. There are also in-memory data grid solutions such as through the Ignite project, allowing for data sharing and IPC across nodes, as well as building up memory-based services with Arrow, allowing for efficient memory transfers.

Of course, also application level code failures tend to only disrupt that application, and not the other users. Be it because of different address spaces and tight runtime control (mainframe) or the use of different containers / JVMs for the applications (Hadoop), this is a good feat to have (even though it is not something that differentiates these platforms from other platforms or operating systems).

Let's talk workloads

When we look at a mainframe setup, we generally look at different workload patterns as well. There are basically two main workload approaches for the mainframe: batch, and On-Line Transactional Processing (OLTP) workload. In the OLTP type, there is often an additional distinction between synchronous OLTP and asynchronous OLTP (usually message-based).

Well, we have the same on Hadoop. It was once a pure batch-driven platform (and many of its components are still using batches or micro-batches in their underlying designs) but now also provides OLTP workload capabilities. Most of the OLTP workload on Hadoop is in the form of SQL-like or NoSQL database management systems with transaction manager support though.

To manage these (different) workloads, and to deal with prioritization of the workload, both platforms offer the necessary services to make things both managed as well as business (or "fit for purpose") focused.

  • Using the Workload Manager (WLM) on the mainframe, policies can be set on the workload classes so that an over-demand of resources (cross-LPARs) results in the "right" amount of allocations for the "right" workload. To actually manage jobs themselves, the Job Entry Subsystem (JES) to receive jobs and schedule then for processing on z/OS. For transactional workload, WLM provides the right resources to for instance the involved IMS regions.
  • On Hadoop, workload management is done through Yet Another Resource Negotiator (YARN), which uses (logical) queues for the different workloads. Workload (Application Containers) running through these queues can be, resource-wise, controlled both on the queue level (high-level resource control) as well as process level (low-level resource control) through the use of Linux Control Groups (CGroups - when using Linux based systems course).

If I would try to compare both against each other, one might say that the YARN queues are like WLMs service classes, and for batch applications, the initiators on mainframe are like the Application Containers within YARN queues. The latter can also be somewhat compared to IMS regions in case of long-running Application Containers.

The comparison will not hold completely though. WLM can be tuned based on goals and will do dynamic decision making on the workloads depending on its parameters, and even do live adjustments on the resources (through the System Resources Manager - SRM). Heavy focus on workload management on mainframe environments is feasible because extending the available resources on mainframes is usually expensive (additional Million Service Units - MSU). On Hadoop, large cluster users who notice resource contention just tend to extend the cluster further. It's a different approach.

Files and file access

Another thing that tends to confuse some new users on Hadoop is its approach to files. But when you know some things about the mainframe, this does remain understandable.

Both platforms have a sort-of master repository where data sets (mainframe) or files (Hadoop) are registered in.

  • On the mainframe, the catalog translates data set names into the right location (or points to other catalogs that do the same)
  • On Hadoop, the Hadoop Distributed File System (HDFS) NameNode is responsible for tracking where files (well, blocks) are located across the various systems

Considering the use of the repository, both platforms thus require the allocation of files and offer the necessary APIs to work with them. But this small comparison does not end here.

Depending on what you want to store (or access), the file format you use is important as well. - On mainframe, Virtual Storage Access Method (VSAM) provides both the methods (think of it as API) as well as format for a particular data organization. Inside a VSAM, multiple data entries can be stored in a structured way. Besides VSAM, there is also Partitioned Data Set/Extended (PDSE), which is more like a directory of sorts. Regular files are Physical Sequential (PS) data sets. - On Hadoop, a number of file formats are supported which optimize the use of the files across the services. One is Avro, which holds both methods and format (not unlike VSAM), another is Optimized Row Columnar (ORC). HDFS also has a number of options that can be enabled or set on certain locations (HDFS uses a folder-like structure) such as encryption, or on files themselves, such as replication factor.

Although I don't say VSAM versus Avro are very similar (Hadoop focuses more on the concept of files and then the file structure, whereas mainframe focuses on the organization and allocation aspect if I'm not mistaken) they seem to be sufficiently similar to get people's attention back on the table.

Services all around

What makes a platform tick is its multitude of supported services. And even here can we find similarities between the two platforms.

On mainframe, DBMS services can be offered my a multitude of softwares. Relational DBMS services can be provided by IBM DB2, CA Datacom/DB, NOMAD, ... while other database types are rendered by titles such as CA IDMS and ADABAS. All these titles build upon the capabilities of the underlying components and services to extend the platform's abilities.

On Hadoop, several database technologies exist as well. Hive offers a SQL layer on top of Hadoop managed data (so does Drill btw), HBase is a non-relational database (mainly columnar store), Kylin provides distributed analytics, MapR-DB offers a column-store NoSQL database, etc.

When we look at transaction processing, the mainframe platform shows its decades of experience with solutions such as CICS and IMS. Hadoop is still very much at its infancy here, but with projects such as Omid or commercial software solutions such as Splice Machine, transactional processing is coming here as well. Most of these are based on underlying database management systems which are extended with transactional properties.

And services that offer messaging and queueing are also available on both platforms: mainframe can enjoy Tibco Rendezvous and IBM WebSphere MQ, while Hadoop is hitting the news with projects such as Kafka and Ignite.

Services extend even beyond the ones that are directly user facing. For instance, both platforms can easily be orchestrated using workload automation tooling. Mainframe has a number of popular schedulers up its sleeve (such as IBM TWS, BMC Control-M or CA Workload Automation) whereas Hadoop is generally easily extended with the scheduling and workload automation software of the distributed world (which, given its market, is dominated by the same vendors, although many smaller ones exist as well). Hadoop also has its "own" little scheduling infrastructure called Oozie.

Programming for the platforms

Platforms however are more than just the sum of the services and the properties that it provides. Platforms are used to build solutions on, and that is true for both mainframe as well as Hadoop.

Let's first look at scripting - using interpreted languages. On mainframe, you can use the Restructed Extended Executor (REXX) or CLIST (Command LIST). Hadoop gives you Tez and Pig, as well as Python and R (through PySpark and SparkR).

If you want to directly interact with the systems, mainframe offers the Time Sharing Option/Extensions (TSO/E) and Interactive System Productivity Facility (ISPF). For Hadoop, regular shells can be used, as well as service-specific ones such as Spark shell. However, for end users, web-based services such as Ambari UI (Ambari Views) are generally better suited.

If you're more fond of compiled code, mainframe supports you with COBOL, Java (okay, it's "a bit" interpreted, but also compiled - don't shoot me here), C/C++ and all the other popular programming languages. Hadoop builds on top of Java, but supports other languages such as Scala and allows you to run native applications as well - it's all about using the right APIs.

To support development efforts, Integrated Development Environments (IDEs) are provided for both platforms as well. You can use Cobos, Micro Focus Enterprise Developer, Rational Developer for System z, Topaz Workbench and more for mainframe development. Hadoop has you covered with web-based notebook solutions such as Zeppelin and JupyterHub, as well as client-level IDEs such as Eclipse (with the Hadoop Development Tools plugins) and IntelliJ.

Governing and managing the platforms

Finally, there is also the aspect of managing the platforms.

When working on the mainframe, management tooling such as the Hardware Management Console (HMC) and z/OS Management Facility (z/OSMF) cover operations for both hardware and system resources. On Hadoop, central management software such as Ambari, Cloudera Manager or Zettaset Orchestrator try to cover the same needs - although most of these focus more on the software side than on the hardware level.

Both platforms also have a reasonable use for multiple roles: application developers, end users, system engineers, database adminstrators, operators, system administrators, production control, etc. who all need some kind of access to the platform to support their day-to-day duties. And when you talk roles, you talk authorizations.

On the mainframe, the Resource Access Control Facility (RACF) provides access control and auditing facilities, and supports a multitude of services on the mainframe (such as DB2, MQ, JES, ...). Many major Hadoop services, such as HDFS, YARN, Hive and HBase support Ranger, providing a single pane for security controls on the Hadoop platform.

Both platforms also offer the necessary APIs or hooks through which system developers can fine-tune the platform to fit the needs of the business, or develop new integrated solutions - including security oriented ones. Hadoop's extensive plugin-based design (not explicitly named) or mainframe's Security Access Facility (SAF) are just examples of this.

Playing around

Going for a mainframe or a Hadoop platform will always be a management decision. Both platforms have specific roles and need particular profiles in order to support them. They are both, in my opinion, also difficult to migrate away from once you are really using them actively (lock-in) although it is more digestible for Hadoop given its financial implications.

Once you want to start meddling with it, getting access to a full platform used to be hard (the coming age of cloud services makes that this is no longer the case though), and both therefore had some potential "small deployment" uses. Mainframe experience could be gained through the Hercules 390 emulator, whereas most Hadoop distributions have a single-VM sandbox available for download.

To do a full scale roll-out however is much harder to do by your own. You'll need to have quite some experience or even expertise on so many levels that you will soon see that you need teams (plural) to get things done.

This concludes my (apparently longer than expected) write-down of this matter. If you don't agree, or are interested in some insights, be sure to comment!

Jan Kundrát a.k.a. jkt (homepage, bugs)

Trojitá, a fast Qt IMAP e-mail client, has a shiny new release. A highlight of the 0.7 version is support for OpenPGP ("GPG") and S/MIME ("X.509") encryption -- in a read-only mode for now. Here's a short summary of the most important changes:

  • Verification of OpenPGP/GPG/S-MIME/CMS/X.509 signatures and support for decryption of these messages
  • IMAP, MIME, SMTP and general bugfixes
  • GUI tweaks and usability improvements
  • Zooming of e-mail content and improvements for vision-impaired users
  • New set of icons matching the Breeze theme
  • Reworked e-mail header display
  • This release now needs Qt 5 (5.2 or newer, 5.6 is recommended)

As usual, the code is available in our git as a "v0.7" tag. You can also download a tarball (GPG signature). Prebuilt binaries for multiple distributions are available via the OBS, and so is a Windows installer.

The Trojitá developers

June 10, 2016
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Comet Coffee and Microbakery, St. Louis, MOAs with most cities, Saint Louis has a plethora of places to get a cup of coffee and some pastries or treats. The vast majority of those places are good, some of them are great, even fewer are exceptional standouts, and the top-tier is comprised of those that are truly remarkable. In my opinion, Comet Coffee and Microbakery finds its way into that heralded top tier. What determines whether or not a coffee shop or café earns that high of marks, you ask? Well, to some degree, that’s up to individual preference. For me, the criteria are:

  • A relaxing and inviting environment
  • Friendly and talented people
  • Exceptional food and beverage quality

First and foremost, I look for the environment that the coffee shop provides. Having a rather hectic work schedule, I like the idea of a place that I can go to unwind and just enjoy some of the more simplistic pleasures of life. Comet Coffee offers the small, intimate café-style setting that fits the bill for me. There are five or six individual tables and some longer benches inside, and a handful of tables outside on the front patio. Though the smaller space sometimes leads to crowding and a bit of a “hustle and bustle” feel, it doesn’t ever seem distracting. Also, during non-peak times, it tends to be quiet and peaceful.

Secondly, a café—or any other eatery, really—is about more than just the space itself. The employees make all the difference! At Comet Coffee, everyone is exceptionally talented in their craft, and it’s apparent that they deeply care about not only the customers they’re serving, but also the food and drinks that they’re making!

Comet Coffee employee Gretchen making a latte
Gretchen starting a latte
Comet Coffee employee Daniel making a pour-over
Daniel making a pour-over coffee

Thirdly, it should go without say that the food and drink quality are incredibly important factors for any café. At Comet, the coffee choices are seemingly limitless, so there are options that will satisfy any taste. Just in pour-overs alone, there are several different roasters (like Kuma, Sweet Bloom, Intelligentsia, Saint Louis’s own Blueprint, and others), from whom Comet offers an ever-changing list of varieties based on region (South American, African, et cetera). In addition to the pour-overs, there are many of the other coffee shop standards like lattes, espressos, macchiatos, cappuccinos, flat whites, and so on. Coffee’s not your thing? That’s fine too because they have an excellent and extensive selection of teas, ranging from the standard blacks, whites, and Darjeelings, to less common Oolongs, and my personal favourite green tea— the Genmaicha, which combines the delicate green tea flavours with toasted rice.

Comet Coffee's extensive tea selection

So between the coffees, espressos, and teas, you shouldn’t have any problem finding a beverage for any occasion or mood. But it isn’t just called “Comet Coffee”, it’s “Comet Coffee and Microbakery.” Though it almost sounds like an afterthought to the coffee, I assure you that the pastries and other baked goods share the stage as costars, and ones that often steal the show! There really isn’t a way for me to describe them that will do them justice. As someone who follows a rather rigid eating regimen, I won’t settle for anything less than stellar desserts and treats. That said, I’ve been blown away by every… single… one of them.

 

Comet Coffee Microbakery - Oat Cookies
Oat Cookies
Comet Coffee Microbakery - Strawberry Rhubarb Pie
Strawberry Rhubarb Pie
Comet Coffee Microbakery - Strawberry and Chocolate Ganache Macarons
Strawberry & Chocolate Ganache Macarons
Comet Coffee Microbakery - Buckwheat Muffin - Strawberry and Pistachio
Buckwheat muffin – Strawberry & Pistachio
Comet Coffee Microbakery - Tomato, Basil, and Mozzerella Quiche
Tomato, Basil, & Mozzerella Quiche
Comet Coffee Microbakery - Chocolate Chip Cookies and Cocoa Nibblers
Chocolate Chip Cookies & Cocoa Nibblers

You should definitely click on each image to view them in full size

 
Though I like essentially all of the treats, I do have my favourites. I tend to get the Oat Cookies most often because they are simple and fulfilling. One time, though, I went to Comet on a Sunday afternoon and the only thing left was a Buckwheat Muffin. Knowing that they simply don’t make any bad pastries, I went for it. Little did I know that it would become my absolute favourite! The baked goods vary with the availability of seasonal and local ingredients. For instance, the spring iteration of the Buckwheat Muffin is Strawberry Pistachio, but the previous one (which was insanely delicious) was Milk Chocolate & Hazelnut (think Nutella done in the best possible way). :)

One other testament to the quality of the treats is that Comet makes a few items that I have never liked anywhere else. For instance, I’m not a big fan of scones because they tend to be dry and often have the texture of coarse-grit sandpaper. However, the Lemon Poppy seed Scone and the Pear, Walnut & Goat Cheese Scone are both moist and satisfying. Likewise, I don’t really think much of Macarons, because they’re so light. These ones, however, have some substance to them and don’t make me think of overpriced cotton candy.

Okay, so now that I’ve sold you on Comet’s drinks and baked goods, here’s a little background about this great place. I recently had the opportunity to sit down with owners Mark and Stephanie, and talk with them about Comet’s past, current endeavours, and their future plans.

 

Coffee is about subtle nuances, and it can be continually improved upon. With all those nuances, I like it when one particular flavour note pops out.–Mark

 

Comet Coffee first opened its doors in August of 2012, and Mark immediately started renovating in order to align the space with his visions for the perfect shop. He and his fiancée Stephanie had worked together at Kaldi’s Coffee beforehand, but were inspired to open their own place. Between the two of them—Mark holding a degree in Economics, and Stephanie with degrees in both Hotel & Restaurant Management as well as Baking & Pastry Arts—the decision to foray into the industry together seemed like a given.

Mark had originally anticipated calling the shop “Demitasse,” after the small Turkish coffee cup by the same French name. Stephanie, though, did some research on the area of Saint Louis in which the shop is located (where the Forest Park Highlands amusement parks used to stand), and eventually found out about The Comet roller coaster. The name pays homage to those parks, and may have even been a little hopeful foreshadowing that the shop would become a well-established staple of the community.

When asked what separates Comet from other coffee shops, Mark readily mentioned that they themselves do not roast their own beans. He explained that doing so “requires purchasing [beans] in large quantities,” and that would disallow them from varying the coffee choices day-to-day. Similar to Mark’s comment about the quest of continuously improving the coffee experience, Stephanie indicated that the key to baking is to constantly modify the recipe based on the freshest available ingredients.

 

There are no compromises when baking. You must be meticulous with measurements, and you have to taste throughout the process to make adjustments.–Stephanie

 

Mark and Stephanie are currently in the process of opening an ice cream and bake shop in the Kirkwood area, and plan on carrying many of those items at Comet as well. Looking further to the future, Mark would like to open a doughnut shop where everything is made to order. His rationale—which, being a doughnut connoisseur myself, I find to be completely sound—is that everything fried needs to be as fresh as possible.

I, for one, can’t wait to try the new ice creams that Mark and Stephanie will offer in their Kirkwood location. For now, though, I will continue to enjoy the outstanding brews and unparalleled pastries at Comet Coffee. It has become a weekly go-to spot for me, and one that I look forward to greatly for unwinding after those difficult “Monday through Friday” stretches.

Comet Coffee Macchiato and seltzer
Macchiato and Seltzer
Comet Coffee pour-over brew
Pour-over brew

Cheers,
Zach

 
P.S. No time to leisurely enjoy the excellent café atmosphere? No worries, at least grab one to go. It will definitely beat what you can get from any of those chain coffee shops. :)

Comet Coffee to-go cup

June 06, 2016
Matthew Thode a.k.a. prometheanfire (homepage, bugs)
Gentoo, Openstack and OSIC (June 06, 2016, 05:00 UTC)

What to use it for

I recently applied for, and use an allocation from https://osic.org/ do extend more support for running Openstack on Gentoo. The end goal of this is to allow Gentoo to become a gated job within the Openstack test infrastructure. To do that, we need to add support for building an image that can be used.

(pre)work

To speed up the work on adding support for generating an openstack infra Gentoo image I already completed work on adding Gentoo to diskimage builder. You can see images at http://gentoo.osuosl.org/experimental/amd64/openstack/

(actual)work

The actual work has been slow going unfortunately, working with upstreams to add Gentoo support has tended to find other issues that need fixing along the way. The main thing that slowed me down though was the Openstack summit (Newton). That went on at the same time and reveiws were delated at least a week, usually two.

Since then though I've been able to work though some of the issues and have started testing the final image build in diskimage builder.

More to do

The main things left to do is to add gentoo support to the bindep elemet within diskimage builder and finish and other rough edges in other elements (if they exist). After that, Openstack Infra can start caching a Gentoo image and the real work can begin. Adding Gentoo support to the Openstack Ansible project to allow for better deployments.

May 27, 2016
New Gentoo LiveDVD "Choice Edition" (May 27, 2016, 00:00 UTC)

We’re happy to announce the availability of an updated Gentoo LiveDVD. As usual, you can find it on our downloads page.

May 26, 2016
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)
Akonadi for e-mail needs to die (May 26, 2016, 10:48 UTC)

So, I'm officially giving up on kmail2 (i.e., the Akonadi-based version of kmail) on the last one of my PCs now. I have tried hard and put in a lot of effort to get it working. However, it costs me a significant amount of time and effort just to be able to receive and read e-mail - meaning hanging IMAP resources every few minutes, the feared "Multiple merge candidates" bug popping up again and again, and other surprise events. That is plainly not acceptable in the workplace, where I need to rely on e-mail as means of communication. By leaving kmail2 I seem to be following many many other people... Even dedicated KDE enthusiasts that I know have by now migrated to Trojita or Thunderbird.

My conclusion after all these years, based on my personal experience, is that the usage of Akonadi for e-mail is a failed experiment. It was a nice idea in theory, and may work fine for some people. I am certain that a lot of effort has been put into improving it, I applaud the developers of both kmail and Akonadi for their tenaciousness and vision and definitely thank them for their work. Sadly, however, if something doesn't become robust and error-tolerant after over 5 (five) years of continuous development effort, the question pops up whether the initial architectural idea wasn't a bad one in the first place - in particular in terms of unhandleable complexity.

I am not sure why precisely in my case things turn out so badly. One possible candidate is the university mail server that I'm stuck with, running Novell Groupwise. I've seen rather odd behaviour in the IMAP replies in the past there. That said, there's the robustness principle for software to consider, and even if Groupwise were to do silly things, other IMAP clients seem to get along with it fine.

Recently I've heard some rumors about a new framework called Sink (or Akonadi-Next), which seems to be currently under development... I hope it'll be less fragile, and less overcomplexified. The choice of name is not really that convincing though (where did my e-mails go again)?

Now for the question and answer session...

Question: Why do you post such negative stuff? You are only discouraging our volunteers.
Answer: Because the motto of the drowned god doesn't apply to software. What is dead should better remain dead, and not suffer continuous revival efforts while users run away and the brand is damaged. Also, I'm a volunteer myself and invest a lot of time and effort into Linux. I've been seeing the resulting fallout. It likely scared off other prospective help.

Question: Have you tried restarting Akonadi? Have you tried clearing the Akonadi cache? Have you tried starting with a fresh database?
Answer: Yes. Yes. Yes. Many times. And yes to many more things. Did I mention that I spent a lot of time with that? I'll miss the akonadiconsole window. Or maybe not.

Question: Do you think kmail2 (the Akonadi-based kmail) can be saved somehow?
Answer: Maybe. One could suggest an additional agent as replacement to the usual IMAP module. Let's call it IMAP-stupid, and mandate that it uses only a bare minimum of server features and always runs in disconnected mode... Then again, I don't know the code, and don't know if that is feasible. Also, for some people kmail2 seems to work perfectly fine.

Question: So what e-mail program will you use now?
Answer: I will use kmail. I love kmail. Precisely, I will use Pali Rohar's noakonadi fork, which is based on kdepim 4.4. It is neither perfect nor bug-free, but accesses all my e-mail accounts reliably. This is what I've been using on my home desktop all the time (never upgraded) and what I downgraded my laptop to some time ago after losing many mails.

Question: So can you recommend running this ages-old kmail1 variant?
Answer: Yes and no. Yes, because (at least in my case) it seems to get the basic job done much more reliably. Yes, because it feels a lot snappier and produces far less random surprises. No, because it is essentially unmaintained, has some bugs, and is written for KDE 4, which is slowly going away. No, because Qt5-based kmail2 has more features and does look sexier. No, because you lose the useful Akonadi integration of addressbook and calendar.
That said, here are the two bugs of kmail1 that I find most annoying right now: 1) PGP/MIME cleartext signature is broken (at random some signatures are not verified correctly and/or bad signatures are produced), and 2), only in a Qt5 / Plasma environment, attachments don't open on click anymore, but can only be saved. (Which is odd since e.g. Okular as viewer is launched but never appears on screen, and the temporary file is written but immediately disappears... need to investigate.)

Question: I have bugfixes / patches for kmail1. What should I do?
Answer: Send them!!! I'll be happy to test and forward.

Question: What will you do when Qt4 / kdelibs goes away?
Answer: Dunno. Luckily I'm involved in packaging myself. :)


May 25, 2016
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Today I saw the cover of the 30 May 2016 edition of The New Yorker, which was designed by artist R. Kikuo Johnson, and it really hit home for me. The illustration depicts the graduating class of 2016 walking out of their commencement ceremony whilst a member of the 2015 graduating class is working as a groundskeeper:

Graduating Class 2016 - The New Yorker - R. Kikuo Johnson
Click for full quality

I won’t go into a full tirade here about my thoughts of higher education within the United States throughout recent years, but I do think that this image sums up a few key points nicely:

  • Many graduates (either from baccalaureate or higher-level programmes) are not working in their respective fields of study
  • A vast majority of students have accrued a nearly insurmountable amount of debt
  • Those two points may be inextricably linked to one another

I know that, for me, I am not able to work in my field of study (child and adolescent development / elementary education) for those very reasons—the corresponding jobs (which I find incredibly rewarding), unfortunately, do not yield high enough salaries for me to even make ends meet. Though the cover artwork doesn’t necessarily offer any suggestion as to a solution to the problem, I think that it very poignantly brings further attention to it.

Cheers,
Zach

May 24, 2016
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)

Here's a brief call for help.

Is there anyone out there who uses a recent kmail (I'm running 16.04.1 since yesterday, before that it was the latest KDE4 release) with a Novell Groupwise IMAP server?

I'm trying hard, I really like kmail and would like to keep using it, but for me right now it's extremely unstable (to the point of being unusable) - and I suspect by now that the server IMAP implementation is at least partially to blame. In the past I've seen definitive broken server behaviour (like negative IMAP uids), the feared "Multiple merge candidates" keeps popping up again and again, and the IMAP resource becomes unresponsive every few minutes...

So any datapoints of other kmail plus Groupwise imap users would be very much appreciated.

For reference, the server here is Novell Groupwise 2014 R2, version 14.2.0 11.3.2016, build number 123013.

Thanks!!!

May 23, 2016
Nirbheek Chauhan a.k.a. nirbheek (homepage, bugs)
GStreamer and Meson: A New Hope (May 23, 2016, 10:19 UTC)

Anyone who has written a non-trivial project using Autotools has realized that (and wondered why) it requires you to be aware of 5 different languages. Once you spend enough time with the innards of the system, you begin to realize that it is nothing short of an astonishing feat of engineering. Engineering that belongs in a museum. Not as part of critical infrastructure.

Autotools was created in the 1980s and caters to the needs of an entirely different world of software from what we have at present. Worse yet, it carries over accumulated cruft from the past 40 years — ostensibly for better “cross-platform support” but that “support” is mostly for extinct platforms that five people in the whole world remember.

We've learned how to make it work for most cases that concern FOSS developers on Linux, and it can be made to limp along on other platforms that the majority of people use, but it does not inspire confidence or really anything except frustration. People will not like your project or contribute to it if the build system takes 10x longer to compile on their platform of choice, does not integrate with the preferred IDE, and requires knowledge arcane enough to be indistinguishable from cargo-cult programming.

As a result there have been several (terrible) efforts at replacing it and each has been either incomplete, short-sighted, slow, or just plain ugly. During my time as a Gentoo developer in another life, I came in close contact with and developed a keen hatred for each of these alternative build systems. And so I mutely went back to Autotools and learned that I hated it the least of them all.

Sometime last year, Tim heard about this new build system called ‘Meson’ whose author had created an experimental port of GStreamer that built it in record time.

Intrigued, he tried it out and found that it finished suspiciously quickly. His first instinct was that it was broken and hadn’t actually built everything! Turns out this build system written in Python 3 with Ninja as the backend actually was that fast. About 2.5x faster on Linux and 10x faster on Windows for building the core GStreamer repository.

Upon further investigation, Tim and I found that Meson also has really clean generic cross-compilation support (including iOS and Android), runs natively (and just as quickly) on OS X and Windows, supports GNU, Clang, and MSVC toolchains, and can even (configure and) generate XCode and Visual Studio project files!

But the critical thing that convinced me was that the creator Jussi Pakkanen was genuinely interested in the use-cases of widely-used software such as Qt, GNOME, and GStreamer and had already added support for several tools and idioms that we use — pkg-config, gtk-doc, gobject-introspection, gdbus-codegen, and so on. The project places strong emphasis on both speed and ease of use and is quite friendly to contributions.

Over the past few months, Tim and I at Centricular have been working on creating Meson ports for most of the GStreamer repositories and the fundamental dependencies (libffi, glib, orc) and improving the MSVC toolchain support in Meson.

We are proud to report that you can now build GStreamer on Linux using the GNU toolchain and on Windows with either MinGW or MSVC 2015 using Meson build files that ship with the source (building upon Jussi's initial ports).

Other toolchain/platform combinations haven't been tested yet, but they should work in theory (minus bugs!), and we intend to test and bugfix all the configurations supported by GStreamer (Linux, OS X, Windows, iOS, Android) before proposing it for inclusion as an alternative build system for the GStreamer project.

You can either grab the source yourself and build everything, or use our (with luck, temporary) fork of GStreamer's cross-platform build aggregator Cerbero.

Personally, I really hope that Meson gains widespread adoption. Calling Autotools the Xorg of build systems is flattery. It really is just a terrible system. We really need to invest in something that works for us rather than against us.

PS: If you just want a quick look at what the build system syntax looks like, take a look at this or the basic tutorial.

May 21, 2016
Rafael Goncalves Martins a.k.a. rafaelmartins (homepage, bugs)
balde internals, part 1: Foundations (May 21, 2016, 15:25 UTC)

For those of you that don't know, as I never actually announced the project here, I'm working on a microframework to develop web applications in C since 2013. It is called balde, and I consider its code ready for a formal release now, despite not having all the features I planned for it. Unfortunately its documentation is not good enough yet.

I don't work on it for quite some time, then I don't remember how everything works, and can't write proper documentation. To make this easier, I'm starting a series of posts here in this blog, describing the internals of the framework and the design decisions I made when creating it, so I can remember how it works gradually. Hopefully in the end of the series I'll be able to integrate the posts with the official documentation of the project and release it! \o/

Before the release, users willing to try balde must install it manually from Github or using my Gentoo overlay (package is called net-libs/balde there). The previously released versions are very old and deprecated at this point.

So, I'll start talking about the foundations of the framework. It is based on GLib, that is the base library used by Gtk+ and GNOME applications. balde uses it as an utility library, without implementing classes or relying on advanced features of the library. That's because I plan to migrate away from GLib in the future, reimplementing the required functionality in a BSD-licensed library. I have a list of functions that must be implemented to achieve this objective in the wiki, but this is not something with high priority for now.

Another important foundation of the framework is the template engine. Instead of parsing templates in runtime, balde will parse templates in build time, generating C code, that is compiled into the application binary. The template engine is based on a recursive-descent parser, built with a parsing expression grammar. The grammar is simple enough to be easily extended, and implements most of the features needed by a basic template engine. The template engine is implemented as a binary, that reads the templates and generates the C source files. It is called balde-template-gen and will be the subject of a dedicated post in this series.

A notable deficiency of the template engine is the lack of iterators, like for and while loops. This is a side effect of another basic characteristic of balde: all the data parsed from requests and sent to responses is stored as string into the internal structures, and all the public interfaces follow the same principle. That means that the current architecture does not allow passing a list of items to a template. And that also means that the users must handle type conversions from and to strings, as needed by their applications.

Static files are also converted to C code and compiled into the application binary, but here balde just relies on GLib GResource infrastructure. This is something that should be reworked in the future too. Integrate templates and static resources, implementing a concept of themes, is something that I want to do as soon as possible.

To make it easier for newcomers to get started with balde, it comes with a binary that can create a skeleton project using GNU Autotools, and with basic unit test infrastructure. The binary is called balde-quickstart and will be the subject of a dedicated post here as well.

That's all for now.

In the next post I'll talk about how URL routing works.

May 20, 2016
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Important!

My tech articles—especially Linux ones—are some of the most-viewed on The Z-Issue. If this one has helped you, please consider a small donation to The Parker Fund by using the top widget at the right. Thanks!

Whenever I purchase music from Bandcamp, SoundCloud, or other music sites that offer uncompressed, full-quality downloads (I can’t bring myself to download anything but original or lossless music files), I will always download the original WAV if it is offered. I prefer to keep the original copy around just in case, but usually I will put that on external storage, and use FLAC compression for actual listening (see my post comparing FLAC compression levels, if you’re interested).

Typically, my workflow for getting songs/albums ready is:

  • Purchase and download the full-quality WAV files
  • Rename them all according to my naming conventions
  • Batch convert them to FLAC using the command line (below)
  • Add in all the tags using EasyTag
  • Standardise the album artwork
  • Add the FLAC files to my playlists on my computers and audio server
  • Batch convert the FLACs to OGG Vorbis using a flac2all command (below) for my mobile and other devices with limited storage

It takes some time, but it’s something that I only have to do once per album, and it’s worth it for someone like me (read “OCD”). 😉 For good measure, here are the commands that I run:

Batch converting files from WAV to FLAC:
find music/wavs/$ARTIST/$ALBUM/ -iname '*.wav' -exec flac -3 {} \;
obviously replacing $ARTIST and $ALBUM with the name of the artist and album, respectively.

Batch converting files from FLAC to OGG using flac2all:
python2 flac2all_v3.38.py vorbis ./music/flac/ -v 'quality=7' -c -o ./music/ogg/

By the way, flac2all is awesome because it copies the tags and the album art as well. That’s a huge time saver for me.

Normally this process goes smoothly, and I’m on my way to enjoying my new music rather quickly. However, I recently downloaded some WAVs from SoundCloud and couldn’t figure out why I was coming up with fewer FLACs than WAVs after converting. I looked back through the output from the command, and saw the following error message on some of the track conversions:

05-Time Goes By.wav: ERROR: unsupported format type 3

That was a rather nebulous and obtuse error message, so I decided to investigate a file that worked versus these ones that didn’t:

File that failed:
$ file vexento/02-inspiration/05-Time\ Goes\ By.wav
RIFF (little-endian) data, WAVE audio, stereo 44100 Hz

File that succeeded:
$ file vexento/02-inspiration/04-Riot.wav
RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, stereo 44100 Hz

The differences are that the working files indicated “Microsoft PCM” and “16 bit.” The fix for the problem was rather simple, actually. I used Audacity (which is a fantastic piece of cross-platform, open-source software for audio editing), and just re-exported the tracks that were failing. Basically, open the file in Audacity, make no edits, and just go to File –> Export –> “Wav (Microsoft) signed 16 bit PCM”, which you can see in the screenshot below:

Using Audacity to fix FLAC to WAV unsupported format 3
Click to enlarge

Just like that, the problem was gone! Also, I noticed that the file size changed substantially. I’m used to a WAV being about 10MiB for every minute of audio. Before re-exporting these files, they were approximately 20MiB for every minute. So, this track went from ~80MiB to ~40MiB. :)

Hope that helps!

Cheers,
Zach

P.S. By the way, Vexento (the artist who released the tracks mentioned here) is amazingly fun, and I recommend that everyone give him a chance. He’s a young Norwegian guy (actually named Alexander Hansen) who creates a wide array of electronic music. Two tracks (that are very different from one another) that I completely adore are Trippy Love (upbeat and fun), and Peace (calming yet cinematic).

May 17, 2016
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
Saving a non-booting Asus UX31A laptop (May 17, 2016, 23:09 UTC)

I have just come back from a long(ish) trip through UK and US, and decided it's time for me to go back to some simple OSS tasks, while I finally convince myself to talk about the doubts I'm having lately.

To start on that, I tried to turn on my laptop, the Asus UX31A I got three and a half years ago. It didn't turn on. This happened before, so I just left it to charge and tried again. No luck.

Googling around I found a number of people with all kind of problems about it, and one of them is something getting stuck at the firmware level. Given how I had found a random problem with PCIE settings in my previous laptop, that would make it reboot every time I turned it off, but only if the power was still plugged in, I was not completely surprised. Unfortunately following the advice I read (take off the battery and power over AC) didn't help.

I knew it was not the (otherwise common) problem with the power plug, because when I plugged the cable in, the Yubikey Neo-n would turn on, which means power arrived to the board fine.

Then I remembered two things: one of the advices was about the keyboard, and the keyboard itself has had problems before (the control key sometimes would stop working for half an hour at a time.) Indeed, once I re-seated the keyboards' ribbon cable, it turned on again, yay!

But here's the other problem: the laptop would turn on, the caps-lock LED on and stay there. And even letting the main battery run out would not be enough to return it to working conditions. What to do? Well, I got a hunch, and turned out to be right.

One of the things that I tried before was to remove the CMOS battery — either I kept it out not long enough to properly clear, or something else went wrong, but it turned out that removing the CMOS battery allowed the system to start up — but that would mean no RTC, which is not great, if you start the laptop without an Internet connection.

The way I solved it was as follows:

  • disconnect the CMOS battery;
  • start up the laptop;
  • enter "BIOS" (EFI) setup;
  • make any needed change (such as time);
  • "Save and exit";
  • let the laptop boot up;
  • connect the CMOS battery.

Yes this does involve running the laptop without the lower plate for a while, be careful about it, but to the other hand, it did save my laptop from being stomped on, on the ground out of sheer rage.

May 16, 2016
Michał Górny a.k.a. mgorny (homepage, bugs)
How LINGUAS are thrice wrong! (May 16, 2016, 06:34 UTC)

The LINGUAS environment variable serves two purposes in Gentoo. On one hand, it’s the USE_EXPAND flag group for USE flags controlling installation of localizations. On the other, it’s a gettext-specfic environment variable controlling installation of localizations in some of build systems supporting gettext. Fun fact is, both uses are simply wrong.

Why LINGUAS as an environment variable is wrong?

Let’s start with the upstream-blessed LINGUAS environment variable. If set, it limits localization files installed by autotools+gettext-based build systems (and some more) to the subset matching specified locales. At first, it may sound like a useful feature. However, it is an implicit feature, and therefore one causing a lot of confusion for the package manager.

Long story short, in this context the package manager does not know anything about LINGUAS. It’s just a random environment variable, that has some value and possibly may be saved somewhere in package metadata. However, this value can actually affect the installed files in a hardly predictable way. So, even if package managers actually added some special meaning to LINGUAS (which would be non-PMS compliant), it still would not be good enough.

What does this practically mean? It means that if I set LINGUAS to some value on my system, then most of the binary packages produced by it suddenly have files stripped, as compared to non-LINGUAS builds. If I installed the binary package on some other system, it would match the LINGUAS of build host rather than the install host. And this means the binary packages are simply incorrect.

Even worse, any change to LINGUAS can not be propagated correctly. Even if the package manager decided to rebuild packages based on changes in LINGUAS, it has no way of knowing which locales were supported by a package, and if LINGUAS was used at all. So you end up rebuilding all installed packages, just in case.

Why LINGUAS USE flags are wrong?

So, how do we solve all those problems? Of course, we introduce explicit LINGUAS flags. This way, the developer is expected to list all supported locales in IUSE, the package manager can determine the enabled localizations and match binary packages correctly. All seems fine. Except, there are two problems.

The first problem is that it is cumbersome. Figuring out supported localizations and adding a dozen flags on a number of packages is time-consuming. What’s even worse, those flags need to be maintained once added. Which means you have to check supported localizations for changes on every version bump. Not all developers do that.

The second problem is that it is… a QA violation, most of the time. We already have quite a clear policy that USE flags are not supposed to control installation of small files with no explicit dependencies — and most of the localization files are exactly that!

Let me remind you why we have that policy. There are two reasons: rebuilds and binary packages.

Rebuilds are bad because every time you change LINGUAS, you end up rebuilding relevant packages, and those can be huge. You may think it uncommon — but just imagine you’ve finally finished building your new shiny Gentoo install, and noticed that you forgot to enable the localization. And guess what! You have to build a number of packages, again.

Binary packages are even worse since they are tied to a specific USE flag combination. If you build a binary package with specific LINGUAS, it can only be installed on hosts with exactly the same LINGUAS. While it would be trivial to strip localizations from installed binary package, you have to build a fresh one. And with dozen lingua-flags… you end up having thousands of possible binary package variants, if not more.

Why EAPI 5 makes things worse… or better?

Reusing the LINGUAS name for the USE_EXPAND group looked like a good idea. After all, the value would end up in ebuild environment for use by the build system, and in most of the affected packages, LINGUAS worked out of the box with no ebuild changes! Except that… it wasn’t really guaranteed to before EAPI 5.

In earlier EAPIs, LINGUAS could contain pretty much anything, since no special behavior was reserved for it. However, starting with EAPI 5 the package manager guarantees that it will only contain those values that correspond to enabled flags. This is a good thing, after all, since it finally makes LINGUAS work reliably. It has one side effect though.

Since LINGUAS is reduced to enabled USE flags, and enabled USE flags can only contain defined USE flags… it means that in any ebuild missing LINGUAS flags, LINGUAS should be effectively empty (yes, I know Portage does not do that currently, and it is a bug in Portage). To make things worse, this means set to an empty value rather than unset. In other words, disabling localization completely.

This way, a small implicit QA issue of implicitly affecting installed localization files turned out into a bigger issue of suddenly stopping to install localizations. Which in turn can’t be fixed without introducing proper set of LINGUAS everywhere, causing other kind of QA issues and additional maintenance burden.

What would be the good solution, again?

First of all, kill LINGUAS. Either make it unset for good (and this won’t be easy since PMS kinda implies making all PM-defined variables read-only), or disable any special behavior associated with it. Make the build system compile and install all localizations.

Then, use INSTALL_MASK. It’s designed to handle this. It strips files from installed systems while preserving them in binary packages. Which means your binary packages are now more portable, and every system you install them to will get correct localizations stripped. Isn’t that way better than rebuilding things?

Now, is that going to happen? I doubt it. People are rather going to focus on claiming that buggy Portage behavior was good, that QA issues are fine as long as I can strip some small files from my system in the ‘obvious’ way, that the specification should be changed to allow a corner case…

May 13, 2016
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Important!

My tech articles—especially Linux ones—are some of the most-viewed on The Z-Issue. If this one has helped you, please consider a small donation to The Parker Fund by using the top widget at the right. Thanks!

Whew, I know that’s a long title for a post, but I wanted to make sure that I mentioned every term so that people having the same problem could readily find the post that explains what solved it for me. For some time now (ever since the 346.x series [340.76, which was the last driver that worked for me, was released on 27 January 2015]), I have had a problem with the NVIDIA Linux Display Drivers (known as nvidia-drivers in Gentoo Linux). The problem that I’ve experienced is that the newer drivers would, upon starting an X session, immediately clock up to Performance Level 2 or 3 within PowerMizer.

Before using these newer drivers, the Performance Level would only increase when it was really required (3D rendering, HD video playback, et cetera). I probably wouldn’t have even noticed that the Performance Level was changing, except that it would cause the GPU fan to spin faster, which was noticeably louder in my office.

After scouring the interwebs, I found that I was not the only person to have this problem. For reference, see this article, and this one about locking to certain Performance Levels. However, I wasn’t able to find a solution for the exact problem that I was having. If you look at the screenshot below, you’ll see that the Performance Level is set at 2 which was causing the card to run quite hot (79°C) even when it wasn’t being pushed.

NVIDIA Linux drivers stuck on high Performance Level in Xorg
Click to enlarge

It turns out that I needed to add some options to my X Server Configuration. Unfortunately, I was originally making changes in /etc/X11/xorg.conf, but they weren’t being honoured. I added the following lines to /etc/X11/xorg.conf.d/20-nvidia.conf, and the changes took effect:


Section "Device"
     Identifier    "Device 0"
     Driver        "nvidia"
     VendorName    "NVIDIA Corporation"
     BoardName     "GeForce GTX 470"
     Option        "RegistryDwords" "PowerMizerEnable=0x1; PowerMizerDefaultAC=0x3;"
EndSection

The portion in bold (the RegistryDwords option) was what ultimately fixed the problem for me. More information about the NVIDIA drivers can be found in their README and Installation Guide, and in particular, these settings are described on the X configuration options page. The PowerMizerDefaultAC setting may seem like it is for laptops that are plugged in to AC power, but as this system was a desktop, I found that it was always seen as being “plugged in to AC power.”

As you can see from the screenshots below, these settings did indeed fix the PowerMizer Performance Levels and subsequent temperatures for me:

NVIDIA Linux drivers Performance Level after Xorg settings
Click to enlarge

Whilst I was adding X configuration options, I also noticed that Coolbits (search for “Coolbits” on that page) were supported with the Linux driver. Here’s the excerpt about Coolbits for version 364.19 of the NVIDIA Linux driver:

Option “Coolbits” “integer”
Enables various unsupported features, such as support for GPU clock manipulation in the NV-CONTROL X extension. This option accepts a bit mask of features to enable.

WARNING: this may cause system damage and void warranties. This utility can run your computer system out of the manufacturer’s design specifications, including, but not limited to: higher system voltages, above normal temperatures, excessive frequencies, and changes to BIOS that may corrupt the BIOS. Your computer’s operating system may hang and result in data loss or corrupted images. Depending on the manufacturer of your computer system, the computer system, hardware and software warranties may be voided, and you may not receive any further manufacturer support. NVIDIA does not provide customer service support for the Coolbits option. It is for these reasons that absolutely no warranty or guarantee is either express or implied. Before enabling and using, you should determine the suitability of the utility for your intended use, and you shall assume all responsibility in connection therewith.

When “2” (Bit 1) is set in the “Coolbits” option value, the NVIDIA driver will attempt to initialize SLI when using GPUs with different amounts of video memory.

When “4” (Bit 2) is set in the “Coolbits” option value, the nvidia-settings Thermal Monitor page will allow configuration of GPU fan speed, on graphics boards with programmable fan capability.

When “8” (Bit 3) is set in the “Coolbits” option value, the PowerMizer page in the nvidia-settings control panel will display a table that allows setting per-clock domain and per-performance level offsets to apply to clock values. This is allowed on certain GeForce GPUs. Not all clock domains or performance levels may be modified.

When “16” (Bit 4) is set in the “Coolbits” option value, the nvidia-settings command line interface allows setting GPU overvoltage. This is allowed on certain GeForce GPUs.

When this option is set for an X screen, it will be applied to all X screens running on the same GPU.

The default for this option is 0 (unsupported features are disabled).

I found that I would personally like to have the options enabled by “4” and “8”, and that one can combine Coolbits by simply adding them together. For instance, the ones I wanted (“4” and “8”) added up to “12”, so that’s what I put in my configuration:


Section "Device"
     Identifier    "Device 0"
     Driver        "nvidia"
     VendorName    "NVIDIA Corporation"
     BoardName     "GeForce GTX 470"
     Option        "Coolbits" "12"
     Option        "RegistryDwords" "PowerMizerEnable=0x1; PowerMizerDefaultAC=0x3;"
EndSection

and that resulted in the following options being available within the nvidia-settings utility:

NVIDIA Linux drivers Performance Level after Xorg settings
Click to enlarge

Though the Coolbits portions aren’t required to fix the problems that I was having, I find them to be helpful for maintenance tasks and configurations. I hope, if you’re having problems with the NVIDIA drivers, that these instructions help give you a better understanding of how to workaround any issues you may face. Feel free to comment if you have any questions, and we’ll see if we can work through them.

Cheers,
Zach

May 10, 2016
Jason A. Donenfeld a.k.a. zx2c4 (homepage, bugs)
New Company: Edge Security (May 10, 2016, 14:48 UTC)

I've just launched a website for my new information security consulting company, Edge Security. We're expert hackers, with a fairly diverse skill set and a lot of experience. I mention this here because in a few months we plan to release an open-source kernel module for Linux called WireGuard. No details yet, but keep your eyes open in this space.

May 08, 2016
Gentoo Haskell Herd a.k.a. haskell (homepage, bugs)
How to deal with portage blockers (May 08, 2016, 21:27 UTC)

TL;DR:

  • use --autounmask=n
  • use --backtrack=1000 (or more)
  • package.mask all outdated packages that cause conflicts (usually requires more iterations)
  • run world update

The problem

Occasionally (more frequently on haskel packages) portage starts taking long time to only tell you it was not able to figure out the upgrade path.

Some people suggest "wipe-blockers-and-reinstall" solution. This post will try to explain how to actually upgrade (or find out why it’s not possible) without actually destroying your system.

Real-world example

I’ll take a real-world example in Gentoo’s bugzilla: bug 579520 where original upgrade error looked like that:

!!! Multiple package instances within a single package slot have been pulled
!!! into the dependency graph, resulting in a slot conflict:

x11-libs/gtk+:3

  (x11-libs/gtk+-3.18.7:3/3::gentoo, ebuild scheduled for merge) pulled in by
    (no parents that aren't satisfied by other packages in this slot)

  (x11-libs/gtk+-3.20.0:3/3::gnome, installed) pulled in by
    >=x11-libs/gtk+-3.19.12:3[introspection?] required by (gnome-base/nautilus-3.20.0:0/0::gnome, installed)
    ^^              ^^^^^^^^^
    >=x11-libs/gtk+-3.20.0:3[cups?] required by (gnome-base/gnome-core-libs-3.20.0:3.0/3.0::gnome, installed)
    ^^              ^^^^^^^^
    >=x11-libs/gtk+-3.19.4:3[introspection?] required by (media-video/totem-3.20.0:0/0::gnome, ebuild scheduled for merge)
    ^^              ^^^^^^^^
    >=x11-libs/gtk+-3.19.0:3[introspection?] required by (app-editors/gedit-3.20.0:0/0::gnome, ebuild scheduled for merge)
    ^^              ^^^^^^^^
    >=x11-libs/gtk+-3.19.5:3 required by (gnome-base/dconf-editor-3.20.0:0/0::gnome, ebuild scheduled for merge)
    ^^              ^^^^^^^^
    >=x11-libs/gtk+-3.19.6:3[introspection?] required by (x11-libs/gtksourceview-3.20.0:3.0/3::gnome, ebuild scheduled for merge)
    ^^              ^^^^^^^^
    >=x11-libs/gtk+-3.19.3:3[introspection,X] required by (media-gfx/eog-3.20.0:1/1::gnome, ebuild scheduled for merge)
    ^^              ^^^^^^^^
    >=x11-libs/gtk+-3.19.8:3[X,introspection?] required by (x11-wm/mutter-3.20.0:0/0::gnome, installed)
    ^^              ^^^^^^^^
    >=x11-libs/gtk+-3.19.12:3[X,wayland?] required by (gnome-base/gnome-control-center-3.20.0:2/2::gnome, installed)
    ^^              ^^^^^^^^^
    >=x11-libs/gtk+-3.19.11:3[introspection?] required by (app-text/gspell-1.0.0:0/0::gnome, ebuild scheduled for merge)
    ^^              ^^^^^^^^^

x11-base/xorg-server:0

  (x11-base/xorg-server-1.18.3:0/1.18.3::gentoo, installed) pulled in by
    x11-base/xorg-server:0/1.18.3= required by (x11-drivers/xf86-video-nouveau-1.0.12:0/0::gentoo, installed)
                        ^^^^^^^^^^
    x11-base/xorg-server:0/1.18.3= required by (x11-drivers/xf86-video-fbdev-0.4.4:0/0::gentoo, installed)
                        ^^^^^^^^^^
    x11-base/xorg-server:0/1.18.3= required by (x11-drivers/xf86-input-evdev-2.10.1:0/0::gentoo, installed)
                        ^^^^^^^^^^

  (x11-base/xorg-server-1.18.2:0/1.18.2::gentoo, ebuild scheduled for merge) pulled in by
    x11-base/xorg-server:0/1.18.2= required by (x11-drivers/xf86-video-vesa-2.3.4:0/0::gentoo, installed)
                        ^^^^^^^^^^
    x11-base/xorg-server:0/1.18.2= required by (x11-drivers/xf86-input-synaptics-1.8.2:0/0::gentoo, installed)
                        ^^^^^^^^^^
    x11-base/xorg-server:0/1.18.2= required by (x11-drivers/xf86-input-mouse-1.9.1:0/0::gentoo, installed)
                        ^^^^^^^^^^

app-text/poppler:0

  (app-text/poppler-0.32.0:0/51::gentoo, ebuild scheduled for merge) pulled in by
    >=app-text/poppler-0.32:0/51=[cxx,jpeg,lcms,tiff,xpdf-headers(+)] required by (net-print/cups-filters-1.5.0:0/0::gentoo, installed)
                           ^^^^^^
    >=app-text/poppler-0.16:0/51=[cxx] required by (app-office/libreoffice-5.0.5.2:0/0::gentoo, installed)
                           ^^^^^^
    >=app-text/poppler-0.12.3-r3:0/51= required by (app-text/texlive-core-2014-r4:0/0::gentoo, installed)
                                ^^^^^^

  (app-text/poppler-0.42.0:0/59::gentoo, ebuild scheduled for merge) pulled in by
    >=app-text/poppler-0.33[cairo] required by (app-text/evince-3.20.0:0/evd3.4-evv3.3::gnome, ebuild scheduled for merge)
    ^^                 ^^^^

net-fs/samba:0

  (net-fs/samba-4.2.9:0/0::gentoo, installed) pulled in by
    (no parents that aren't satisfied by other packages in this slot)

  (net-fs/samba-3.6.25:0/0::gentoo, ebuild scheduled for merge) pulled in by
    net-fs/samba[smbclient] required by (media-sound/xmms2-0.8-r2:0/0::gentoo, ebuild scheduled for merge)
                 ^^^^^^^^^


It may be possible to solve this problem by using package.mask to
prevent one of those packages from being selected. However, it is also
possible that conflicting dependencies exist such that they are
impossible to satisfy simultaneously.  If such a conflict exists in
the dependencies of two different packages, then those packages can
not be installed simultaneously.

For more information, see MASKED PACKAGES section in the emerge man
page or refer to the Gentoo Handbook.


emerge: there are no ebuilds to satisfy ">=dev-libs/boost-1.55:0/1.57.0=".
(dependency required by "app-office/libreoffice-5.0.5.2::gentoo" [installed])
(dependency required by "@selected" [set])
(dependency required by "@world" [argument])

A lot of text! Let’s trim it down to essential detail first (AKA how to actually read it). I’ve dropped the "cause" of conflcts from previous listing and left only problematic packages:

!!! Multiple package instances within a single package slot have been pulled
!!! into the dependency graph, resulting in a slot conflict:

x11-libs/gtk+:3
  (x11-libs/gtk+-3.18.7:3/3::gentoo, ebuild scheduled for merge) pulled in by
  (x11-libs/gtk+-3.20.0:3/3::gnome, installed) pulled in by

x11-base/xorg-server:0
  (x11-base/xorg-server-1.18.3:0/1.18.3::gentoo, installed) pulled in by
  (x11-base/xorg-server-1.18.2:0/1.18.2::gentoo, ebuild scheduled for merge) pulled in by

app-text/poppler:0
  (app-text/poppler-0.32.0:0/51::gentoo, ebuild scheduled for merge) pulled in by
  (app-text/poppler-0.42.0:0/59::gentoo, ebuild scheduled for merge) pulled in by

net-fs/samba:0
  (net-fs/samba-4.2.9:0/0::gentoo, installed) pulled in by
  (net-fs/samba-3.6.25:0/0::gentoo, ebuild scheduled for merge) pulled in by

emerge: there are no ebuilds to satisfy ">=dev-libs/boost-1.55:0/1.57.0=".

That is more manageable. We have 4 "conflicts" here and one "missing" package.

Note: all the listed requirements under "conflicts" (the previous listing) suggest these are >= depends. Thus the dependencies themselves can’t block upgrade path and error message is misleading.

For us it means that portage leaves old versions of gtk+ (and others) for some unknown reason.

To get an idea on how to explore that situation we need to somehow hide outdated packages from portage and retry an update. It will be the same as uninstalling and reinstalling a package but not actually doing it:)

package.mask does exactly that. To make package hidden for real we will also need to disable autounmask: --autounmask=n (default is y).

Let’s hide outdated x11-libs/gtk+-3.18.7 package from portage:

# /etc/portage/package.mask
<x11-libs/gtk+-3.20.0:3

Blocker list became shorter but still does not make sense:

x11-base/xorg-server:0
  (x11-base/xorg-server-1.18.2:0/1.18.2::gentoo, ebuild scheduled for merge) pulled in by
  (x11-base/xorg-server-1.18.3:0/1.18.3::gentoo, installed) pulled in by
                        ^^^^^^^^^^

app-text/poppler:0
  (app-text/poppler-0.32.0:0/51::gentoo, ebuild scheduled for merge) pulled in by
  (app-text/poppler-0.42.0:0/59::gentoo, ebuild scheduled for merge) pulled in by

Blocking more old stuff:

# /etc/portage/package.mask
<x11-libs/gtk+-3.20.0:3
<x11-base/xorg-server-1.18.3
<app-text/poppler-0.42.0

The output is:

[blocks B      ] <dev-util/gdbus-codegen-2.48.0 ("<dev-util/gdbus-codegen-2.48.0" is blocking dev-libs/glib-2.48.0)

* Error: The above package list contains packages which cannot be
* installed at the same time on the same system.

 (dev-libs/glib-2.48.0:2/2::gentoo, ebuild scheduled for merge) pulled in by

 (dev-util/gdbus-codegen-2.46.2:0/0::gentoo, ebuild scheduled for merge) pulled in by

That’s our blocker! Stable <dev-util/gdbus-codegen-2.48.0 blocks unstable blocking dev-libs/glib-2.48.0.

The solution is to ~arch keyword dev-util/gdbus-codegen-2.48.0:

# /etc/portage/package.accept_keywords
dev-util/gdbus-codegen

And run world update.

Simple!


May 04, 2016
Arun Raghavan a.k.a. ford_prefect (homepage, bugs)

As we approach the PulseAudio 9.0 release, I thought it would be a good time to talk about one of the things I had a chance to work on, that landed in this cycle.

Old-time readers will remember the work I had done in the past on echo cancellation. If you’re unfamiliar with the concept, imagine a situation where you’re making a call from your phone or laptop. You don’t have a headset, so you use your device’s speaker and microphone. Now when the person on the other end speaks, their voice is played out of your speaker and captured by your mic. This means that they can now also hear what they’re saying, with some lag — this is called echo. If this has happened to you, you know how annoying and disruptive it can be.

Using Acoustic Echo Cancellation (AEC), PulseAudio is able to detect this in the captured input, and remove the audio we recently played back. While doing this, we also run some other algorithms to enhance the captured input, such as noise suppression (great at damping out background and fan noise) and acoustic gain control, or AGC, which adjusts the mic volume so you are clearly audible). In addition to voice call use cases, this is also handy to have in other applications such as speech recognition (where you want the device to detect what a user is saying, while possibly playing out other sounds).

We don’t implement these algorithms ourselves in PulseAudio. The echo cancellation module — cunningly named module-echo-cancel — provides the infrastructure to plug in different echo canceller implementations. One of these that we support (and recommend), is based on Google’s [WebRTC.org] implementation which includes an extremely capable set of voice processing algorithms.

This is a large code-base, intended to support a full real-time communication stack, and we didn’t want to pick up all that code to include in PulseAudio. So what I did was to make a copy of the AudioProcessing module, wrap it in an easy-to-package library, and then used that from PulseAudio. Quite some time passed by, and I didn’t get a chance to update that code, until last October.

What’s New

The update brought us a number of things since the last one (5 years ago!):

  • The AGC module has essentially been rewritten. In practice, we see that it is slower to change the volume.

  • Voice Activity Detection (VAD) has also been split off into its own module and undergone significant changes.

  • Beamforming has been added, to allow you to use a set of microphones to be able to “point” your microphone array in a specific direction (more on this in a later post).

  • There is now an intelligibility enhancer for applying processing on the stream coming in from the far end (so you can hear the other side better). This feature has not been hooked up in PulseAudio yet.

  • There is a transient suppressor for when you’re on a laptop, and your microphone is picking up keystrokes. This can be important since the sound of the keystroke introduces sharp spikes or “transients” in the audio stream, which can throw off the echo canceller that works best with the frequency range of the human voice. This one seems to be work-in-progress, and not actually used yet.

In addition to this, I’ve also extended support in module-echo-cancel for performing cancellation on multiple channels. So we are now able to deal with hardware that has any number of playback and capture channels (and they don’t even need to be equal), and we no longer have the artificial restriction of having to downmix things to mono.

These changes are in the newly released webrtc-audio-processing v0.2. Unfortunately, we do break API with regards to the previous version. I wrote about this a while back, and hopefully the impact on other users of this library will be minimal.

All this work was made possible thanks to Aldebaran Robotics. A special shout-out to Julien Massot and his excellent team!

These features are already in our master branch, and will be part of the 9.0 release. If you’re using these features, let me know how things work for you, and watch out for a follow up post about beamforming.

If you or your company are looking for help with either PulseAudio or GStreamer, do take a look at the consulting services I currently provide.

April 30, 2016
Sebastian Pipping a.k.a. sping (homepage, bugs)
Loop variable collisions in Bash (April 30, 2016, 16:25 UTC)

Let’s have a look at this simple Bash script:

#! /bin/bash
inner() {
    for i in {1..4}; do
        echo "  $i"
    done
}

outer() {
    for i in {1..3}; do
        echo "$i"
        inner
    done
}

outer

The output is ..

1
  1
  2
  3
  4
2
  1
  2
  3
  4
3
  1
  2
  3
  4

.. as expected. Now if we turn the for loops to C-style, we end up with this code:

#! /bin/bash
inner() {
    for ((i = 1; i <= 4; i++)); do
        echo "  $i"
    done
}

outer() {
    for ((i = 1; i <= 3; i++)); do
        echo "$i"
        inner
    done
}

outer

Before you continue, take a moment: What output do you expect?

The same?

The output we get is:

1
  1
  2
  3
  4

Why?

By default, in Bash variables a global unless declared local, explicitly. That includes loop variables.
The initial code had inner modify the global $i already; the collision did not show, because the master loop for i in {1..3}; do resets $i to the next value from list “1 2 3” rather than adding 1 to $i‘s current value. So right after inner returns, $i is “corrupted” for a brief moment in the original code, already.

If we were to fix the C-style loop version, falling back to plain for loops is addressing symptoms more than causes. For addressing causes, I would like to propose both a soft and a hard fix:

#! /bin/bash
inner() {
    local i  # soft fix
    for ((i = 1; i <= 4; i++)); do
        echo "  $i"
    done
}

outer() {
    local i
    for ((i = 1; i <= 3; i++)); do
        echo "$i"
        ( inner )  # hard fix
    done
}

outer

The soft fix is declaring $i as local (to inner) so that it does not modify the global $i (“shadowing”). (The new local i in outer is for hygiene, and not taking part in this particular fix.)

The hard fix is calling inner from a subshell so that outer does not have to trust inner on globals.

To summarize:

  • Loop variables are global by default, too: Better turn them local.
  • Declare as many variables local as possible, in general.

April 28, 2016
GSoC 2016: Five projects accepted (April 28, 2016, 00:00 UTC)

We are excited to announce that 5 students have been selected to participate with Gentoo during the Google Summer of Code 2016!

You can follow our students’ progress on the gentoo-soc mailing list and chat with us regarding our GSoC projects via IRC in #gentoo-soc on freenode.
Congratulations to all the students. We look forward to their contributions!

GSoC logo

Accepted projects

Clang native support - Lei Zhang: Bring native clang/LLVM support in Gentoo.

Continuous Stabilization - Pallav Agarwal: Automate the package stabilization process using continuous integration practices.

kernelconfig - André Erdmann: Consistently generate custom Linux kernel configurations from curated sources.

libebuild - Denys Romanchuk: Create a common shared C-based implementation for package management and other ebuild operations in the form of a library.

Gentoo-GPG- Angelos Perivolaropoulos: Code the new Meta­-Manifest system for Gentoo and improve Gentoo Keys capabilities.

Events: Gentoo Miniconf 2016 (April 28, 2016, 00:00 UTC)

Gentoo Miniconf 2016 will be held in Prague, Czech Republic during the weekend of 8 and 9 October 2016. Like last time, is hosted together with the LinuxDays by the Faculty of Information Technology of the Czech Technical University.

Want to participate? The call for papers is open until 1 August 2016.

April 25, 2016
Gentoo Miniconf 2016 a.k.a. miniconf-2016 (homepage, bugs)

Gentoo Miniconf 2016 will be held in Prague, Czech Republic during the weekend of 8 and 9 October 2016. Like last time, is hosted together with the LinuxDays by the Faculty of Information Technology of the Czech Technical University in Prague (FIT ČVUT).

The call for papers is now open where you can submit your session proposal until 1 August 2016. Want to have a meeting, discussion, presentation, workshop, do ebuild hacking, or anything else? Tell us!

miniconf-2016

April 22, 2016
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Well, it was that time of year again… time to dress up as one’s favourite Superhero and run for a great cause! The first weekend of April, I made the drive down to the lovely city of Huntsville, Alabama in order to support the National Children’s Advocacy Center by running in the 2016 NCAC Superheros 5K (here’s my post about last year’s race).

This year’s course was the same as last year, so it was a really nice run through parts of the city centre and through more residential areas of Huntsville. Unlike last year, the race started much later in the afternoon, so the temperatures were a lot higher (last year, truthfully, a bit chilly). It was beautifully sunny, and actually bordered on a bit warm, but I would gladly take those conditions over the cold! :)

2016 NCAC Superheroes race - pre-race start
Right before the start of the race
Click for full photo

I wasn’t quite sure how this race was going to turn out, seeing as it was my first since the knee injury late last year. I was hopeful that my rehabilitation and training since the injury would help me at least come close to my time last year, but I also doubted that possibility. I came in first place overall with a time of 20:13, which was a little over 30 seconds slower than last year. All things considered, I was pleased with my time. A few other fantastic runners to mention this year were Elliott Kliesner (age 14) who came in about 37 seconds after me, Christian Grant (age 12) with a time of 21:42, and Bud Bettler (age 72) who finished with an outstanding time for his age bracket at 28:16.

2016 NCAC Superheroes race - 5K results
5K Results 1st through 5th place
Click for top 45 results

Years ago, I decided that I wouldn’t run in any races unless they benefited a children’s charity, and I can’t think of any organisation with which my goals more align with their mission than the National Children’s Advocacy Center. According to WAFF News in Huntsville, the race raised over $24,000 for the NCAC! That will make a huge difference in the lives of the children that the NCAC serves! Here’s to hoping that next year’s race (the 7th annual) will raise even more. Hope to see you there!

2016 NCAC Superheroes race - Nathan Zachary award acceptance
Nathan Zachary’s award (and Superhero cape) acceptance
Click to enlarge

Cheers,
Zach

April 15, 2016
Michał Górny a.k.a. mgorny (homepage, bugs)

Those of you who use my Gentoo repository mirrors may have noticed that the repositories are constructed of original repository commits automatically merged with cache updates. While the original commits are signed (at least in the official Gentoo repository), the automated cache updates and merge commits are not. Why?

Actually, I was wondering about signing them more than once, even discussed it a bit with Kristian. However, each time I decided against it. I was seriously concerned that those automatic signatures would not be able to provide sufficient security level — and could cause the users to believe the commits are authentic even if they were not. I think it would be useful to explain why.

Verifying the original commits

While this may not be entirely clear, by signing the merge commits I would implicitly approve the original commits as well. While this might be worked-around via some kind of policy requesting the developer to perform additional verification, such a policy would be impractical and confusing. Therefore, it only seems reasonable to verify the original commits before signing merges.

The problem with that is that we still do not have an official verification tool for repository commits. There’s the whole Gentoo-keys project that aims to eventually solve the problem but it’s not there yet. Maybe this year’s Summer of Code will change that…

Not having an official verification routines, I would have to implement my own. I’m not saying it would be that hard — but it would always be semi-official, at best. Of course, I could spend a day or two in contributing needed code to Gentoo-keys and preventing some student from getting the $5500 of Google money… but that would be the non-enterprise way of solving the urgent problem.

Protecting the signing key

The other important point is the security of key used to sign commits. For the whole effort to make any sense, it needs to be strongly protected against being compromised. Keeping the key (or even a subkey) unencrypted on the server really diminishes the whole effort (I’m not pointing fingers here!)

Basic rules first. The primary key kept off-line, used to generate signing subkey only. Signing subkey stored encrypted on the server and used via gpg-agent, so that it won’t be kept unencrypted outside the memory. All nice and shiny.

The problem is — this means someone needs to type the password in. Which means there needs to be an interactive bootstrap process. Which means every time server reboots for some reason, or gpg-agent dies, or whatever, the mirrors stop and wait for me to come and type the password in. Hopefully when I’m around some semi-secure device.

Protecting the software

Even all those points considered and solved satisfiably, there’s one more issue: the software. I won’t be running all those scripts in my home. So it’s not just me you have to trust — you have to trust all other people with administrative access to the machine that’s running the scripts, you have to trust the employees of the hosting company that have physical access to the machine.

I mean, any one of them can go and attempt to alter the data somehow. Even if I tried hard, I won’t be able to protect my scripts from this. In the worst case, they are going to add a valid, verified signature to the data that has been altered externally. What’s the value of this signature then?

And this is the exact reason why I don’t do automatic signatures.

How to verify the mirrors then?

So if automatic signatures are not the way, how can you verify the commits on repository mirrors? The answer is not that complex.

As I’ve mentioned, the mirrors use merge commits to combine metadata updates with original repository commits. What’s important is that this preserves the original commits, along with their valid signatures and therefore provides a way to verify them. What’s the use of that?

Well, you can look for the last merge commit to find the matching upstream commit. Then you can use the usual procedure to verify the upstream commit. And then, you can diff it against the mirror HEAD to see that only caches and other metadata have been altered. While this doesn’t guarantee that the alterations are genuine, the danger coming from them is rather small (if any).

April 11, 2016
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Important!

My tech articles—especially Linux ones—are some of the most-viewed on The Z-Issue. If this one has helped you, please consider a small donation to The Parker Fund by using the top widget at the right. Thanks!

A couple weeks ago, I decided to update my primary laptop’s kernel from 4.0 to 4.5. Everything went smoothly with the exception of my wireless networking. This particular laptop uses the a wifi chipset that is controlled by the Intel Wireless DVM Firmware:


#lspci | grep 'Network controller'
03:00.0 Network controller: Intel Corporation Centrino Advanced-N 6205 [Taylor Peak] (rev 34)

According to Intel Linux support for wireless networking page, I need kernel support for the ‘iwlwifi’ driver. I remembered this requirement from building the previous kernel, so I included it in the new 4.5 kernel. The new kernel had some additional options, though, and they were:


[*] Intel devices
...
< > Intel Wireless WiFi Next Gen AGN - Wireless-N/Advanced-N/Ultimate-N (iwlwifi)
< > Intel Wireless WiFi DVM Firmware support
< > Intel Wireless WiFi MVM Firmware support
Debugging Options --->

As previously mentioned, the Kernel page for iwlwifi indicates that I need the DVM module for my particular chipset, so I selected it. Previously, I chose to build support for the driver into the kernel, and then use the firmware for the device. However, this time, I noticed that it wasn’t loading:


[ 3.962521] iwlwifi 0000:03:00.0: can't disable ASPM; OS doesn't have ASPM control
[ 3.970843] iwlwifi 0000:03:00.0: Direct firmware load for iwlwifi-6000g2a-6.ucode failed with error -2
[ 3.976457] iwlwifi 0000:03:00.0: loaded firmware version 18.168.6.1 op_mode iwldvm
[ 3.996628] iwlwifi 0000:03:00.0: CONFIG_IWLWIFI_DEBUG enabled
[ 3.996640] iwlwifi 0000:03:00.0: CONFIG_IWLWIFI_DEBUGFS disabled
[ 3.996647] iwlwifi 0000:03:00.0: CONFIG_IWLWIFI_DEVICE_TRACING enabled
[ 3.996656] iwlwifi 0000:03:00.0: Detected Intel(R) Centrino(R) Advanced-N 6205 AGN, REV=0xB0
[ 3.996828] iwlwifi 0000:03:00.0: L1 Enabled - LTR Disabled
[ 4.306206] iwlwifi 0000:03:00.0 wlp3s0: renamed from wlan0
[ 9.632778] iwlwifi 0000:03:00.0: L1 Enabled - LTR Disabled
[ 9.633025] iwlwifi 0000:03:00.0: L1 Enabled - LTR Disabled
[ 9.633133] iwlwifi 0000:03:00.0: Radio type=0x1-0x2-0x0
[ 9.898531] iwlwifi 0000:03:00.0: L1 Enabled - LTR Disabled
[ 9.898803] iwlwifi 0000:03:00.0: L1 Enabled - LTR Disabled
[ 9.898906] iwlwifi 0000:03:00.0: Radio type=0x1-0x2-0x0
[ 20.605734] iwlwifi 0000:03:00.0: L1 Enabled - LTR Disabled
[ 20.605983] iwlwifi 0000:03:00.0: L1 Enabled - LTR Disabled
[ 20.606082] iwlwifi 0000:03:00.0: Radio type=0x1-0x2-0x0
[ 20.873465] iwlwifi 0000:03:00.0: L1 Enabled - LTR Disabled
[ 20.873831] iwlwifi 0000:03:00.0: L1 Enabled - LTR Disabled
[ 20.873971] iwlwifi 0000:03:00.0: Radio type=0x1-0x2-0x0

The strange thing, though, is that the firmware was right where it should be:


# ls -lh /lib/firmware/
total 664K
-rw-r--r-- 1 root root 662K Mar 26 13:30 iwlwifi-6000g2a-6.ucode

After digging around for a while, I finally figured out the problem. The kernel was trying to load the firmware for this device/driver before it was actually available. There are definitely ways to build the firmware into the kernel image as well, but instead of going that route, I just chose to rebuild my kernel with this driver as a module (which is actually the recommended method anyway):


[*] Intel devices
...
Intel Wireless WiFi Next Gen AGN - Wireless-N/Advanced-N/Ultimate-N (iwlwifi)
Intel Wireless WiFi DVM Firmware support
< > Intel Wireless WiFi MVM Firmware support
Debugging Options --->

If I had fully read the page instead of just skimming it, I could have saved myself a lot of time. Hopefully this post will help anyone getting the “Direct firmware load for iwlwifi-6000g2a-6.ucode failed with error -2” error message.

Cheers,
Zach

April 04, 2016
Hanno Böck a.k.a. hanno (homepage, bugs)

OwncloudThe Owncloud web application has an encryption module. I first became aware of it when a press release was published advertising this encryption module containing this:

“Imagine you are an IT organization using industry standard AES 256 encryption keys. Let’s say that a vulnerability is found in the algorithm, and you now need to improve your overall security by switching over to RSA-2048, a completely different algorithm and key set. Now, with ownCloud’s modular encryption approach, you can swap out the existing AES 256 encryption with the new RSA algorithm, giving you added security while still enabling seamless access to enterprise-class file sharing and collaboration for all of your end-users.”

To anyone knowing anything about crypto this sounds quite weird. AES and RSA are very different algorithms – AES is a symmetric algorithm and RSA is a public key algorithm - and it makes no sense to replace one by the other. Also RSA is much older than AES. This press release has since been removed from the Owncloud webpage, but its content can still be found in this Reuters news article. This and some conversations with Owncloud developers caused me to have a look at this encryption module.

First it is important to understand what this encryption module is actually supposed to do and understand the threat scenario. The encryption provides no security against a malicious server operator, because the encryption happens on the server. The only scenario where this encryption helps is if one has a trusted server that is using an untrusted storage space.

When one uploads a file with the encryption module enabled it ends up under the same filename in the user's directory on the file storage. Now here's a first, quite obvious problem: The filename itself is not protected, so an attacker that is assumed to be able to see the storage space can already learn something about the supposedly encrypted data.

The content of the file starts with this:
BEGIN:oc_encryption_module:OC_DEFAULT_MODULE:cipher:AES-256-CFB:HEND----

It is then padded with further dashes until position 0x2000 and then the encrypted contend follows Base64-encoded in blocks of 8192 bytes. The header tells us what encryption algorithm and mode is used: AES-256 in CFB-mode. CFB stands for Cipher Feedback.

Authenticated and unauthenticated encryption modes

In order to proceed we need some basic understanding of encryption modes. AES is a block cipher with a block size of 128 bit. That means we cannot just encrypt arbitrary input with it, the algorithm itself only encrypts blocks of 128 bit (or 16 byte) at a time. The naive way to encrypt more data is to split it into 16 byte blocks and encrypt every block. This is called Electronic Codebook mode or ECB and it should never be used, because it is completely insecure.

Common modes for encryption are Cipherblock Chaining (CBC) and Counter mode (CTR). These modes are unauthenticated and have a property that's called malleability. This means an attacker that is able to manipulate encrypted data is able to manipulate it in a way that may cause a certain defined behavior in the output. Often this simply means an attacker can flip bits in the ciphertext and the same bits will be flipped in the decrypted data.

To counter this these modes are usually combined with some authentication mechanism, a common one is called HMAC. However experience has shown that this combining of encryption and authentication can go wrong. Many vulnerabilities in both TLS and SSH were due to bad combinations of these two mechanism. Therefore modern protocols usually use dedicated authenticated encryption modes (AEADs), popular ones include Galois/Counter-Mode (GCM), Poly1305 and OCB.

Cipher Feedback (CFB) mode is a self-correcting mode. When an error happens, which can be simple data transmission error or a hard disk failure, two blocks later the decryption will be correct again. This also allows decrypting parts of an encrypted data stream. But the crucial thing for our attack is that CFB is not authenticated and malleable. And Owncloud didn't use any authentication mechanism at all.

Therefore the data is encrypted and an attacker cannot see the content of a file (however he learns some metadata: the size and the filename), but an Owncloud user cannot be sure that the downloaded data is really the data that was uploaded in the first place. The malleability of CFB mode works like this: An attacker can flip arbitrary bits in the ciphertext, the same bit will be flipped in the decrypted data. However if he flips a bit in any block then the following block will contain unpredictable garbage.

Backdooring an EXE file

How does that matter in practice? Let's assume we have a group of people that share a software package over Owncloud. One user uploads a Windows EXE installer and the others download it from there and install it. Let's further assume that the attacker doesn't know the content of the EXE file (this is a generous assumption, in many cases he will know, as he knows the filename).

EXE files start with a so-called MZ-header, which is the old DOS EXE header that gets usually ignored. At a certain offset (0x3C), which is at the end of the fourth 16 byte block, there is an address of the PE header, which on Windows systems is the real EXE header. After the MZ header even on modern executables there is still a small DOS program. This starts with the fifth 16 byte block. This DOS program usually only shows the message “Th is program canno t be run in DOS mode”. And this DOS stub program is almost always the exactly the same.

EXE file

Therefore our attacker can do the following: First flip any non-relevant bit in the third 16 byte block. This will cause the fourth block to contain garbage. The fourth block contains the offset of the PE header. As this is now garbled Windows will no longer consider this executable to be a Windows application and will therefore execute the DOS stub.

The attacker can then XOR 16 bytes of his own code with the first 16 bytes of the standard DOS stub code. He then XORs the result with the fifth block of the EXE file where he expects the DOS stub to be. Voila: The resulting decrypted EXE file will contain 16 bytes of code controlled by the attacker.

I created a proof of concept of this attack. This isn't enough to launch a real attack, because an attacker only has 16 bytes of DOS assembler code, which is very little. For a real attack an attacker would have to identify further pieces of the executable that are predictable and jump through the code segments.

The first fix

I reported this to Owncloud via Hacker One in January. The first fix they proposed was a change where they used Counter-Mode (CTR) in combination with HMAC. They still encrypt the file in blocks of 8192 bytes size. While this is certainly less problematic than the original construction it still had an obvious problem: All the 8192 bytes sized file blocks where encrypted the same way. Therefore an attacker can swap or remove chunks of a file. The encryption is still malleable.

The second fix then included a counter of the file and also avoided attacks where an attacker can go back to an earlier version of a file. This solution is shipped in Owncloud 9.0, which has recently been released.

Is this new construction secure? I honestly don't know. It is secure enough that I didn't find another obvious flaw in it, but that doesn't mean a whole lot.

You may wonder at this point why they didn't switch to an authenticated encryption mode like GCM. The reason for that is that PHP doesn't support any authenticated encryption modes. There is a proposal and most likely support for authenticated encryption will land in PHP 7.1. However given that using outdated PHP versions is a very widespread practice it will probably take another decade till anyone can use that in mainstream web applications.

Don't invent your own crypto protocols

The practical relevance of this vulnerability is probably limited, because the scenario that it protects from is relatively obscure. But I think there is a lesson to learn here. When people without a strong cryptographic background create ad-hoc designs of cryptographic protocols it will almost always go wrong.

It is widely known that designing your own crypto algorithms is a bad idea and that you should use standardized and well tested algorithms like AES. But using secure algorithms doesn't automatically create a secure protocol. One has to know the interactions and limitations of crypto primitives and this is far from trivial. There is a worrying trend – especially since the Snowden revelations – that new crypto products that never saw any professional review get developed and advertised in masses. A lot of these products are probably extremely insecure and shouldn't be trusted at all.

If you do crypto you should either do it right (which may mean paying someone to review your design or to create it in the first place) or you better don't do it at all. People trust your crypto, and if that trust isn't justified you shouldn't ship a product that creates the impression it contains secure cryptography.

There's another thing that bothers me about this. Although this seems to be a pretty standard use case of crypto – you have a symmetric key and you want to encrypt some data – there is no straightforward and widely available standard solution for it. Using authenticated encryption solves a number of issues, but not all of them (this talk by Adam Langley covers some interesting issues and caveats with authenticated encryption).

The proof of concept can be found on Github. I presented this vulnerability in a talk at the Easterhegg conference, a video recording is available.

Michal Hrusecky a.k.a. miska (homepage, bugs)
Turris Omnia and openSUSE (April 04, 2016, 05:29 UTC)

About two weeks ago I was on the annual openSUSE Board face to face meeting. It was great and you can read reports of what was going on in there on openSUSE project mailing list. In this post I would like to focus on my other agenda I had while coming to Nuremberg. Nuremberg is among other things SUSE HQ and therefore there is a high concentration of skilled engineers and I wanted to take an advantage of that…

Little bit of my personal history. I recently join Turris team at CZ.NIC, partly because Omnia is so cool and I wanted to help to make it happen. And being long term openSUSE contributor I really wanted to see some way how to help both projects. I discussed it with my bosses at CZ.NIC and got in contact with Andreas Färber who you might know as one of the guys playing with ARMs within openSUSE project. The result was that I got an approval to bring Omnia prototype during the weekend to him and let him play with it.

My point was to give him a head start, so when Omnias will start shipping, there will be already some research done and maybe even howto for openSUSE so you could replace OpenWRT with openSUSE if you wanted. On the other hand, we will also get some preliminary feedback we can still try to incorporate.

Andreas Färber with Omnia

Andreas Färber with Omnia

Why testing whether you can install openSUSE on Omnia? And do you want to do that? As a typical end user probably not. Here are few arguments that speaks against it. OpenWRT is great for routers – it has nice interface and anything you want to do regarding the network setup is really easy to do. You are able to setup even complicated network using simple web UI. Apart from that, by throwing away OpenWRT you would throw away quite some of the perks of Omnia – like parental control or mobile application. You might think that it is worth it to sacrifice those to get full-fledged server OS you are familiar with and where you can install everything in non-stripped down version. Actually, you don’t have to sacrifice anything – OpenWRT in Omnia will support LXC, so you can install your OS of choice inside LXC container and have both – easily manageable router with all the bells and whistles and also virtual server with very little overhead doing complicated stuff. Or even two or three of them. So most probably, you want to keep OpenWRT and install openSUSE or some other Linux distribution inside a container.

But if you still do want to replace OpenWRT, can you? And how difficult would it be? Long story short, the answer is yes. Andreas was able to get openSUSE running on Omnia and even wrote instructions how to do that! One little comment, Turris Omnia is still under heavy development. What Andreas played with was one of the prototypes we have. Software is still being worked on and even hardware is being polished a little bit from time to time. But still, HW will not change drastically and therefor howto probably wouldn’t change as well. It is nice to see that it is possible and quite easy to install your average Linux distribution.

Why is having this option so important given all the arguments I stated against doing so? Because of freedom. I consider it great advantage when buying a piece of hardware knowing that I can do whatever I want with it and I’m not locked in and depending on the vendor with everything. Being able to install openSUSE on Omnia basically proves that Omnia is really open and even in the unlikely situation in which hell freezes over and CZ.NIC will disappear or turn evil, you will still be able to install latest kernel 66.6 and continue to do whatever you want with your router.

This post was originally posted on CZ.NIC blog, re-posted here to make it available on Planet openSUSE.

April 03, 2016
Michal Hrusecky a.k.a. miska (homepage, bugs)
Shell calendar generator (April 03, 2016, 18:45 UTC)

Some people still use paper calendars. Stuff where you have a picture of the month and all days in the month listed. I have some relatives that do use those. On loosely related topic, I like to travel and I like to take some pictures in foreign lands. So combining both is an obvious idea – to create a calendar where pictures of the month are taken by me. I searched for some ready to use solution but haven’t found anything. So I decided to create my own simple tool. And this post is about creating that tool.

I know time and date stuff is complicated and I wasn’t really looking into learning all the rules regarding date and time and programing them. There had to be a simple way how I can use some of the tools that are already implemented. Obvious option would be to use some of the date manipulation libraries like mktime and write the tool in C. But that that sounded quite heavy weight for such a simple tool. Using Ruby would be an option, but still kinda too much and I’m not fluent rubyist and my python and perl are even rustier. I was also thinking what output format should I use to print it easily. As I was targeting some pretty printed paper LaTeX sounded like a good choice and in theory it could be used to implement the whole thing. I even found somebody who did that, but I didn’t managed to comprehend how it worked, how to modify it or even how to compile it. Turns out my LaTeX is rusty as well.

So I decided to use shell and powerful date command to generate the content. Started with generating LaTeX code as I still want it on paper in the end, right? Trouble is, LaTeX make great papers if you want to look serious and make some serious typography. For calendar on the wall, you probably want to make it fancy and screw typography. I was trying to make it what I wanted, but it was hard. So hard I gave up. And I ended up with the winning combo – shell and html. Html is easy to view and print and CSS supports various of options including different style for screen and print type media.

Html and css made the whole exercise really easy and I have something working now on GitHub in 150 lines of code where half of it is CSS. It’s not perfect, there is plenty of space for optimization, but it is really simple and fast enough. Are you interested? Give it a try and if it doesn’t work well for you, pull requests are welcome 😉

April 02, 2016
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
Last words on diabetes and software (April 02, 2016, 22:24 UTC)

Started as a rant on G+, then became too long and suited for a blog.

I do not understand why we can easily get people together with something like VideoLAN, but the moment when health is involved, the results are just horrible.

Projects either end up in "startuppy", which want to keep things for themselves and by themselves, or we end up fractionated in tiny one-person-projects because every single glucometer is a different beast and nobody wants to talk with others.

Tonight I ended up in a half-fight with a project to which I came saying "I've started drafting an exchange format, because nobody has written one down, and the format I've seen you use is just terrible and when I told you, you haven't replied" and the answer was "we're working on something we want to standardize by talking with manufacturers."

Their "we talk with these" projects are also something insane — one seem to be more like the idea of building a new device from scratch (great long term solution, terrible usefulness for people) and the other one is yet-another-build-your-own-cloud kind of solution that tells you to get Heroku or Azure with MongoDB to store your data. It also tells you to use a non-manufacturer-approved scanner for the sensors, which the comments point out can fry those sensors to begin with. (I should check whether that's actually within ToS for Play Store.)

So you know what? I'm losing hope in FLOSS once again. Maybe I should just stop caring, give up this laptop for a new Microsoft Surface Pro, and keep my head away from FLOSS until I am ready for retirement, at which point I can probably just go and keep up with the reading.

I have tried reaching out to the people who have written other tools, like I posted before, but it looks like people are just not interested in discussing this — I did talk with a few people over email about some of the glucometers I dealt with, but that came to one person creating yet another project wanting to become a business, and two figuring out which original proprietary tools to use, because they do actually work.

So I guess you won't be reading much about diabetes on my blog in the future, because I don't particularly enjoy writing this for my sole use, and clearly that's the only kind of usage these projects will ever get. Sharing seems to be considered deprecated.

April 01, 2016
Luca Barbato a.k.a. lu_zero (homepage, bugs)
AVScale – part1 (April 01, 2016, 18:53 UTC)

swscale is one of the most annoying part of Libav, after a couple of years since the initial blueprint we have something almost functional you can play with.

Colorspace conversion and Scaling

Before delving in the library architecture and the outher API probably might be good to make a extra quick summary of what this library is about.

Most multimedia concepts are more or less intuitive:
encoding is taking some data (e.g. video frames, audio samples) and compress it by leaving out unimportant details
muxing is the act of storing such compressed data and timestamps so that audio and video can play back in sync
demuxing is getting back the compressed data with the timing information stored in the container format
decoding inflates somehow the data so that video frames can be rendered on screen and the audio played on the speakers

After the decoding step would seem that all the hard work is done, but since there isn’t a single way to store video pixels or audio samples you need to process them so they work with your output devices.

That process is usually called resampling for audio and for video we have colorspace conversion to change the pixel information and scaling to change the amount of pixels in the image.

Today I’ll introduce you to the new library for colorspace conversion and scaling we are working on.

AVScale

The library aims to be as simple as possible and hide all the gory details from the user, you won’t need to figure the heads and tails of functions with a quite large amount of arguments nor special-purpose functions.

The API itself is modelled after avresample and approaches the problem of conversion and scaling in a way quite different from swscale, following the same design of NAScale.

Everything is a Kernel

One of the key concept of AVScale is that the conversion chain is assembled out of different components, separating the concerns.

Those components are called kernels.

The kernels can be conceptually divided in two kinds:
Conversion kernels, taking an input in a certain format and providing an output in another (e.g. rgb2yuv) without changing any other property.
Process kernels, modifying the data while keeping the format itself unchanged (e.g. scale)

This pipeline approach gets great flexibility and helps code reuse.

The most common use-cases (such as scaling without conversion or conversion with out scaling) can be faster than solutions trying to merge together scaling and conversion in a single step.

API

AVScale works with two kind of structures:
AVPixelFormaton: A full description of the pixel format
AVFrame: The frame data, its dimension and a reference to its format details (aka AVPixelFormaton)

The library will have an AVOption-based system to tune specific options (e.g. selecting the scaling algorithm).

For now only avscale_config and avscale_convert_frame are implemented.

So if the input and output are pre-determined the context can be configured like this:

AVScaleContext *ctx = avscale_alloc_context();

if (!ctx)
    ...

ret = avscale_config(ctx, out, in);
if (ret < 0)
    ...

But you can skip it and scale and/or convert from a input to an output like this:

AVScaleContext *ctx = avscale_alloc_context();

if (!ctx)
    ...

ret = avscale_convert_frame(ctx, out, in);
if (ret < 0)
    ...

avscale_free(&ctx);

The context gets lazily configured on the first call.

Notice that avscale_free() takes a pointer to a pointer, to make sure the context pointer does not stay dangling.

As said the API is really simple and essential.

Help welcome!

Kostya kindly provided an initial proof of concept and me, Vittorio and Anton prepared this preview on the spare time. There is plenty left to do, if you like the idea (since many kept telling they would love a swscale replacement) we even have a fundraiser.

March 31, 2016
Anthony Basile a.k.a. blueness (homepage, bugs)

I’ll be honest, this is a short post because the aggregation on planet.gentoo.org is failing for my account!  So, Jorge (jmbsvicetto) is debugging it and I need to push out another blog entry to trigger venus, the aggregation program.  Since I don’t like writing trivial stuff, I’m going to write something short, but hopefully important.

C Standard libraries, like glibc, uClibc, musl and the like, were born out of a world in which every UNIX vendor had their own set of useful C functions.  Code portability put pressure on various libc to incorporate these functions from other libc, first leading to to a mess and then to standards like POSIX, XOPEN, SUSv4 and so on.  Chpt 1 of Kerrisk’s The Linux Programming Interface has a nice write up on this history.

We still live in the shadows of that world today.  If you look thorugh the code base of uClibc you’ll see lots of macros like __GLIBC__, __UCLIBC__, __USE_BSD, and __USE_GNU.  These are used in #ifdef … #endif which are meant to shield features unless you want a glibc or uClibc only feature.

musl has stubbornly and correctly refused to include a __MUSL__ macro.  Consider the approach to portability taken by GNU autotools.  Marcos such as AC_CHECK_LIBS(), AC_CHECK_FUNC() or AC_CHECK_HEADERS() unambiguously target the feature in question without making the use of __GLIBC__ or __UCLIBC__.  Whereas the previous approach globs together functions into sets, the latter just simply asks, do you have this function or not?

Now consider how uClibc makes use of both __GLIBC__ and __UCLIBC__.  If a function is provided by the former but not by the latter, then it expects a program to use

#if defined(__GLIBC__) && !defined(__UCLIBC__)

This is getting a bit ugly and syntactically ambiguous.  Someone not familiar with this could easily misinterpret it, or reject it.

So I’ve hit bugs like these.  I hit one in gdk-pixbuf and I was not able to convince upstream to consistently use __GLIBC__ and __UCLIBC__.   Alternatively I hit this in geocode-glib and geoclue, and they did accept it.  I went with the wrong minded approach because that’s what was already there, and I didn’t feel like sifting through their code base and revamping their build system.  This isn’t just laziness, its historical weight.

So kudos to musl.  And for all the faults of GNU autotools, at least its approach to portability is correct.

 

 

 

 

March 30, 2016
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
Travel cards collection (March 30, 2016, 19:07 UTC)

As some of you might have noticed, for example by following me on Twitter, I have been traveling a significant amount over the past four years. Part of it has been for work, part for my involvement with VideoLAN and part again for personal reason (i.e. vacation.)

When I travel, I don't rent a car. The main reason being I (still) don't have a driving license, so particularly when I travel for leisure I tend to travel where there is at least some form of public transport, and even better if there is a good one. This matched perfectly with my hopes of visiting Japan (which I did last year), and usually tends to work relatively well with conference venues, so I have not had much trouble on it in the past few years.

One thing that is going a bit overboard for me, though, is the number of travel cards I have by now. With the exception of Japan, here every city or so have a different travel card — while London appears to have solved that, at least for tourists and casual passengers, by accepting contactless cards as if it was their local travel card (Oyster), it does not seem to be followed up by anyone else, that I can see.

Indeed I have at this point at home:

  • Clipper for San Francisco and Bay Area; prepaid, I actually have not used it in a while so I have some money "stuck" on it.
  • SmarTrip for Washington DC; also prepaid, but at least I managed to only keep very little on it.
  • Metro dayLink for Belfast; prepaid by tickets.
  • Ridacard for Edinburgh and the Lothian region; this one has my photo on it, and I paid for a weekly ticket when I used it.
  • imob.venezia, which is now discontinued, and I used when I lived in Venice, it's just terrible.
  • Suica, for Japan, which is a stored-value card that can be used for payments as well as travel, so it comes the closest to London's use of contactless.
  • Leap which is the local Dublin transports card, also prepaid.
  • Navigo for Paris, but I only used it once because you can only store Monday-to-Sunday tickets on it.

I might add a few more this year, as I'm hitting a few new places. On the other hand, while in London yesterday, I realized how nice and handy it is to just use my bank card for popping in and out of the Tube. And I've been wondering how did we get to this system of incompatible cards.

In the list above, most of the cities are one per State or Country, which might suggest cards work better within a country, but that's definitely not the case. I have been told that recently Nottingham has moved to a consolidate travelcard which is not compatible with Oyster either, and both of them are in England.

Suica is the exception. The IC system used in Japan is a stored-value system which can be used for both travel and for general payments, in stores and cafes and so on. This is not "limited" to Tokyo (though limited might be the wrong word there), but rather works in most of the cities I've visited — one exception being busses in Hiroshima, while it worked fine for trams and trains. It is essentially an upside-down version of what happens in London, like if instead of using your payment card to travel, you used your travel card for in-store purchases.

The convenience of using a payment card, by the way, lies for me mostly on being able to use (one of) my bank accounts to pay for the money without having to "earmark" it the way I did for Clipper, which is now going to be used only the next time I actually use the public transport in SF — which I'm not sure when it is!

At the same time, I can think of two big obstacles to implementing contactless payment in place for travelcards: contracts and incentives. On the first note, I'm sure that there is some weight that TfL (Travel for London) can pull, that your average small town can't. On the other note, it's a matter for finance experts, which I can only guess on: there is value for the travel companies to receive money before you travel — Clipper has already had my money in their coffers since I topped it up, though I have not used it.

While topped-up credit of customers is essentially a liability for the companies, it also increases their liquidity. So there is little incentive for them, particularly the smaller ones. Indeed, moving to a payment system for which the companies get their money mostly from banks rather than through cash, is likely to be a problem for them. And we're back on the first matter: contracts. I'm sure TfL can get better deals from banks and credit card companies than most.

There is also the matter of the tech behind all of this. TfL has definitely done a good job with keeping compatible systems — the Oyster I got in 2009, the first time I boarded a plane, still works. During the same seven years, Venice changed their system twice: once keeping the same name/brand but with different protocols on the card (making it compatible with more NFC systems), and once by replacing the previous brand — I assume they have kept some compatibility on the cards but since I no longer live there I have not investigated.

I'm definitely not one of those people who insist that opensource is the solution to everything, and that just by being opened, things become better for society. On the other hand, I do wonder if it would make sense for the opensource community to engage with public services like this to provide a solution that can be more easily mirrored by smaller towns, who would not otherwise be able to afford the system themselves.

On the other hand, this would require, most likely, compromises. The contracts with service providers would likely include a number of NDA-like provisions, and at the same time, the hardware would not be available off-the-shelf.

This post is not providing any useful information I'm afraid, it's just a bit of a bigger opinion I have about opensource nowadays, and particularly about how so many people limit their idea of "public interest" to "privacy" and cryptography.

March 29, 2016
Luca Barbato a.k.a. lu_zero (homepage, bugs)
New AVCodec API (March 29, 2016, 11:58 UTC)

Another week another API landed in the tree and since I spent some time drafting it, I guess I should describe how to use it now what is implemented. This is part I

What is here now

Between theory and practice there is a bit of discussion and obviously the (lack) of time to implement, so here what is different from what I drafted originally:

  • Function Names: push got renamed to send and pull got renamed to receive.
  • No separated function to probe the process state, need_data and have_data are not here.
  • No codecs ported to use the new API, so no actual asyncronicity for now.
  • Subtitles aren’t supported yet.

New API

There are just 4 new functions replacing both audio-specific and video-specific ones:

// Decode
int avcodec_send_packet(AVCodecContext *avctx, const AVPacket *avpkt);
int avcodec_receive_frame(AVCodecContext *avctx, AVFrame *frame);

// Encode
int avcodec_send_frame(AVCodecContext *avctx, const AVFrame *frame);
int avcodec_receive_packet(AVCodecContext *avctx, AVPacket *avpkt);

The workflow is sort of simple:
– You setup the decoder or the encoder as usual
– You feed data using the avcodec_send_* functions until you get a AVERROR(EAGAIN), that signals that the internal input buffer is full.
– You get the data back using the matching avcodec_receive_* function until you get a AVERROR(EAGAIN), signalling that the internal output buffer is empty.
– Once you are done feeding data you have to pass a NULL to signal the end of stream.
– You can keep calling the avcodec_receive_* function until you get AVERROR_EOF.
– You free the contexts as usual.

Decoding examples

Setup

The setup uses the usual avcodec_open2.

    ...

    c = avcodec_alloc_context3(codec);

    ret = avcodec_open2(c, codec, &opts);
    if (ret < 0)
        ...

Simple decoding loop

People using the old API usually have some kind of simple loop like

while (get_packet(pkt)) {
    ret = avcodec_decode_video2(c, picture, &got_picture, pkt);
    if (ret < 0) {
        ...
    }
    if (got_picture) {
        ...
    }
}

The old functions can be replaced by calling something like the following.

// The flush packet is a non-NULL packet with size 0 and data NULL
int decode(AVCodecContext *avctx, AVFrame *frame, int *got_frame, AVPacket *pkt)
{
    int ret;

    *got_frame = 0;

    if (pkt) {
        ret = avcodec_send_packet(avctx, pkt);
        // In particular, we don't expect AVERROR(EAGAIN), because we read all
        // decoded frames with avcodec_receive_frame() until done.
        if (ret < 0)
            return ret == AVERROR_EOF ? 0 : ret;
    }

    ret = avcodec_receive_frame(avctx, frame);
    if (ret < 0 && ret != AVERROR(EAGAIN) && ret != AVERROR_EOF)
        return ret;
    if (ret >= 0)
        *got_frame = 1;

    return 0;
}

Callback approach

Since the new API will output multiple frames in certain situations would be better to process them as they are produced.

// return 0 on success, negative on error
typedef int (*process_frame_cb)(void *ctx, AVFrame *frame);

int decode(AVCodecContext *avctx, AVFrame *pkt,
           process_frame_cb cb, void *priv)
{
    AVFrame *frame = av_frame_alloc();
    int ret;

    ret = avcodec_send_packet(avctx, pkt);
    // Again EAGAIN is not expected
    if (ret < 0)
        goto out;

    while (!ret) {
        ret = avcodec_receive_frame(avctx, frame);
        if (!ret)
            ret = cb(priv, frame);
    }

out:
    av_frame_free(&frame);
    if (ret == AVERROR(EAGAIN))
        return 0;
    return ret;
}

Separated threads

The new API makes sort of easy to split the workload in two separated threads.

// Assume we have context with a mutex, a condition variable and the AVCodecContext


// Feeding loop
{
    AVPacket *pkt = NULL;

    while ((ret = get_packet(ctx, pkt)) >= 0) {
        pthread_mutex_lock(&ctx->lock);

        ret = avcodec_send_packet(avctx, pkt);
        if (!ret) {
            pthread_cond_signal(&ctx->cond);
        } else if (ret == AVERROR(EAGAIN)) {
            // Signal the draining loop
            pthread_cond_signal(&ctx->cond);
            // Wait here
            pthread_cond_wait(&ctx->cond, &ctx->mutex);
        } else if (ret < 0)
            goto out;

        pthread_mutex_unlock(&ctx->lock);
    }

    pthread_mutex_lock(&ctx->lock);
    ret = avcodec_send_packet(avctx, NULL);

    pthread_cond_signal(&ctx->cond);

out:
    pthread_mutex_unlock(&ctx->lock)
    return ret;
}

// Draining loop
{
    AVFrame *frame = av_frame_alloc();

    while (!done) {
        pthread_mutex_lock(&ctx->lock);

        ret = avcodec_receive_frame(avctx, frame);
        if (!ret) {
            pthread_cond_signal(&ctx->cond);
        } else if (ret == AVERROR(EAGAIN)) {
            // Signal the feeding loop
            pthread_cond_signal(&ctx->cond);
            // Wait
            pthread_cond_wait(&ctx->cond, &ctx->mutex);
        } else if (ret < 0)
            goto out;

        pthread_mutex_unlock(&ctx->lock);

        if (!ret) {
            do_something(frame);
        }
    }

out:
        pthread_mutex_unlock(&ctx->lock)
    return ret;
}

It isn’t as neat as having all this abstracted away, but is mostly workable.

Encoding Examples

Simple encoding loop

Some compatibility with the old API can be achieved using something along the lines of:

int encode(AVCodecContext *avctx, AVPacket *pkt, int *got_packet, AVFrame *frame)
{
    int ret;

    *got_packet = 0;

    ret = avcodec_send_frame(avctx, frame);
    if (ret < 0)
        return ret;

    ret = avcodec_receive_packet(avctx, pkt);
    if (!ret)
        *got_packet = 1;
    if (ret == AVERROR(EAGAIN))
        return 0;

    return ret;
}

Callback approach

Since for each input multiple output could be produced, would be better to loop over the output as soon as possible.

// return 0 on success, negative on error
typedef int (*process_packet_cb)(void *ctx, AVPacket *pkt);

int encode(AVCodecContext *avctx, AVFrame *frame,
           process_packet_cb cb, void *priv)
{
    AVPacket *pkt = av_packet_alloc();
    int ret;

    ret = avcodec_send_frame(avctx, frame);
    if (ret < 0)
        goto out;

    while (!ret) {
        ret = avcodec_receive_packet(avctx, pkt);
        if (!ret)
            ret = cb(priv, pkt);
    }

out:
    av_packet_free(&pkt);
    if (ret == AVERROR(EAGAIN))
        return 0;
    return ret;
}

The I/O should happen in a different thread when possible so the callback should just enqueue the packets.

Coming Next

This post is long enough so the next one might involve converting a codec to the new API.

March 28, 2016
Anthony Basile a.k.a. blueness (homepage, bugs)

RBAC is a security feature of the hardened-sources kernels.  As its name suggests, its a role-based access control system which allows you to define policies for restricting access to files, sockets and other system resources.   Even root is restricted, so attacks that escalate privilege are not going to get far even if they do obtain root.  In fact, you should be able to give out remote root access to anyone on a well configured system running RBAC and still remain confident that you are not going to be owned!  I wouldn’t recommend it just in case, but it should be possible.

It is important to understand what RBAC will give you and what it will not.  RBAC has to be part of a more comprehensive security plan and is not a single security solution.  In particular, if one can compromise the kernel, then one can proceed to compromise the RBAC system itself and undermine whatever security it offers.  Or put another way, protecting root is pretty much a moot point if an attacker is able to get ring 0 privileges.  So, you need to start with an already hardened kernel, that is a kernel which is able to protect itself.  In practice, this means configuring most of the GRKERNSEC_* and PAX_* features of a hardened-sources kernel.  Of course, if you’re planning on running RBAC, you need to have that option on too.

Once you have a system up and running with a properly configured kernel, the next step is to set up the policy file which lives at /etc/grsec/policy.  This is where the fun begins because you need to ask yourself what kind of a system you’re going to be running and decide on the policies you’re going to implement.  Most of the existing literature is about setting up a minimum privilege system for a server which runs only a few simple processes, something like a LAMP stack.  I did this for years when I ran a moodle server for D’Youville College.  For a minimum privilege system, you want to deny-by-default and only allow certain processes to have access to certain resources as explicitly stated in the policy file.  RBAC is ideally suited for this.  Recently, however, I was asked to set up a system where the opposite was the case, so this article is going to explore the situation where you want to allow-by-default; however, for completeness let me briefly cover deny-by-default first.

The easiest way to proceed is to get all your services running as they should and then turn on learning mode for about a week, or at least until you have one cycle of, say, log rotations and other cron based jobs.  Basically your services should have attempted to access each resource at least once so the event gets logged.  You then distill those logs into a policy file describing only what should be permitted and tweak as needed.  Basically, you proceed something as follows:

1. gradm -P  # Create a password to enable/disable the entire RBAC system
2. gradm -P admin  # Create a password to authenticate to the admin role
3. gradm –F –L /etc/grsec/learning.log # Turn on system wide learning
4. # Wait a week.  Don't do anything you don't want to learn.
5. gradm –F –L /etc/grsec/learning.log –O /etc/grsec/policy  # Generate the policy
6. gradm -E # Enable RBAC system wide
7. # Look for denials.
8. gradm -a admin  # Authenticate to admin to do extraordinary things, like tweak the policy file
9. gradm -R # reload the policy file
10. gradm -u # Drop those privileges to do ordinary things
11. gradm -D # Disable RBAC system wide if you have to

Easy right?  This will get you pretty far but you’ll probably discover that some things you want to work are still being denied because those particular events never occurred during the learning.  A typical example here, is you might have ssh’ed in from one IP, but now you’re ssh-ing in from a different IP and you’re getting denied.  To tweak your policy, you first have to escape the restrictions placed on root by transitioning to the admin role.  Then using dmesg you can see what was denied, for example:

[14898.986295] grsec: From 192.168.5.2: (root:U:/) denied access to hidden file / by /bin/ls[ls:4751] uid/euid:0/0 gid/egid:0/0, parent /bin/bash[bash:4327] uid/euid:0/0 gid/egid:0/0

This tells you that root, logged in via ssh from 192.168.5.2, tried to ls / but was denied.  As we’ll see below, this is a one line fix, but if there are a cluster of denials to /bin/ls, you may want to turn on learning on just that one subject for root.  To do this you edit the policy file and look for subject /bin/ls under role root.  You then add an ‘l’ to the subject line to enable learning for just that subject.

role root uG
…
# Role: root
subject /bin/ls ol {  # Note the ‘l’

You restart RBAC using  gradm -E -L /etc/grsec/partial-learning.log and obtain the new policy for just that subject by running gradm -L /etc/grsec/partial-learning.log  -O /etc/grsec/partial-learning.policy.  That single subject block can then be spliced into the full policy file to change the restircions on /bin/ls when run by root.

Its pretty obvious that RBAC designed to do deny-by-default. If access is not explicitly granted to a subject (an executable) to access some object (some system resource) when its running in some role (as some user), then access is denied.  But what if you want to create a policy which is mostly allow-by-default and then you just add a few denials here and there?  While RBAC is more suited for the opposite case, we can do something like this on a per account basis.

Let’s start with a failry permissive policy file for root:

role admin sA
subject / rvka {
	/			rwcdmlxi
}

role default
subject / {
	/			h
	-CAP_ALL
	connect disabled
	bind    disabled
}

role root uG
role_transitions admin
role_allow_ip 0.0.0.0/0
subject /  {
	/			r
	/boot			h
#
	/bin			rx
	/sbin			rx
	/usr/bin		rx
	/usr/libexec		rx
	/usr/sbin		rx
	/usr/local/bin		rx
	/usr/local/sbin		rx
	/lib32			rx
	/lib64			rx
	/lib64/modules		h
	/usr/lib32		rx
	/usr/lib64		rx
	/usr/local/lib32	rx
	/usr/local/lib64	rx
#
	/dev			hx
	/dev/log		r
	/dev/urandom		r
	/dev/null		rw
	/dev/tty		rw
	/dev/ptmx		rw
	/dev/pts		rw
	/dev/initctl		rw
#
	/etc/grsec		h
#
	/home			rwcdl
	/root			rcdl
#
	/proc/slabinfo		h
	/proc/modules		h
	/proc/kallsyms		h
#
	/run/lock		rwcdl
	/sys			h
	/tmp			rwcdl
	/var			rwcdl
#
	+CAP_ALL
	-CAP_MKNOD
	-CAP_NET_ADMIN
	-CAP_NET_BIND_SERVICE
	-CAP_SETFCAP
	-CAP_SYS_ADMIN
	-CAP_SYS_BOOT
	-CAP_SYS_MODULE
	-CAP_SYS_RAWIO
	-CAP_SYS_TTY_CONFIG
	-CAP_SYSLOG
#
	bind 0.0.0.0/0:0-32767 stream dgram tcp udp igmp
	connect 0.0.0.0/0:0-65535 stream dgram tcp udp icmp igmp raw_sock raw_proto
	sock_allow_family all
}

The syntax is pretty intuitive. The only thing not illustrated here is that a role can, and usually does, have multiple subject blocks which follow it. Those subject blocks belong only to the role that they are under, and not another.

The notion of a role is critical to understanding RBAC. Roles are like UNIX users and groups but within the RBAC system. The first role above is the admin role. It is ‘special’ meaning that it doesn’t correspond to any UNIX user or group, but is only defined within the RBAC system. A user will operate under some role but may transition to another role if the policy allows it. Transitioning to the admin role is reserved only for root above; but in general, any user can transition to any special role provided it is explicitly specified in the policy. No matter what role the user is in, he only has the UNIX privileges for his account. Those are not elevated by transitioning, but the restrictions applied to his account might change. Thus transitioning to a special role can allow a user to relax some restrictions for some special reason. This transitioning is done via gradm -a somerole and can be password protected using gradm -P somerole.

The second role above is the default role. When a user logs in, RBAC determines the role he will be in by first trying to match the user name to a role name. Failing that, it will try to match the group name to a role name and failing that it will assign the user the default role.

The third role above is the root role and it will be the main focus of our attention below.

The flags following the role name specify the role’s behavior. The ‘s’ and ‘A’ in the admin role line say, respectively, that it is a special role (ie, one not to be matched by a user or group name) and that it is has extra powers that a normal role doesn’t have (eg, it is not subject ptrace restrictions). Its good to have the ‘A’ flag in there, but its not essential for most uses of this role. Its really its subject block which makes it useful for administration. Of course, you can change the name if you want to practice a little bit of security by obfuscation. As long as you leave the rest alone, it’ll still function the same way.

The root role has the ‘u’ and the ‘G’ flags. The ‘u’ flag says that this role is to match a user by the same name, obviously root in this case. Alternatively, you can have the ‘g’ flag instead which says to match a group by the same name. The ‘G’ flag gives this role permission to authenticate to the kernel, ie, to use gradm. Policy information is automatically added that allows gradm to access /dev/grsec so you don’t need to add those permissions yourself. Finally the default role doesn’t and shouldn’t have any flags. If its not a ‘u’ or ‘g’ or ‘s’ role, then its a default role.

Before we jump into the subject blocks, you’ll notice a couple of lines after the root role. The first says ‘role_transitions admin’ and permits the root role to transition to the admin role. Any special roles you want this role to transition to can be listed on this line, space delimited. The second line says ‘role_allow_ip 0.0.0.0/0’. So when root logs in remotely, it will be assigned the root role provided the login is from an IP address matching 0.0.0.0/0. In this example, this means any IP is allowed. But if you had something like 192.168.3.0/24 then only root logins from the 192.168.3.0 network would get user root assigned role root. Otherwise RBAC would fall back on the default role. If you don’t have the line in there, get used to logging on on console because you’ll cut yourself off!

Now we can look at the subject blocks. These define the access controls restricting processes running in the role to which those subjects belong. The name following the ‘subject’ keyword is either a path to a directory containing executables or to an executable itself. When a process is started from an executable in that directory, or from the named executable itself, then the access controls defined in that subject block are enforced. Since all roles must have the ‘/’ subject, all processes started in a given role will at least match this subject. You can think of this as the default if no other subject matches. However, additional subject blocks can be defined which further modify restrictions for particular processes. We’ll see this towards the end of the article.

Let’s start by looking at the ‘/’ subject for the default role since this is the most restrictive set of access controls possible. The block following the subject line lists the objects that the subject can act on and what kind of access is allowed. Here we have ‘/ h’ which says that every file in the file system starting from ‘/’ downwards is hidden from the subject. This includes read/write/execute/create/delete/hard link access to regular files, directories, devices, sockets, pipes, etc. Since pretty much everything is forbidden, no process running in the default role can look at or touch the file system in any way. Don’t forget that, since the only role that has a corresponding UNIX user or group is the root role, this means that every other account is simply locked out. However the file system isn’t the only thing that needs protecting since it is possible to run, say, a malicious proxy which simply bounces evil network traffic without ever touching the filesystem. To control network access, there are the ‘connect’ and ‘bind’ lines that define what remote addresses/ports the subject can connect to as a client, or what local addresses/ports it can listen on as a server. Here ‘disabled’ means no connections or bindings are allowed. Finally, we can control what Linux capabilities the subject can assume, and -CAP_ALL means they are all forbidden.

Next, let’s look at the ‘/’ subject for the admin role. This, in contrast to the default role, is about as permissive as you can get. First thing we notice is the subject line has some additional flags ‘rvka’. Here ‘r’ means that we relax ptrace restrictions for this subject, ‘a’ means we do not hide access to /dev/grsec, ‘k’ means we allow this subject to kill protected processes and ‘v’ means we allow this subject to view hidden processes. So ‘k’ and ‘v’ are interesting and have counterparts ‘p’ and ‘h’ respectively. If a subject is flagged as ‘p’ it means its processes are protected by RBAC and can only be killed by processes belonging to a subject flagged with ‘k’. Similarly processes belonging to a subject marked ‘h’ can only be viewed by processes belonging to a subject marked ‘v’. Nifty, eh? The only object line in this subject block is ‘/ rwcdmlxi’. This says that this subject can ‘r’ead, ‘w’rite, ‘c’reate, ‘d’elete, ‘m’ark as setuid/setgid, hard ‘l’ink to, e’x’ecute, and ‘i’nherit the ACLs of the subject which contains the object. In other words, this subject can do pretty much anything to the file system.

Finally, let’s look at the ‘/’ subject for the root role. It is fairly permissive, but not quite as permissive as the previous subject. It is also more complicated and many of the object lines are there because gradm does a sanity check on policy files to help make sure you don’t open any security holes. Notice that here we have ‘+CAP_ALL’ followed by a series of ‘-CAP_*’. Each of these were included otherwise gradm would complain. For example, if ‘CAP_SYS_ADMIN’ is not removed, an attacker can mount filesystems to bypass your policies.

So I won’t go through this entire subject block in detail, but let me highlight a few points. First consider these lines

	/			r
	/boot			h
	/etc/grsec		h
	/proc/slabinfo		h
	/proc/modules		h
	/proc/kallsyms		h
	/sys			h

The first line gives ‘r’ead access to the entire file system but this is too permissive and opens up security holes, so we negate that for particular files and directories by ‘h’iding them. With these access controls, if the root user in the root role does ls /sys you get

# ls /sys
ls: cannot access /sys: No such file or directory

but if the root user transitions to the admin role using gradm -a admin, then you get

# ls /sys/
block  bus  class  dev  devices  firmware  fs  kernel  module

Next consider these lines:

	/bin			rx
	/sbin			rx
	...
	/lib32			rx
	/lib64			rx
	/lib64/modules		h

Since the ‘x’ flag is inherited by all the files under those directories, this allows processes like your shell to execute, for example, /bin/ls or /lib64/ld-2.21.so. The ‘r’ flag further allows processes to read the contents of those files, so one could do hexdump /bin/ls or hexdump /lib64/ld-2.21.so. Dropping the ‘r’ flag on /bin would stop you from hexdumping the contents, but it would not prevent execution nor would it stop you from listing the contents of /bin. If we wanted to make this subject a bit more secure, we could drop ‘r’ on /bin and not break our system. This, however, is not the case with the library directories. Dropping ‘r’ on them would break the system since library files need to have readable contents for loaded, as well as be executable.

Now consider these lines:

        /dev                    hx
        /dev/log                r
        /dev/urandom            r
        /dev/null               rw
        /dev/tty                rw
        /dev/ptmx               rw
        /dev/pts                rw
        /dev/initctl            rw

The ‘h’ flag will hide /dev and its contents, but the ‘x’ flag will still allow processes to enter into that directory and access /dev/log for reading, /dev/null for reading and writing, etc. The ‘h’ is required to hide the directory and its contents because, as we saw above, ‘x’ is sufficient to allow processes to list the contents of the directory. As written, the above policy yields the following result in the root role

# ls /dev
ls: cannot access /dev: No such file or directory
# ls /dev/tty0
ls: cannot access /dev/tty0: No such file or directory
# ls /dev/log
/dev/log

In the admin role, all those files are visible.

Let’s end our study of this subject by looking at the ‘bind’, ‘connect’ and ‘sock_allow_family’ lines. Note that the addresses/ports include a list of allowed transport protocols from /etc/protocols. One gotcha here is make sure you include port 0 for icmp! The ‘sock_allow_family’ allows all socket families, including unix, inet, inet6 and netlink.

Now that we understand this policy, we can proceed to add isolated restrictions to our mostly permissive root role. Remember that the system is totally restricted for all UNIX users except root, so if you want to allow some ordinary user access, you can simply copy the entire role, including the subject blocks, and just rename ‘role root’ to ‘role myusername’. You’ll probably want to remove the ‘role_transitions’ line since an ordinary user should not be able to transition to the admin role. Now, suppose for whatever reason, you don’t want this user to be able to list any files or directories. You can simply add a line to his ‘/’ subject block which reads ‘/bin/ls h’ and ls become completely unavailable for him! This particular example might not be that useful in practice, but you can use this technique, for example, if you want to restrict access to to your compiler suite. Just ‘h’ all the directories and files that make up your suite and it becomes unavailable.

A more complicated and useful example might be to restrict a user’s listing of a directory to just his home. To do this, we’ll have to add a new subject block for /bin/ls. If your not sure where to start, you can always begin with an extremely restrictive subject block, tack it at the end of the subjects for the role you want to modify, and then progressively relax it until it works. Alternatively, you can do partial learning on this subject as described above. Let’s proceed manually and add the following:

subject /bin/ls o {
{
        /  h
        -CAP_ALL
        connect disabled
        bind    disabled
}

Note that this is identical to the extremely restrictive ‘/’ subject for the default role except that the subject is ‘/bin/ls’ not ‘/’. There is also a subject flag ‘o’ which tells RBAC to override the previous policy for /bin/ls. We have to override it because that policy was too permissive. Now, in one terminal execute gradm -R in the admin role, while in another terminal obtain a denial to ls /home/myusername. Checking our dmesgs we see that:

[33878.550658] grsec: From 192.168.5.2: (root:U:/bin/ls) denied access to hidden file /lib64/ld-2.21.so by /bin/ls[bash:7861] uid/euid:0/0 gid/egid:0/0, parent /bin/bash[bash:7164] uid/euid:0/0 gid/egid:0/0

Well that makes sense. We’ve started afresh denying everything, but /bin/ls requires access to the dynamic linker/loader, so we’ll restore read access to it by adding a line ‘/lib64/ld-2.21.so r’. Repeating our test, we get a seg fault! Obviously, we don’t just need read access to the ld.so, but we also execute privileges. We add ‘x’ and try again. This time the denial is

[34229.335873] grsec: From 192.168.5.2: (root:U:/bin/ls) denied access to hidden file /etc/ld.so.cache by /bin/ls[ls:7917] uid/euid:0/0 gid/egid:0/0, parent /bin/bash[bash:7909] uid/euid:0/0 gid/egid:0/0
[34229.335923] grsec: From 192.168.5.2: (root:U:/bin/ls) denied access to hidden file /lib64/libacl.so.1.1.0 by /bin/ls[ls:7917] uid/euid:0/0 gid/egid:0/0, parent /bin/bash[bash:7909] uid/euid:0/0 gid/egid:0/0

Of course! We need ‘rx’ for all the libraries that /bin/ls links against, as well as the linker cache file. So we add lines for libc, libattr and libacl and ls.so.cache. Our final denial is

[34481.933845] grsec: From 192.168.5.2: (root:U:/bin/ls) denied access to hidden file /home/myusername by /bin/ls[ls:7982] uid/euid:0/0 gid/egid:0/0, parent /bin/bash[bash:7909] uid/euid:0/0 gid/egid:0/0

All we need now is ‘/home/myusername r’ and we’re done! Our final subject block looks like this:

subject /bin/ls o {
        /                         h
        /home/myusername          r
        /etc/ld.so.cache          r
        /lib64/ld-2.21.so         rx
        /lib64/libc-2.21.so       rx
        /lib64/libacl.so.1.1.0    rx
        /lib64/libattr.so.1.1.0   rx
        -CAP_ALL
        connect disabled
        bind    disabled
}

Proceeding in this fashion, we can add isolated restrictions to our mostly permissive policy.

References:

The official documentation is The_RBAC_System.  A good reference for the role, subject and object flags can be found in these  Tables.