Gentoo Logo
Gentoo Logo Side
Gentoo Spaceship

Contributors:
. Aaron W. Swenson
. Agostino Sarubbo
. Alexey Shvetsov
. Alexis Ballier
. Alexys Jacob
. Alice Ferrazzi
. Alice Ferrazzi
. Andreas K. Hüttel
. Anthony Basile
. Arun Raghavan
. Bernard Cafarelli
. Brian Harring
. Christian Ruppert
. Chí-Thanh Christopher Nguyễn
. Denis Dupeyron
. Detlev Casanova
. Diego E. Pettenò
. Domen Kožar
. Doug Goldstein
. Eray Aslan
. Fabio Erculiani
. Gentoo Haskell Herd
. Gentoo Miniconf 2016
. Gentoo Monthly Newsletter
. Gentoo News
. Gilles Dartiguelongue
. Greg KH
. Göktürk Yüksek
. Hanno Böck
. Hans de Graaff
. Ian Whyman
. Jan Kundrát
. Jason A. Donenfeld
. Jeffrey Gardner
. Joachim Bartosik
. Johannes Huber
. Jonathan Callen
. Jorge Manuel B. S. Vicetto
. Kristian Fiskerstrand
. Lance Albertson
. Liam McLoughlin
. Luca Barbato
. Marek Szuba
. Mart Raudsepp
. Matt Turner
. Matthew Thode
. Michael Palimaka
. Michal Hrusecky
. Michał Górny
. Mike Doty
. Mike Gilbert
. Mike Pagano
. Nathan Zachary
. Pacho Ramos
. Patrick Kursawe
. Patrick Lauer
. Patrick McLean
. Paweł Hajdan, Jr.
. Petteri Räty
. Piotr Jaroszyński
. Rafael G. Martins
. Remi Cardona
. Richard Freeman
. Robin Johnson
. Ryan Hill
. Sean Amoss
. Sebastian Pipping
. Steev Klimaszewski
. Stratos Psomadakis
. Sven Vermeulen
. Sven Wegener
. Tom Wijsman
. Tomáš Chvátal
. Yury German
. Zack Medico

Last updated:
July 21, 2017, 04:05 UTC

Disclaimer:
Views expressed in the content published here do not necessarily represent the views of Gentoo Linux or the Gentoo Foundation.


Bugs? Comments? Suggestions? Contact us!

Powered by:
Planet Venus

Welcome to Gentoo Universe, an aggregation of weblog articles on all topics written by Gentoo developers. For a more refined aggregation of Gentoo-related topics only, you might be interested in Planet Gentoo.

July 20, 2017
Hanno Böck a.k.a. hanno (homepage, bugs)

KeyLately, some attention was drawn to a widespread problem with TLS certificates. Many people are accidentaly publishing their private keys. Sometimes they are released as part of applications, in Github repositories or with common filenames on web servers.

If a private key is compromised, a certificate authority is obliged to revoke it. The Baseline Requirements – a set of rules that browsers and certificate authorities agreed upon – regulate this and say that in such a case a certificate authority shall revoke the key within 24 hours (Section 4.9.1.1 in the current Baseline Requirements 1.4.8). These rules exist despite the fact that revocation has various problems and doesn’t work very well, but that’s another topic.

I reported various key compromises to certificate authorities recently and while not all of them reacted in time, they eventually revoked all certificates belonging to the private keys. I wondered however how thorough they actually check the key compromises. Obviously one would expect that they cryptographically verify that an exposed private key really is the private key belonging to a certificate.

I registered two test domains at a provider that would allow me to hide my identity and not show up in the whois information. I then ordered test certificates from Symantec (via their brand RapidSSL) and Comodo. These are the biggest certificate authorities and they both offer short term test certificates for free. I then tried to trick them into revoking those certificates with a fake private key.

Forging a private key

To understand this we need to get a bit into the details of RSA keys. In essence a cryptographic key is just a set of numbers. For RSA a public key consists of a modulus (usually named N) and a public exponent (usually called e). You don’t have to understand their mathematical meaning, just keep in mind: They’re nothing more than numbers.

An RSA private key is also just numbers, but more of them. If you have heard any introductory RSA descriptions you may know that a private key consists of a private exponent (called d), but in practice it’s a bit more. Private keys usually contain the full public key (N, e), the private exponent (d) and several other values that are redundant, but they are useful to speed up certain things. But just keep in mind that a public key consists of two numbers and a private key is a public key plus some additional numbers. A certificate ultimately is just a public key with some additional information (like the host name that says for which web page it’s valid) signed by a certificate authority.

A naive check whether a private key belongs to a certificate could be done by extracting the public key parts of both the certificate and the private key for comparison. However it is quite obvious that this isn’t secure. An attacker could construct a private key that contains the public key of an existing certificate and the private key parts of some other, bogus key. Obviously such a fake key couldn’t be used and would only produce errors, but it would survive such a naive check.

I created such fake keys for both domains and uploaded them to Pastebin. If you want to create such fake keys on your own here’s a script. To make my report less suspicious I searched Pastebin for real, compromised private keys belonging to certificates. This again shows how problematic the leakage of private keys is: I easily found seven private keys for Comodo certificates and three for Symantec certificates, plus several more for other certificate authorities, which I also reported. These additional keys allowed me to make my report to Symantec and Comodo less suspicious: I could hide my fake key report within other legitimate reports about a key compromise.

Symantec revoked a certificate based on a forged private key

SymantecComodo didn’t fall for it. They answered me that there is something wrong with this key. Symantec however answered me that they revoked all certificates – including the one with the fake private key.

No harm was done here, because the certificate was only issued for my own test domain. But I could’ve also fake private keys of other peoples' certificates. Very likely Symantec would have revoked them as well, causing downtimes for those sites. I even could’ve easily created a fake key belonging to Symantec’s own certificate.

The communication by Symantec with the domain owner was far from ideal. I first got a mail that they were unable to process my order. Then I got another mail about a “cancellation request”. They didn’t explain what really happened and that the revocation happened due to a key uploaded on Pastebin.

I then informed Symantec about the invalid key (from my “real” identity), claiming that I just noted there’s something wrong with it. At that point they should’ve been aware that they revoked the certificate in error. Then I contacted the support with my “domain owner” identity and asked why the certificate was revoked. The answer: “I wanted to inform you that your FreeSSL certificate was cancelled as during a log check it was determined that the private key was compromised.”

To summarize: Symantec never told the domain owner that the certificate was revoked due to a key leaked on Pastebin. I assume in all the other cases they also didn’t inform their customers. Thus they may have experienced a certificate revocation, but don’t know why. So they can’t learn and can’t improve their processes to make sure this doesn’t happen again. Also, Symantec still insisted to the domain owner that the key was compromised even after I already had informed them that the key was faulty.

How to check if a private key belongs to a certificate?

SSLShopper checkIn case you wonder how you properly check whether a private key belongs to a certificate you may of course resort to a Google search. And this was fascinating – and scary – to me: I searched Google for “check if private key matches certificate”. I got plenty of instructions. Almost all of them were wrong. The first result is a page from SSLShopper. They recommend to compare the MD5 hash of the modulus. That they use MD5 is not the problem here, the problem is that this is a naive check only comparing parts of the public key. They even provide a form to check this. (That they ask you to put your private key into a form is a different issue on its own, but at least they have a warning about this and recommend to check locally.)

Furthermore we get the same wrong instructions from the University of Wisconsin, Comodo (good that their engineers were smart enough not to rely on their own documentation), tbs internet (“SSL expert since 1996”), ShellHacks, IBM and RapidSSL (aka Symantec). A post on Stackexchange is the only result that actually mentions a proper check for RSA keys. Two more Stackexchange posts are not related to RSA, I haven’t checked their solutions in detail.

Going to Google results page two among some unrelated links we find more wrong instructions and tools from Symantec, SSL247 (“Symantec Specialist Partner Website Security” - they learned from the best) and some private blog. A documentation by Aspera (belonging to IBM) at least mentions that you can check the private key, but in an unrelated section of the document. Also we get more tools that ask you to upload your private key and then not properly check it from SSLChecker.com, the SSL Store (Symantec “Website Security Platinum Partner”), GlobeSSL (“in SSL we trust”) and - well - RapidSSL.

Documented Security Vulnerability in OpenSSL

So if people google for instructions they’ll almost inevitably end up with non-working instructions or tools. But what about other options? Let’s say we want to automate this and have a tool that verifies whether a certificate matches a private key using OpenSSL. We may end up finding that OpenSSL has a function x509_check_private_key() that can be used to “check the consistency of a private key with the public key in an X509 certificate or certificate request”. Sounds like exactly what we need, right?

Well, until you read the full docs and find out that it has a BUGS section: “The check_private_key functions don't check if k itself is indeed a private key or not. It merely compares the public materials (e.g. exponent and modulus of an RSA key) and/or key parameters (e.g. EC params of an EC key) of a key pair.”

I think this is a security vulnerability in OpenSSL (discussion with OpenSSL here). And that doesn’t change just because it’s a documented security vulnerability. Notably there are downstream consumers of this function that failed to copy that part of the documentation, see for example the corresponding PHP function (the limitation is however mentioned in a comment by a user).

So how do you really check whether a private key matches a certificate?

Ultimately there are two reliable ways to check whether a private key belongs to a certificate. One way is to check whether the various values of the private key are consistent and then check whether the public key matches. For example a private key contains values p and q that are the prime factors of the public modulus N. If you multiply them and compare them to N you can be sure that you have a legitimate private key. It’s one of the core properties of RSA that it’s secure based on the assumption that it’s not feasible to calculate p and q from N.

You can use OpenSSL to check the consistency of a private key:
openssl rsa -in [privatekey] -check

For my forged keys it will tell you:
RSA key error: n does not equal p q

You can then compare the public key, for example by calculating the so-called SPKI SHA256 hash:
openssl pkey -in [privatekey] -pubout -outform der | sha256sum
openssl x509 -in [certificate] -pubkey |openssl pkey -pubin -pubout -outform der | sha256sum

Another way is to sign a message with the private key and then verify it with the public key. You could do it like this:
openssl x509 -in [certificate] -noout -pubkey > /tmp/pubkey.pem
dd if=/dev/urandom of=/tmp/rnd bs=32 count=1
openssl rsautl -sign -pkcs -inkey [privatekey] -in /tmp/rnd -out /tmp/sig
openssl rsautl -verify -pkcs -pubin -inkey /tmp/pubkey.pem -in /tmp/sig -out /tmp/check
cmp /tmp/rnd /tmp/check

If cmp produces no output then the signature matches.

As this is all quite complex due to OpenSSLs arcane command line interface I have put this all together in a script. You can pass a certificate and a private key, both in ASCII/PEM format, and it will do both checks.

Summary

Symantec did a major blunder by revoking a certificate based on completely forged evidence. There’s hardly any excuse for this and it indicates that they operate a certificate authority without a proper understanding of the cryptographic background.

Apart from that the problem of checking whether a private key and certificate match seems to be largely documented wrong. Plenty of erroneous guides and tools may cause others to fall for the same trap.

July 18, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
IPv6: WordPress has a long way to go, too (July 18, 2017, 21:04 UTC)

I recently complained about Hugo and the fact that it seems like its development was taken over by SEO-types, that changed its defaults to something I’m not comfortable with. In the comments to that post I have let it understood that I’ve been looking into WordPress as an alternative once again.

The reason why I’m looking into WordPress is that I expected it to be a much easier setup, and (assuming I don’t go crazy on the plugins) an easily upgradeable install. Jürgen told me that they now support Markdown, and of course moving to WordPress means I don’t need to keep using Disqus for comments, and I can own my comments again.

The main problem with WordPress, like most PHP apps, is that it requires particular care for it to be set up securely and safely. Luckily I do have some experience with this kind of work, and I thought I might as well share my experience and my setup once I got it running. But here is where things got complicated, to the point I’m not sure if I have any chance of getting this working, so I may have to stick with Hugo for much longer than I was hoping. And almost all of the problems fall back to the issue that my battery of test servers are IPv6-only. But don’t let me get ahead of myself.

After installing, configuring, and getting MySQL, Apache, and PHP-FPM to all work together nicely (which believe me was not obvious), I tried to set up the Akismet plugin, which failed. I ignored that, removed it, and then figured out that there is no setting to enable Markdown at all. Turns out it requires a plugin, which, according again to Jürgen, is the Jetpack plugin from WordPress.com itself.

Unfortunately, I couldn’t get the plugins page to work at all: it would just return an error connecting to Wordpress.org. The first problem was that the Plugins page wouldn’t load at all, and a quick tcpdump later told me that WordPress tried connecting to api.wordpress.org. Which despite having eight separate IP addresses to respond from, has no IPv6. Well, that’s okay, I have a TinyProxy running on the host system that I use to fetch distfiles from the “outside world” that is not v6-compatible, so I just need to set this up, right? After all, I was already planning on disallowing direct network access to WordPress, so that’s not a big deal.

Well, the first problem is that the way to set up proxies with WordPress is not documented in the default wp-config.php file. Luckily I found that someone else wrote it down. And that started me on the right direction. Except it was not enough, as the list of plugins and the search page would come up, but they wouldn’t download, with the same error about not being able to establish a (secure) connection to WordPress.org, but only on the Apache error log at first — the page itself would have a debug trace if you ask WP to enable debug reporting.

Quite a bit of debugging later, with tcpdump and editing the source code, I found the problem: some of the requests sent by WordPress target HTTP endpoints, and others (including the downloads, correctly) target HTTPS endpoints. The HTTP endpoints worked fine, but the HTTPS ones failed. And the reason why they failed is that they tried to connect to TinyProxy with TLS. TinyProxy does not support TLS, because it really just performs the minimal amount of work needed of a proxy. And for what it’s worth, in my setup it only allows local connections, so there is no real value in adding TLS to it.

Turns out this bug is only present if PHP does not have curl support, and WordPress fallback to fsockopen. Enabling the curl USE flag for the ebuild was enough to fix the problem, and I reported the bug. I honestly wonder if the Gentoo ebuild should actually force curl on, for WordPress, but I don’t want to go there yet.

By the way, I originally didn’t want to say this on this blog post, but since it effectively went viral, I also found out at that point that the reason why I could get a list of plugins, is that when the connection to api.wordpress.org with HTTPS fails, the code retries explicitly with HTTP. It’s effectively a silent connection downgrade (you’d still find the warning in the log, but nothing would appear like breaking at first). This appears to include the “new version check” of WordPress, which makes it an interesting security issue. I reported it via WordPress Hacker One page before my tweet got viral — and sorry, I didn’t at first realize just how bad that downgrade was.

So now I have an installation of WordPress, mostly secure, able to look for, fetch and install plugins. Let me install JetPack to get that famous Markdown support that is to me a requirement and dealbreaker. For some reason (read: because WordPress is more Open Core than Open Source), it requires activating with a WordPress.com account. That should be easy, yes?

Error Details: The Jetpack server was unable to communicate with your site https://[OMISSIS] [IXR -32300: transport error: http_request_failed cURL error 7: ]

I hid away the URL for my test server, simply to avoid spammers. The website is public, and it has a valid certificate (thank you Let’s Encrypt), and it is not firewalled or requiring any particular IP to connect to. But, IPv6 only. Which makes it easy for me, as it reduces the amount of scanners and spammers while I try it out, and since I have an IPv6-enabled connection at both home and the office, it makes it easy to test with.

Unfortunately it seems like the WordPress infrastructure not only is not reachable from the IPv6 Internet, but it does not even egress onto it either. Which makes once again the myth of IPv6-only networks infeasible. Contacting WordPress.com on Twitter ended up with them opening a support ticket for me, and a few exchanges and logs later, they confirmed their infrastructure does not support IPv6 and, as they said, «[they] don’t have an estimate on when [they] may».

Where does this leave me? Well, right now I can’t activate the normal Jetpack plugin, but they have a “development version”, which they assert is fully functional for Markdown, and that should let me keep evaluating WordPress without this particular hurdle. Of course this requires more time, and may end up with me hitting other roadblock at this point, I’m not sure. So we’ll see.

Whether I’ll get it to work or not, I will share my configuration files in the near future, because it took me a while to get them set up properly, and some of them are not really explained. You may end up seeing a new restyle of the blog in the next few months. It’s going to bother me a bit, because I usually prefer to kepe the blog the same way for years, instead, but I guess that needs to happen this time around. Also, changing the posts’ paths again means I’ll have to set up another chain of redirects. If I do that, I have a bit of a plan to change the hostname of the blog too.

Sven Vermeulen a.k.a. swift (homepage, bugs)
Project prioritization (July 18, 2017, 18:40 UTC)

This is a long read, skip to “Prioritizing the projects and changes” for the approach details...

Organizations and companies generally have an IT workload (dare I say, backlog?) which needs to be properly assessed, prioritized and taken up. Sometimes, the IT team(s) get an amount of budget and HR resources to "do their thing", while others need to continuously ask for approval to launch a new project or instantiate a change.

Sizeable organizations even require engineering and development effort on IT projects which are not readily available: specialized teams exist, but they are governance-wise assigned to projects. And as everyone thinks their project is the top-most priority one, many will be disappointed when they hear there are no resources available for their pet project.

So... how should organizations prioritize such projects?

Structure your workload, the SAFe approach

A first exercise you want to implement is to structure the workload, ideas or projects. Some changes are small, others are large. Some are disruptive, others are evolutionary. Trying to prioritize all different types of ideas and changes in the same way is not feasible.

Structuring workload is a common approach. Changes are grouped in projects, projects grouped in programs, programs grouped in strategic tracks. Lately, with the rise in Agile projects, a similar layering approach is suggested in the form of SAFe.

In the Scaled Agile Framework a structure is suggested that uses, as a top-level approach, value streams. These are strategically aligned steps that an organization wants to use to build solutions that provide a continuous flow of value to a customer (which can be internal or external). For instance, for a financial service organization, a value stream could focus on 'Risk Management and Analytics'.

SAFe full framework

SAFe full framework overview, picture courtesy of www.scaledagileframework.com

The value streams are supported through solution trains, which implement particular solutions. This could be a final product for a customer (fitting in a particular value stream) or a set of systems which enable capabilities for a value stream. It is at this level, imo, that the benefits exercises from IT portfolio management and benefits realization management research plays its role (more about that later). For instance, a solution train could focus on an 'Advanced Analytics Platform'.

Within a solution train, agile release trains provide continuous delivery for the various components or services needed within one or more solutions. Here, the necessary solutions are continuously delivered in support of the solution trains. At this level, focus is given on the culture within the organization (think DevOps), and the relatively short-lived delivery delivery periods. This is the level where I see 'projects' come into play.

Finally, you have the individual teams working on deliverables supporting a particular project.

SAFe is just one of the many methods for organization and development/delivery management. It is a good blueprint to look into, although I fear that larger organizations will find it challenging to dedicate resources in a manageable way. For instance, how to deal with specific expertise across solutions which you can't dedicate to a single solution at a time? What if your organization only has two telco experts to support dozens of projects? Keep that in mind, I'll come back to that later...

Get non-content information about the value streams and solutions

Next to the structuring of the workload, you need to obtain information about the solutions that you want to implement (keeping with the SAFe terminology). And bear in mind that seemingly dull things such as ensuring your firewalls are up to date are also deliverables within a larger ecosystem. Now, with information about the solutions, I don't mean the content-wise information, but instead focus on other areas.

Way back, in 1952, Harry Markowitz introduced Modern portfolio theory as a mathematical framework for assembling a portfolio of assets such that the expected return is maximized for a given level of risk (quoted from Wikipedia). This was later used in an IT portfolio approach by McFarlan in his Portfolio Approach to Information Systems article, published in September 1981.

There it was already introduced that risk and return shouldn't be looked at from an individual project viewpoint, but how it contributes to the overall risk and return. A balance, if you wish. His article attempts to categorize projects based on risk profiles on various areas. Personally, I see the suggested categorization more as a way of supporting workload assessments (how many mandays of work will this be), but I digress.

Since then, other publications came up which tried to document frameworks and methodologies that facilitate project portfolio prioritization and management. The focus often boils down to value or benefits realization. In The Information Paradox John Thorp comes up with a benefits realization approach, which enables organizations to better define and track benefits realization - although it again boils down on larger transformation exercises rather than the lower-level backlogs. The realm of IT portfolio management and Benefits realization management gives interesting pointers as to the lecture part of prioritizing projects.

Still, although one can hardly state the resources are incorrect, a common question is how to make this tangible. Personally, I tend to view the above on the value stream level and solution train level. Here, we have a strong alignment with benefits and value for customers, and we can leverage the ideas of past research.

The information needed at this level often boils down to strategic insights and business benefits, coarse-grained resource assessments, with an important focus on quality of the resources. For instance, a solution delivery might take up 500 days of work (rough estimation) but will also require significant back-end development resources.

Handling value streams and solutions

As we implement this on the highest level in the structure, it should be conceivable that the overview of the value streams (a dozen or so) and solutions (a handful per value stream) is manageable, and something that at an executive level is feasible to work with. These are the larger efforts for structuring and making strategic alignment. Formal methods for prioritization are generally not implemented or described.

In my company, there are exercises that are aligning with SAFe, but it isn't company-wide. Still, there is a structure in place that (within IT) one could map to value streams (with some twisting ;-) and, within value streams, there are structures in place that one could map to the solution train exercises.

We could assume that the enterprise knows about its resources (people, budget ...) and makes a high-level suggestion on how to distribute the resources in the mid-term (such as the next 6 months to a year). This distribution is challenged and worked out with the value stream owners. See also "lean budgeting" in the SAFe approach for one way of dealing with this.

There is no prioritization of value streams. The enterprise has already made its decision on what it finds to be the important values and benefits and decided those in value streams.

Within a value stream, the owner works together with the customers (internal or external) to position and bring out solutions. My experience here is that prioritization is generally based on timings and expectations from the customer. In case of resource contention, the most challenging decision to make here is to put a solution down (meaning, not to pursue the delivery of a solution), and such decisions are hardly taken.

Prioritizing the projects and changes

In the lower echelons of the project portfolio structure, we have the projects and changes. Let's say that the levels here are projects (agile release trains) and changes (team-level). Here, I tend to look at prioritization on project level, and this is the level that has a more formal approach for prioritization.

Why? Because unlike the higher levels, where the prioritization is generally quality-oriented on a manageable amount of streams and solutions, we have a large quantity of projects and ideas. Hence, prioritization is more quantity-oriented in which formal methods are more efficient to handle.

The method that is used in my company uses scoring criteria on a per-project level. This is not innovative per se, as past research has also revealed that project categorization and mapping is a powerful approach for handling project portfolio's. Just look for "categorizing priority projects it portfolio" in Google and you'll find ample resources. Kendal's Advanced Project Portfolio Management and the PMO (book) has several example project scoring criteria's. But allow me to explain our approach.

It basically is like so:

  1. Each project selects three value drivers (list decided up front)
  2. For the value drivers, the projects check if they contribute to it slightly (low), moderately (medium) or fully (high)
  3. The value drivers have weights, as do the values. Sum the resulting products to get a priority score
  4. Have the priority score validated by a scoring team

Let's get to the details of it.

For the IT projects within the infrastructure area (which is what I'm active in), we have around 5 scoring criteria (value drivers) that are value-stream agnostic, and then 3 to 5 scoring criteria that are value-stream specific. Each scoring criteria has three potential values: low (2), medium (4) and high (9). The numbers are the weights that are given to the value.

A scoring criteria also has a weight. For instance, we have a scoring criteria on efficiency (read: business case) which has a weight of 15, so a score of medium within that criteria gives a total value of 60 (4 times 15). The potential values here are based on the "return on investment" value, with low being a return less than 2 years, medium within a year, and high within a few months (don't hold me on the actual values, but you get the idea).

The sum of all values gives a priority score. Now, hold your horses, because we're not done yet. There is a scoring rule that says a project can only be scored by at most 3 scoring criteria. Hence, project owners need to see what scoring areas their project is mostly visible in, and use those scoring criteria. This rule supports the notion that people don't bring around ideas that will fix world hunger and make a cure for cancer, but specific, well scoped ideas (the former are generally huge projects, while the latter requires much less resources).

OK, so you have a score - is that your priority? No. As a project always falls within a particular value stream, we have a "scoring team" for each value stream which does a number of things. First, it checks if your project really belongs in the right value stream (but that's generally implied) and has a deliverable that fits the solution or target within that stream. Projects that don't give any value or aren't asked by customers are eliminated.

Next, the team validates if the scoring that was used is correct: did you select the right values (low, medium or high) matching the methodology for said criteria? If not, then the score is adjusted.

Finally, the team validates if the resulting score is perceived to be OK or not. Sometimes, ideas just don't map correctly on scoring criteria, and even though a project has a huge strategic importance or deliverable it might score low. In those cases, the scoring team can adjust the score manually. However, this is more of a fail-safe (due to the methodology) rather than the norm. About one in 20 projects gets its score adjusted. If too many adjustments come up, the scoring team will suggest a change in methodology to rectify the situation.

With the score obtained and validated by the scoring team, the project is given a "go" to move to the project governance. It is the portfolio manager that then uses the scores to see when a project can start.

Providing levers to management

Now, these scoring criteria are not established from a random number generator. An initial suggestion was made on the scoring criteria, and their associated weights, to the higher levels within the organization (read: the people in charge of the prioritization and challenging of value streams and solutions).

The same people are those that approve the weights on the scoring criteria. If management (as this is often the level at which this is decided) feels that business case is, overall, more important than risk reduction, then they will be able to put a higher value in the business case scoring than in the risk reduction.

The only constraint is that the total value of all scoring criteria must be fixed. So an increase on one scoring criteria implies a reduction on at least one other scoring criteria. Also, changing the weights (or even the scoring criteria themselves) cannot be done frequently. There is some inertia in project prioritization: not the implementation (because that is a matter of following through) but the support it will get in the organization itself.

Management can then use external benchmarks and other sources to gauge the level that an organization is at, and then - if needed - adjust the scoring weights to fit their needs.

Resource allocation in teams

Portfolio managers use the scores assigned to the projects to drive their decisions as to when (and which) projects to launch. The trivial approach is to always pick the projects with the highest scores. But that's not all.

Projects can have dependencies on other projects. If these dependencies are "hard" and non-negotiable, then the upstream project (the one being dependent on) inherits the priority of the downstream project (the one depending on the first) if the downstream project has a higher priority. Soft dependencies however need to validate if they can (or have to) wait, or can implement workarounds if needed.

Projects also have specific resource requirements. A project might have a high priority, but if it requires expertise (say DBA knowledge) which is unavailable (because those resources are already assigned to other ongoing projects) then the project will need to wait (once resources are fully allocated and the projects are started, then they need to finish - another reason why projects have a narrow scope and an established timeframe).

For engineers, operators, developers and other roles, this approach allows them to see which workload is more important versus others. When their scope is always within a single value stream, then the mentioned method is sufficient. But what if a resource has two projects, each of a different value stream? As each value stream has its own scoring criteria it can use (and weight), one value stream could systematically have higher scores than others...

Mixing and matching multiple value streams

To allow projects to be somewhat comparable in priority values, an additional rule has been made in the scoring methodology: value streams must have a comparable amount of scoring criteria (value drivers), and the total value of all criteria must be fixed (as was already mentioned before). So if there are four scoring criteria and the total value is fixed at 20, then one value stream can have its criteria at (5,3,8,4) while another has it at (5,5,5,5).

This is still not fully adequate, as one value stream could use a single criteria with the maximum amount (20,0,0,0). However, we elected not to put in an additional constraint, and have management work things out if the situation ever comes out. Luckily, even managers are just human and they tend to follow the notion of well-balanced value drivers.

The result is that two projects will have priority values that are currently sufficiently comparable to allow cross-value-stream experts to be exchangeable without monopolizing these important resources to a single value stream portfolio.

Current state

The scoring methodology has been around for a few years already. Initially, it had fixed scoring criteria used by three value streams (out of seven, the other ones did not use the same methodology), but this year we switched to support both value stream agnostic criteria (like in the past) as well as value stream specific ones.

The methodology is furthest progressed in one value stream (with focus of around 1000 projects) and is being taken up by two others (they are still looking at what their stream-specific criteria are before switching).

July 16, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
Who reaps the benefits of Free Software? (July 16, 2017, 14:04 UTC)

I feel silly to have to start this post by having to boast about my own achievements, but my previous post has stirred up a lot of comments (outside my blog’s own, that is) and a number of those could be summed up with “I don’t know this guy, he’s of course new, has no idea what he’s talking about”. So let’s start with the fact that I’ve been involved in Free Software for just about half of my life at this point.

And while I have not tied my name to any particular project, I have contributed to a significant amount of them by now. I’m not an “ideas man” so you can count on me to help figuring out problems and fixing bugs, but I would probably start hiding in a corner if I built up a cult of personality; unfortunately that appears to be what many other contributors to Free Software have done over the years, and this brings weight to their own name. I don’t have such weight, but you probably better off googling my name around before thinking I have no stake in the fire.

Introduction over, let’s get to the meat of this post.

If you’re reading this post, it is likely that you’re a Free Software supporter, creator, user, or a combination of these. Even those people that I know fiercely criticized Free Software can’t say that they don’t use any nowadays: at the very least the two major mobile operating systems have a number of Free Software components they rely upon, which means they are all users of Free Software, whether they want it or not. Which means I have no reason to try to convince you that Free Software is good. But good for who?

RMS notoriously titled his essay anthology Free Software, Free Society, which implies that Free Software is good for everybody — I can agree to a point, but at the same time I don’t think it’s as clear cut as he wants to make it sound.

I wrote about this before, but I want to write it down again. For the majority of people, Free Software is not important. You can argue that we should make it clear to them that it is important, but that sounds like a dogma, and I really don’t like dogmas.

Free Software supporters and users, for the most part, are geeks who are able to make use of available source code of software to do… something about it. Sometimes it’s not a matter of making changes to the code, because even making changes is not enough, but for instance you can reverse engineer an undocumented protocol in such a way that can be reimplemented properly.

But what about my brother in law? He can’t code to save his life, and he has no interest in reverse engineering protocols. How is Free Software important to him and benefiting him? Truthfully, the answer is “not at all, directly”. He’s a Free Software user, because he has an Android phone and an iPad, but neither are entirely Free Software, and even all the more “libre” projects are making his life any easier. I’ll go back to that later.

In the post I already referenced, I pointed out how the availability of open-source online diabetes management software would make possible for doctors to set up their own instances of these apps to give their patients access. But that only works in theory — in practice, no doctor would be able to set this up by themselves safely, and data protection laws would likely require them to hire an external contractor to set it up and maintain it. And in that case, what would be the difference between that and hiring a company that developed their own closed-source application, and maybe provide it as a cloud service?

Here’s the somewhat harsh truth: for most people who are not into IT and geekery, there is no direct advantage to Free Software, but there are multiple indirect ones, almost all of which rotate around one “simple” concept: Free Software makes Free Market. And oh my, is this term loaded, and ready to explode a flame just by me using it. Particularly with a significant amount of Free Software activists nowadays being pretty angry with capitalism as a concept, and that’s without going into the geek supremacists ready to poison the well for their own ego.

When Free Software is released, it empowers companies, freelancer, and people alike to re-use and improve it – I will not go into the discussion of which license allows what; I’ll just hand-wave this problem for now – which increase competition, which is generally good. Re-using the example of online diabetes management software, thanks to open-source it’s no longer only the handful of big companies that spent decades working on diabetes that can provide software to the doctors, but any other company that wants to… that is if they have the means to comply with data protection laws and similar hurdles.

Home routers are another area in which Free Software has clearly left a mark. From the WRT54G, which was effectively the easiest hackable router of its time, we made significant progress with both OpenWRT and LEDE, to the point that using a “pure” (to a point) Free Software home router is not only possible but feasible, and there even are routers that you can buy with Free Software firmware installed by default. But even here you can notice an important distinction: you have to buy the router with the firmware you want. You can’t just build it yourself for obvious reasons.

And this is the important part for me. There is this geek myth that everyone should be a “maker” — I don’t agree with this, because not everyone wants to be one, so they should not be required to become one. I am totally sold that everybody should have the chance and access to information to become one if they so want, and that’s why I also advocate reverse engineering and documenting whatever is not already freely accessible.

But for Free Software to be consumable by users, to improve society, you need a mediating agency, and that agency lies in the companies that provide valuable services to users. And with “valuable services” I do not mean solely services aimed at the elites, or even just at that part of the population living in big metropolises like SF or NYC. Not all the “ubers of” companies that try to provide services with which you can interact online or through apps are soulless entities. Not all the people wanting to use these services are privileged (though I am).

And let me be clear, that I don’t mean that Free Software has to be subject to companies and startups to make sense. I have indeed already complained about startups camouflaged by communities. In a healthy environment, the fact that some Free Software project is suitable to make a startup thrive is not the same as the startup needing Free Software contributions to be alive. The distinction is hard to put down on paper, but I think most of you have seen how that turns out for projects like ownCloud.

Since I have already complained about anti-corporatism in Free Software years ago, well before joining the current “big corporation” I work for, why am I going back to the topic, particularly as I can be seen as a controversial character due to that? Well, there are a few things that made me think. Only partially, this relates to the fact that I’ve been attacked a time or two for working for said corporation, and some of it is because I stopped contributing to some projects — although all but one of those cases, the reason was simply my own energy to keep contributing, rather than a work specific problem.

I have seen other colleagues still maintaining enough energy to keep contributing to open source while working at the same office; I have seen some dedicating enough of their work time to that as well. While of course this kind of jobs do limit the amount of time you can put to Free Software, I also think that a number of contributors who end up burning themselves out due to the hardships of paying the bills would probably be glad to exchange full-time Free Software work with part-time one if they were so lucky. So in this, I still count myself particularly privileged, but embrace it, because if I can contribute less time but for a longer time, I think it’s worth it.

But while I do my best to keep improving Free Software, and contribute to the public good, including by documenting glucometer protocols, I hear people criticizing how the only open-source GSM stack is funded, even though Harald Welte is dedicating a lot of his personal time, and doing a lot of ungrateful work, while certain “freedom fighters” decide to cut corners and break licenses.

At the same time, despite not being my personal favourite company, particularly after the most recent allegations of its conduct, GitHub’s Open Source Friday is a neat idea to convince companies that rely on Free Software to do something — sometimes the something may just as well be writing documentation for software, because that’s possibly more important than coding! Given that some of the reasons I’ve read for attacking them is that they are not “pure enough”, because they do not open their core business application, I feel it’s a bit of a cheap shot, given that they are probably the company that most empowered Free Software since the original SourceForge.

So what is that I am suggesting (given people appear to expect me to have answers in additions to opinions)? If I have to give a suggestion to all Free Software contributors out there, that is to always consider what they can do to make sure that their contributions can be consumed at all. That includes for instance not using joke licenses and not discriminating against requests from companies, because those companies might have the resources to make your software successful too.

Which does not mean that you should just bend to whatever the first company passing by request you, nor that you should provide them with free labour. Again that’s a game of balance: you can’t have a successful project that nobody uses, but you’re not there to work for free either. The right way is to provide something that is useful and used. And to make this compromise work, one of the best suggestion I can give to Free Software developers, is to learn a bit about the theories of business.

Unfortunately, I have also seen way too many Free Software supporters (luckily, less so contributors) keep believing that words like “business” and “marketing” are the devil’s own, and do not even stop thinking of what they actually mean — and that is a bad thing. Even when you don’t like some philosophy, or even more so when you don’t like some philosophy, the best way to fight it, is to know it. So if you really think marketing is that evil, you may want to go and read a book about marketing: you’ll understand how it works, and how to defend yourself from its tactics.

Michał Górny a.k.a. mgorny (homepage, bugs)
GLEP 73 check results explained (July 16, 2017, 08:40 UTC)

The pkgcheck instance run for the Repo mirror&CI project has finished gaining a full support for GLEP 73 REQUIRED_USE validation and verification today. As a result, it can report 5 new issues defined by that GLEP. In this article, I’d like to shortly summarize them and explain how to interpret and solve the reports.

Technical note: the GLEP number has not been formally assigned yet. However, since there is no other GLEP request open at the moment, I have taken the liberty of using the next free number in the implementation.

GLEP73Syntax: syntax violates GLEP 73

GLEP 73 specifies a few syntax restrictions as compared to the pretty much free-form syntax allowed by the PMS. The restrictions could be shortly summarized as:

  • ||, ^^ and ?? can not not be empty,
  • ||, ^^ and ?? can not not be nested,
  • USE-conditional groups can not be used inside ||, ^^ and ??,
  • All-of groups (expressed using parentheses without a prefix) are banned completely.

The full rationale for the restrictions, along with examples and proposed fixes is provided in the GLEP. For the purpose of this article, it is enough to say that in all the cases found, there was a simpler (more obvious) way of expressing the same constraint.

Violation of this syntax prevents pkgcheck from performing any of the remaining checks. But more importantly, the report indicates that the constraint is unnecessarily complex and could result in REQUIRED_USE mismatch messages that are unnecessarily confusing to the user. Taking a real example, compare:

  The following REQUIRED_USE flag constraints are unsatisfied:
    exactly-one-of ( ( !32bit 64bit ) ( 32bit !64bit ) ( 32bit 64bit ) )

and the effect of a valid replacement:

  The following REQUIRED_USE flag constraints are unsatisfied:
	any-of ( 64bit 32bit )

While we could debate about usefulness of the Portage output, I think it is clear that the second output is simpler to comprehend. And the best proof is that you actually need to think a bit before confirming that they’re equivalent.

GLEP73Immutability: REQUIRED_USE violates immutability rules

This one is rather simple: it means this constraint may tell user to enable (disable) a flag that is use.masked/forced. Taking a trivial example:

a? ( b )

GLEP73Immutability report will trigger if a profile masks the b flag. This means that if the user has a enabled, the PM would normally tell him to enable b as well. However, since b is masked, it can not be enabled using normal methods (we assume that altering use.mask is not normally expected).

The alternative is to disable a then. But what’s the point of letting user enable it if we afterwards tell him to disable it anyway? It is more friendly to disable both flags together, and this is pretty much what the check is about. So in this case, the solution is to mask a as well.

How to read it? Given the generic message of:

REQUIRED_USE violates immutability rules: [C] requires [E] while the opposite value is enforced by use.force/mask (in profiles: [P])

It indicates that in profiles P (a lot of profiles usually indicates you’re looking for base or top-level arch profile), E is forced or masked, and that you probably need to force/mask C appropriately as well.

GLEP73SelfConflicting: impossible self-conflicting condition

This one is going to be extremely rare. It indicates that somehow the REQUIRED_USE nested a condition and its negation, causing it to never evaluate to true. It is best explained using the following trivial example:

a? ( !a? ( b ) )

This constraint will never be enforced since a and !a can not be true simultaneously.

Is there a point in having such a report at all? Well, such a thing is extremely unlikely to happen. However, it would break the verification algorithms and so we need to account for it explicitly. Since we account for it anyway and it is a clear mistake, why not report it?

GLEP73Conflict: request for conflicting states

This warning indicates that there are at least two constraints that can apply simultaneously and request the opposite states for the same USE flag. Again, best explained on a generic example:

a? ( c ) b? ( !c )

In this example, any USE flag set with both a and b enabled could not satisfy the constraint. However, Portage will happily led us astray:

  The following REQUIRED_USE flag constraints are unsatisfied:
	a? ( c )

If we follow the advice and enable c, we get:

  The following REQUIRED_USE flag constraints are unsatisfied:
	b? ( !c )

The goal of this check is to avoid such a bad advices, and to require constraints to clearly indicate a suggested way forward. For example, the above case could be modified to:

a? ( !b c ) b? ( !c )

to indicate that a takes precedence over b, and that b should be disabled to avoid the impossible constraint. The opposite can be stated similarly — however, note that you need to reorder the constraints to make sure that the PM will get it right:

b? ( !a !c ) a? ( c )

How to read it? Given the generic message of:

REQUIRED_USE can request conflicting states: [Ci] requires [Ei] while [Cj] requires [Ej]

It means that if the user enables Ci and Cj simultaneously, the PM will request conflicting Ei and Ej. Depending on the intent, the solution might involve negating one of the conditions in the other constraint, or reworking the REQUIRED_USE towards another solution.

GLEP73BackAlteration: previous condition starts applying

This warning is the most specific and the least important from all the additions at the moment. It indicates that the specific constraint may cause a preceding condition to start to apply, enforcing additional requirements. Consider the following example:

b? ( c ) a? ( b )

If the user has only a enabled, the second rule will enforce b. Then the condition for the first rule will start matching, and additionally enforce c. Is this a problem? Usually not. However, for the purpose of GLEP 73 we prefer that the REQUIRED_USE can be enforced while processing left-to-right, in a single iteration. If a previous rule starts applying, we may need to do another iteration.

The solution is usually trivial: to reorder (swap) the constraints. However, in some cases developers seem to prefer copying the enforcements into the subsequent rule, e.g.:

b? ( c ) a? ( b c )

Either way works for the purposes of GLEP 73, though the latter increases complexity.

How to read it? Given the generic message of:

REQUIRED_USE causes a preceding condition to start applying: [Cj] enforces [Ej] which may cause preceding [Ci] enforcing [Ei] to evaluate to true

This indicates that if Cj is true, Ej needs to be true as well. Once it is true, a preceding condition of Ci may also become true, adding another requirement for Ei. To fix the issue, you need to either move the latter constraint before the former, or include the enforcement of Ei in the rule for Cj, rendering the application of the first rule unnecessary.

Constructs using ||, ^^ and ?? operators

GLEP 73 specifies a leftmost-preferred behavior for the ||, ^^ and ?? operators. It is expressed in a simple transformation into implications (USE-conditional groups). Long story short:

  • || and ^^ groups force the leftmost unmasked flag if none of the flags are enabled already, and
  • ?? and ^^ groups disable all but the leftmost enabled flag if more than one flag is enabled.

All the verification algorithms work on the transformed form, and so their output may list conditions resulting from it. For example, the following construct:

|| ( a b c ) static? ( !a )

will report a conflict between !b !c ⇒ a and static ⇒ !a. This indicates the fact that per the forementioned rule, || group is transformed into !b? ( !c? ( a ) ) which explains that if none of the flags are enabled, the first one is preferred, causing a conflict with the static flag.

In this particular case you could debate that the algorithm could choose b or c instead in order to avoid the problem. However, we determined that this kind of heuristic is not a goal for GLEP 73, and instead we always obide the developer’s preference expressed in the ordering. The only exception to this rule is when the leftmost flag can not match due to a mask, in which case the first unmasked flag is used.

For completeness, I should add that ?? and ^^ blocks create implications in the form of: a ⇒ !b !c…, b ⇒ !c… and so on.

At some point I might work on making the reports include the original form to avoid ambiguity.

The future

The most important goal for GLEP 73 is to make it possible for users to install packages out-of-the-box without having to fight through mazes of REQUIRED_USE, and for developers to use REQUIRED_USE not only sparingly but whenever possible to improve the visibility of resulting package configuration. However, there is still a lot of testing, some fixing and many bikesheds before that could happen.

Nevertheless, I think we can all agree that most of the reports produced so far (with the exception of the back-alteration case) are meaningful even without automatic enforcing of REQUIRED_USE, and fixing them would benefit our users already. I would like to ask you to look for the reports on your packages and fix them whenever possible. Feel free to ping me if you need any help with that.

Once the number of non-conforming packages goes down, I will convert the reports successively into warning levels, making the CI report new issues and the pull request scans proactively complain about them.

July 14, 2017
Sebastian Pipping a.k.a. sping (homepage, bugs)
Expat 2.2.2 released (July 14, 2017, 17:32 UTC)

Includes security fixes, a short article about the release is up here: Expat 2.2.2 released (XML.com)

 

July 12, 2017

Description:
graphicsmagick is a collection of tools and libraries for many image formats.

The complete ASan outputof the issue:

# gm identify $FILE
==20404==ERROR: AddressSanitizer: heap-use-after-free on address 0x6230000053c0 at pc 0x7fc01a253357 bp 0x7fffcd2d2630 sp 0x7fffcd2d2628
READ of size 8 at 0x6230000053c0 thread T0
    #0 0x7fc01a253356 in CloseBlob /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/blob.c:859:3
    #1 0x7fc013fbed77 in ReadMNGImage /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/coders/png.c:5144:11
    #2 0x7fc01a50ee88 in ReadImage /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/constitute.c:1607:13
    #3 0x7fc01a3a1f18 in ConvertImageCommand /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/command.c:4348:22
    #4 0x7fc01a3de0c5 in MagickCommand /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/command.c:8869:17
    #5 0x7fc01a48985b in GMCommandSingle /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/command.c:17396:10
    #6 0x7fc01a486991 in GMCommand /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/command.c:17449:16
    #7 0x7fc018cf1680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289
    #8 0x419cd8 in _init (/usr/bin/gm+0x419cd8)

0x6230000053c0 is located 6848 bytes inside of 6856-byte region [0x623000003900,0x6230000053c8)
freed by thread T0 here:
    #0 0x4cf4d0 in __interceptor_cfree /var/tmp/portage/sys-libs/compiler-rt-sanitizers-4.0.1/work/compiler-rt-4.0.1.src/lib/asan/asan_malloc_linux.cc:55
    #1 0x7fc01a8f13d2 in MagickFree /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/memory.c:509:5
    #2 0x7fc01a7dc750 in DestroyImage /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/image.c:1277:3
    #3 0x7fc01a8a7cda in DestroyImageList /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/list.c:239:5
    #4 0x7fc013fbed6f in ReadMNGImage /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/coders/png.c:5143:11
    #5 0x7fc01a50ee88 in ReadImage /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/constitute.c:1607:13
    #6 0x7fc01a3a1f18 in ConvertImageCommand /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/command.c:4348:22
    #7 0x7fc01a3de0c5 in MagickCommand /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/command.c:8869:17
    #8 0x7fc01a48985b in GMCommandSingle /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/command.c:17396:10
    #9 0x7fc01a486991 in GMCommand /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/command.c:17449:16
    #10 0x7fc018cf1680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289

previously allocated by thread T0 here:
    #0 0x4cf688 in malloc /var/tmp/portage/sys-libs/compiler-rt-sanitizers-4.0.1/work/compiler-rt-4.0.1.src/lib/asan/asan_malloc_linux.cc:66
    #1 0x7fc01a8f04d6 in MagickMalloc /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/memory.c:156:10
    #2 0x7fc01a7a6fa3 in AllocateImage /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/image.c:336:18
    #3 0x7fc013f7819a in ReadMNGImage /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/coders/png.c:3872:9
    #4 0x7fc01a50ee88 in ReadImage /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/constitute.c:1607:13
    #5 0x7fc01a3a1f18 in ConvertImageCommand /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/command.c:4348:22
    #6 0x7fc01a3de0c5 in MagickCommand /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/command.c:8869:17
    #7 0x7fc01a48985b in GMCommandSingle /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/command.c:17396:10
    #8 0x7fc01a486991 in GMCommand /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/command.c:17449:16
    #9 0x7fc018cf1680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289

SUMMARY: AddressSanitizer: heap-use-after-free /var/tmp/portage/media-gfx/graphicsmagick-1.3.26/work/GraphicsMagick-1.3.26/magick/blob.c:859:3 in CloseBlob
Shadow bytes around the buggy address:
  0x0c467fff8a20: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c467fff8a30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c467fff8a40: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c467fff8a50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c467fff8a60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c467fff8a70: fd fd fd fd fd fd fd fd[fd]fa fa fa fa fa fa fa
  0x0c467fff8a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c467fff8a90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c467fff8aa0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c467fff8ab0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c467fff8ac0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==20404==ABORTING

Affected version:
1.3.26

Fixed version:
N/A

Commit fix:
http://hg.code.sf.net/p/graphicsmagick/code/rev/d0a76868ca37

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-11403

Reproducer:
https://github.com/asarubbo/poc/blob/master/00301-graphicsmagick-UAF-CloseBlob

Timeline:
2017-07-10: bug discovered and reported to upstream
2017-07-10: upstream released a fix
2017-07-12: blog post about the issue
2017-07-18: CVE assigned

Note:
This bug was found with American Fuzzy Lop.

Permalink:

graphicsmagick: use-after-free in CloseBlob (blob.c)

Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
Why I do not like Hugo (July 12, 2017, 10:04 UTC)

Not even a year ago, I decided to start using Hugo as the engine for this blog. This has mostly served me well, except for the fact that it relies on me having some kind of access to a console, a text editor, and my Bitbucket account, which made posting stuff while travelling a bit harder, so I opted instead for writing drafts, and then staggering their posts — which is why you now see that for the most part I post something once every three days, except for the free ideas.

Hugo was sold to me as a static generator for blogs, and indeed when I looked into it, that’s what it was clearly aiming on being. Sure the support for arbitrary taxonomies make it possible to use it in slightly different setups for a blog, but it was at that point seriously focusing on blog, and a few other similar site types. The integration with Disqus was pretty good from the start, as much as I’m not really happy about that choice, and the conversion proceeded mostly smoothly, although it took me weeks to make sure the articles were converted correctly, and even months in I dedicated a few nights a month just to go through the posts and make sure their formatting was right, or through the tags to collapse duplicates.

All in all, while imperfect, it was not as horrible as having to maintain my own Typo fork. Until last week.

I finally decided that maintaining a separate website for the projects is a bad idea. Not just for the style being out of sync between the two, but most importantly because I barely ever update that content, as most of my projects are dead or have their own website already (like Autotools Mythbuster) or they effectively are just using their GitHub repository as the main description, even though it pains me. So the best option I found is to just build the pages I care about into Hugo, particularly using a custom taxonomy for the projects, and be done with it. Except.

Except that to be able to do what I had in mind, I needed a feature that was committed after the version of Hugo I froze myself at, so I had to update. Updates with Typo were always extremely painful because of new dependencies, and new features, and changes to the database layout, and all those kind of problems. Certainly Hugo won’t have these problems! Except it decided not to be able to render the theme I was using, as one function got renamed from RSSlink to RSSLink.

That was an easy fix; a bit less easy at first was figuring out that someone decided that RSS feeds should include, unconditionally, the summary of the article, not the full text, because, and I quote: «This is a somewhat breaking change, but is what most people expect from their RSS feeds.»

I’m not sure what these “most people” are. And I’d say that if you want to change such as default, maybe you want it to be an option, but that does not seem to be Hugo’s style, as I’ll show later. But this is not why I’m angry. I’m angry because changing the RSS from full content to summary is a very clear change in impression.

An RSS feed that has full article content, is an RSS feed for a blog (or other site) that wants to be read. You can use this feed to syndicate on Planets (yes they still exist), read it on services like Feedly, or NewsBlur (no they did not all disappear with the death of Google Reade), and have it at hand on offline readers on your mobile devices, too.

RSS feeds that only carry summaries, are there to drive traffic to a site. And this is where the nasty smell around SEOs and similar titles come back in from below the door. I totally understand if one is trying to make a living off their website they want to be able to bring in traffic, which include ads views and the like. I have spoken about ads before, and though I recently removed it from the blog altogether for lack of any useful profit, I totally empathise with those who actually can make a profit and want people to see their ads.

But the fact that the tools decide to switch to this mode make me feel angry and sick, because they are no longer empowering people to make their view visible, they are empowering them to trick users into opening a website, to either get served ads, or (if they are geek enough to use Brave) give bitcoin to the author.

As it turns out, it’s not the only thing that happen to have changed with Hugo, and they all sound like someone decided to follow the path of WordPress, that went from a blogging engine to a total solution for managing websites — which is kind of what Typo did when becoming Publify. Except that instead of going to a general website solution, they decided to one-up all of them. From the same release notes of the version that changed the RSS feed defaults:

Hugo 0.20 introduces the powerful and long sought after feature Custom Output Formats; Hugo isn’t just that “static HTML with an added RSS feed” anymore. Say hello to calendars, e-book formats, Google AMP, and JSON search indexes, to name a few ( #2828 ).

Why would you want to build e-book formats and calendars with the same tool you used to build a blog with? Sure, if it actually was practical I could possibly make Autotools Mythbuster use this, but I somehow doubt it would have enough support for what I want to get out of the output, so I don’t even want to consider that for now. But all in all, it looks like widening a little too much the target field.

Anyway, I went and reverted the changes for my local build of Hugo. I ended up giving up on that by the way, and just applied a local template replacement instead, since that way I could also re-introduce another fix I needed for the RSS that was not merged upstream (the ability to put the taxonomy data into the feed, so you can use NewsBlur’s intelligence trainer to filter out some of my blog’s content). Of course maintaining a forked copy of the builtin template also means that it can break when I update if they decided that it should be FeedLink next time around.

Then I pushed the new version, including the updated donations page – which is not redirected from the old one yet, still working on that – and stopped looking too much onto it. I did this (purposefully) in the 3-days break between two posts, so that if something broke I would have time to fix it, but it looked everything was alright.

Until I noticed that I somehow flooded Planet Gentoo with a bunch of posts dating back up to 2006! And someone pinged me on Hangouts for the same reason. So I rolled back to the old version (that did not solve the flooding unfortunately), regenerated, and started digging what happened.

In the version of Hugo I used originally, the RSS feeds were fixed to 15 items. This is a perfectly reasonable debug for a blog, as I didn’t go anywhere near it even at at the time I was spending more time blogging than sleeping. But since Hugo is no longer targeting to be a blog engine, that’s not enough. “News” sites (and I use it in quote, because too many of those are actually either aggregators of other things, or just outright scammers, or fake news sites) would have many more than that per day, so 15 is clearly not a good option for them. So in Hugo 0.19 (the version before the one that changed to use summary), this change can be found:

Make RSS item limit configurable #3035

This is reasonable. The default is kept to 15, but now you can change it in the configuration file to whatever you want it to be, be it 20, 50, or 100.

What I did not notice at that point, was from the following version:

Raise the default rssLimit #3145

That sounds still good, no? It raises the limit. To what?

hugolib: Default rssLimit to unlimited

Of course this is perfectly fine for small websites that have a hundred or two pages. But this blog counts over 2400 articles, written over the span of 12 years (as I have recovered a number of blog posts from previous platforms, and I’m still always looking to see if I can find the backups with the posts of my 15 years old self). It ended up generating a 12MB RSS feed with every single page published up to them.

So what am I doing now? That, unfortunately, I’m not sure. This is the kind of bad upgrade path that frustrated the heck out of me with Typo. Unfortunately the only serious alternative I know to this is WordPress, and that still does not support Markdown unless you use a strange combinations of plugins and I don’t even want to get into that.

I am tempted to see what’s out there for Ruby-based blog engines, although at this point I’m ready to pay for a solution that works native on AWS, to avoid having to manage it myself. I would like to be able to edit posts without requiring me a console and git client, and I would like to have an integrated view of the comments, instead of relying on Disqus1, which at least a few people hate, and I don’t particularly enjoy.

For now, I guess I’ll have to be extra careful if I want to update Hugo. But at least I should be able to not break this again so easily as I’ll be checking the output before and after the change.


  1. don’t even try suggesting isso. Maybe my comments are not big data, but at 2400+ blog posts, I don’t really want to have something single-threaded that access a SQLite file! [return]

Alice Ferrazzi a.k.a. alicef (homepage, bugs)
Google-Summer-of-Code-day20 (July 12, 2017, 08:54 UTC)

Google Summer of Code day 20

What was my plan for today?

  • work on the livepatch downloader and make the kpatch creator flexible

What i did today?

  • Created .travis.yml for validating changes https://github.com/aliceinwire/elivepatch/blob/master/.travis.yml
  • Finished making the live patch downloader https://github.com/aliceinwire/elivepatch/commit/6eca2eec3572cad0181b3ce61f521ff40fa85ec1
  • Testing elivepatch

The POC generally works but I had a problem with building the Linux kernel 4.9.29 on my notebook One problem with the POC is that still some variable are hard coded.

WARNING: Skipping gcc version matching check (not recommended)
Skipping cleanup
Using source directory at /usr/src/linux-4.9.29-gentoo
Testing patch file
checking file fs/exec.c
Hunk #1 succeeded at 238 (offset -5 lines).
Reading special section data
Building original kernel
Building patched kernel
Extracting new and modified ELF sections
/usr/libexec/kpatch/create-diff-object: ERROR: exec.o: find_local_syms: 136: find_local_syms for exec.c: found_none
ERROR: 1 error(s) encountered. Check /root/.kpatch/build.log for more details.

the function find_local_syms https://github.com/dynup/kpatch/blob/master/kpatch-build/lookup.c#L80

Now i'm rebuilding everything with debug options for see some more useful information I'm also thinking to add a debug option to the elivepatch server

One question is if can be useful to work on making a feature for getting the kernel version from the Kernel configuration file header.

like this:

.config
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 4.9.29-gentoo Kernel Configuration
#

like parsing this for get the version file without need to give it manually.

Another option is to passing it by rest as command line option.

something like -g 4.9.29

Interesting thing is that as now kernel-build have already embedded some way of dealing with most problems, and works better with distribution like ubuntu or fedora.

like for example is already copying the .config file and building the kernel with the option that we are giving from the rest api. cp -f /home/alicef/IdeaProjects/elivepatch/elivepatch_server/config /usr/src/linux-4.9.29-gentoo/.config

and the patch cp /home/alicef/IdeaProjects/elivepatch/elivepatch_server/1.patch kpatch.patch

Is also checking the .config for missing configurations. grep -q CONFIG_DEBUG_INFO_SPLIT=y /home/alicef/IdeaProjects/elivepatch/elivepatch_server/config

what i will do next time?
* Testing elivepatch * Getting the kernel version dynamically * Updating kpatch-build for work with Gentoo better

July 11, 2017
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)
Best sushi in St. Louis? J Sushi in Arnold. (July 11, 2017, 04:09 UTC)

As a self-proclaimed connoisseur of Asian cuisine, I’m constantly searching out the best restaurants in Saint Louis of the various regions and genres (Thai, Japanese, Vietnamese, as well as sushi, dim sum, et cetera). Having been to many of the staples of St. Louis sushi—Drunken Fish, Kampai, Wasabi, Cafe Mochi, and others—I’ve always been satisfied with their offerings, but yet felt like they missed the mark in one way or another. Don’t get me wrong, all of those places have some great dishes, but I just found them to be lacking that spark to make them stand out as the leader of the pack.

… and then it happened. One day when I was driving north on 61-67 (Jeffco Boulevard / Lemay Ferry), I noticed that the storefront in Water Tower Place that previously housed a mediocre Thai restaurant was set to reopen as a sushi joint. My first thought was “oh no, that’s probably not going to go over well in Arnold” but I hoped for the best. A couple weeks later, it opened as J Sushi. I added it to my ever-growing list of restaurants to try, but didn’t actually make it in for several more weeks.

Salmon Killer roll at J Sushi in St. Louis, MO
The Salmon Killer Roll with spicy crab, asparagus, salmon, cream cheese, mango sauce and Japanese mayo
(click for full quality)

Named for the original owner, Joon Kim, (who, as of this writing, is the owner of Shogun in Farmington, MO), J Sushi came onto the scene offering a huge variety of Japanese fare. From a smattering of traditional appetisers like tempura and gyoza, to a gigantic list of rolls and sashimi, to the “I don’t particularly care for raw fish” offerings in their Bento boxes, J Sushi offers dishes to appease just about anyone interested in trying Japanese cuisine.

Since their initial opening, some things have changed at J Sushi. One of the biggest differences is that it is now owned by an employee that Joon himself trained in the ways of sushi over the years: Amanda, and her partner, Joseph. The two of them have taken an already-outstanding culinary experience and elevated it even further with their immediately noticeable hospitality and friendliness (not to mention, incredible aptitude for sushi)!

VIP roll at J Sushi in St. Louis, MO
The VIP Roll with seared salmon, and shrimp tempura… it’s on fire!
(click for full quality)

So, now that you have a brief history of the restaurant, let’s get to the key components that I look for when rating eateries. First and foremost, the food has to be far above par. I expect the food to not only be tasty, but also a true representation of the culture, elegantly plated, and creative. J Sushi delivers on all four of those aspects! I’ve had many of their appetisers, rolls, sushi/sashimi plates, and non-fish dishes, and have yet to find one that wasn’t good. Of course I have my favourites, but so far, nothing has hit the dreaded “do not order again” list. As for plating, the sushi chefs recognise that one firstly eats with the eyes. Dishes are presented in a clean fashion and many of them warrant taking a minute to appreciate them visually before delving in with your chopsticks.

Second, the service has to be commendable. At J Sushi, Amanda, Joe, and the members of the waitstaff go out of their way to greet everyone as they come in and thank them after their meal. The waiters and waitresses come to the table often to check on your beverages, and to see if you need to order anything else. At a sushi restaurant, it’s very important to check for reorders as it’s commonplace to order just a couple rolls at a time. I can imagine that one of the complaints about the service is how long it takes to get your food after ordering. Though it is a valid concern, great sushi is intricate and takes time to execute properly. That being said, I have personally found the wait times to be completely acceptable, even when they’re really busy with dine-ins and take-away orders.

Mastercard roll at J Sushi in St. Louis, MO
The Master Card Roll with shrimp tempura, and gorgeously overlapped tuna, salmon, & mango
(click for full quality)

Third, the restaurant has to be a good value. Does that mean that it has to be inexpensive? No, not at all. When I’m judging a restaurant’s value, I take into consideration the quality of the ingredients used, the time and labour involved in preparation, the ambience, and the overall dining experience. J Sushi, in my opinion, excels in all of these areas, and still manages to keep their prices affordable. Yes, there are cheaper places to get sushi, and even some that offer “all you can eat” options, but you’re certainly exchanging quality for price at those types of establishments. I, for one, would rather pay a little more money to ensure that I’m getting very high quality fish (especially since the flavours and textures of the fish are exponentially heightened when consumed raw).

The Dragon Bowl at J Sushi in St. Louis, MO
The stunningly beautiful Dragon Bowl – as much artwork as it is food!
(click for full quality)

Now for the meat and potatoes (or in this case, the fish and rice): what dishes stand out to me? As I previously said, I haven’t found anything that I dislike on the menu; just dishes that I like more than others. I enjoy changing up my order and trying new things, but there are some items that I keep going back to time and time again. Here are some of my absolute favourites:

Appetisers:

  • Japanese Crab Rangoon
    • Expecting those Chinese-style fried wontons filled with cream cheese? Think again. This amazing “roll” has spicy pulled crab and cream cheese wrapped in soy paper (Mamenori) and rice. It’s deep-fried and served with eel sauce. NOT to be missed!
  • Tuna Tataki
    • Perfectly seared (read: “nearly raw”) tuna served with shredded radish and a light sauce.

Rolls:

  • Master Card Roll
    • Shrimp tempura and spicy tuna inside, topped with fresh tuna, salmon, and slices of mango (see the photo above).
  • Sweet Ogre Roll
    • One of my original favourites, this roll has shrimp tempura and cucumber inside. On top, there’s seared tuna, Sriracha, a little mayo, crunch, and masago.
  • Missouri Thunder Roll
  • Derby Roll
    • Spicy crab and avocado (I swap that for cucumber). Topped with eight beautifully-grilled shrimp.
  • Poison Spider Roll
    • HUGE, double-stuffed roll with a whole deep fried soft-shell crab and cucumber. On top, a bunch of spicy pulled crab, masago, crunch, and eel sauce.

Other:

  • Tai Nigiri
    • Simple Nigiri of Red Snapper
  • Hamachi Nigiri
    • Simple Nigiri of Yellowtail
  • Sushi sampler
    • 5 pieces of various Nigiri (raw fish on rice with a little wasabi)

If your mouth isn’t watering by now, then you must not care all that much for sushi (or Pavlov was sorely misguided 🙂 ). I hope that you try some of the amazing food that I mentioned above, but more importantly, I hope that you check out J Sushi and find your the dishes that speak to you personally!

Cheers,
Zach

Important!

The photographs in this post were taken by me. If you would like to use them elsewhere, please just give credit to Nathan Zachary and link back to my blog. Thanks!

July 09, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)

So my previous post with glucometerutils news got picked up by Hackaday, and though the comments ended up mostly talking about the (more physical, less practical) note about fiddling with the glucometers hardware themselves (which would suggest me the editor should probably have avoided moving the spotlight in the post, but never mind), I ended up replying to a few comments that were actually topical, to the point that I thought I should be writing about this more extensively.

In the comments, someone brought up Tidepool, which is a no-profit in California that develops what to me appears to be its own data storage and web application for diabetics. This is not far from what Glucosio is meant to be — and you might remember that an interaction with them, had me almost leave open source development, at least for what diabetes is concerned.

The problem with both projects, and a number of others that I’ve been pointed to over the years, is that I find most of them either not practical or web-oriented, or a mixture of the two. With not practical I mean that while building an “universal glucometer” capable of using any random strip is an interesting proposal, it does nothing to improve the patients’ life, and it actually can significantly increase the risks of misreading values and thus, risk the life of the user. For this reason, plus the fact that I do not have enough of a biochemistry understanding to figure out how to evaluate the precision of the meters that are already certified, I don’t invest any time looking into these projects.

Web-based applications such as Tidepool and similar are also far from my interests. I do not have a personal problem with accessing my blood sugar readouts for the sake of research, but I do have some concerns about which actors are allowed access to them. So in particular a startup like Glucosio is not someone I’d be particularly fond of giving access to my data to. Tidepool may be a no-profit, but that does not really make me feel much better, particularly because I would expect that an US-based no-profit would not have gone through all the possible data processing requirements of EU legislation, unlike, say, Abbott. I have already written a lot about why I don’t find self-hosting a good solution so I don’t think I need to spend much time on it here.

Except, there is one extra problem with those apps that require you to set up your own instance — like some of the people who have not been waiting some time ago. While running an app for my own interest may sound like an interesting thing to do, particularly if I want to build up the expertise to run complicated web app stacks, my personal ultimate goal is to have my doctor know what my blood sugar levels are over time. This is the whole point why I started that tool, I wanted to be able to output a PDF that my doctor could see without having to jump around a number of hoops to produce it — I failed to do so, but in part because I lost interest after I started using the awesome Accu-Chek Mobile.

If I were to tell my doctor «Log in on this site here with these credentials and you can see my readouts» he might actually do it, but mostly because of novelty and because he entertains my geekery around trying different meters and solutions. If he started to get this request from dozens of his patients, not only he’d have to keep a password managers just to deal with credentials, but he’d probably just couldn’t have the time to deal with it. The LibreLink app does have the ability to share data with a few services, and he did suggest me to look into diasend, but it looks like it got merged into something else that might or might not work for now, so I gave up.

Now, here is an interesting prospect, and why such apps are not completely worthless in my opinion. If the protocols are open to be used, and the apps are open source and can be set up by anyone, there is space for the doctors to have their own instance set up so that their patients can upload their data. Unfortunately, the idea that being open source this does not involve a significant investment in time and money is patently false. Particularly for important data like this, there has to be proper security, starting from every session being encrypted with TLS, and the data encrypted at rest (it is ironic that neither Tidepool nor Glucosio, at the time of writing, use TLS for their main websites). So I still don’t expect doctors in the public sector to be using these technologies any time soon. But on the other hand, there are more and more apps for this being built by the diabetes tech companies, so maybe we’ll see something happening in the future.

Where does this leave my project? Well, to begin with it’s not a single project but two of them. glucometerutils was born as a proof of concept and is still a handy tool to have. If someone manages to implement output to HTML or to PDF of the data, that would make it a very useful piece of software that does not need to interact with any remote, online application. The protocols repository serves a distinct need: it provides a way for more people to contribute to this ecosystem without requiring each of them to invest significant time in reversing the protocols, or getting in bed with the manufacturers, which – I can only guess – involves NDAs, data-sharing agreements, and similar bureaucracy that most hobbyist developers can’t afford.

Indeed, I know of at least one app, built for iOS, proprietary and commercial (as in, you have to pay for it), that has built support for meters thanks to my repository (and the author gave back in form of corrections and improvements on the documentation!). This is perfectly in line with my reasons to even have such a repository. I don’t care if the consumers and contributors to the repository build closed-source tools, as long as they share the knowledge on how to get to the data. And after that, may the best tool win.

As I said before, smartphones are no longer a luxury and for many people they are the only way they can access the Internet. It makes sense that the same way, for many diabetics it is their only way to analyse their readouts. This is why Contour Next One comes with Bluetooth and a nice app, and why there even are standard Bluetooth specification for glucometers (GLP/GLS) and continuous monitors (CGMP/CGMS). If my work on an open-source tool brings more people the ability to manage their diabetes, even with closed-source software, I’ll consider myself satisfied.

Now, there is one more interesting bit with Tidepool, though: they actually publish a Chrome-based uploader app that is able to download data from many more glucometers than my own tool (and the intersection between the two is minimal). This is great! But, as it happens, it comes with a little bit of a downside: the drivers are not documented at all. I confirmed the reason is that the access to the various meters’ protocols is subject to NDA — so while they can publish the code that access those meters, they cannot publish the specs of the protocols themselves, and that appears to include in-code comments that would make it easy to read what’s going on.

So, one of the things I’m going to do is read through those drivers, and try to write a protocol spec for the meters. It appears that they have a driver for Contour Next meters, which may or may not work for the Contour Next One which I’ve been trying to reverse engineer — I know there is at least one other open-source implementation of accessing data from Contour Next meters, but the other one is GPL-2 and, like OpenGlucose, I’ve avoided looking too closely to the code.

Projects such as Tidepool are extremely important to provide a proper alternative to the otherwise closed-garden of proprietary cloud diabetes management software. And if they become simple, and secure enough to set up, it is possible that some of the doctors will start providing their own instances where their patients can upload the readings, and that will make them also practical. But for now, to me they are only a good source of confrontation to figure out a way forward for my own tools.

July 06, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
More on IPv6 feasibility (July 06, 2017, 10:04 UTC)

I didn’t think I would have to go back to the topic of IPv6, particularly after my last rant on the topic. But of course it’s the kind of topic that leads itself to harsh discussions over Twitter, so here I am back again (sigh).

As a possibly usual heads’ up, and to make sure people understand where I’m coming from, it is correct I do not have a network background, and I do not know all the details of IPv6 and the related protocol, but I do know quite a bit about it, have been using it for years, and so my opinion is not one of the lazy sysadmin that sees a task to be done and wants to say there’s no point. Among other things, because I do not like that class of sysadmins any more (I used to). I also seem to have given some people the impression that I am a hater of IPv6. That’s not me, that’s Todd. I have been using IPv6 for a long time, I have IPv6 at home, I set up IPv6 tunnels back in the days of having my own office and contracting out, and I have a number of IPv6-only services (including the Tinderbox.

So with all this on the table, why am I complaining about IPv6 so much? The main reason is that, like I said in the other post, geeks all over the place appear to think that IPv6 is great right now and you can throw away everything else and be done with it right this moment. And I disagree. I think there’s a long way to go, and I also think that this particular attitude will make the way even longer.

I have already covered in the linked post the particular issue of IPv6 having originally be designed for globally identifiable, static IPv6 addresses, and the fact that there have been at least two major RFCs to work around this particular problem. If you have missed that, please go back to the post and read it there because I won’t repeat myself here.

I want instead to focus on why I think IPv6 only is currently infeasible for your average end user, and why NAT (including carrier-grade NAT) is not going away any year now.

First of all, let’s define what an average end user is, because that is often lost to geeks. An average end user does not care what a router does, they barely care what a router is, and a good chunk of them probably still just call them modem, as their only interest is referring to “the device that the ISP gives you to connect to the Internet”. An average user does not care what an IP address is, nor cares how DNS resolution happens. And the reason for all of this is because the end user cares about what they are trying to do. And what they are trying to do is browse the Internet, whether it is the Web as a whole, Facebook, Twitter, YouTube or whatever else. They read and write their mail, they watch Netflix, NowTV and HBO GO. They play games they buy on Steam or EA Origin. They may or may not have a job, and if they do they may or may not care to use a VPN to connect to whatever internal resources they need.

I won’t make any stupid or sexist stereotype example for this, because I have combined my whole family and some of my friends in that definition, and they are all different. They all don’t care about IPv6, IPv4, DSL technologies and the like. They just want an Internet connection, and one that works and is fast. And with “that works” I mean “where they can reach the services they need to complete their task, whichever that task is”.

Right now that is not an IPv6 only network. It may be, in the future, but I won’t hold my breath for a number of reasons, that this is going to happen in the next 10 years, despite the increasing pressure and the increasing growth of IPv6 deployment to end users.

The reason why I say this is that right now, there are plenty of services that can only be reached over IPv4, some of which are “critical” (for some definition of critical of course) to end users, such as Steam. Since the Steam servers are not available over IPv6, the only way you can reach them is either through IPv4 (which will involve some NAT) or NAT64. While the speed of the latter, at least on closed-source proprietary hardware solutions, is getting good enough to be usable, I don’t expect it being widely deployed any time now, as it has the drawback of not working with IP literals. We all hate IP literals, but if you think no company ever issue their VPN instructions with an IP literal in them, you are probably going to be disappointed once you ask around.

There could be an interesting post about this level of “security by obscurity”, but I’ll leave that for later.

No ISP wants to receive calls from their customers that access to a given service is not working for them, particularly when you’re talking about end users that do not want to care about tcpdump and traceroute, and customer support that wouldn’t know how to start debugging that NowTV will send a request to an IPv4 address (literal) before opening the stream, and then continue the streaming over IPv4. Or that Netflix refuse to play any stream if the DNS resolution happens over IPv4 and the stream tries to connect over IPv6.

Which I thought Netflix finally fixed until…

Now, to be fair, it is true that if you’re using an IPv6 tunnel you are indeed proxying. Before I had DS-Lite at home I was using TunnelBroker and it appeared like I was connecting from Scotland rather than Ireland, and so for a while I unintentionally (but gladly) sidestepped country restrictions. But this also happened a few times on DS-Lite, simply because the GeoIP and WhoIs records didn’t match between the CGNAT and the IPv6 blocks. I can tell you it’s not fun to debug.

The end result is that most customer ISPs will choose to provide a service in such a way that their users feel the least possible inconvenience. Right now that means DS-Lite, which involves a carrier-grade NAT, which is not great, as it is not cheap to run, and it still can cause problems, particularly when users use Torrent or P2P heavily, in which case they can very quickly exhaust the 200-ports forwarding blocks that are allocated for CGNAT. Of course DS-Lite also takes away your public IPv4, which is why I heard a number of geeks complaining loudly about DS-Lite as a deployment option.

Now there is another particular end user, in addition to geeks, that may care about IP addresses: gamers. In particular online gamers (rather than, say, Fallout 4 or Skyrim fans like me). The reason for that is that most of the online games use some level of P2P traffic, and so require you to have a way to receive inbound packets. While it is technically possible to set up IGD-based redirection all the way from the CGNAT address to your PC or console, I do not know how many ISPs implement that correctly. Also, NAT in general introduces risks for latency, and requires more resources on the passing routers, and that is indeed a topic that is close to the heart of gamers. Of course, gamers are not your average end user.

An aside: back in the early days of ADSL in Italy, it was a gaming website, building its own ISP infrastructure, that first introduced Fastpath to the country. Other ISPs did indeed follow, but NGI (the ISP noted above) stayed for a long while a niche ISP focused on the need of gamers over other concerns, including price.

There is one caveat that I have not described yet, but I really should, because it’s one of the first objections I receive every time I speak about the infeasibility of IPv6 only end user connections: the mobile world. T-Mobile in the US, in particular, is known for having deployed IPv6 only 3G/4G mobile networks. There is a big asterisk to put here, though. In the US, and in Italy, and a number of other countries to my knowledge, mobile networks have classically been CGNAT before being v6-only, and with a large amount of filtering in what you can actually connect to, even without considering tethering – this is not always the case for specialised contracts that allow tethering or for “mobile DSL” as they marked it in Italy back in the days – and as such, most of the problems you could face with VPN, v4 literals and other similar limitations of v6-only with NAT64 (or proxies) already applied.

Up to now I have described a number of complexities related to how end users (generalising) don’t care about IPv6. But ISPs do, or they wouldn’t be deploying DS-Lite either. And so do a number of other “actors” in the world. As Thomas pointed out over Twitter, not having to bother with TCP keepalive for making sure a certain connection is being tracked by a router makes mobile devices faster and use less power, as they don’t have to wake up for no reason. Certain ISPs are also facing problems with the scarcity of IPv4 blocks, particularly as they grow. And of course everybody involved in the industry hates pushing around the technical debt of the legacy IPv4 year after year.

So why are we not there yet? In my opinion and experience, it is because the technical debt, albeit massive, is spread around too much: ISPs, application developers, server/service developers, hosting companies, network operators, etc. Very few of them feel enough pain from v4 being around that they want to push hard for IPv6.

A group of companies that did feel a significant amount of that pain organized the World IPv6 Day. In 2011. That’s six years ago. Why was this even needed? The answer is that there were too many unknowns. Because of the way IPv6 is deployed in dual-stack configurations, and the fact that a lot of systems have to deal with addresses, it seems obvious that there is a need to try things out. And while opt-ins are useful, they clearly didn’t stress test enough of the usage surface of end users. Indeed, I stumbled across one such problem back then: when my hosting provider (which was boasting IPv6 readiness) sent me to their bank infrastructure to make a payment, the IP addresses of the two requests didn’t match, and the payment session failed. Interactions are always hard.

A year after the test day, the “Launch” happened, normalizing the idea that services should be available over IPv6. Even though that the case, it took quite a longer while for many services to be normally available over IPv6, and I think, despite being one of the biggest proponents and pushers of IPv6, Microsoft update servers only started providing v6 support by default in the last year or so. Things improved significantly over the past five years, and thanks to the forced push of mobile providers such as T-Mobile, it’s a minority of the connections of my mobile phones that still connect to the v4 world, though there are still enough not to be able to be ignored.

What are the excuse for those? Once upon a time, the answer was “nobody is using IPv6, so we’re not bothering supporting it”. This is getting less and less valid. You can see the Google IPv6 statistics that show an exponential growth of connections coming from IPv6 addresses. My gut feeling is that the wider acceptance of DS-Lite as a bridge solution is driving that – full disclosure: I work for Google, but I have no visibility in that information, so I’m only guessing this out of personal experience and experience gathered before I joined the company – and it’s making that particular excuse pointless.

Unfortunately, there are still “good” excuses. Or at least reasons that is hard to argue with. Sometimes, you cannot enable IPv6 for your web service, even though you have done all your work, because of dependencies that you do not control, for instance external support services such as the payment system in the OVH example above. Sometimes, the problem is to be found in another piece of infrastructure that your service shares with others and that requires to be adapted, as it may have code expecting a valid IPv4 address at some particular place, and an IPv6 would make it choke, say in some log analysis pipeline. Or you may rely on hardware for the network layer that just still does not understand IPv6, and you don’t want to upgrade because you still have not found enough of an upside to you to make the switch.

Or you may be using an hosting provider that insists that giving you a single routable IPv6 is giving you a “/64” (it isn’t — they are giving you a single address in a /64 they control). Any reference to a certain German ISP I had to deal with in the past is not casual at all.

And here is why I think that the debt is currently too spread around. Yes, it is true that mobile phones batteries can be improved thanks to IPv6. But that’s not something your ISP or the random website owner care about – I mean, there are websites so bad that they take megabytes to load a page, that would be even better! – and of course a pure IPv6 without CGNAT is a dream of ISPs all over the world, but it is very unlikely that Steam would care about them.

If we all acted “for the greater good”, we’d all be investing more energy to make sure that v6-only becomes a feasible reality. But in truth, outside of controlled environments, I don’t see that happening any year now as I said. Controlled environments in this case can refer not only to your own personal network, but to situations like T-Mobile’s mobile data network, or an office’s network — after all, it’s unlikely that an office, outside of Sky’s own, would care whether they can connect to NowTV or Steam. Right now, I feel v6-only network (without NAT64 even) are the realm of backend networks. Because you do not need v4 for connecting between backends you control, such as your database or API provider, and if you push your software images over the same backend network, there is no reason why you would even have to hit the public Internet.

I’m not asking to give a pass to anyone who’s not implementing v6 access now, but as I said when commenting on the FOSDEM network, it is not by bothering the end users that you’ll get better v6 support, is by asking the services to be reachable.

To finish off, here’s a few personal musings on the topic, that did not quite fit into the discourse of the post:

  • Some ISPs appear to not have as much IPv4 pressure as others; Telecom Italia still appears to not have reassigned or rerouted the /29 network I used to have routed to my home in Mestre. Indeed, whois information for those IPs still has my old physical address as well as an old (now disconnected) phone number.
  • A number of protocols that provide out-of-band signalling, such as RTSP and RTMP, required changes to be used in IPv6 environments. This means that just rebuilding the applications using them against a C library capable of understanding IPv6 would not be enough.
  • I have read at least one Debian developer in the past giving up on running IPv6 on their web server, because their hosting provider was sometimes unreliable and they had no way to make sure the site was actually correctly reachable at all time; this may sound like a minimal problem, but there is a long tail of websites that are not actually hosted on big service providers.
  • Possibility is not feasibility. Things may be possible, but not really feasible. It’s a subtle distinction but an important one.

July 05, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)

This post is part of a series of free ideas that I’m posting on my blog in the hope that someone with more time can implement. It’s effectively a very sketched proposal that comes with no design attached, but if you have time you would like to spend learning something new, but no idea what to do, it may be a good fit for you.

This is clearly not a new idea, as I posted about something very similar over eight years ago. At the time I was looking for a way of encoding audibooks coming from audio CD in a format that was compatible with the iPod Classic. Since then, Apple appears to have done their best to make the audiobooks experience on iOS the worst possible, to the point that I don’t really use my iPod Touch as my primary audiobook player any more.

As an aside to the free idea, which can probably give a bit more context for you all, let me describe the problems I have with the current approach to audiobooks by Apple. A few iOS major versions ago, they decided to move the audiobooks handling from the Music app to the iBooks app; this would be reasonable, given that they are books, and it was always a bit strange to have them in a separate application, but it also meant you lost the ability to build playlists with them.

Playlists with audiobooks are great, because they allow you to “stitch” multiple books of the same series, so that you can play them for hours on end, for instance if you need them to sleep. I used to have a playlist for the Hitchhikers’ Guide to the Galaxy radio series and one for the books, one for Dresden Files, and one for the News Quiz, including both the collected editions in CD by BBC, my own “audiobooks” built out of the podcasts, and the more recent podcast episodes that I have not collected into audiobook files yet.

So what is the idea? There are two components that, as far as I can see, are currently heavily lacking in the FLOSS world. The first is a way to generate audiobook files, which is what I complained eight years ago. Indeed, if you look even at a random sample on Project Gutenberg, the audiobook is actually a ton of files (47!) each with a chapter in them. A proper audiobook file would be a single file, with chapter markers, and per-chapter metadata (chapter title, and in that case, the performer).

It’s more than just a matter of having a single file to move around. While of course the hardware improvements made a number of these points moot, the original reason to have a single big file over multiple small files was to avoid having to seek to a different point in the disk in-between chapters. It also allows the decoder to keep going, between chapters, as there is no “end of stream” but rather just a marker that at a given point in time some different metadata applies. Again, as I said this is no longer as relevant as it used to be, but it’s also not entirely gone.

The other component that is currently lacking, is a good playback solution. While VLC can obviously play those files right now, and if I’m not mistaken it also extracts the per-chapter metadata correctly, it lacks two features that make enjoying audiobooks possible. The first is possibly complicated, and relates to the ability to store bookmarks and current-playing time. While supposedly VLC supports the feature for resuming from last playback, I have heard it’s still sometimes unreliable (I have no idea how it’s implemented), plus it does not support just bookmarking a given time in a file/book. Bookmarking is particularly important when listening to non-novel audiobooks, as you may want to go back to it afterwards, to re-listen to advice or take a reference to further details.

The other feature is basically UI heavy, and it involves mostly the mobile UI (at least the Android one) and is the ability to scan backward and forward in the file. You have probably seen this in other players including Netflix’s own app, that allow you to scan back 30 seconds — in audiobooks it’s also useful to scan forward 30 seconds, particularly when considering the bookmarks above.

As usual for Free Ideas I have no time to work on this myself. I can give the idea details out, and depending on things I may be able to contribute to a bounty on it, but otherwise, no code I can share about this yet.

July 04, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
More HTTP misbehaviours (July 04, 2017, 18:08 UTC)

Today I have been having some fun: while looking at the backlog on IRCCloud, I found out that it auto-linked Makefile.am which I prompty decided to register it with Gandi — unfortunately I couldn’t get Makefile.in or configure.ac as they are both already registered. After that I decided to set up Google Analytics to report how many referrer arrive to my websites through some of the many vanity domains I registered over time.

The "as needed" return (July 04, 2017, 18:08 UTC)

Okay some time ago there was an article somewhere talking about the mythical –as-needed flag for GNU ld, that allows to link inside a binary only the actual libraries used by that binary. It shown up a lot of noise and lot of people tried to use it on Gentoo, with mixed results. The timeframe was the one of binutils 2.15.9x . Today, I wanted to take another look to that, my own old report on that was not completely clean: we had known issues with GNOME-related packages, and KDE ones weren’t having that much impact without things like –dont-add-needed that was breaking badly a lot of libraries such as the ones related to gnupg (bad idea to use that flag without caring where it gets enabled… for unieject, you can use it, but many packages does not really behave well with that).

Am I ready to move? (July 04, 2017, 18:08 UTC)

Sometimes I think about my future, especially when it comes to my job. What I have up to now are a bunch of temporary jobs, nice ones most of the times, but temporary. Finding a stable job is another story, and would probably help me in a few ways, although disrupting my actual lifestyle, which I don’t really dislike too much. In Italy, I wonder how many possibilities I’d end up having, And especially, how many possibilities I’d end up having in the Venice area.

VMware Image Released (July 04, 2017, 18:08 UTC)

So as I promised, the VMware Image, usable with VMware Player and VMware Server is released to public. It can be downloaded with torrent at the new torrent tracker that curtis119 set up for 2006.0 release and Gentoo/FreeBSD 6.0 release :) Update: as I forgot that we don’t have update docs yet, better telling here what the root password is…. unbelievably “Flameeyes” (no it’s not my usual password). This is still using the old baselayout and old portage, but it’s a completely set-up box with compiled kernel and bootloader, directly usable without passing through installation instructions.

July 03, 2017

Description:
mpg123 is a fast console MPEG Audio Player and decoder library.

The complete ASan output of the issue:

# mpg123-mpg123 -t $FILE
==10588==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7f01025c5cbc at pc 0x7f010229bfe3 bp 0x7ffc988ac5b0 sp 0x7ffc988ac5a8
READ of size 4 at 0x7f01025c5cbc thread T0
    #0 0x7f010229bfe2 in III_i_stereo /var/tmp/portage/media-sound/mpg123-1.25.0/work/mpg123-1.25.0/src/libmpg123/layer3.c:1343:10
    #1 0x7f010229bfe2 in INT123_do_layer3 /var/tmp/portage/media-sound/mpg123-1.25.0/work/mpg123-1.25.0/src/libmpg123/layer3.c:2013
    #2 0x7f01021d3708 in decode_the_frame /var/tmp/portage/media-sound/mpg123-1.25.0/work/mpg123-1.25.0/src/libmpg123/libmpg123.c:710:14
    #3 0x7f01021dc61d in mpg123_decode_frame /var/tmp/portage/media-sound/mpg123-1.25.0/work/mpg123-1.25.0/src/libmpg123/libmpg123.c:849:4
    #4 0x535783 in play_frame /var/tmp/portage/media-sound/mpg123-1.25.0/work/mpg123-1.25.0/src/mpg123.c:739:7
    #5 0x53a3a7 in main /var/tmp/portage/media-sound/mpg123-1.25.0/work/mpg123-1.25.0/src/mpg123.c:1363:8
    #6 0x7f0100f1d680 in __libc_start_main /tmp/portage/sys-libs/glibc-2.23-r3/work/glibc-2.23/csu/../csu/libc-start.c:289
    #7 0x41bec8 in mpg123_seek_frame (/usr/bin/mpg123-mpg123+0x41bec8)

0x7f01025c5cbc is located 4 bytes to the left of global variable 'pow2_1' defined in '/var/tmp/portage/media-sound/mpg123-1.25.0/work/mpg123-1.25.0/src/libmpg123/layer3.c:50:27' (0x7f01025c5cc0) of size 128
0x7f01025c5cbc is located 28 bytes to the right of global variable 'pow1_1' defined in '/var/tmp/portage/media-sound/mpg123-1.25.0/work/mpg123-1.25.0/src/libmpg123/layer3.c:50:13' (0x7f01025c5c20) of size 128
SUMMARY: AddressSanitizer: global-buffer-overflow /var/tmp/portage/media-sound/mpg123-1.25.0/work/mpg123-1.25.0/src/libmpg123/layer3.c:1343:10 in III_i_stereo
Shadow bytes around the buggy address:
  0x0fe0a04b0b40: f9 f9 f9 f9 00 04 f9 f9 f9 f9 f9 f9 00 04 f9 f9
  0x0fe0a04b0b50: f9 f9 f9 f9 00 00 00 00 00 00 00 00 f9 f9 f9 f9
  0x0fe0a04b0b60: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 00 00 00 00
  0x0fe0a04b0b70: 00 00 00 00 f9 f9 f9 f9 00 00 00 00 00 00 00 00
  0x0fe0a04b0b80: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0fe0a04b0b90: 00 00 00 00 f9 f9 f9[f9]00 00 00 00 00 00 00 00
  0x0fe0a04b0ba0: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 00 00 00 00
  0x0fe0a04b0bb0: 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9
  0x0fe0a04b0bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe0a04b0bd0: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe0a04b0be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==10588==ABORTING

Affected version:
1.25.1

Fixed version:
1.25.2 (not released atm)

Commit fix:
https://scm.orgis.org/view/mpg123/trunk/src/libmpg123/layer3.c?view=patch&r1=4275&r2=4274&pathrev=4275

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-11126

Reproducer:
https://github.com/asarubbo/poc/blob/master/00300-mpg123-globaloverflow-III_i_stereo

Timeline:
2017-06-30: bug discovered and reported to upstream
2017-07-03: blog post about the issue
2017-07-10: CVE assigned

Note:
This bug was found with American Fuzzy Lop.

Permalink:

mpg123: global buffer overflow in III_i_stereo (layer3.c)

June 28, 2017

Description:
xar is an easily extensible archive format.

The complete ASan output of the issue:

# xar -t -f $FILE
==5525==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f075cfb35f6 bp 0x7fff705167b0 sp 0x7fff70515f38 T0)
==5525==The signal is caused by a READ memory access.
==5525==Hint: address points to the zero page.
    #0 0x7f075cfb35f5 in strlen /tmp/portage/sys-libs/glibc-2.23-r3/work/glibc-2.23/string/../sysdeps/x86_64/strlen.S:76
    #1 0x45f5ef in __strdup /tmp/portage/sys-libs/compiler-rt-sanitizers-4.0.0/work/compiler-rt-4.0.0.src/lib/asan/asan_interceptors.cc:562
    #2 0x7f075decebc8 in xar_get_path /var/tmp/portage/app-arch/xar-1.6.1-r1/work/xar-1.6.1/lib/util.c:95:8
    #3 0x523f93 in print_file /var/tmp/portage/app-arch/xar-1.6.1-r1/work/xar-1.6.1/src/xar.c:214:16
    #4 0x513f07 in list /var/tmp/portage/app-arch/xar-1.6.1-r1/work/xar-1.6.1/src/xar.c:1524:4
    #5 0x513f07 in main /var/tmp/portage/app-arch/xar-1.6.1-r1/work/xar-1.6.1/src/xar.c:2666
    #6 0x7f075cf55680 in __libc_start_main /tmp/portage/sys-libs/glibc-2.23-r3/work/glibc-2.23/csu/../csu/libc-start.c:289
    #7 0x41af38 in _init (/usr/bin/xar+0x41af38)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /tmp/portage/sys-libs/glibc-2.23-r3/work/glibc-2.23/string/../sysdeps/x86_64/strlen.S:76 in strlen
==5525==ABORTING

Affected version:
1.6.1

Fixed version:
N/A

Commit fix:
N/A

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-11125

Reproducer:
https://github.com/asarubbo/poc/blob/master/00287-xar-nullptr-xar_get_path

Timeline:
2017-06-17: bug discovered and reported to upstream
2017-06-28: blog post about the issue
2017-07-10: CVE assigned

Note:
This bug was found with American Fuzzy Lop.

Permalink:

xar: NULL pointer dereference in xar_get_path (util.c)

Description:
xar is an easily extensible archive format.

The complete ASan output of the issue:

# xar -t -f $FILE
==7615==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x7f71a859ebd6 bp 0x7fffd8ace150 sp 0x7fffd8acde80 T0)
==7615==The signal is caused by a WRITE memory access.
==7615==Hint: address points to the zero page.
    #0 0x7f71a859ebd5 in xar_unserialize /var/tmp/portage/app-arch/xar-1.6.1-r1/work/xar-1.6.1/lib/archive.c:1767:27
    #1 0x7f71a859ebd5 in xar_open /var/tmp/portage/app-arch/xar-1.6.1-r1/work/xar-1.6.1/lib/archive.c:340
    #2 0x5139ee in list /var/tmp/portage/app-arch/xar-1.6.1-r1/work/xar-1.6.1/src/xar.c:1492:6
    #3 0x5139ee in main /var/tmp/portage/app-arch/xar-1.6.1-r1/work/xar-1.6.1/src/xar.c:2666
    #4 0x7f71a76a2680 in __libc_start_main /tmp/portage/sys-libs/glibc-2.23-r3/work/glibc-2.23/csu/../csu/libc-start.c:289
    #5 0x41af38 in _init (/usr/bin/xar+0x41af38)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /var/tmp/portage/app-arch/xar-1.6.1-r1/work/xar-1.6.1/lib/archive.c:1767:27 in xar_unserialize
==7615==ABORTING

Affected version:
1.6.1

Fixed version:
N/A

Commit fix:
N/A

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-11124

Reproducer:
https://github.com/asarubbo/poc/blob/master/00288-xar-nullptr-xar_unserialize

Timeline:
2017-06-17: bug discovered and reported to upstream
2017-06-28: blog post about the issue
2017-07-10: CVE assigned

Note:
This bug was found with American Fuzzy Lop.

Permalink:

xar: NULL pointer dereference in xar_unserialize (archive.c)

June 27, 2017
Alice Ferrazzi a.k.a. alicef (homepage, bugs)
Open Source Summit Japan-2017 (June 27, 2017, 13:57 UTC)

Open Source Summit Japan 2017 summary

OSS Japan 2017 was a really great experience.

I sended my paper proposal and waited for a replay, some week after I got a
invite to partecipate at the Kernel Keynote.
I thought partecipating at the Kernel Keynote as mentor and doing a presentation
was a good way to talk about Gentoo Kernel Project and how to contribute in the
Linux Kernel and Gentoo Kernel Project.
Also my paper got accepted so I could join OSS Japan 2017 as speaker.
It was three really nice days.

Presentation:

Fast Releasing and Testing of Gentoo Kernel Packages and Future Plans of the Gentoo Kernel Project

My talk was manly about the Gentoo Kernel related Projects past and future
specifically about the Gentoo Kernel Continuos Integreting system we are creating:
https://github.com/gentoo/Gentoo_kernelCI

Why is needed:

  • We need some way for checking the linux-patches commits automatically, can also check pre-commit by pushing to a sandbox branch
  • Check the patches signatures
  • Checking the ebuild committed to https://github.com/gentoo/gentoo/commits/master/sys-kernel
  • Checking the kernel eclass commits
  • Checking the pull request to the sys-kernel/*
  • Use Qemu for testing kernel vmlinux correct execution

For any issue or contribution feel free to send here:
https://github.com/gentoo/Gentoo_kernelCI

For see Gentoo Kernel CI in action:
http://kernel1.amd64.dev.gentoo.org:8010

slides:
http://schd.ws/hosted_files/ossjapan2017/39/Gentoo%20Kernel%20recent%20and%20Future%20project.pdf

Open Source Summit Japan 2017
Keynote: Linux Kernel Panel - Moderated by Alice Ferrazzi, Gentoo Kernel Project Leader

The keynote was with:
Greg Kroah-Hartman - Fellow, Linux Foundation
Steven Rostedt - VMware
Dan Williams - Intel Open Source Technology Center
Alice Ferrazzi - Gentoo Kernel Project Leader, Gentoo

One interesting part was about how to contribute to the Linux Kernel.
After some information about Linux Kernel contribution numbers the talk moved on
ho to contribute in the Linux Kernel.
For contribute in the Linux Kernel there is need of some understanding of C
and running test in the Linux Kernel.
Like fuego, kselftest, coccinelle, and many others.
And also a good talk from Steven Rostedt about working with Real-Time patch.

Who can find the Gentoo logo in this image:

June 26, 2017
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)

With the release of Lab::Measurement 3.550 we've switched to Dist::Zilla as maintenance tool. If you're not involved in hacking Lab::Measurement, you should essentially not notice this change. However, for the authors of the package, Dist::Zilla makes it much easier to keep track of dependencies, prepare new releases, and eventually also improve and unify the documentation... At the side we've also fixed Issue 4 and Lab::Measurement should now work out of the box with recent Gnuplot on Windows again.

June 25, 2017
Alice Ferrazzi a.k.a. alicef (homepage, bugs)
Open Source Summit Japan 2017 (June 25, 2017, 17:59 UTC)

Open Source Summit Japan 2017 summary

OSS Japan 2017 was a really great experience.

I sended my paper proposal and waited for a replay, some week after I got a invite to partecipate at the Kernel Keynote. I thought partecipating at the Kernel Keynote was a good way to talk about Gentoo Kernel Project and how to contribute in the Linux Kernel and Gentoo Kernel Project. Also my paper got accepted so I could join OSS Japan 2017 as speaker.

It was three really nice days.

[[!tags draft]]

Google-Summer-of-Code-summary week04 (June 25, 2017, 17:59 UTC)

Google Summer of Code summary week 04

What I did in this week 04 summary:

elivepatch:

  • Created the elivepatch client command line argument parser
  • Added function for sending patch and configuration files
  • Divided the call for sending (patch, config) and the call for building the livepatch
  • made send_file function more generic for sending all kind of files using RESTful api
  • Cleaned code following pep8
  • Decided to use only SSL and to don't use basic auth
  • Sending informations about the kernel version when requesting a livepatch build
  • We can now build livepatch using the RESTful API
  • Returning information about the livepatch building status

Kpatch:

  • Working on making kpatch-build working also with gentoo with all the features (As now kpatch-build can only automatically build livepatch for Ubuntu, Debian, Red Hat, Fedora)

Others:

  • Ask infra for a server for install the elivepatch server

What I need to do next time:

  • Finish the function for download the livepatch to the client
  • Testing elivepatch
  • Implementing the CVE patch uploader
  • Installing elivepatch to the Gentoo server
  • Fix kpatch-build for automatically work with gentoo-sources
  • Add more features to elivepatch

Google-Summer-of-Code-day18 (June 25, 2017, 17:59 UTC)

Google Summer of Code day 18

What was my plan for today?

  • going on with the code for retriving the livepatch and installing it

What i did today?

checked about kpatch-build required folder.

kpatch-build find_dirs function:

find_dirs() {
  if [[ -e "$SCRIPTDIR/create-diff-object" ]]; then
      # git repo
      TOOLSDIR="$SCRIPTDIR"
      DATADIR="$(readlink -f $SCRIPTDIR/../kmod)"
  elif [[ -e "$SCRIPTDIR/../libexec/kpatch/create-diff-object" ]]; then
      # installation path
      TOOLSDIR="$(readlink -f $SCRIPTDIR/../libexec/kpatch)"
      DATADIR="$(readlink -f $SCRIPTDIR/../share/kpatch)"
  else
      return 1
  fi
}

$SCRIPTDIR is the kpatch-build directory. kpatch-build is installed in /usr/bin/ so /usr/kmod /usr/libexe are all under such directory.

error "CONFIG_FUNCTION_TRACER, CONFIG_HAVE_FENTRY, CONFIG_MODULES, CONFIG_SYSFS, CONFIG_KALLSYMS_ALL kernel config options are required" Require by kmod/core.c: https://github.com/dynup/kpatch/blob/master/kmod/core/core.c#L62

We probably need someway for check that this setting are configured in the kernel we are going to build.

Updating kpatch-build for work automatically with gentoo (as now fedora for example can automatically download the kernel rpm and install it, we could do similar thing with gentoo): https://github.com/aliceinwire/kpatch/commits/gentoo

Starting to write the live patch downloader: https://github.com/aliceinwire/elivepatch/commit/d26611fb898223f2ea2dcf323078347ca928cbda

Now the elivepatch server can call and build the livepatch with kpatch:

sudo kpatch-build -s /usr/src/linux-4.10.14-gentoo/ -v /usr/src/linux-4.10.14-gentoo//vmlinux -c config 1.patch --skip-gcc-check
ERROR: kpatch build failed. Check /root/.kpatch/build.log for more details.
127.0.0.1 - - [25/Jun/2017 05:27:06] "POST /elivepatch/api/v1.0/build_livepatch HTTP/1.1" 201 -
WARNING: Skipping gcc version matching check (not recommended)
Using source directory at /usr/src/linux-4.10.14-gentoo
Testing patch file
checking file fs/exec.c
Hunk #1 succeeded at 259 (offset 16 lines).
Reading special section data
Building original kernel

Fixed some minor pep8

what i will do next time?
* work on the livepatch downloader and make the kpatch creator flexible

June 23, 2017
Alice Ferrazzi a.k.a. alicef (homepage, bugs)
Google-Summer-of-Code-day17 (June 23, 2017, 23:12 UTC)

Google Summer of Code day 16

What was my plan for today?

  • going on with the code for retriving the livepatch and installing it
  • Ask infra for a server where to install elivepatch sever

What i did today?
Sended request for the server that will offer the elivepatch service as talked with my mentor. https://bugs.gentoo.org/show_bug.cgi?id=622476

Fixed some pep8 warnings.

Livepatch server is now returning information about the livepatch building status.

Removed basic auth as we will go with SSL.

The client is now sending information about the kernel version when requesting a new build.

The kernel directory under the server is now a livepatch class variable.

what i will do next time?

  • going on with the code for retriving the livepatch and installing it

Google-Summer-of-Code-day16 (June 23, 2017, 23:12 UTC)

Google Summer of Code day 16

What was my plan for today?

  • Divide call for sending (patch, config) and the call for build the livepatch
  • Make the livepatch call more flexible (as now is hardcoded)
  • Ask infra for a server where to install elivepatch sever

What i did today?

Added patch file path argument to the elivepatch server API and added patch call to elivepatch client.

Adding way for dividing the call for sending the configuration with a POST call sending the patch with a POST calland than start the livepatch build and getting the result.

patch sended work and working on calling livepatch.

Added docstring to the build patch function.

Cleaned GetLive dispatcher function.

Added call from client to build livepatch of the server API.

made send_file function more generic for send all kind of file.

what i will do next time?

  • going on with the code for retriving the livepatch and installing it

June 21, 2017
Alice Ferrazzi a.k.a. alicef (homepage, bugs)
2017-06-21 Google-Summer-of-Code-day14 (June 21, 2017, 17:57 UTC)

Google Summer of Code day 15

What was my plan for today?
working on sending the configuration file on RESTful api,
and starting to work on making the patch.ko file in the server.

What i did today?
using wekzeug.datastructures.FileStorage in elivepatch_server,
I could receive the file from the elivepatch_client POST request
using the RESTful API.

def post(self):
    parse = reqparse.RequestParser()
    parse.add_argument('file', type=werkzeug.datastructures.FileStorage, location)

so as now we can get the kernel configuration file, extract if is .gz filename and send it to the elivepatch server.

elivepatch server need to read the configuration, compare it with the
current kernel configuration and if different recompile the kernel.
After we can start making the livepatch with kpatch-build.

This is the example of using kpatch-build:

kpatch-build/kpatch-build -s /usr/src/linux-4.9.16-gentoo/ -v /usr/src/linux-4.9.16-gentoo/vmlinux examples/test.patch --skip-gcc-check
gsoc-2017 kpatch (gentoo) # kpatch-build/kpatch-build --help
usage: kpatch-build [options] <patch file>
            -h, --help         Show this help message
            -r, --sourcerpm    Specify kernel source RPM
            -s, --sourcedir    Specify kernel source directory
            -c, --config       Specify kernel config file
            -v, --vmlinux      Specify original vmlinux
            -t, --target       Specify custom kernel build targets
            -d, --debug        Keep scratch files in /tmp
            --skip-cleanup     Skip post-build cleanup
            --skip-gcc-check   Skip gcc version matching check
                               (not recommended)

This command is called automatically by the elivepatch server after receiving the configuration file.

we need also to send the patch file.

what i will do next time?

  • Divide call for sending (patch, config) and the call for build the livepatch
  • Make the livepatch call more flexible (as now is hardcoded)
  • Ask infra for a server where to install elivepatch sever

Nathan Zachary a.k.a. nathanzachary (homepage, bugs)
The Book of Henry film review (June 21, 2017, 03:06 UTC)

Notice

Though I make every effort to not include spoilers in my reviews (personal opinions and biases, however, are fair game 🙂 ), some of the linked pages may contain them.

The Book of Henry opening title

This past weekend I went to the theatre to see the newly-released film The Book of Henry (WARNING: the IMDb and Wikipedia pages contain massive spoilers) starring Jaeden Lieberher, Jacob Tremblay, and Naomi Watts. I’ve been anxiously awaiting the release of the film because 1) the trailer piqued my interest, and 2) the cast is filled with some outstanding talent.

Without giving too much away, the basic premise is that Henry (Lieberher) is an incredibly intelligent 11-year-old who essentially plays the “head of household” role for the family. His mother Susan (Watts) has come to rely on him for many adult-oriented tasks, such as managing the finances, setting ground rules, and taking care of his younger brother Peter (Tremblay). When Henry learns that a girl in his class (who is also his neighbour) is being abused by her stepfather, he devises a plan to rescue her and his mother is tasked with helping execute that plan.

Brothers Peter and Henry Carpenter in The Book of HenryBrothers Peter and Henry Carpenter

I went into this film not knowing all that much about it (which is what I prefer: I would rather form my own impressions than start viewing it with preconceived notions and expectations). Based on the opening credits, I anticipated being introduced to a precocious kid that was wise beyond his years. I thought that it may turn into a dramatic representation of his sociocultural interactions at school and how they impacted his outlook on life. Prepared for that type of somewhat slower-paced progression, I was pleasantly surprised to find a stronger emphasis on the family dynamics between Henry, Peter, Susan, and Susan’s coworker and friend, Sheila (Sarah Silverman). As the film continues, we do get to see some of the philosophically advanced aspects of Henry’s worldview, but more importantly we see—through his focus on making the world a better place and his desire for justice—the love that he has for other people.

 

Our legacy isn’t what we write on a résumé or how many commas we have in our bank account. It’s who we’re lucky enough to have in our lives, and what we can leave them with. The one thing we do know: we’re here now, so I say we do the best we can while we’re on this side of the dirt.—Henry Carpenter

Henry and his mother discuss helping othersHenry and his mother discussing helping others

Is that what sets The Book of Henry apart from other films? No, not at all. There are plenty of examples of films where a child protagonist is longing to make a difference in the world (including another one of Liberher’s recent roles [and Watts’s, for that matter] in St. Vincent). Instead, what makes this film special is the way in which it is able to elicit the full gamut of human emotion in such a strong, sweeping fashion. For instance, in one scene we’re reminded of the frivolity of childhood by seeing Henry and Peter playing and inventing in their elaborate fort in the woods behind their home. Thereafter, we see some of the strain on Susan as she tries to balance her own needs and wants with doing the best that she in can in raising her two boys (as illustrated in a scene where she may have slightly overindulged with her friend Sheila 😉 ). Following that somewhat more serious tone comes a tender moment with Susan putting Henry and Peter to bed by singing a song to them and playing the ukulele. It’s this juxtaposition of emotional evocation that elevates The Book of Henry to levels above others in the genre.

Henry and Peter in their backyard fortHenry and Peter in their backyard fort

Focusing again on emotionality, there have been very few movies over the years that have actually made me cry (for the curious, Radio Flyer, Life is Beautiful, and The Fountain are a few). I’ve been misty-eyed before, but there were a few scenes in this film that actually brought me to tears. One in particular was so gut-wrenching for me that I thought I might have to briefly excuse myself from the theatre (thankfully, though, I toughened up). Will everyone react in the same way to these types of scenes? No, but that’s one of the beautiful bonds between well-made films and its viewers.

Even with the moments that were extremely difficult to watch due to their forlorn sentiments, there were plenty of convivial parts that provided an overall undertone of warmth throughout the movie. One key example that sticks out in my mind is at the Calvary Elementary School talent show where Peter performs a magic trick. He comes out in full magician garb (reminiscent of one of Tremblay’s first roles in The Magic Ferret [which, by the way, is a cute short film that can be purchased through director Alison Parker’s website]), and wows the audience all the while smiling ear-to-ear. These types of scenes provided the much needed comic relief from the overarching sentimentality.

The Book of Henry - Jacob Tremblay as Peter the GreatJacob Tremblay as Peter the Great

It is likely this roller coaster ride of emotions (or some of the unusual/unresolved details) that led to the overwhelmingly negative Rotten Tomatoes Critics’ Consensus that the movie “deserves a few points for ambition, but its tonal juggling act—and a deeply maudlin twist—may leave viewers gaping in disbelief rather than choking back tears”. Again, that’s the beauty of the world of artistic expression (which, of course, includes film): it can speak to different people in vastly different ways based on what they individually bring to the table. I, for one, thought that it was beautifully done and in no way “maudlin”. Instead of relying solely on the reviews (including this one) of professional critics or viewers, though, I urge you to go see it for yourself and make your own assessment.

8 out of 10 stars:
Filled starFilled starFilled starFilled starFilled starFilled starFilled starFilled starUnfilled starUnfilled star

Cheers,
Zach

June 17, 2017
Sebastian Pipping a.k.a. sping (homepage, bugs)

Expat 2.2.1 has been released. It’s a security release with a variety of security fixes, for instance: An infinite loop denial-of-service fix (that Rhodri James wrote more about), introduction of SipHash against sophisticated hash flooding, use of OS-specific high quality entropy providers like getrandom, integer overflow fixes, and more. We also got better code coverage, moved all but the downloads from SourceForge to GitHub, … but maybe have a look at the detailed change log yourself 🙂

So if you control copies of Expat somewhere, please get them updated.

Let me use the occasion to point out that we are looking for help with a few things Expat. There are tickets with details up here. If you can help, please get in touch.

Thanks and best

 

Sebastian

lame: two UBSAN crashes (June 17, 2017, 14:52 UTC)

Description:
lame is a high quality MPEG Audio Layer III (MP3) encoder licensed under the LGPL.

Few notes before the details of this bug. Time ago a fuzz was done by Brian Carpenter and Jakub Wilk which posted the results on the debian bugtracker. In cases like this, when upstream is not active and people do not post on the upstream bugzilla is easy discover duplicates, so I downloaded all available testcases, and noone of the bug you will see on my blog is a duplicate of an existing issue. Upstream seems a bit dead, latest release was into 2011, so this blog post will probably forwarded on the upstream bugtracker just for the record.

The complete ASan output of the issue:

# lame -f -V 9 $FILE out.wav
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/brhist.c:204:60: runtime error: signed integer overflow: 953447384 + 1908859798 cannot be represented in type 'int'

Reproducer:
https://github.com/asarubbo/poc/blob/master/00298-lame-signintoverflow-brhist.c
CVE:
N/A

#######################

# lame -f -V 9 $FILE out.wav
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:1234:21: runtime error: value -nan is outside the range of representable values of type 'int'

Reproducer:
https://github.com/asarubbo/poc/blob/master/00299-lame-outside-int-get_audio.c
CVE:
N/A

#######################

Affected version:
3.99.5

Fixed version:
N/A

Commit fix:
N/A

Credit:
These bugs were discovered by Agostino Sarubbo of Gentoo.

Timeline:
2017-06-01: bug discovered
2017-06-17: blog post about the issue

Note:
These bugs were found with American Fuzzy Lop.

Permalink:

lame: two UBSAN crashes

lame: multiple left shift (June 17, 2017, 14:52 UTC)

Description:
lame is a high quality MPEG Audio Layer III (MP3) encoder licensed under the LGPL.

Few notes before the details of this bug. Time ago a fuzz was done by Brian Carpenter and Jakub Wilk which posted the results on the debian bugtracker. In cases like this, when upstream is not active and people do not post on the upstream bugzilla is easy discover duplicates, so I downloaded all available testcases, and noone of the bug you will see on my blog is a duplicate of an existing issue. Upstream seems a bit dead, latest release was into 2011, so this blog post will probably forwarded on the upstream bugtracker just for the record.

The complete ASan output of the issue:

# lame -f -V 9 $FILE out.wav
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:263:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:265:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:266:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:267:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:268:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:269:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:271:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:272:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:273:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:274:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:276:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:277:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:278:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:279:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/VbrTag.c:280:5: runtime error: left shift of negative value -1
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:845:48: runtime error: left shift of negative value -18
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:848:52: runtime error: left shift of negative value -10

Reproducer:
https://github.com/asarubbo/poc/blob/master/00295-lame-leftshift1
CVE:
N/A

#######################################

# lame -f -V 9 $FILE out.wav
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:848:52: runtime error: left shift of negative value -29398
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/bitstream.c:181:50: runtime error: left shift of 45389699 by 6 places cannot be represented in type 'int'

Reproducer:
https://github.com/asarubbo/poc/blob/master/00296-lame-leftshift2
CVE:
N/A

#######################################

# lame -f -V 9 $FILE out.wav
/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:1195:52: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'

Reproducer:
https://github.com/asarubbo/poc/blob/master/00297-lame-leftshift3
CVE:
N/A

#######################################

Affected version:
3.99.5

Fixed version:
N/A

Commit fix:
N/A

Credit:
These bugs were discovered by Agostino Sarubbo of Gentoo.

Timeline:
2017-06-01: bug discovered
2017-06-17: blog post about the issue

Note:
These bugs were found with American Fuzzy Lop.

Permalink:

lame: multiple left shift

Description:
lame is a high quality MPEG Audio Layer III (MP3) encoder licensed under the LGPL.

Few notes before the details of this bug. Time ago a fuzz was done by Brian Carpenter and Jakub Wilk which posted the results on the debian bugtracker. In cases like this, when upstream is not active and people do not post on the upstream bugzilla is easy discover duplicates, so I downloaded all available testcases, and noone of the bug you will see on my blog is a duplicate of an existing issue. Upstream seems a bit dead, latest release was into 2011, so this blog post will probably forwarded on the upstream bugtracker just for the record.

The complete ASan output of the issue:

# lame -f -V 9 $FILE out.wav
==30801==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffe82a515a0 at pc 0x7f56d24c9df7 bp 0x7ffe82a4ffb0 sp 0x7ffe82a4ffa8
WRITE of size 4 at 0x7ffe82a515a0 thread T0
    #0 0x7f56d24c9df6 in III_dequantize_sample /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/layer3.c
    #1 0x7f56d24a664f in decode_layer3_frame /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/layer3.c:1738:17
    #2 0x7f56d24733ca in decodeMP3_clipchoice /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/interface.c:615:13
    #3 0x7f56d2470c13 in decodeMP3 /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/interface.c:696:12
    #4 0x7f56d2431092 in decode1_headersB_clipchoice /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/mpglib_interface.c:149:11
    #5 0x7f56d243694a in hip_decode1_headersB /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/mpglib_interface.c:436:16
    #6 0x7f56d243694a in hip_decode1_headers /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/mpglib_interface.c:379
    #7 0x51e984 in lame_decode_fromfile /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:2089:11
    #8 0x51e984 in read_samples_mp3 /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:877
    #9 0x51e984 in get_audio_common /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:785
    #10 0x51e4fa in get_audio /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:688:16
    #11 0x50f776 in lame_encoder_loop /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:456:17
    #12 0x50f776 in lame_encoder /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:531
    #13 0x50c43f in lame_main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:707:15
    #14 0x510793 in c_main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/main.c:470:15
    #15 0x510793 in main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/main.c:438
    #16 0x7f56d1029680 in __libc_start_main /tmp/portage/sys-libs/glibc-2.23-r3/work/glibc-2.23/csu/../csu/libc-start.c:289
    #17 0x41c998 in _init (/usr/bin/lame+0x41c998)

Address 0x7ffe82a515a0 is located in stack of thread T0 at offset 5024 in frame
    #0 0x7f56d24a548f in decode_layer3_frame /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/layer3.c:1659

  This frame has 4 object(s):
    [32, 344) 'scalefacs'
    [416, 5024) 'hybridIn' 0x1000505422b0: 00 00 00 00[f2]f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2
  0x1000505422c0: f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2
  0x1000505422d0: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000505422e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000505422f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100050542300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==30801==ABORTING

Affected version:
3.99.5

Fixed version:
N/A

Commit fix:
N/A

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-9872

Reproducer:
https://github.com/asarubbo/poc/blob/master/00294-lame-stackoverflow-III_dequantize_sample

Timeline:
2017-06-01: bug discovered
2017-06-17: blog post about the issue
2017-06-25: CVE assigned

Note:
This bug was found with American Fuzzy Lop.

Permalink:

lame: stack-based buffer overflow in III_dequantize_sample (layer3.c)

Description:
lame is a high quality MPEG Audio Layer III (MP3) encoder licensed under the LGPL.

Few notes before the details of this bug. Time ago a fuzz was done by Brian Carpenter and Jakub Wilk which posted the results on the debian bugtracker. In cases like this, when upstream is not active and people do not post on the upstream bugzilla is easy discover duplicates, so I downloaded all available testcases, and noone of the bug you will see on my blog is a duplicate of an existing issue. Upstream seems a bit dead, latest release was into 2011, so this blog post will probably forwarded on the upstream bugtracker just for the record.

The complete ASan output of the issue:

# lame -f -V 9 $FILE out.wav
==30141==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd587a7600 at pc 0x7f95cdaf0f34 bp 0x7ffd587a6250 sp 0x7ffd587a6248
WRITE of size 4 at 0x7ffd587a7600 thread T0
    #0 0x7f95cdaf0f33 in III_i_stereo /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/layer3.c:1236:28
    #1 0x7f95cdaf0f33 in decode_layer3_frame /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/layer3.c:1753
    #2 0x7f95cdaaa3ca in decodeMP3_clipchoice /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/interface.c:615:13
    #3 0x7f95cdaa7c13 in decodeMP3 /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/interface.c:696:12
    #4 0x7f95cda68092 in decode1_headersB_clipchoice /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/mpglib_interface.c:149:11
    #5 0x7f95cda6d94a in hip_decode1_headersB /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/mpglib_interface.c:436:16
    #6 0x7f95cda6d94a in hip_decode1_headers /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/mpglib_interface.c:379
    #7 0x51efa6 in lame_decode_fromfile /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:2099:19
    #8 0x51efa6 in read_samples_mp3 /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:877
    #9 0x51efa6 in get_audio_common /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:785
    #10 0x51e4fa in get_audio /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:688:16
    #11 0x50f776 in lame_encoder_loop /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:456:17
    #12 0x50f776 in lame_encoder /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:531
    #13 0x50c43f in lame_main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:707:15
    #14 0x510793 in c_main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/main.c:470:15
    #15 0x510793 in main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/main.c:438
    #16 0x7f95cc660680 in __libc_start_main /tmp/portage/sys-libs/glibc-2.23-r3/work/glibc-2.23/csu/../csu/libc-start.c:289
    #17 0x41c998 in _init (/usr/bin/lame+0x41c998)

Address 0x7ffd587a7600 is located in stack of thread T0 at offset 5024 in frame
    #0 0x7f95cdadc48f in decode_layer3_frame /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/layer3.c:1659

  This frame has 4 object(s):
    [32, 344) 'scalefacs'
    [416, 5024) 'hybridIn' 0x10002b0ecec0:[f2]f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2
  0x10002b0eced0: f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2
  0x10002b0ecee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002b0ecef0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002b0ecf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10002b0ecf10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==30141==ABORTING

Affected version:
3.99.5

Fixed version:
N/A

Commit fix:
N/A

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-9871

Reproducer:
https://github.com/asarubbo/poc/blob/master/00293-lame-stackoverflow-III_i_stereo

Timeline:
2017-06-01: bug discovered
2017-06-17: blog post about the issue
2017-06-25: CVE assigned

Note:
This bug was found with American Fuzzy Lop.

Permalink:

lame: stack-based buffer overflow in III_i_stereo (layer3.c)

Description:
lame is a high quality MPEG Audio Layer III (MP3) encoder licensed under the LGPL.

Few notes before the details of this bug. Time ago a fuzz was done by Brian Carpenter and Jakub Wilk which posted the results on the debian bugtracker. In cases like this, when upstream is not active and people do not post on the upstream bugzilla is easy discover duplicates, so I downloaded all available testcases, and noone of the bug you will see on my blog is a duplicate of an existing issue. Upstream seems a bit dead, latest release was into 2011, so this blog post will probably forwarded on the upstream bugtracker just for the record.

The complete ASan output of the issue:

# lame -f -V 9 $FILE out.wav
==29263==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60c00000003c at pc 0x7f60ef5a8c12 bp 0x7ffe7420b940 sp 0x7ffe7420b938
READ of size 4 at 0x60c00000003c thread T0
    #0 0x7f60ef5a8c11 in fill_buffer_resample /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/util.c
    #1 0x7f60ef5a8c11 in fill_buffer /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/util.c:677
    #2 0x7f60ef47866b in lame_encode_buffer_sample_t /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/lame.c:1736:9
    #3 0x7f60ef47866b in lame_encode_buffer_template /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/lame.c:1891
    #4 0x7f60ef47e83a in lame_encode_buffer /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/lame.c:1902:12
    #5 0x7f60ef47e83a in lame_encode_flush /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/lame.c:2134
    #6 0x50fa2c in lame_encoder_loop /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:487:16
    #7 0x50fa2c in lame_encoder /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:531
    #8 0x50c43f in lame_main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:707:15
    #9 0x510793 in c_main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/main.c:470:15
    #10 0x510793 in main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/main.c:438
    #11 0x7f60ee1c7680 in __libc_start_main /tmp/portage/sys-libs/glibc-2.23-r3/work/glibc-2.23/csu/../csu/libc-start.c:289
    #12 0x41c998 in _init (/usr/bin/lame+0x41c998)

0x60c00000003c is located 4 bytes to the left of 128-byte region [0x60c000000040,0x60c0000000c0)
allocated by thread T0 here:
    #0 0x4d2540 in calloc /tmp/portage/sys-libs/compiler-rt-sanitizers-4.0.0/work/compiler-rt-4.0.0.src/lib/asan/asan_malloc_linux.cc:74
    #1 0x7f60ef5a3575 in fill_buffer_resample /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/util.c:558:29
    #2 0x7f60ef5a3575 in fill_buffer /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/util.c:677
    #3 0x7f60ef47866b in lame_encode_buffer_sample_t /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/lame.c:1736:9
    #4 0x7f60ef47866b in lame_encode_buffer_template /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/lame.c:1891
    #5 0x7f60ef47e83a in lame_encode_buffer /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/lame.c:1902:12
    #6 0x7f60ef47e83a in lame_encode_flush /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/lame.c:2134
    #7 0x50fa2c in lame_encoder_loop /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:487:16
    #8 0x50fa2c in lame_encoder /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:531
    #9 0x50c43f in lame_main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:707:15
    #10 0x510793 in c_main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/main.c:470:15
    #11 0x510793 in main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/main.c:438
    #12 0x7f60ee1c7680 in __libc_start_main /tmp/portage/sys-libs/glibc-2.23-r3/work/glibc-2.23/csu/../csu/libc-start.c:289

SUMMARY: AddressSanitizer: heap-buffer-overflow /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/util.c in fill_buffer_resample
Shadow bytes around the buggy address:
  0x0c187fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c187fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c187fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c187fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c187fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c187fff8000: fa fa fa fa fa fa fa[fa]00 00 00 00 00 00 00 00
  0x0c187fff8010: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
  0x0c187fff8020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c187fff8030: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c187fff8040: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
  0x0c187fff8050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==29263==ABORTING

Affected version:
3.99.5

Fixed version:
N/A

Commit fix:
N/A

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2015-9101

Reproducer:
https://github.com/asarubbo/poc/blob/master/00292-lame-heapoverflow-fill_buffer_resample

Timeline:
2017-06-01: bug discovered
2017-06-17: blog post about the issue
2017-06-25: CVE assigned

Note:
This bug was found with American Fuzzy Lop.
Mitre decided that this bug can share the same CVE id of https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=777161

Permalink:

lame: heap-based buffer overflow in fill_buffer_resample (util.c)

Description:
lame is a high quality MPEG Audio Layer III (MP3) encoder licensed under the LGPL.

Few notes before the details of this bug. Time ago a fuzz was done by Brian Carpenter and Jakub Wilk which posted the results on the debian bugtracker. In cases like this, when upstream is not active and people do not post on the upstream bugzilla is easy discover duplicates, so I downloaded all available testcases, and noone of the bug you will see on my blog is a duplicate of an existing issue. Upstream seems a bit dead, latest release was into 2011, so this blog post will probably forwarded on the upstream bugtracker just for the record.

The complete ASan output of the issue:

# lame -f -V 9 $FILE out.wav
==28403==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7fecc4b7eb6c at pc 0x7fecc489accc bp 0x7fff525972d0 sp 0x7fff525972c8
READ of size 4 at 0x7fecc4b7eb6c thread T0
    #0 0x7fecc489accb in III_i_stereo /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/layer3.c:1149:26
    #1 0x7fecc489accb in decode_layer3_frame /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/layer3.c:1753
    #2 0x7fecc48543ca in decodeMP3_clipchoice /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/interface.c:615:13
    #3 0x7fecc4851c13 in decodeMP3 /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/interface.c:696:12
    #4 0x7fecc4812092 in decode1_headersB_clipchoice /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/mpglib_interface.c:149:11
    #5 0x7fecc481794a in hip_decode1_headersB /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/mpglib_interface.c:436:16
    #6 0x7fecc481794a in hip_decode1_headers /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/libmp3lame/mpglib_interface.c:379
    #7 0x51e984 in lame_decode_fromfile /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:2089:11
    #8 0x51e984 in read_samples_mp3 /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:877
    #9 0x51e984 in get_audio_common /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:785
    #10 0x51e4fa in get_audio /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/get_audio.c:688:16
    #11 0x50f776 in lame_encoder_loop /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:456:17
    #12 0x50f776 in lame_encoder /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:531
    #13 0x50c43f in lame_main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/lame_main.c:707:15
    #14 0x510793 in c_main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/main.c:470:15
    #15 0x510793 in main /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/frontend/main.c:438
    #16 0x7fecc340a680 in __libc_start_main /tmp/portage/sys-libs/glibc-2.23-r3/work/glibc-2.23/csu/../csu/libc-start.c:289
    #17 0x41c998 in _init (/usr/bin/lame+0x41c998)

0x7fecc4b7eb6c is located 20 bytes to the left of global variable 'pow2_1' defined in '/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/layer3.c:128:28' (0x7fecc4b7eb80) of size 128
0x7fecc4b7eb6c is located 12 bytes to the right of global variable 'pow1_1' defined in '/var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/layer3.c:128:13' (0x7fecc4b7eae0) of size 128
SUMMARY: AddressSanitizer: global-buffer-overflow /var/tmp/portage/media-sound/lame-3.99.5-r1/work/lame-3.99.5/mpglib/layer3.c:1149:26 in III_i_stereo
Shadow bytes around the buggy address:
  0x0ffe18967d10: 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9
  0x0ffe18967d20: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x0ffe18967d30: 00 00 00 00 f9 f9 f9 f9 00 00 00 00 00 00 00 00
  0x0ffe18967d40: f9 f9 f9 f9 00 00 00 00 00 00 00 00 f9 f9 f9 f9
  0x0ffe18967d50: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 00 00 00 00
=>0x0ffe18967d60: 00 00 00 00 00 00 00 00 00 00 00 00 f9[f9]f9 f9
  0x0ffe18967d70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffe18967d80: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffe18967d90: 00 00 00 00 f9 f9 f9 f9 00 00 00 00 00 00 00 00
  0x0ffe18967da0: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 00 00 00 00
  0x0ffe18967db0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==28403==ABORTING

Affected version:
3.99.5

Fixed version:
N/A

Commit fix:
N/A

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-9870

Reproducer:
https://github.com/asarubbo/poc/blob/master/00291-lame-globaloverflow-III_i_stereo

Timeline:
2017-06-01: bug discovered
2017-06-17: blog post about the issue
2017-06-25: CVE assigned

Note:
This bug was found with American Fuzzy Lop.

Permalink:

lame: global-buffer-overflow in III_i_stereo (layer3.c)

June 16, 2017
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

Recently, Gentoo Linux put GCC 6.3 (released in December 2016) into the testing branch. For a source-based distribution like Gentoo, GCC is a critical part of the toolchain, and sometimes can lead to packages failing to compile or run. I recently ran into just this problem with Audacity. The error that I hit was not during compilation, but during runtime:


$ audacity
Fatal Error: Mismatch between the program and library build versions detected.
The library used 3.0 (wchar_t,compiler with C++ ABI 1009,wx containers,compatible with 2.8),
and your program used 3.0 (wchar_t,compiler with C++ ABI 1010,wx containers,compatible with 2.8).
Aborted

The Gentoo Wiki has a nice, detailed page on Upgrading GCC, and explicitly calls out ABI changes. In this particular case of Audacity, the problematic library is referenced in the error above: “wx containers”. WX containers are handled by the wxGTK package. So, I simply needed to rebuild the currently-installed version of wxGTK to fix this particular problem.

Hope that helps.

Cheers,
Zach

June 15, 2017
Hanno Böck a.k.a. hanno (homepage, bugs)
Don't leave Coredumps on Web Servers (June 15, 2017, 09:20 UTC)

CoreCoredumps are a feature of Linux and other Unix systems to analyze crashing software. If a software crashes, for example due to an invalid memory access, the operating system can save the current content of the application's memory to a file. By default it is simply called core.

While this is useful for debugging purposes it can produce a security risk. If a web application crashes the coredump may simply end up in the web server's root folder. Given that its file name is known an attacker can simply download it via an URL of the form http://example.org/core. As coredumps contain an application's memory they may expose secret information. A very typical example would be passwords.

PHP used to crash relatively often. Recently a lot of these crash bugs have been fixed, in part because PHP now has a bug bounty program. But there are still situations in which PHP crashes. Some of them likely won't be fixed.

How to disclose?

With a scan of the Alexa Top 1 Million domains for exposed core dumps I found around 1.000 vulnerable hosts. I was faced with a challenge: How can I properly disclose this? It is obvious that I wouldn't write hundreds of manual mails. So I needed an automated way to contact the site owners.

Abusix runs a service where you can query the abuse contacts of IP addresses via a DNS query. This turned out to be very useful for this purpose. One could also imagine contacting domain owners directly, but that's not very practical. The domain whois databases have rate limits and don't always expose contact mail addresses in a machine readable way.

Using the abuse contacts doesn't reach all of the affected host operators. Some abuse contacts were nonexistent mail addresses, others didn't have abuse contacts at all. I also got all kinds of automated replies, some of them asking me to fill out forms or do other things, otherwise my message wouldn't be read. Due to the scale I ignored those. I feel that if people make it hard for me to inform them about security problems that's not my responsibility.

I took away two things that I changed in a second batch of disclosures. Some abuse contacts seem to automatically search for IP addresses in the abuse mails. I originally only included affected URLs. So I changed that to include the affected IPs as well.

In many cases I was informed that the affected hosts are not owned by the company I contacted, but by a customer. Some of them asked me if they're allowed to forward the message to them. I thought that would be obvious, but I made it explicit now. Some of them asked me that I contact their customers, which again, of course, is impractical at scale. And sorry: They are your customers, not mine.

How to fix and prevent it?

If you have a coredump on your web host, the obvious fix is to remove it from there. However you obviously also want to prevent this from happening again.

There are two settings that impact coredump creation: A limits setting, configurable via /etc/security/limits.conf and ulimit and a sysctl interface that can be found under /proc/sys/kernel/core_pattern.

The limits setting is a size limit for coredumps. If it is set to zero then no core dumps are created. To set this as the default you can add something like this to your limits.conf:

* soft core 0

The sysctl interface sets a pattern for the file name and can also contain a path. You can set it to something like this:

/var/log/core/core.%e.%p.%h.%t

This would store all coredumps under /var/log/core/ and add the executable name, process id, host name and timestamp to the filename. The directory needs to be writable by all users, you should use a directory with the sticky bit (chmod +t).

If you set this via the proc file interface it will only be temporary until the next reboot. To set this permanently you can add it to /etc/sysctl.conf:

kernel.core_pattern = /var/log/core/core.%e.%p.%h.%t

Some Linux distributions directly forward core dumps to crash analysis tools. This can be done by prefixing the pattern with a pipe (|). These tools like apport from Ubuntu or abrt from Fedora have also been the source of security vulnerabilities in the past. However that's a separate issue.

Look out for coredumps

My scans showed that this is a relatively common issue. Among popular web pages around one in a thousand were affected before my disclosure attempts. I recommend that pentesters and developers of security scan tools consider checking for this. It's simple: Just try download the /core file and check if it looks like an executable. In most cases it will be an ELF file, however sometimes it may be a Mach-O (OS X) or an a.out file (very old Linux and Unix systems).

Image credit: NASA/JPL-Université Paris Diderot

June 13, 2017
Alice Ferrazzi a.k.a. alicef (homepage, bugs)
Moving back to ikiwiki (June 13, 2017, 03:40 UTC)

I was trying to use blogs.gentoo.org/alicef/ Gentoo official blog
based on wordpress.
As far as I could like the draft feature, it had some big drawback.
Most big one I couldn't post any syntax highlighted code.
And wordpress maintenance takes lots of time, in particular managing plugins.
Also because I cannot change Gentoo blog plugins without admin privilege, is bit too much to have to ask every time I have problem with plugins or I need a new one.

So I decided to come back to ikiwiki.
Also in ikiwiki I can make some sort of draft function where post tagged as draft dosen't come up on the blog list.
This is simply done by using this on the blog.mdwn
pages="page(blog/) and !/Discussion and !*/local.css and !tagged(draft)"
That will remove the blog page tagged with draft from the blog view.

For the syntax highlight I used a plugin for pygments.py made by
tylercipriani.com
That you can find here pygments.pm

Pygments.pm output example:

#include <stdio.h>

int main(void) {
    // your code goes here
    return 0;
}

And last one, on Ikiwiki comments, I decided to delegate comments spam to disqus, also if I don't like so much to use a private business plugin, do is job of managing comments well enough.
I also just discovered that disqus is part of Y combinator, the company behind hacker news.

[2017 05] Gentoo activity summary (June 13, 2017, 01:49 UTC)

June 12, 2017
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)

I've just uploaded Lab::Measurement 3.544 to CPAN. This is our first release containing support for Zurich Instruments equipment, in particular the ZI MFLI digital lock-in amplifier, via Simon's Lab::Zhinst package. Enjoy!

June 11, 2017
Sebastian Pipping a.k.a. sping (homepage, bugs)

About rel=noopener (mathiasbynens.github.io)

June 10, 2017
Matthew Thode a.k.a. prometheanfire (homepage, bugs)
Gentoo Puppet Portage Package Provider (June 10, 2017, 05:00 UTC)

Why do this

The previus built in puppet portage package provider (I'm just going to shorten it to PPPP) only supported very simplistic package interactions. Mainly package name (with slot) install and uninstall. This has proven fairly limiting, if you want to install a specific version of a package and lock it down you were forced to call out to exec or editing package.{mask,unmask,keywords} files.

The new provider (which will be built into puppet in 5.0 or puppet-agent-2.0) supports all the package provider attributes.

How do I get this awesome thing

Emerge puppet or puppet-agent with the experimental use flag.

What it can do

You can use the following attributes with the new PPPP.

  • Name - The full package atom works now, using qatom on the backend.
  • ensure - now allowing a package purge as well (CONFIG_PROTECT="-*").
  • install_options - you can now pass options to emerge (--deep or --usepkgonly for example).
  • uninstall_options - just like install_options

Being able to call out specific versions and per package install options will give much greater flexability.

fin

Here is the pull request that upstream puppet merged.

If you have any questions I'm on freenode as prometheanfire.

June 07, 2017
Sven Vermeulen a.k.a. swift (homepage, bugs)
Structuring infrastructural deployments (June 07, 2017, 18:40 UTC)

Many organizations struggle with the all-time increase in IP address allocation and the accompanying need for segmentation. In the past, governing the segments within the organization means keeping close control over the service deployments, firewall rules, etc.

Lately, the idea of micro-segmentation, supported through software-defined networking solutions, seems to defy the need for a segmentation governance. However, I think that that is a very short-sighted sales proposition. Even with micro-segmentation, or even pure point-to-point / peer2peer communication flow control, you'll still be needing a high level overview of the services within your scope.

In this blog post, I'll give some insights in how we are approaching this in the company I work for. In short, it starts with requirements gathering, creating labels to assign to deployments, creating groups based on one or two labels in a layered approach, and finally fixating the resulting schema and start mapping guidance documents (policies) toward the presented architecture.

June 03, 2017
Matthew Thode a.k.a. prometheanfire (homepage, bugs)
Gentoo portage templates (June 03, 2017, 05:00 UTC)

Why do this

Gentoo is known for being somewhat complex to manage, making clusters of gentoo machines even more complex in most scenarios. Using the following methods the configuration becomes easier.

By the end of this you should be able to have a default hiera configuration for Gentoo while still being able to override it for specific use cases. What makes the method I chose particularly powerful is the ability to delete default vales entirely, not just setting them to something else.

Most of these methods came from my experience with chef that I thought would apply well to other config engines. While some don't like shoving logic to the configuration template engine, I'm open to suggestions.

Requirements

  • Puppet 4.x or puppet-agent with hiera support.
  • Puppet's stdlib installed (specifically for delete_values stuff).
  • (optional) use puppetserver instead of running this oneoff.
  • Hiera configured to use the following configuration.

Hiera config

:merge_behavior: deeper
:deep_merge_options:
  :merge_hash_arrays: true

Basic Setup

  • Convert the common portage configuraitons to templates.
  • Convert the data in those templates to a datastructure.
  • Use hiera to write the defaults / node overrides.
  • Call the templates via a puppet module.

Datastructures

The easiest way of explaing how this works is to say that the only data stored in the 'deepest' value is ever going to be True or False. The reason for this is a because using deep_merge in hiera is an additive process and we need a flag to remove things further down the line.

The datastructure itself is fairly simple, here is an excerpt from my setup.

make_conf:
  emerge_default_opts:
    "--quiet-build": true
    "--changed-use": true

If I wanted to disable --quiet-build down the line you would just set the value to False in a higher precidence (the specific node config instead of the general location.

make_conf:
  emerge_default_opts:
    "--quiet-build": false

Configuration Templates

The templates themselves are epp based, not erb (the old method).

package.keywords

For this one I'll also supply how I auto-set the right archetecture, works for amd64 at least.

Hiera data

"app-admin/paxtest ~%{facts.architecture}": true

Template

<%- |$packages| -%>
# THIS FILE WAS GENERATED BY PUPPET, CHANGES WILL BE OVERWRITTEN

<%- keys(delete_values($packages, false)).each |$package| { -%>
<%= "$package" %>
<%- } -%>

This one is the simplest, if a value for the key (paxtest in this case) is set to false, don't use it, the remaining keys are then set as plan text.

package.use

<%- |$packages| -%>
# THIS FILE WAS GENERATED BY PUPPET, CHANGES WILL BE OVERWRITTEN

<%- keys(delete_values($packages, false)).each |$package| { -%>
  <%- if ! empty(keys(delete_values($packages[$package], false))) { -%>
<%= "$package" %> <%= join(keys(delete_values($packages[$package], false)), ' ') %>
  <%- } -%>
<%- } -%>

This one is fairly straight forward as well, for each package that isn't disabled, if there are keys for the package (signifying use flags, needed because we remove the unset flags) then set them. This combines the flags set from all levels in hiera.

make.conf

This will be the most complicated one, but it's also likely to be the most important. I'll explain a bit about it after the paste.

<%- |$config| -%>
# THIS FILE WAS GENERATED BY PUPPET, CHANGES WILL BE OVERWRITTEN

CFLAGS="<%= join(keys(delete_values($config['cflags'], false)), ' ') %>"
CXXFLAGS="<%= join(keys(delete_values($config['cxxflags'], false)), ' ') %>"
CHOST="<%= $config['chost'] %>"
MAKEOPTS="<%= join(keys(delete_values($config['makeopts'], false)), ' ') %>"
CPU_FLAGS_X86="<%= join(keys(delete_values($config['cpu_flags_x86'], false)), ' ') %>"
ABI_X86="<%= join(keys(delete_values($config['abi_x86'], false)), ' ') %>"

USE="<%= join(keys(delete_values($config['use'], false)), ' ') %>"

GENTOO_MIRRORS="<%= join(keys(delete_values($config['gentoo_mirrors'], false)), ' ') %>"
<% if has_key($config, 'portage_binhost') { -%>
  <%- if $config['portage_binhost'] != false { -%>
PORTAGE_BINHOST="<%= $config['portage_binhost'] %>"
  <%- } -%>
<% } -%>

FEATURES="<%= join(keys(delete_values($config['features'], false)), ' ') %>"
EMERGE_DEFAULT_OPTS="<%= join(keys(delete_values($config['emerge_default_opts'], false)), ' ') %>"
PKGDIR="<%= $config['pkgdir'] %>"
PORT_LOGDIR="<%= $config['port_logdir'] %>"
PORTAGE_GPG_DIR="<%= $config['portage_gpg_dir'] %>"
PORTAGE_GPG_KEY='<%= $config['portage_gpg_key'] %>'

GRUB_PLATFORMS="<%= join(keys(delete_values($config['grub_platforms'], false)), ' ') %>"
LINGUAS="<%= join(keys(delete_values($config['linguas'], false)), ' ') %>"
L10N="<%= join(keys(delete_values($config['l10n'], false)), ' ') %>"

PORTAGE_ELOG_CLASSES="<%= join(keys(delete_values($config['portage_elog_classes'], false)), ' ') %>"
PORTAGE_ELOG_SYSTEM="<%= join(keys(delete_values($config['portage_elog_system'], false)), ' ') %>"
PORTAGE_ELOG_MAILURI="<%= $config['portage_elog_mailuri'] %>"
PORTAGE_ELOG_MAILFROM="<%= $config['portage_elog_mailfrom'] %>"

<% if has_key($config, 'accept_licence') { -%>
ACCEPT_LICENSE="<%= join(keys(delete_values($config['accept_licence'], false)), ' ') %>"
<%- } -%>
POLICY_TYPES="<%= join(keys(delete_values($config['policy_types'], false)), ' ') %>"
PAX_MARKINGS="<%= join(keys(delete_values($config['pax_markings'], false)), ' ') %>"

USE_PYTHON="<%= join(keys(delete_values($config['use_python'], false)), ' ') %>"
PYTHON_TARGETS="<%= join(keys(delete_values($config['python_targets'], false)), ' ') %>"
RUBY_TARGETS="<%= join(keys(delete_values($config['ruby_targets'], false)), ' ') %>"
PHP_TARGETS="<%= join(keys(delete_values($config['php_targets'], false)), ' ') %>"

<% if has_key($config, 'source') { -%>
source <%= join(keys(delete_values($config['source'], false)), ' ') %>
<%- } -%>

The basic idea of this is that you pass in the full make.conf datastructre you will generate as a single variable. Everything else is pulled (or elemated from that).

Each option that is selected already has all the options merged, but this could mean both the disabled versions of a given value could be still there, this is removed using the delete_values($config['foo'], false) bit.

The puppet module itself

It's fairly easy to call, just make sure the template is in the template location and do it as follows.

file { '/etc/portage/make.conf':
  content => epp('portage/portage-make_conf.epp', {'config' => hiera_hash('portage')['make_conf']})
}

fin

If you have any questions I'm on freenode as prometheanfire.

June 02, 2017
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
Load balancing Hadoop Hive with F5 BIG-IP (June 02, 2017, 12:51 UTC)

In our quest to a highly available HiveServer2, we faced so many problems and a clear lack of documentation when it came to do it with F5 BIG-IP load balancers that I think it’s worth a blog post to help around.

We are using the Cloudera Hadoop distribution but this applies whatever your distribution.

Hive HA configuration

This appears to be well documented at a first glance but the HiveServer2 (HS2) documentation vanished at the time of writing.

Anyway, using Cloudera Manager to set up HS2 HA is not hard but there are a few gotchas that I want to highlight and that you will need to be careful with:

  • As for every Keberos based service, make sure you have a dedicated IP for the HiveServer2 Load Balancer URL and that it’s reverse DNS is setup properly. Else you will get GSSAPI errors.
  • When running a secure cluster with Kerberos, the HiveServer2 Load Balancer URL is to be used as your connection host (obvious) AND in your Kerberos principal connection string (maybe less obvious).

Example beeline connection string before HA:

!connect jdbc:hive2://hive-server:10000/default;principal=hive/_HOST@REALM.COM

and with HA (notice we changed also the _HOST):

!connect jdbc:hive2://ha-hive-fqdn:10000/default;principal=hive/ha-hive-fqdn@REALM.COM

We found out the kerberos principal gotcha the hard way… The reason behind this is that the _HOST is basically a macro that will get resolved to the client host name which will then be used to validate the kerberos ticket. When running in load balanced/HA mode , the actual source IP will be replaced by the load balancer’s IP (SNAT) and the kerberos reverse DNS lookup will then fail!

So if you do not use the HS2 HA URL in the kerberos principal string, you will get Kerberos GSSAPI errors when the load balanding SNAT will be used (see next chapter).

This will require you to update all your jobs using HS2 to reflect these changes before load balancing HS2 with F5 BIG-IP.

Load balancing HiveServer2 with F5

Our pals at Cloudera have brought a good doc for Impala HA with F5 and they instructed we followed it to set up HS2 HA too because they had nothing better.

Kerberos GSSAPI problem

When we applied it the first time and tried to switch to using the F5, all our jobs failed because of the kerberos _HOST principal problem mentioned on the previous chapter. This one is not that hard to find out and debug with a google search and explained on Cloudera community forums.

We then migrated (again) all our jobs to update the principal connection strings before migrating again to the F5 load balancers.

Connection Reset problems

After our next migration to F5 load balancers, we had most of our jobs running well and the Kerberos problems vanished but we faced a new problem: some of our jobs failed with Connection Reset errors on HiveServer2:

java.sql.SQLException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset

After some debugging and traffic analysis we found out that the F5 were actually responsible for those connection reset but we struggled to understand why.

It turned out that the Protocol Profile set up on the Virtual Server was the root cause of the problem and specifically its idle timeout setting default of 300s:

Note the Reset on Timeout setting as well which is responsible for the actual Reset packet sent by the F5 to the client.

This could also be proven by the Virtual Server statistics showing an increasing Connection Expires count.

The solution is to create a new Protocol Profile based on the fastL4 with a higher Idle Timeout setting and update our Virtual Server to use this profile instead of the default one.

It seemed sensible in our case to increase the 5 minutes expiration to 1 day, so let’s call our new profile fastL4-24h-idle-timeout:

Then change the Hive Virtual Server configuration to use this Protocol Profile:

You will see no more expired connections on the Virtual Server statistics!

Job design consideration

We could argue whether or not a default 5 minutes idle timeout is reasonable or not for Hive or any other Hadoop component but it is important to point out that the jobs which were affected also had sub-optimal design pattern in the first place. This also explains why most of our jobs (including also long running ones) were not affected.

The affected jobs allowed were Talend jobs where the Hive connection was established at the beginning of the job, used at that time and then the job went on doing other things before using the Hive connection again.

When those in between computation took more than 300s, the remaining of the job failed because the initial Hive connection got reset by the F5:

This is clearly not a good job design for long processing jobs and you should refrain from doing it. Instead open a connection to Hive when you need it, use it and close it properly. Shall you need it later in your job, open a new connection to Hive and use that one.

This will also have the benefit of not keeping open idle connections to Hive itself and favour resources allocation fairness across your jobs.

I hope this will be of help to anyone facing these kind of issues.

May 19, 2017
Hanno Böck a.k.a. hanno (homepage, bugs)

Today the OCSP servers from Let’s Encrypt were offline for a while. This has caused far more trouble than it should have, because in theory we have all the technologies available to handle such an incident. However due to failures in how they are implemented they don’t really work.

We have to understand some background. Encrypted connections using the TLS protocol like HTTPS use certificates. These are essentially cryptographic public keys together with a signed statement from a certificate authority that they belong to a certain host name.

CRL and OCSP – two technologies that don’t work

Certificates can be revoked. That means that for some reason the certificate should no longer be used. A typical scenario is when a certificate owner learns that his servers have been hacked and his private keys stolen. In this case it’s good to avoid that the stolen keys and their corresponding certificates can still be used. Therefore a TLS client like a browser should check that a certificate provided by a server is not revoked.

That’s the theory at least. However the history of certificate revocation is a history of two technologies that don’t really work.

One method are certificate revocation lists (CRLs). It’s quite simple: A certificate authority provides a list of certificates that are revoked. This has an obvious limitation: These lists can grow. Given that a revocation check needs to happen during a connection it’s obvious that this is non-workable in any realistic scenario.

The second method is called OCSP (Online Certificate Status Protocol). Here a client can query a server about the status of a single certificate and will get a signed answer. This avoids the size problem of CRLs, but it still has a number of problems. Given that connections should be fast it’s quite a high cost for a client to make a connection to an OCSP server during each handshake. It’s also concerning for privacy, as it gives the operator of an OCSP server a lot of information.

However there’s a more severe problem: What happens if an OCSP server is not available? From a security point of view one could say that a certificate that can’t be OCSP-checked should be considered invalid. However OCSP servers are far too unreliable. So practically all clients implement OCSP in soft fail mode (or not at all). Soft fail means that if the OCSP server is not available the certificate is considered valid.

That makes the whole OCSP concept pointless: If an attacker tries to abuse a stolen, revoked certificate he can just block the connection to the OCSP server – and thus a client can’t learn that it’s revoked. Due to this inherent security failure Chrome decided to disable OCSP checking altogether. As a workaround they have something called CRLsets and Mozilla has something similar called OneCRL, which is essentially a big revocation list for important revocations managed by the browser vendor. However this is a weak workaround that doesn’t cover most certificates.

OCSP Stapling and Must Staple to the rescue?

There are two technologies that could fix this: OCSP Stapling and Must-Staple.

OCSP Stapling moves the querying of the OCSP server from the client to the server. The server gets OCSP replies and then sends them within the TLS handshake. This has several advantages: It avoids the latency and privacy implications of OCSP. It also allows surviving short downtimes of OCSP servers, because a TLS server can cache OCSP replies (they’re usually valid for several days).

However it still does not solve the security issue: If an attacker has a stolen, revoked certificate it can be used without Stapling. The browser won’t know about it and will query the OCSP server, this request can again be blocked by the attacker and the browser will accept the certificate.

Therefore an extension for certificates has been introduced that allows us to require Stapling. It’s usually called OCSP Must-Staple and is defined in https://tools.ietf.org/html/rfc7633 RFC 7633 (although the RFC doesn’t mention the name Must-Staple, which can cause some confusion). If a browser sees a certificate with this extension that is used without OCSP Stapling it shouldn’t accept it.

So we should be fine. With OCSP Stapling we can avoid the latency and privacy issues of OCSP and we can avoid failing when OCSP servers have short downtimes. With OCSP Must-Staple we fix the security problems. No more soft fail. All good, right?

The OCSP Stapling implementations of Apache and Nginx are broken

Well, here come the implementations. While a lot of protocols use TLS, the most common use case is the web and HTTPS. According to Netcraft statistics by far the biggest share of active sites on the Internet run on Apache (about 46%), followed by Nginx (about 20 %). It’s reasonable to say that if these technologies should provide a solution for revocation they should be usable with the major products in that area. On the server side this is only OCSP Stapling, as OCSP Must Staple only needs to be checked by the client.

What would you expect from a working OCSP Stapling implementation? It should try to avoid a situation where it’s unable to send out a valid OCSP response. Thus roughly what it should do is to fetch a valid OCSP response as soon as possible and cache it until it gets a new one or it expires. It should furthermore try to fetch a new OCSP response long before the old one expires (ideally several days). And it should never throw away a valid response unless it has a newer one. Google developer Ryan Sleevi wrote a detailed description of what a proper OCSP Stapling implementation could look like.

Apache does none of this.

If Apache tries to renew the OCSP response and gets an error from the OCSP server – e. g. because it’s currently malfunctioning – it will throw away the existing, still valid OCSP response and replace it with the error. It will then send out stapled OCSP errors. Which makes zero sense. Firefox will show an error if it sees this. This has been reported in 2014 and is still unfixed.

Now there’s an option in Apache to avoid this behavior: SSLStaplingReturnResponderErrors. It’s defaulting to on. If you switch it off you won’t get sane behavior (that is – use the still valid, cached response), instead Apache will disable Stapling for the time it gets errors from the OCSP server. That’s better than sending out errors, but it obviously makes using Must Staple a no go.

It gets even crazier. I have set this option, but this morning I still got complaints that Firefox users were seeing errors. That’s because in this case the OCSP server wasn’t sending out errors, it was completely unavailable. For that situation Apache has a feature that will fake a tryLater error to send out to the client. If you’re wondering how that makes any sense: It doesn’t. The “tryLater” error of OCSP isn’t useful at all in TLS, because you can’t try later during a handshake which only lasts seconds.

This is controlled by another option: SSLStaplingFakeTryLater. However if we read the documentation it says “Only effective if SSLStaplingReturnResponderErrors is also enabled.” So if we disabled SSLStapingReturnResponderErrors this shouldn’t matter, right? Well: The documentation is wrong.

There are more problems: Apache doesn’t get the OCSP responses on startup, it only fetches them during the handshake. This causes extra latency on the first connection and increases the risk of hitting a situation where you don’t have a valid OCSP response. Also cached OCSP responses don’t survive server restarts, they’re kept in an in-memory cache.

There’s currently no way to configure Apache to handle OCSP stapling in a reasonable way. Here’s the configuration I use, which will at least make sure that it won’t send out errors and cache the responses a bit longer than it does by default:

SSLStaplingCache shmcb:/var/tmp/ocsp-stapling-cache/cache(128000000)
SSLUseStapling on
SSLStaplingResponderTimeout 2
SSLStaplingReturnResponderErrors off
SSLStaplingFakeTryLater off
SSLStaplingStandardCacheTimeout 86400


I’m less familiar with Nginx, but from what I hear it isn’t much better either. According to https://blog.crashed.org/nginx-stapling-busted/ this blogpost it doesn’t fetch OCSP responses on startup and will send out the first TLS connections without stapling even if it’s enabled. Here’s a blog post that recommends to work around this by connecting to all configured hosts after the server has started.

To summarize: This is all a big mess. Both Apache and Nginx have OCSP Stapling implementations that are essentially broken. As long as you’re using either of those then enabling Must-Staple is a reliable way to shoot yourself in the foot and get into trouble. Don’t enable it if you plan to use Apache or Nginx.

Certificate revocation is broken. It has been broken since the invention of SSL and it’s still broken. OCSP Stapling and OCSP Must-Staple could fix it in theory. But that would require working and stable implementations in the most widely used server products.

May 18, 2017
Sven Vermeulen a.k.a. swift (homepage, bugs)
Matching MD5 SSH fingerprint (May 18, 2017, 16:20 UTC)

Today I was attempting to update a local repository, when SSH complained about a changed fingerprint, something like the following:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:p4ZGs+YjsBAw26tn2a+HPkga1dPWWAWX+NEm4Cv4I9s.
Please contact your system administrator.
Add correct host key in /home/user/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /home/user/.ssh/known_hosts:9
ECDSA host key for 192.168.56.101 has changed and you have requested strict checking.
Host key verification failed.

May 16, 2017
Alexys Jacob a.k.a. ultrabug (homepage, bugs)

In my previous blog post, I demonstrated how to use the PIV feature of a Yubikey to add a 2nd factor authentication to SSH.

Careful readers such as Grzegorz Kulewski pointed out that using the GPG capability of the Yubikey was also a great, more versatile and more secure option on the table (I love those community insights):

  • GPG keys and subkeys are indeed more flexible and can be used for case-specific operations (signing, encryption, authentication)
  • GPG is more widely used and one could use their Yubikey smartcard for SSH, VPN, HTTP auth and code signing
  • The Yubikey 4 GPG feature supports 4096 bit keys (limited to 2048 for PIV)

While I initially looked at the GPG feature, its apparent complexity got me to discard it for my direct use case (SSH). But I couldn’t resist the good points of Grzegorz and here I got back into testing it. Thank you again Grzegorz for the excuse you provided 😉

So let’s get through with the GPG feature of the Yubikey to authenticate our SSH connections. Just like the PIV method, this one has the  advantage to allow a 2nd factor authentication while using the public key authentication mechanism of OpenSSH and thus does not need any kind of setup on the servers.

Method 3 – SSH using Yubikey and GPG

Acknowledgement

The first choice you have to make is to decide whether you allow your master key to be stored on the Yubikey or not. This choice will be guided by how you plan to use and industrialize your usage of the GPG based SSH authentication.

Consider this to choose whether to store the master key on the Yubikey or not:

  • (con) it will not allow the usage of the same GPG key on multiple Yubikeys
  • (con) if you loose your Yubikey, you will have to revoke your entire GPG key and start from scratch (since the secret key is stored on the Yubikey)
  • (pro) by storing everything on the Yubikey, you won’t necessary have to have an offline copy of your master key (and all the process that comes with it)
  • (pro) it is easier to generate and store everything on the key and is then a good starting point for new comers or rare GPG users

Because I want to demonstrate and enforce the most straightforward way of using it, I will base this article on generating and storing everything on a Yubikey 4. You can find useful links at the end of the article pointing to reference on how to do it differently.

Tools installation

For this to work, we will need some tools on our local machine to setup our Yubikey correctly.

Gentoo users should install those packages:

emerge -av dev-libs/opensc sys-auth/ykpers app-crypt/ccid sys-apps/pcsc-tools app-crypt/gnupg

Gentoo users should also allow the pcscd service to be hotplugged (started automatically upon key insertion) by modifying their /etc/rc.conf and having:

rc_hotplug="pcscd"

Yubikey setup

The idea behind the Yubikey setup is to generate and store the GPG keys directly on our Yubikey and to secure them via a PIN code (and an admin PIN code).

  • default PIN code: 123456
  • default admin PIN code: 12345678

First, insert your Yubikey and let’s change its USB operating mode to OTP+U2F+CCID with MODE_FLAG_EJECT flag.

ykpersonalize -m86
Firmware version 4.3.4 Touch level 783 Program sequence 3

The USB mode will be set to: 0x86

Commit? (y/n) [n]: y

NOTE: if you have an older version of Yubikey (before Sept. 2014), use -m82 instead.

Then, we can generate a new GPG key on the Yubikey. Let’s open the smartcard for edition.

gpg --card-edit --expert

Reader ...........: Yubico Yubikey 4 OTP U2F CCID (0005435106) 00 00
Application ID ...: A7560001240102010006054351060000
Version ..........: 2.1
Manufacturer .....: Yubico
Serial number ....: 75435106
Name of cardholder: [not set]
Language prefs ...: [not set]
Sex ..............: unspecified
URL of public key : [not set]
Login data .......: [not set]
Signature PIN ....: forced
Key attributes ...: rsa2048 rsa2048 rsa2048
Max. PIN lengths .: 127 127 127
PIN retry counter : 3 0 3
Signature counter : 0
Signature key ....: [none]
Encryption key....: [none]
Authentication key: [none]
General key info..: [none]

Then switch to admin mode.

gpg/card> admin
Admin commands are allowed

We can start generating the Signature, Encryption and Authentication keys on the Yubikey. During the process, you will be prompted alternatively for the admin PIN and PIN.

gpg/card> generate 
Make off-card backup of encryption key? (Y/n) 

Please note that the factory settings of the PINs are
   PIN = '123456'     Admin PIN = '12345678'
You should change them using the command --change-pin

I advise you say Yes to the off-card backup of the encryption key.

Yubikey 4 users can choose a 4096 bits key, let’s go for it for every key type.

What keysize do you want for the Signature key? (2048) 4096
The card will now be re-configured to generate a key of 4096 bits
Note: There is no guarantee that the card supports the requested size.
      If the key generation does not succeed, please check the
      documentation of your card to see what sizes are allowed.
What keysize do you want for the Encryption key? (2048) 4096
The card will now be re-configured to generate a key of 4096 bits
What keysize do you want for the Authentication key? (2048) 4096
The card will now be re-configured to generate a key of 4096 bits

Then you’re asked for the expiration of your key. I choose 1 year but it’s up to you (leave 0 for no expiration).

Please specify how long the key should be valid.
         0 = key does not expire
      <n>  = key expires in n days
      <n>w = key expires in n weeks
      <n>m = key expires in n months
      <n>y = key expires in n years
Key is valid for? (0) 1y
Key expires at mer. 15 mai 2018 21:42:42 CEST
Is this correct? (y/N) y

Finally you give GnuPG details about your user ID and you will be prompted for a passphrase (make it strong).

GnuPG needs to construct a user ID to identify your key.

Real name: Ultrabug
Email address: ultrabug@nospam.com
Comment: 
You selected this USER-ID:
    "Ultrabug <ultrabug@nospam.com>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.

If you chose to make an off-card backup of your key, you will also get notified of its location as well the revocation certificate.

gpg: Note: backup of card key saved to '/home/ultrabug/.gnupg/sk_8E407636C9C32C38.gpg'
gpg: key 22A73AED8E766F01 marked as ultimately trusted
gpg: revocation certificate stored as '/home/ultrabug/.gnupg/openpgp-revocs.d/A1580FD98C0486D94C1BE63B22A73AED8E766F01.rev'
public and secret key created and signed.

Make sure to store that backup in a secure and offline location.

You can verify that everything went good and take this chance to note the public key ID.

gpg/card> verify

Reader ...........: Yubico Yubikey 4 OTP U2F CCID (0001435106) 00 00
Application ID ...: A7560001240102010006054351060000
Version ..........: 2.1
Manufacturer .....: Yubico
Serial number ....: 75435106
Name of cardholder: [not set]
Language prefs ...: [not set]
Sex ..............: unspecified
URL of public key : [not set]
Login data .......: [not set]
Signature PIN ....: forced
Key attributes ...: rsa4096 rsa4096 rsa4096
Max. PIN lengths .: 127 127 127
PIN retry counter : 3 0 3
Signature counter : 4
Signature key ....: A158 0FD9 8C04 86D9 4C1B E63B 22A7 3AED 8E76 6F01
 created ....: 2017-05-16 20:43:17
Encryption key....: E1B6 7009 907D 1D94 B200 37D7 8E40 7636 C9C3 2C38
 created ....: 2017-05-16 20:43:17
Authentication key: AAED AB8E E055 41B2 EFFF 62A4 164F 873A 75D2 AD6B
 created ....: 2017-05-16 20:43:17
General key info..: pub rsa4096/22A73AED8E766F01 2017-05-16 Ultrabug <ultrabug@nospam.com>
sec> rsa4096/22A73AED8E766F01 created: 2017-05-16 expires: 2018-05-16
 card-no: 0001 05435106
ssb> rsa4096/164F873A75D2AD6B created: 2017-05-16 expires: 2018-05-16
 card-no: 0001 05435106
ssb> rsa4096/8E407636C9C32C38 created: 2017-05-16 expires: 2018-05-16
 card-no: 0001 05435106

You’ll find the public key ID on the “General key info” line (22A73AED8E766F01):

General key info..: pub rsa4096/22A73AED8E766F01 2017-05-16 Ultrabug <ultrabug@nospam.com>

Quit the card edition.

gpg/card> quit

It is then convenient to upload your public key to a key server, whether public or on your own web server (you can also keep it to be used and imported directly from an USB stick).

Export the public key:

gpg --armor --export 22A73AED8E766F01 > 22A73AED8E766F01.asc

Then upload it to your http server or a public server (needed if you want to be able to easily use the key on multiple machines):

# Upload it to your http server
scp 22A73AED8E766F01.asc user@server:public_html/static/22A73AED8E766F01.asc

# OR upload it to a public keyserver
gpg --keyserver hkps://hkps.pool.sks-keyservers.net --send-key 22A73AED8E766F01

Now we can finish up the Yubikey setup. Let’s edit the card again:

gpg --card-edit --expert

Reader ...........: Yubico Yubikey 4 OTP U2F CCID (0001435106) 00 00
Application ID ...: A7560001240102010006054351060000
Version ..........: 2.1
Manufacturer .....: Yubico
Serial number ....: 75435106
Name of cardholder: [not set]
Language prefs ...: [not set]
Sex ..............: unspecified
URL of public key : [not set]
Login data .......: [not set]
Signature PIN ....: forced
Key attributes ...: rsa4096 rsa4096 rsa4096
Max. PIN lengths .: 127 127 127
PIN retry counter : 3 0 3
Signature counter : 4
Signature key ....: A158 0FD9 8C04 86D9 4C1B E63B 22A7 3AED 8E76 6F01
 created ....: 2017-05-16 20:43:17
Encryption key....: E1B6 7009 907D 1D94 B200 37D7 8E40 7636 C9C3 2C38
 created ....: 2017-05-16 20:43:17
Authentication key: AAED AB8E E055 41B2 EFFF 62A4 164F 873A 75D2 AD6B
 created ....: 2017-05-16 20:43:17
General key info..: pub rsa4096/22A73AED8E766F01 2017-05-16 Ultrabug <ultrabug@nospam.com>
sec> rsa4096/22A73AED8E766F01 created: 2017-05-16 expires: 2018-05-16
 card-no: 0001 05435106
ssb> rsa4096/164F873A75D2AD6B created: 2017-05-16 expires: 2018-05-16
 card-no: 0001 05435106
ssb> rsa4096/8E407636C9C32C38 created: 2017-05-16 expires: 2018-05-16
 card-no: 0001 05435106
gpg/card> admin

Make sure that the Signature PIN is forced to request that your PIN is entered when your key is used. If it is listed as “not forced”, you can enforce it by entering the following command:

gpg/card> forcesig

It is also good practice to set a few more settings on your key.

gpg/card> login
Login data (account name): ultrabug

gpg/card> lang
Language preferences: en

gpg/card> name 
Cardholder's surname: Bug
Cardholder's given name: Ultra

Now we need to setup the PIN and admin PIN on the card.

gpg/card> passwd 
gpg: OpenPGP card no. A7560001240102010006054351060000 detected

1 - change PIN
2 - unblock PIN
3 - change Admin PIN
4 - set the Reset Code
Q - quit

Your selection? 1
PIN changed.

1 - change PIN
2 - unblock PIN
3 - change Admin PIN
4 - set the Reset Code
Q - quit

Your selection? 3
PIN changed.

1 - change PIN
2 - unblock PIN
3 - change Admin PIN
4 - set the Reset Code
Q - quit

Your selection? Q

If you uploaded your public key on your web server or a public server, configure it on the key:

gpg/card> url
URL to retrieve public key: http://ultrabug.fr/keyserver/22A73AED8E766F01.asc

gpg/card> quit

Now we can quit the gpg card edition, we’re done on the Yubikey side!

gpg/card> quit

SSH client setup

This is the setup on the machine(s) where you will be using the GPG key. The idea is to import your key from the card to your local keyring so you can use it on gpg-agent (and its ssh support).

You can skip the fetch/import part below if you generated the key on the same machine than you are using it. You should see it listed when executing gpg -k.

Plug-in your Yubikey and load the smartcard.

gpg --card-edit --expert

Then fetch the key from the URL to import it to your local keyring.

gpg/card> fetch

Then you’re done on this part, exit gpg and update/display& your card status.

gpg/card> quit

gpg --card-status

You can verify the presence of the key in your keyring:

gpg -K
sec>  rsa4096 2017-05-16 [SC] [expires: 2018-05-16]
      A1580FD98C0486D94C1BE63B22A73AED8E766F01
      Card serial no. = 0001 05435106
uid           [ultimate] Ultrabug <ultrabug@nospam.com>
ssb>  rsa4096 2017-05-16 [A] [expires: 2018-05-16]
ssb>  rsa4096 2017-05-16 [E] [expires: 2018-05-16]

Note the “Card serial no.” showing that the key is actually stored on a smartcard.

Now we need to configure gpg-agent to enable ssh support, edit your ~/.gnupg/gpg-agent.conf configuration file and make sure that the enable-ssh-support is present:

default-cache-ttl 7200
max-cache-ttl 86400
enable-ssh-support

Then you will need to update your ~/.bashrc file to automatically start gpg-agent and override ssh-agent’s environment variables. Add this at the end of your ~/.bashrc file (or equivalent).

# start gpg-agent if it's not running
# then override SSH authentication socket to use gpg-agent
pgrep -l gpg-agent &>/dev/null
if [[ "$?" != "0" ]]; then
 gpg-agent --daemon &>/dev/null
fi
SSH_AUTH_SOCK=/run/user/$(id -u)/gnupg/S.gpg-agent.ssh
export SSH_AUTH_SOCK

To simulate a clean slate, unplug your card then kill any running gpg-agent:

killall gpg-agent

Then plug back your card and source your ~/.bashrc file:

source ~/.bashrc

Your GPG key is now listed in you ssh identities!

ssh-add -l
4096 SHA256:a4vsJM6Sw1Rt8orvPnI8nvNUwHbRQ67ylnoTxruozK9 cardno:000105435106 (RSA)

You will now be able to get the SSH public key hash to copy to your remote servers using:

ssh-add -L
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCVDq24Ld/bOzc3yNnY6fF7FNfZnb6wRVdN2xMo1YiA5pz20y+2P1ynq0rb6l/wsSip0Owq4G6fzaJtT1pBUAkEJbuMvZBrbYokb2mZ78iFZyzAkCT+C9YQtvPClFxSnqSL38jBpunZuuiFfejM842dWdMNK3BTcjeTQdTbmY+VsVOw7ppepRh7BWslb52cVpg+bHhB/4H0BUUd/KHZ5sor0z6e1OkqIV8UTiLY2laCCL8iWepZcE6n7MH9KItqzX2I9HVIuLCzhIib35IRp6d3Whhw3hXlit4/irqkIw0g7s9jC8OwybEqXiaeQbmosLromY3X6H8++uLnk7eg9RtCwcWfDq0Qg2uNTEarVGgQ1HXFr8WsjWBIneC8iG09idnwZNbfV3ocY5+l1REZZDobo2BbhSZiS7hKRxzoSiwRvlWh9GuIv8RNCDss9yNFxNUiEIi7lyskSgQl3J8znDXHfNiOAA2X5kVH0s6AQx4hQP9Dl1X2Em4zOz+yJEPFnAvE+XvBys1yuUPq1c3WKMWzongZi8JNu51Yfj7Trm74hoFRn+CREUNpELD9JignxlvkoKAJpWVLdEu1bxJ7jh7kcMQfVEflLbfkEPLV4nZS4sC1FJR88DZwQvOudyS69wLrF3azC1Gc/fTgBiXVVQwuAXE7vozZk+K4hdrGq4u7Gw== cardno:000105435106

This is what ends up in ~/.ssh/authorized_keys on your servers.

When connecting to your remote server, you will be prompted for the PIN!

Conclusion

Using the GPG feature of your Yubikey is very convenient and versatile. Even if it is not that hard after all, it is interesting and fair to note that the PIV method is indeed more simple to implement.

When you need to maintain a large number of security keys in an organization and that their usage is limited to SSH, you will be inclined to stick with PIV if 2048 bits keys are acceptable for you.

However, for power users and developers, usage of GPG is definitely something you need to consider for its versatility and enhanced security.

Useful links

You may find those articles useful to setup your GPG key differently and avoid having the secret key tied to your Yubikey.

Sebastian Pipping a.k.a. sping (homepage, bugs)

Hi!

When I started fetchcommandwrapper about 6 years ago it was a proof of concept: It plugged into portage replacing wget for downloads, facilitating ${GENTOO_MIRRORS} and aria2 to both download faster and distribute loads across mirrors. A hack for sure, but with some potential.

Back then public interest was non-existent, fetchcommandwrapper had some issues — e.g. metadata.xsd downloads failed and some sites rejected downloading before it made aria2 dress like wget — and I stopped using it myself, eventually.

With the latest bug reports, bugfixes and release of version 0.8 in Gentoo, fetchcommandwrapper is ready for general use now. To give it a shot, you emerge app-portage/fetchcommandwrapper and append source /usr/share/fetchcommandwrapper/make.conf to /etc/portage/make.conf. Done.

If you have extra options that you would like to pass to aria2c, put them in ${FETCHCOMMANDWRAPPER_EXTRA}, or ${FETCHCOMMANDWRAPPER_OPTIONS} for fetchcommendwrapper itself; for example

FETCHCOMMANDWRAPPER_OPTIONS="--link-speed=600000"

tells fetchcommandwrapper that my download link has 600KB/s only and makes aria2 in turn drop connections to mirrors that cannot keep up with at least a third of that, so that faster mirrors get a chance to take their place.

For non-ebuild bugs, feel free to use https://github.com/gentoo/fetchcommandwrapper/issues to report.

Best, Sebastian