Gentoo Logo
Gentoo Logo Side
Gentoo Spaceship

Contributors:
. Aaron W. Swenson
. Agostino Sarubbo
. Alexey Shvetsov
. Alexis Ballier
. Alexys Jacob
. Alice Ferrazzi
. Alice Ferrazzi
. Andreas K. Hüttel
. Anthony Basile
. Arun Raghavan
. Bernard Cafarelli
. Brian Harring
. Christian Ruppert
. Chí-Thanh Christopher Nguyễn
. Denis Dupeyron
. Detlev Casanova
. Diego E. Pettenò
. Domen Kožar
. Doug Goldstein
. Eray Aslan
. Fabio Erculiani
. Gentoo Haskell Herd
. Gentoo Miniconf 2016
. Gentoo Monthly Newsletter
. Gentoo News
. Gilles Dartiguelongue
. Greg KH
. Göktürk Yüksek
. Hanno Böck
. Hans de Graaff
. Ian Whyman
. Jan Kundrát
. Jason A. Donenfeld
. Jeffrey Gardner
. Joachim Bartosik
. Johannes Huber
. Jonathan Callen
. Jorge Manuel B. S. Vicetto
. Kristian Fiskerstrand
. Lance Albertson
. Liam McLoughlin
. Luca Barbato
. Marek Szuba
. Mart Raudsepp
. Matt Turner
. Matthew Thode
. Michael Palimaka
. Michal Hrusecky
. Michał Górny
. Mike Doty
. Mike Gilbert
. Mike Pagano
. Nathan Zachary
. Pacho Ramos
. Patrick Kursawe
. Patrick Lauer
. Patrick McLean
. Paweł Hajdan, Jr.
. Piotr Jaroszyński
. Rafael G. Martins
. Remi Cardona
. Richard Freeman
. Robin Johnson
. Ryan Hill
. Sean Amoss
. Sebastian Pipping
. Sergei Trofimovich
. Steev Klimaszewski
. Stratos Psomadakis
. Sven Vermeulen
. Sven Wegener
. Tom Wijsman
. Yury German
. Zack Medico

Last updated:
January 23, 2018, 13:06 UTC

Disclaimer:
Views expressed in the content published here do not necessarily represent the views of Gentoo Linux or the Gentoo Foundation.


Bugs? Comments? Suggestions? Contact us!

Powered by:
Planet Venus

Welcome to Gentoo Universe, an aggregation of weblog articles on all topics written by Gentoo developers. For a more refined aggregation of Gentoo-related topics only, you might be interested in Planet Gentoo.

January 19, 2018
Greg KH a.k.a. gregkh (homepage, bugs)

I keep getting a lot of private emails about my previous post about the latest status of the Linux kernel patches to resolve both the Meltdown and Spectre issues.

These questions all seem to break down into two different categories, “What is the state of the Spectre kernel patches?”, and “Is my machine vunlerable?”

State of the kernel patches

As always, lwn.net covers the technical details about the latest state of the kernel patches to resolve the Spectre issues, so please go read that to find out that type of information.

And yes, it is behind a paywall for a few more weeks. You should be buying a subscription to get this type of thing!

Is my machine vunlerable?

For this question, it’s now a very simple answer, you can check it yourself.

Just run the following command at a terminal window to determine what the state of your machine is:

1
$ grep . /sys/devices/system/cpu/vulnerabilities/*

On my laptop, right now, this shows:

1
2
3
4
$ grep . /sys/devices/system/cpu/vulnerabilities/*
/sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI
/sys/devices/system/cpu/vulnerabilities/spectre_v1:Vulnerable
/sys/devices/system/cpu/vulnerabilities/spectre_v2:Vulnerable: Minimal generic ASM retpoline

This shows that my kernel is properly mitigating the Meltdown problem by implementing PTI (Page Table Isolation), and that my system is still vulnerable to the Spectre variant 1, but is trying really hard to resolve the variant 2, but is not quite there (because I did not build my kernel with a compiler to properly support the retpoline feature).

If your kernel does not have that sysfs directory or files, then obviously there is a problem and you need to upgrade your kernel!

Some “enterprise” distributions did not backport the changes for this reporting, so if you are running one of those types of kernels, go bug the vendor to fix that, you really want a unified way of knowing the state of your system.

Note that right now, these files are only valid for the x86-64 based kernels, all other processor types will show something like “Not affected”. As everyone knows, that is not true for the Spectre issues, except for very old CPUs, so it’s a huge hint that your kernel really is not up to date yet. Give it a few months for all other processor types to catch up and implement the correct kernel hooks to properly report this.

And yes, I need to go find a working microcode update to fix my laptop’s CPU so that it is not vulnerable against Spectre…

January 18, 2018
Matthew Thode a.k.a. prometheanfire (homepage, bugs)
Undervolting your CPU for fun and profit (January 18, 2018, 06:00 UTC)

Disclaimer

I'm not responsible if you ruin your system, this guide functions as documentation for future me. While this should just cause MCEs or system resets if done wrong it might also be able to ruin things in a more permanent way.

Why do this

It lowers temps generally and if you hit thermal throttling (as happens with my 5th Generation X1 Carbon) you may thermally throttle less often or at a higher frequency.

Prework

Repaste first, it's generally easier to do and can result in better gains (and it's all about them gains). For instance, my Skull Canyon NUC was resetting itself thermally when stress testing. Repasting lowered the max temps by 20°C.

Undervolting - The How

I based my info on https://github.com/mihic/linux-intel-undervolt which seems to work on my Intel Kaby Lake based laptop.

Using the MSR registers to write the values via msr-tools wrmsr binary.

The following python3 snippet is how I got the values to write.

# for a -110mv offset I run the following
format(0xFFE00000&( (round(-110*1.024)&0xFFF) <<21), '08x')

What you are actually writing is actually as follows, with the plane index being for cpu, gpu or cache for 0, 1 or 2 respectively.

constant plane index constant write/read offset
80000 0 1 1 F1E00000

So we end up with 0x80000011F1E00000 to write to MSR register 0x150.

Undervolting - The Integration

I made a script in /opt/bin/, where I place my custom system scripts called undervolt.sh (make sure it's executable).

#!/usr/bin/env sh
/usr/sbin/wrmsr 0x150 0x80000011F1E00000  # cpu core  -110
/usr/sbin/wrmsr 0x150 0x80000211F1E00000  # cpu cache -110
/usr/sbin/wrmsr 0x150 0x80000111F4800000  # gpu core  -90

I then made a custom systemd unit in /etc/systemd/system called undervolt.service with the following content.

[Unit]
Description=Undervolt Service

[Service]
ExecStart=/opt/bin/undervolt.sh

[Install]
WantedBy=multi-user.target

I then systemctl enable undervolt.service so I'll get my settings on boot.

The following script was also placed in /usr/lib/systemd/system-sleep/undervolt.sh to get the settings after recovering from sleep, as they are lost at that point.

#!/bin/sh
if [ "${1}" == "post" ]; then
  /opt/bin/undervolt.sh
fi

That's it, everything after this point is what I went through in testing.

Testing for stability

I stress tested with mprime in stress mode for the CPU and glxmark or gputest for the GPU. I only recorded the results for the CPU, as that's what I care about.

No changes:

  • 15-17W package tdp
  • 15W core tdp
  • 3.06-3.22 Ghz

50mv Undervolt:

  • 15-16W package tdp
  • 14-15W core tdp
  • 3.22-3.43 Ghz

70mv undervolt:

  • 15-16W package tdp
  • 14W core tdp
  • 3.22-3.43 Ghz

90mv undervolt:

  • 15W package tdp
  • 13-14W core tdp
  • 3.32-3.54 Ghz

100mv undervolt:

  • 15W package tdp
  • 13W core tdp
  • 3.42-3.67 Ghz

110mv undervolt:

  • 15W package tdp
  • 13W core tdp
  • 3.42-3.67 Ghz

started getting gfx artifacts, switched the gfx undervolt to 100 here

115mv undervolt:

  • 15W package tdp
  • 13W core tdp
  • 3.48-3.72 Ghz

115mv undervolt with repaste:

  • 15-16W package tdp
  • 14-15W core tdp
  • 3.63-3.81 Ghz

120mv undervolt with repaste:

  • 15-16W package tdp
  • 14-15W core tdp
  • 3.63-3.81 Ghz

I decided on 110mv cpu and 90mv gpu undervolt for stability, with proper ventilation I get about 3.7-3.8 Ghz out of a max of 3.9 Ghz.

Other notes: Undervolting made the cpu max speed less spiky.

January 17, 2018
Sven Vermeulen a.k.a. swift (homepage, bugs)
Structuring a configuration baseline (January 17, 2018, 08:10 UTC)

A good configuration baseline has a readable structure that allows all stakeholders to quickly see if the baseline is complete, as well as find a particular setting regardless of the technology. In this blog post, I'll cover a possible structure of the baseline which attempts to be sufficiently complete and technology agnostic.

If you haven't read the blog post on documenting configuration changes, it might be a good idea to do so as it declares the scope of configuration baselines and why I think XCCDF is a good match for this.

Chaptered documentation

As mentioned previously, a configuration baseline describes the configuration of a particular technological service (rather than a business service which is an integrated set of technologies and applications). To document and maintain the configuration state of the technology, I suggest the following eight chapters (to begin with):

  1. Architecture
  2. Operating system and services
  3. Software deployment and file system
  4. Technical service settings
  5. Authentication, authorization, access control and auditing
  6. Service specific settings
  7. Cryptographic services
  8. Data and information handling

Within each chapter, sections can be declared depending on how the technology works. For instance, for database technologies one can have a distinction between system-wide settings, instance-specific settings and even database-specific settings. Or, if the organization has specific standards on user definitions, a chapter on "User settings" can be used. The above is just a suggestion in an attempt to cover most aspects of a configuration baseline.

With the sections of the chapter, rules are then defined which specify the actual configuration setting (or valid range) applicable to the technology. But the rule goes further than just a single-line configuration setting description.

Each rule should have a unique identifier so that other documents can reliably link to the rules in the document. Although XCCDF has a convention for this, I feel that the XCCDF way here is more useful for technical referencing while the organization is better off with a more human addressable approach. So while a rule in XCCDF has the identifier xccdf_com.example.postgresql_rule_selinux-enforcing the human addressable identifier would be postgresql_selinux-enforcing or even postgresql-00001. In the company that I work for, we already have a taxonomy for services and a decision to use numerical identifiers on the configuration baseline rules.

Each rule should be properly described, documenting what the rule is for. In case of a ranged value, it should also document how this range can be properly applied. For instance, if the number of worker threads is based on the number of cores available in the system, document the formula.

Each rule should also document the risk that it wants to mitigate (be it a security risk, or a manageability aspect of the service, or a performance related tuning parameter). This aspect of the baseline is important whenever an implementation wants an exception to the rule (not follow it) or a deviation (different value). Personally, to make sure that the baseline is manageable, I don't expect engineers to immediately fill in the risk in great detail, but rather holistically. The actual risk determination is then only done when an implementation wants an exception or deviation, and then includes a list of potential mitigating actions to take. This way, a 300+ rule document does not require all 300+ rules to have a risk determination, especially if only a dozen or so rules have exceptions or deviations in the organization.

Each rule should have sources linked to it. These sources help the reader understand what the rule is based on, such as a publicly available secure configuration baseline, an audit recommendation, a specific incident, etc. If the rule is also controversial, it might benefit from links to meeting minutes.

Each rule might have consequences listed as well. These are known changes or behavior aspects that follow the implementation of the rule. For instance, a rule might state that TLS mutual authentication is mandatory, and the consequence is that all interacting clients must have a properly defined certificate (so proper PKI requirements) as well as client registration in the application.

Finally, and importantly as well, each rule identifies the scope at which exceptions or deviations can be granted. For smaller groups and organizations, this might not matter that much, but for larger organizations, some configuration baseline rules can be "approved" by a small team or by the application owner, while others need formal advise of a security officer and approval on a decision body.

Finding a balanced approval hierarchy

The exception management for configuration baselines should not be underestimated. It is not viable to have all settings handled by top management decision bodies, but some configuration changes might result in such a huge impact that a formal decision needs to be taken somewhere, with proper accountability assigned (yes, this is the architect in me speaking).

Rather than attempting to create a match for all rules, I again like to keep the decision here in the middle, just like I do with the risk determination. The maintainer of the configuration baseline can leave the "scope" of a rule open, and have an intermediate decision body as the main decision body. Whenever an exception or deviation is asked, the risk determination is made and filled in, and with this documented rule now complete a waiver is asked on the decision body. Together with the waiver request, the maintainer also asks this decision body if the rule in the future also needs to be granted on that decision body or elsewhere.

The scope is most likely tied to the impact of the rule towards other services. A performance specific rule that only affects the application hosted on the technology can be easily scoped as being application-only. This means that the application or service owner can decide to deviate from the baseline. A waiver for a rule that influences system behavior might need to be granted by the system administrator (or team) as well as application or service owners that use this system. Following this logic, I generally use the following scope terminology:

  • tbd (to be determined), meaning that there is no assessment done yet
  • application, meaning that the impact is only on a single application and thus can be taken by the application owner
  • instance, meaning that the impact is on an instance and thus might be broader than a single application, but is otherwise contained to the technology. Waivers are granted by the responsible system administrator and application owner(s)
  • system, meaning that the impact is on the entire system and thus goes beyond the technology. Waivers are granted by the responsible system administrator, application owner(s) and with advise from a security officer
  • network, meaning that the impact can spread to other systems or influence behavior of other systems, but remains technical in nature. Waivers are granted by an infrastructure architecture board with advise from a security officer
  • organization, meaning that the impact goes beyond technical influence but also impacts business processes. Waivers are granted by an architecture board with advise from a security officer and senior service owner, and might even be redirected to a higher management board.
  • group, meaning that the impact influences multiple businesses. Waivers are granted by a specific management board

Each scope also has a "-pending" value, so "network-pending". This means that the owner of the configuration baseline suggests that this is the scope on which waivers can be established, but still needs to receive formal validation.

The main decision body is then a particular infrastructure architecture board, which will redirect requests to other decision bodies if the scope goes beyond what that architecture board handles.

Architectural settings

The first chapter in a baseline is perhaps the more controversial one, as it is not a technical setting and hard to validate. However, in my experience, tying architectural constraints in a configuration baseline is much more efficient than having a separate track for a number of reasons.

For one, I strongly believe that architecture deviations are like configuration deviations. They should be documented similarly, and follow the same path as configuration baseline deviations. The scope off architectural rules are also all over the place, from application-level impact up to organization-wide.

Furthermore, architectural positioning of services should not be solely an (infrastructure) architecture concern, but supported by the other stakeholders as well, and especially the responsible for the technology service.

For instance, a rule could be that no databases should be positioned within an organizations DeMilitarized Zone (DMZ), which is a network design that shields off internally positioned services from the outside world. Although this is not a configuration setting, it makes sense to place it in the configuration baseline of the database technology. There are several ways to validate automatically if this rule is followed, depending for instance the organization IP plan.

Another rule could be that web applications that host browser-based applications should only be linked through a reverse proxy, or that a load balancer must be put in front of an application server, etc. This might result in additional rules in the chapter that covers access control as well (such as having a particular IP filter in place), but these rules are the consequence of the architectural positioning of the service.

Operating system and services

The second chapter covers settings specific to the operating system on which the technology is deployed. Such settings can be system-wide settings like Linux' sysctl parameters, services which need to be enabled or disabled when the technology is deployed, and deviations from the configuration baseline of the operating system.

An example of the latter depends of course on the configuration baseline of the operating system (assuming this is a baseline for a technology deployed on top of an operating system, it could very well be a different platform). Suppose for instance that the baseline has the squashfs kernel module disabled, but the technology itself requires squashfs, then a waiver is needed. This is the level where this is documented.

Another setting could be an extension of the SSH configuration (the term "services" in the chapter title here focuses on system services, such as OpenSSH), or the implementation of additional audit rules on OS-level (although auditing can also be covered in a different section).

Software deployment and file system

The third chapter focuses on the installation of the technology itself, and the file system requirements related to the technology service.

Rules here look into file ownership and permissions, mount settings, and file system declarations. Some baselines might even define rules about integrity of certain files (the Open Vulnerability and Assessment Language (OVAL) supports checksum-based validations) although I think this is better tackled through a specific integrity process. Still, if such an integrity process does not exist and automated validation of baselines is implemented, then integrity validation of critical files could be in scope.

Technical service settings

In the fourth chapter, settings are declared regarding the service without being service-specific. A service-specific setting is one that requires functional knowledge of the service, whereas technical service settings can be interpreted without functionally understanding the technology at hand.

Let's take PostgreSQL as an example. A service-specific setting would be the maximum number of non-frozen transaction IDs before a VACUUM operation is triggered (the autovacuum_freeze_max_age parameter). If you are not working with PostgreSQL much, then this makes as much sense as Prisencolinensinainciusol. It sounds like English, but that's about as far as you get.

A technical service setting on PostgreSQL that is likely more understandable is the runtime account under which the database runs (you don't want it to run as root), or the TCP port on which it listens. Although both are technical in nature, they're much more understandable for others and, perhaps the most important reason of all, often more reusable in deployments across technologies.

This reusability is key for larger organizations as they will have numerous technologies to support, and the technical service settings offer a good baseline for initial secure setup. They focus on the runtime account of the service, the privileges of the runtime account (be it capability-based on Linux or account rights on Windows), the interfaces on which the service is reachable, the protocol or protocols it supports, etc.

Authentication, authorization, access control and auditing

The next chapter focuses on the Authentication, Authorization and Accounting (AAA) services, but slightly worded differently (AAA is commonly used in networking related setups, I just borrow it and extend it). If the configuration baseline is extensive, then it might make sense to have separate sections for each of these security concepts.

Some technologies have a strong focus on user management as well. In that case, it might make sense to first describe the various types of users that the technology supports (like regular users, machine users, internal service users, shared users, etc.) and then, per user type, document how these security services act on it.

Service specific settings

The next chapter covers settings that are very specific to the service. These are often the settings that are found in the best practices documentation, secure deployment instructions of the vendor, performance tuning parameters, etc.

I tend to look first at the base configuration and administration guides for technologies, and see what the main structure is that those documents follow. Often, this can be borrowed for the configuration baseline. Next, consider performance related tuning, as that is often service specific and not related to the other chapters.

Cryptographic services

In this chapter, the focus is on the cryptographic services and configuration.

The most well-known example here is related to any TLS configuration and tuning. Whereas the location of the private key (used for TLS services) is generally mentioned in the third chapter (or at least the secure storage of the private key), this section will focus on using this properly. It looks at selecting proper TLS version, making a decent and manageable set of ciphers to support, enabling Online Certificate Status Protocol (OCSP) on web servers, etc.

But services often use cryptographic related algorithms in various other places as well. Databases can provide transparent data file encryption to ensure that offline access to the database files does not result in data leakage for instance. Or they implement column-level encryption.

Application servers might support crypto related routines to the applications they host, and the configuration baseline can then identify which crypto modules are supported and which ones aren't.

Services might be using cryptographic hashes which are configurable, or could be storing user passwords in a database using configurable settings. OpenLDAP for instance supports multiple hashing methods (and also supports storing in plain-text if you want this), so it makes sense to select a hashing method that is hard to brute-force (slow to compute for instance) and is salted (to make certain types of attacks more challenging).

If the service makes use of stored credentials or keytabs, document how they are protected here as well.

Data and information handling

Information handling covers both the regular data management activities (like backup/restore, data retention, archival, etc.) as well as sensitive information handling (to comply with privacy rules).

The regular data management related settings look into both the end user data handling (as far as this is infrastructurally related - this isn't meant to become a secure development guide) as well as service-internal data handling. When the technology is meant to handle data (like a database or LDAP) then certain related settings could be both in the service specific settings chapter or in this one. Personally, I tend to prefer that technology-specific and non-reusable settings are in the former, while the data and information handling chapter covers the integration and technology-agnostic data handling.

If the service handles sensitive information, it is very likely that additional constraints or requirements were put in place beyond the "traditional" cryptographic requirements. Although such requirements are often implemented on the application level (like tagging the data properly and then, based on the tags, handle specific fine-grained access controls, archival and data retention), more and more technologies provide out-of-the-box (or at least reusable) methods that can be configured.

An XCCDF template

To support the above structure, I've made an XCCDF template that might be a good start for documenting the configuration baseline of a technology. It also structures the chapters a bit more with various sections, but those are definitely not mandatory to use as it strongly depends on the technology being documented, the maturity of the organization, etc.

January 08, 2018
Andreas K. Hüttel a.k.a. dilfridge (homepage, bugs)
FOSDEM 2018 talk: Perl in the Physics Lab (January 08, 2018, 17:07 UTC)

FOSDEM 2018, the "Free and Open Source Developers' European Meeting", takes place 3-4 February at Universite Libre de Bruxelles, Campus Solbosch, Brussels - and our measurement control software Lab::Measurement will be presented there in the Perl devrooom! As all of FOSDEM, the talk will also be streamed live and archived; more details on this follow later. Here's the abstract:

Perl in the Physics Lab
Andreas K. Hüttel
    Track: Perl Programming Languages devroom
    Room: K.4.601
    Day: Sunday
    Start: 11:00
    End: 11:40 
Let's visit our university lab. We work on low-temperature nanophysics and transport spectroscopy, typically measuring current through experimental chip structures. That involves cooling and temperature control, dc voltage sources, multimeters, high-frequency sources, superconducting magnets, and a lot more fun equipment. A desktop computer controls the experiment and records and evaluates data.

Some people (like me) want to use Linux, some want to use Windows. Not everyone knows Perl, not everyone has equal programming skills, not everyone knows equally much about the measurement hardware. I'm going to present our solution for this, Lab::Measurement. We implement a layer structure of Perl modules, all the way from the hardware access and the implementation of device-specific command sets to high level measurement control with live plotting and metadata tracking. Current work focuses on a port of the entire stack to Moose, to simplify and improve the code.

FOSDEM

January 07, 2018
Sven Vermeulen a.k.a. swift (homepage, bugs)
Documenting configuration changes (January 07, 2018, 20:20 UTC)

IT teams are continuously under pressure to set up and maintain infrastructure services quickly, efficiently and securely. As an infrastructure architect, my main concerns are related to the manageability of these services and the secure setup. And within those realms, a properly documented configuration setup is in my opinion very crucial.

In this blog post series, I'm going to look into using the Extensible Configuration Checklist Description Format (XCCDF) as the way to document these. This first post is an introduction to XCCDF functionally, and what I position it for.

Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)

You may remember that just last week I was excited to announce that I had more work planned and lined up for my glucometer utilities, one of which was supporting OneTouch Verio IQ which is a slightly older meter that is still sold and in use in many countries, but for which no protocol is released.

In the issue I linked above you can find an interesting problem: LifeScan discontinued their Diabetes Management Software, and removed it from their website. Indeed instead they suggest you get one of their Bluetooth meters and to use that with their software. While in general the idea of upgrading a meter is sane, the fact that they decided to discontinue the old software without providing protocols is at the very least annoying.

This shows the importance of having open source tools that can be kept alive as long as needed, because there will be people out there that still rely on their OneTouch Verio IQ, or even on the OneTouch Ultra Easy, which was served by the same software, and is still being sold in the US. Luckily at least they at least used to publish the Ultra Easy protocol specs and they are still available on the Internet at large if you search for them (and I do have a copy, and I can rephrase that into a protocol specification if I find that’s needed).

On the bright side, the Tidepool project (of which I wrote before) has a driver for the Verio IQ. It’s not a particularly good driver, as I found out (I’ll get to that later), but it’s a starting point. It made me notice that the protocol was almost an in-between of the Ultra Easy and the Verio 2015, which I already reverse engineered before.

Of course I also managed to find a copy of the LifeScan software on a mostly shady website and a copy of the “cable drivers” package from the Middle East and Africa website of LifeScan, which still has the website design from five years ago. This is good because the latter package is the one that installs kernel drivers on Windows, while the former only contains userland software, which I can trust a little more.

Comparing the USB trace I got from the software with the commands implemented in the TidePool driver showed me a few interesting bits of information. The first being the first byte of commands on the Verio devices is not actually fixed, but can be chosen between a few, as the Windows software and the TidePool driver used different bytes (and with this I managed to simplify one corner case in the Verio 2015!). The second is that the TidePool driver does not extract all the information it should! In particular the device allows before/after meal marking, but they discard the byte before getting to it. Of course they don’t seem to expose that data even from the Ultra 2 driver so it may be intentional.

A bit more concerning is that they don’t verify that the command returned a success status, but rather discard the first two bytes every time. Thankfully it’s very easy for me to check that.

On the other hand, reading through the TidePool driver (which I have to assume was developed with access to the LifeScan specifications, under NDA) I could identify two flaws in my own code. The first was not realizing the packet format between the UltraEasy and the Verio 2015 was not subtly different as I thought, but it was almost identical, except the link-control byte in both Verio models is not used, and is kept to 0. The second was that I’m not currently correctly dropping out control solutions from the readings of the Verio 2015! I should find a way to get a hold of the control solution for my models in the pharmacy and make sure I get this tested out.

Oh yeah, and the TidePool driver does not do anything to get or set date and time; thankfully the commands were literally the same as in the Verio 2015, so that part was an actual copy-paste of code. I should probably tidy up a bit, but now I would have a two-tier protocol system: the base packet structure is shared between the UltraEasy, Verio IQ and Verio 2015. Some of the commands are shared between UltraEasy and Verio IQ, more of them are shared between the Verio IQ and the Verio 2015.

You can see now why I’ve been disheartened to hear that the development of drivers, even for open source software, is done through closed protocol specifications that cannot be published (or the drivers thoroughly commented). Since TidePool is not actually using all of the information, there is no way to tell what certain bytes of the responses represent. And unless I get access to all the possible variants of the information, I can’t tell how some bytes that to me look like constant should represent. Indeed since the Verio 2015 does not have meal information, I assumed that the values were 32-bit until I got a report of invalid data on another model which shares the same protocol and driver. This is why I am tempted to build “virtual” fakes of these devices with Facedencer to feed variants of the data to the original software and see how it’s represented there.

On the bright side I feel proud of myself (maybe a little too much) for having spent the time to rewrite those two drivers with Construct while at 34C3 and afterwards. If I hadn’t refactored the code before looking at the Verio IQ, I wouldn’t have noticed the similarities so clearly and likely wouldn’t have come to the conclusion it’s a shared similar protocol. And no way I could have copy-pasted between the drivers so easily as I did.

January 06, 2018
Greg KH a.k.a. gregkh (homepage, bugs)
Meltdown and Spectre Linux kernel status (January 06, 2018, 12:36 UTC)

By now, everyone knows that something “big” just got announced regarding computer security. Heck, when the Daily Mail does a report on it , you know something is bad…

Anyway, I’m not going to go into the details about the problems being reported, other than to point you at the wonderfully written Project Zero paper on the issues involved here. They should just give out the 2018 Pwnie award right now, it’s that amazingly good.

If you do want technical details for how we are resolving those issues in the kernel, see the always awesome lwn.net writeup for the details.

Also, here’s a good summary of lots of other postings that includes announcements from various vendors.

As for how this was all handled by the companies involved, well this could be described as a textbook example of how NOT to interact with the Linux kernel community properly. The people and companies involved know what happened, and I’m sure it will all come out eventually, but right now we need to focus on fixing the issues involved, and not pointing blame, no matter how much we want to.

What you can do right now

If your Linux systems are running a normal Linux distribution, go update your kernel. They should all have the updates in them already. And then keep updating them over the next few weeks, we are still working out lots of corner case bugs given that the testing involved here is complex given the huge variety of systems and workloads this affects. If your distro does not have kernel updates, then I strongly suggest changing distros right now.

However there are lots of systems out there that are not running “normal” Linux distributions for various reasons (rumor has it that it is way more than the “traditional” corporate distros). They rely on the LTS kernel updates, or the normal stable kernel updates, or they are in-house franken-kernels. For those people here’s the status of what is going on regarding all of this mess in the upstream kernels you can use.

Meltdown – x86

Right now, Linus’s kernel tree contains all of the fixes we currently know about to handle the Meltdown vulnerability for the x86 architecture. Go enable the CONFIG_PAGE_TABLE_ISOLATION kernel build option, and rebuild and reboot and all should be fine.

However, Linus’s tree is currently at 4.15-rc6 + some outstanding patches. 4.15-rc7 should be out tomorrow, with those outstanding patches to resolve some issues, but most people do not run a -rc kernel in a “normal” environment.

Because of this, the x86 kernel developers have done a wonderful job in their development of the page table isolation code, so much so that the backport to the latest stable kernel, 4.14, has been almost trivial for me to do. This means that the latest 4.14 release (4.14.12 at this moment in time), is what you should be running. 4.14.13 will be out in a few more days, with some additional fixes in it that are needed for some systems that have boot-time problems with 4.14.12 (it’s an obvious problem, if it does not boot, just add the patches now queued up.)

I would personally like to thank Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra, Josh Poimboeuf, Juergen Gross, and Linus Torvalds for all of the work they have done in getting these fixes developed and merged upstream in a form that was so easy for me to consume to allow the stable releases to work properly. Without that effort, I don’t even want to think about what would have happened.

For the older long term stable (LTS) kernels, I have leaned heavily on the wonderful work of Hugh Dickins, Dave Hansen, Jiri Kosina and Borislav Petkov to bring the same functionality to the 4.4 and 4.9 stable kernel trees. I had also had immense help from Guenter Roeck, Kees Cook, Jamie Iles, and many others in tracking down nasty bugs and missing patches. I want to also call out David Woodhouse, Eduardo Valentin, Laura Abbott, and Rik van Riel for their help with the backporting and integration as well, their help was essential in numerous tricky places.

These LTS kernels also have the CONFIG_PAGE_TABLE_ISOLATION build option that should be enabled to get complete protection.

As this backport is very different from the mainline version that is in 4.14 and 4.15, there are different bugs happening, right now we know of some VDSO issues that are getting worked on, and some odd virtual machine setups are reporting strange errors, but those are the minority at the moment, and should not stop you from upgrading at all right now. If you do run into problems with these releases, please let us know on the stable kernel mailing list.

If you rely on any other kernel tree other than 4.4, 4.9, or 4.14 right now, and you do not have a distribution supporting you, you are out of luck. The lack of patches to resolve the Meltdown problem is so minor compared to the hundreds of other known exploits and bugs that your kernel version currently contains. You need to worry about that more than anything else at this moment, and get your systems up to date first.

Also, go yell at the people who forced you to run an obsoleted and insecure kernel version, they are the ones that need to learn that doing so is a totally reckless act.

Meltdown – ARM64

Right now the ARM64 set of patches for the Meltdown issue are not merged into Linus’s tree. They are staged and ready to be merged into 4.16-rc1 once 4.15 is released in a few weeks. Because these patches are not in a released kernel from Linus yet, I can not backport them into the stable kernel releases (hey, we have rules for a reason…)

Due to them not being in a released kernel, if you rely on ARM64 for your systems (i.e. Android), I point you at the Android Common Kernel tree All of the ARM64 fixes have been merged into the 3.18, 4.4, and 4.9 branches as of this point in time.

I would strongly recommend just tracking those branches as more fixes get added over time due to testing and things catch up with what gets merged into the upstream kernel releases over time, especially as I do not know when these patches will land in the stable and LTS kernel releases at this point in time.

For the 4.4 and 4.9 LTS kernels, odds are these patches will never get merged into them, due to the large number of prerequisite patches required. All of those prerequisite patches have been long merged and tested in the android-common kernels, so I think it is a better idea to just rely on those kernel branches instead of the LTS release for ARM systems at this point in time.

Also note, I merge all of the LTS kernel updates into those branches usually within a day or so of being released, so you should be following those branches no matter what, to ensure your ARM systems are up to date and secure.

Spectre

Now things get “interesting”…

Again, if you are running a distro kernel, you might be covered as some of the distros have merged various patches into them that they claim mitigate most of the problems here. I suggest updating and testing for yourself to see if you are worried about this attack vector

For upstream, well, the status is there is no fixes merged into any upstream tree for these types of issues yet. There are numerous patches floating around on the different mailing lists that are proposing solutions for how to resolve them, but they are under heavy development, some of the patch series do not even build or apply to any known trees, the series conflict with each other, and it’s a general mess.

This is due to the fact that the Spectre issues were the last to be addressed by the kernel developers. All of us were working on the Meltdown issue, and we had no real information on exactly what the Spectre problem was at all, and what patches were floating around were in even worse shape than what have been publicly posted.

Because of all of this, it is going to take us in the kernel community a few weeks to resolve these issues and get them merged upstream. The fixes are coming in to various subsystems all over the kernel, and will be collected and released in the stable kernel updates as they are merged, so again, you are best off just staying up to date with either your distribution’s kernel releases, or the LTS and stable kernel releases.

It’s not the best news, I know, but it’s reality. If it’s any consolation, it does not seem that any other operating system has full solutions for these issues either, the whole industry is in the same boat right now, and we just need to wait and let the developers solve the problem as quickly as they can.

The proposed solutions are not trivial, but some of them are amazingly good. The Retpoline post from Paul Turner is an example of some of the new concepts being created to help resolve these issues. This is going to be an area of lots of research over the next years to come up with ways to mitigate the potential problems involved in hardware that wants to try to predict the future before it happens.

Other arches

Right now, I have not seen patches for any other architectures than x86 and arm64. There are rumors of patches floating around in some of the enterprise distributions for some of the other processor types, and hopefully they will surface in the weeks to come to get merged properly upstream. I have no idea when that will happen, if you are dependant on a specific architecture, I suggest asking on the arch-specific mailing list about this to get a straight answer.

Conclusion

Again, update your kernels, don’t delay, and don’t stop. The updates to resolve these problems will be continuing to come for a long period of time. Also, there are still lots of other bugs and security issues being resolved in the stable and LTS kernel releases that are totally independent of these types of issues, so keeping up to date is always a good idea.

Right now, there are a lot of very overworked, grumpy, sleepless, and just generally pissed off kernel developers working as hard as they can to resolve these issues that they themselves did not cause at all. Please be considerate of their situation right now. They need all the love and support and free supply of their favorite beverage that we can provide them to ensure that we all end up with fixed systems as soon as possible.

January 03, 2018
FOSDEM 2018 (January 03, 2018, 00:00 UTC)

FOSDEM 2018 logo

Put on your cow bells and follow the herd of Gentoo developers to Université libre de Bruxelles in Brussels, Belgium. This year FOSDEM 2018 will be held on February 3rd and 4th.

Our developers will be ready to candidly greet all open source enthusiasts at the Gentoo stand in building K. Visit this year’s wiki page to see which developer will be running the stand during the different visitation time slots. So far seven developers have specified their attendance, with most-likely more on the way!

Unlike past years, Gentoo will not be hosting a keysigning party, however participants are encouraged to exchange and verify OpenPGP key data to continue building the Gentoo Web of Trust. See the wiki article for more details.

January 02, 2018
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)

You may remember glucometerutils, my project of an open source Python tool to download glucometer data from meters that do not provide Linux support (as in, any of them).

While the tool started a few years ago, out of my personal need, this year there has been a bigger push than before, with more contributors trying the tool out, finding problem, fixing bugs. From my part, I managed to have a few fits of productivity on the tool, particularly this past week at 34C3, when I decided it was due time to start making the package shine a bit more.

So let’s see what are the more recent developments for the tool.

First of all, I decided to bring up the Python version requirement to Python 3.4 (previously, it was Python 3.2). The reason for this is that it gives access to the mock module for testing, and the enum module to write actual semantically-defined constants. While both of these could be provided as dependencies to support the older versions, I can’t think of any good reason not to upgrade from 3.2 to 3.4, and thus no need to support those versions. I mean, even Debian Stable has Python 3.5.

And while talking about Python versions, Hector pointed me at construct, which looked right away like an awesome library for dealing with binary data structures. It turned out to be a bit more rough around the edges than I’ve expected from the docs, particularly because the docs do not contain enough information to actually use it with proper dynamic objects, but it does make a huge difference compared to dealing with bytestring manually. I have started already while in Leipzig to use it to parse the basic frames of the FreeStyle protocol, and then proceeded to rewrite the other binary-based protocols, between the airports and home.

This may sound like a minor detail, but I actually found this made a huge difference, as the library already provides proper support for validating expected constants, as well as dealing with checksums — although in some cases it’s a bit heavier-handed than I expected. Also, the library supports defining bit structures too, which simplified considerably the OneTouch Ultra Easy driver, that was building its own poor developer version of the same idea. After rewriting the two binary LifeScan drivers I have (for the OneTouch Verio 2015 and Select Plus, and the Ultra Easy/Ultra Mini), the similarities between the two protocols are much easier to spot. Indeed, after porting the second driver, I also decided to refactor a bit on the first, to make the two look more alike.

This is going to be useful soon again, because two people have asked for supporting the OneTouch Verio IQ (which, despite the name, shares nothing with the normal Verio — this one uses an on-board cp210x USB-to-serial adapter), and I somehow expect that while not compatible, the protocol is likely to be similar to the other two. I ended up finding one for cheap on Amazon Germany, and I ended up ordering it — it would be the easier one to reverse engineer from my backlog, because it uses a driver I already known is easy to sniff (unlike other serial adapters that use strange framing, I’m looking at you FT232RL!), and the protocol is likely to not stray too far from the other LifeScan protocols, even though it’s not directly compatible.

I have also spent some time on the tests that are currently present. Unfortunately they don’t currently cover much of anything beside for some common internal libraries. I have though decided to improve the situation, if a bit slowly. First of all, I picked up a few of the recommendations I give my peers at work during Python reviews, and started using the parameterized module that comes from Abseil, which was recently released opensource by Google. This reduces tedious repetition when building similar tests to exercise different paths in the code. Then, I’m very thankful to Muhammad for setting up Travis for me, as that now allows the tests to show breakage, if there is any coverage at all. I’ll try to write more tests this month to make sure to exercise more drivers.

I’ve also managed to make the setup.py in the project more useful. Indeed it now correctly lists dependencies for most of the drivers as extras, and I may even be ready to make a first release on PyPI, now that I tested most of the devices I have at home and they all work. Unfortunately this is currently partly blocked on Python SCSI not having a release on PyPI itself. I’ll get back to that possibly next month at this point. For now you can install it from GitHub and it should all work fine.

As for the future, there are two in-progress pull requests/branches from contributors to add support for graphing the results, one using rrdtool and one using gnuplot. These are particularly interesting for users of FreeStyle Libre (which is the only CGM I have a driver for), and someone expressed interest in adding a Prometheus export, because why not — this is not as silly as it may sound, the output graphs of the Libre’s own software look more like my work monitoring graphs than the usual glucometer graphs. Myself, I am now toying with the idea of mimicking the HTML output that the Accu-Check Mobile generate on its own. This would be the easiest to just send by email to a doctor, and probably can be used as a basis to extend things further, integrating the other graphs output.

So onwards and upwards, the tooling will continue being built on. And I’ll do my best to make sure that Linux users who have a need to download their meters’ readings have at least some tooling that they can use, and that does not require setting up unsafe MongoDB instances on cloud providers.

January 01, 2018
Sebastian Pipping a.k.a. sping (homepage, bugs)

Escaping Docker container using waitid() – CVE-2017-5123 (twistlock.com)

December 30, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
SMS for two-factors authentication (December 30, 2017, 01:04 UTC)

Having spent now a few days at the 34C3 (I’ll leave comments over that to discussion to happen in front of a very warm coffee with a nice croissant to the side), I have heard a few times already presenters referencing the SS7 hack as an everyday security threat. You probably are not surprised I don’t agree with that being an everyday security issue, while still thinking this is a real security issue.

Indeed, while some websites refer to the SS7 hack as the reason not to use SMS for auth, at least you can find more reasonable articles talking about the updated NIST recommendation. Myself, the preference for TOTP (as it’s used in authenticator apps), is because I don’t need to be registered on a mobile network and I can use it to login while in flights.

I want to share some more thoughts that add up to the list of reasons to not use SMS as the second factor authentication. Because while I do not believe SS7 attack is something that all users should be accounting for in their day-to-day life, it is an addition to the list of reasons why SMS auth is not a good idea at all, and I don’t want people to think that, since they are unlikely to be attacked by someone leveraging SS7, they are okay with using SMS authentication instead.

The obvious first problem is reliability: as I said in the post linked above, SMS-based authentication requires you to have access to the phone with the SIM, and that it’s connected to the network and allowed to receive the SMS. This is fine if you never travel and you have good reception where you need to use this. But myself, I have had multiple situations in which I was unable to receive SMS (in flights as I said already, or while visiting China the first time, with 3 Ireland not giving access to any roaming partners to pay-as-you-go customers, or even at my own desk at my office with 3 UK after I moved to London).

This lack of reliability is unfortunate, but not by itself a security issue: it prevents you from accessing the account that the 2FA is set on, and that would sound like it’s doing what it’s designed to do, failing close. At the same time, this increases the friction of using 2FA, reducing usability, and pushing users, bit by bit, to stop using 2FA altogether. Which is a security problem.

The other problem is the ability to intercept or hijack those messages. As you can guess by what I wrote above, I’m not referring to SS7 hacks or equivalent. It’s all significantly more simple.

The first way to intercept the SMS auth messages is having access to the phone itself. Most phones, iOS and Android alike, are configured to show new text messages in the standby page. In some cases, only a part of the message is visible, and to have access to the rest of the message you’d have to unlock the phone – assuming the phone is locked at all – but 2FA authentication messages tend to be very short and to the point, showing the number in the preview, for ease of access. On Android, such a message can also be swiped away without unlocking the phone. An user that would be victim to this type of interception might have a very hard time noticing this, as nowadays it’s likely the SMS app is not opened very often, and a swiped-off notification would take time to be noticed1.

The hijacking I have in mind is a bit more complicated and (probably) noticeable. Instead of using the SS7 hack, you can just take over a phone number by leveraging the phone providers. And this can be done in (at least) two ways: you can convince the victim’s phone provider to reprovision the number to a new SIM card within the same operator, or you can port the number to a new operator. The complexity or easiness of these two processes change a lot between countries, some countries are safer than others.

For instance, the UK system for number portability is what I expect to be the more secure (if not the most userfriendly) I have seen. The first step is to get a Portability Authorization Code (PAC) for the number you want to port. You do that by calling the current provider. None of the three providers I had up to now in the UK had any way to get this code online, which is a tad safer, as a misplaced password cannot bring full access to the account line. And while the code could be “intercepted” the same way as I pointed out above for authentication codes, the (new) operator does get in touch with you reminding you when the portability will take place, and giving you a chance to contact them if it doesn’t sound right. In the case of Vodafone, they also send you an email when you request the PAC, meaning just swiping away the notification is not enough to hide the fact that it was requested in the first place.

In Ireland, a portability request completes in the span of less than an hour, and only requires you to have (brief) access to the line you want to take over, as the new operator will send you a code to confirm the request. Which means the process, while being significantly easier for the customers, is also extremely insecure. In Italy, I actually went to the store with the line that I wanted to port, and I don’t remember if they asked anything but my IDs to open the new line. No authentication code is involved at all, so if you can fake enough documents, you likely can take over any lines. I do not remember if they notified my old SIM card before the move. I have not tried number portability in France, but it appears you can get the RIO (the equivalent transfer code) from the online system of Free at the very least.

The good thing about all the portability processes I’ve seen up to now is that at least they do not drop a new number on the old SIM card. I was told that this is (or was at least) actually common for US providers, where porting a number out just assign a new number to the old SIM. In that case, it would probably take a while for a victim to notice they had their account taken over. And that would not be a surprise.

If you’re curious, you can probably try that by yourself. Call your phone provider from another number than your own, see how many and which security questions they ask you to identify that it is actually their customer calling, instead of a random stranger. I think the funniest I’ve had was Three Ireland, that asked me for the number I “recently” sent a text message to, or a recent call made or received — you can imagine that it’s extremely easy to force you to get someone to send you a text message, or have them call you, if you’re close enough to them, or even just have them pick up the phone and reporting to the provider that you were last called by the number you used.

And then there is the other interesting point of SMS-based authentication: the codes last longer in time. A TOTP has a short lifetime by design, as it’s time based. Add some fuzzing, most of the implementations I’ve seen allow a single code to be valid for 2× the time the code is displayed on the authenticator, by accepting a code from the past, or from the future, generally at half the expected duration. Since the normal case has the code lasting for 60 seconds, they would accept a code 30 seconds before it’s supposed to be used, and 30 seconds after. But text messages can (and do) take much longer than that.

And this is particularly useful for attackers of systems that do not implement 2FA correctly. Entering the wrong OTP most of the time does not invalidate a login attempt, at least not on the first mistake, because users can mistype, or they can miss the window to send an OTP over. But sometimes there are egregious errors, such as that made by N26, where they neither ratelimited, nor invalidated the OTP requests, allowing a simple bruteforcing of a valid OTP. Since TOTP change without requesting a new code, bruteforcing those give you a particularly short time span of viability… SMS on the other hand, open for a much larger window of opportunity.

Oh and remember the Barclays single-factor-authentication? How long do you think it would take to spoof the outgoing number of the SMS that needs to be sent (with the text “Y”) to authorize the transaction, even without having access to the text message that was sent?


  1. This is an argument for using another of your messaging apps to receive text messages, whether it is Facebook Messenger, Hangouts or Signal. Assuming you use any of those on a day-to-day basis, you would then have an easy way to notice if you received messages you have not seen before. [return]

December 28, 2017
Michael Palimaka a.k.a. kensington (homepage, bugs)
Helper scripts for CMake-based ebuilds (December 28, 2017, 14:13 UTC)

I recently pushed two new scripts to the KDE overlay to help with CMake-based ebuild creation and maintenance.

cmake_dep_check inspects all CMakeLists.txt files under the current directory and maps find_package calls to Gentoo packages.

cmake_ebuild_check builds on cmake_dep_check and produces a list of CMake dependencies that are not present in the ebuild (and optionally vice versa).

If you find it useful or run into any issues feel free to drop me a line.

December 24, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
Barclays and the single factor authentication (December 24, 2017, 21:04 UTC)

In my previous post on the topic I have barely touched on one of the important reasons why I did not like Barclays at all. The reason for that was that I still had money into my account with them, and I wanted to make sure that was taken care of before lamenting further on the state of their security. As I managed to close my account now, I should go on and discuss this further, even though I have touched upon the major topics of this.

Barclays online banking system relies heavily on what I would define as “single factor authentication”.

Usually, you define authentication factors as things you have or things you know. In the case of Barclays, the only thing they effectively rely upon is “access to the debit card”. Okay, technically you could say that by itself it’s a two-factor system, as it requires access to the debit card and to its PIN. And since the EMV-CAP protocol they use for this factor executes directly on the chipcard, it is not susceptible to the usual PIN-stripping attacks as most card fraud with chip-and-pin cards uses.

But this does not count for much when the PIN of the card they issued me was 7766 — and to lament of that is why I waited to close the account and give them back the card. It seems like there’s a pattern of banks issuing “easy to remember” 4-digit PINs: XYYX, XXYY, etc. One of my previous (again, cancelled) cards had a PIN terribly easy to remember for a computerist, at least not for the average person though: 0016.

Side note: I have read someone suggesting to badly scribbled a wrong PIN on the back of a card as a theft prevention. Though I like that idea, I’m just afraid the banks won’t like it anyway. Also it would take some work to make the scribble being easily misunderstood for different digits so that they can try the three times needed to block it.

You access Barclays online banking account through the use of the Identify method provided by CAP, which means you put the card into the reader, provide the PIN, and you get an 8-digits identifier that can be used to login on the website. Since I’m no expert of how CAP works internally, I will only venture a guess that this is similar to a counter-based OTP, as the card has no access to a real-time clock, and there is no challenge provided for this information.

This account access sounds secure, but it’s really not any more secure than an username and password, at least when it comes to dealing with phishing. You may think that producing a façade that shows the full Barclays login, and proxies the responses in real time is a lot of work, but the phishing tools are known for being flexible, and they don’t really need to reproduce the whole website, just the parts they care about getting data from. The rest can easily be proxied as it is without any further change, of course.

So what can we do once you can fool someone into logging in to the bank? Well, you can’t really do much, as most of the actions require further CAP confirmation: wires, new standing orders, and so on so forth. You can, though, get a lot of information about the victim, including enough proofs of address or identity that you can really mess with their life. It also makes it possible to cancel things like standing orders to pay for rent, which would be quite messy to deal with for most people — although most of the phishing is not done for the purpose of messing with people, and more to get their money.

As I said, for sending money you need to have access to the CAP codes. That includes having access not only to the device itself, but also the card and the PIN. To execute those transactions, Barclays will ask you to sign a transaction by providing the CAP device with the account number and the amount to wire. This is good and it’s pretty hard to tamper with, hopefully (I do not make any guarantee on the implementation of CAP), so even if you’re acting through a proxy-phishing site, your wires are probably safe.

I say probably, because the way the challenge-response is implemented, only the 8-digits account number is used during the signature. If the phishers are attacking a victim that they studied for long enough, which may be the case when attacking businesses, you could know which account they pay every month manually, and set up an account with the same number at a different bank (different sort code). The signature would be valid for both.

To be fair to Barclays, implementing the CAP fully, the way they did here, is actually more secure than what Ulster Bank (and I assume the rest of RBS Group) does, with an opaque “challenge” token. While this may encode more information, the fact that it’s opaque means there is no way for the user to know whether what they are signing is indeed what they meant to.

Now, these mitigations are actually good. They require continuous access to the card on request, and that makes it very hard for phishing to just keep using the site in the background after the user logged in. But they still rely on effectively a single factor. If someone gets a hold of the card and the PIN (and we know at least some people will write the real one on the back of the card), then it’s game over: it’s like the locks on my flat’s door: two independent locks… except they use the same key. Sure, it’s a waste of time to pick both, so it increases the chances a neighbour would walk in on wannabe burglars trying to open the apartment door. But there’s a single key, I can’t just use two separate keychains to make sure a thief would only grab one of the two, and if anyone gets it from me, well, it’s game over.

Of course Barclays knows that this is not enough, so they include a risk engine. If something in the transactions don’t comply with their profile of your activity, it’s considered risky and they require an additional verification. This verification happens to be in form of text messages. I will not suggest that the problem with these is with GSM-layer attacks, as that is still not (yet) in the hands of the type of criminals aiming at personal bank accounts, but there is at the very least the risk that a thieve would get a handle of my bag with both my card and my phone, so the only “factors” that are still in my head, rather than tied to the physical objects, are the (provided) PIN of the card, and the PIN of the phone.

This profile fitting is actually the main reason why I got frustrated with Barclays: since I had just opened the account, most of the transactions were all “exceptional”, and that is extremely annoying. This was compounded by the fact that my phone provider didn’t even let me receive SMS from the office, due to lack of coverage (now fixed), and the fact that at least for wires, the Barclays UI does not warn you to check your phone!

There is also the problem with the way Barclays handle these “exceptional transactions”: debit card transactions are out-and-out rejected. The Verified by Visa screen tells you to check your phone, but the phone will only ask you if it was your transaction or not, and after you confirm it is, it’ll ask you to “retry in a couple of minutes” — retrying too quickly will lead to the transactions being blocked by the processor directly, with a temporary card lock. The wire transfer one will unblock the execution of the wire, which is good, but it can also push the wire to after the cut-off time for non-“Faster Payments” wires.

Update (2017-12-30): since I did not make this very clear, I have added a note about this at the bottom of my new post, about the fact hat confirming these transactions only need you to spoof the sender, since the content and destination of the text message to send are known (it only has to say “Y”, and it’s always to the same service number). So this validation should not really count as a second factor authentication for a skilled attacker.

These are all the reasons for which I abandoned Barclays as fast as I could. Some of those are actually decent mitigation strategies, but the fact that they do not really increase security, while increasing inconvenience, makes me doubt the validity of their concerns and threat models.

December 23, 2017
Sergei Trofimovich a.k.a. slyfox (homepage, bugs)
GCC bit shift handling on ia64 (December 23, 2017, 00:00 UTC)

trofi's blog: GCC bit shift handling on ia64

GCC bit shift handling on ia64

Over the past week I’ve spent quite some time poking at toolchain packages in Gentoo and fixing fresh batch of bugs. The bugs popped up after recent introduction of gcc profiles that enable PIE (position independent executables) and SSP (stack smash protected executables) by default in Gentoo. One would imagine SSP is more invasive as it changes on-stack layout of things. But somehow PIE yields bugs in every direction you throw it at:

  • [not really from last week] i386 static binaries SIGSEGV at startup: bug (TL;DR: calling vdso-implemented SYSENTER syscall entry point needs TLS to be already initialised in glibc. In this case brk() syscall was called before TLS was initialised.)
  • ia64’s glibc miscompiles librt.so: bug (Tl;DR: pie-by-default gcc tricked glibc into believing binutils supports IFUNC on ia64. It does not, exactly as in sparc bug 7 years ago.)
  • sparc’s glibc produces broken static binaries on pie-by-default gcc: bug. (TL;DR: C startup files for sparc were PIC-unfriendly)
  • powerpc’s gcc-7.2.0 and older produces broken binaries: bug (TL;DR: two bugs: binutils generates relocations to non-existent symbols, gcc hardcoded wrong C startup files to be used)
  • mips64 gcc fails to produce glibc that binutils can link without assert() failures (TODO: debug and fix)
  • m68k glibc fails to compile (TODO: debug and fix)
  • sh4 gcc fails to compile by crashing with stack smash detection (TODO: debug and fix)
  • more I forgot about

These are only toolchain bugs, and I did not yet try things on arm. I’m sure other packages will uncover more interesting corner cases once we fix the toolchain.

I won’t write about any of the above here but it’s a nice reference point if you want to backport something to older toolchains to get PIE/SSP work better. The bugs might be a good reference point on how to debug that kind of bugs.

Today’s story will be about gcc code generation bug on ia64 that has nothing to do with PIE or SSP.

Bug

On #gentoo-ia64 Jason Duerstock asked if openssl fails tests on modern gcc-7.2.0. It used to pass on gcc-6.4.0 for him.

Having successfully dealt with IFUNC bug recently I was confident things are mostly working in ia64 land and gave it a try.

openssl DES tests were indeed failing, claiming that encryption/decryption roundtrip did not yield original data with errors like:

des_ede3_cbc_encrypt decrypt error ...

What does DES test do?

The test lives at https://github.com/openssl/openssl/blob/OpenSSL_1_0_2-stable/crypto/des/destest.c#L421 and looks like that (simplified):

printf("Doing cbcm\n");
DES_set_key_checked(&cbc_key, &ks);
DES_set_key_checked(&cbc2_key, &ks2);
DES_set_key_checked(&cbc3_key, &ks3);
i = strlen((char *)cbc_data) + 1;
DES_ede3_cbcm_encrypt(cbc_data, cbc_out, 16L, &ks, &ks2, &ks3, &iv3, &iv2, DES_ENCRYPT);
DES_ede3_cbcm_encrypt(&cbc_data[16], &cbc_out[16], i - 16, &ks, &ks2, &ks3, &iv3, &iv2, DES_ENCRYPT);
DES_ede3_cbcm_encrypt(cbc_out, cbc_in, i, &ks, &ks2, &ks3, &iv3, &iv2, DES_DECRYPT);
if (memcmp(cbc_in, cbc_data, strlen((char *)cbc_data) + 1) != 0) {
printf("des_ede3_cbcm_encrypt decrypt error\n");
// ...
}

Test encrypts a few bytes of data in two takes (1: encrypt 16 bytes, 2: encrypt rest) and then decrypts the result in one go. It works on static data. Very straightforward.

Fixing variables

My first suspect was gcc change as 7.2.0 fails, 6.4.0 works and I tried to build openssl with optimisations completely disabled:

# CFLAGS=-O1 FEATURES=test emerge -1 =dev-libs/openssl-1.0.2n
...
des_ede3_cbc_encrypt decrypt error ...

# CFLAGS=-O0 FEATURES=test emerge -1 =dev-libs/openssl-1.0.2n
...
OK!

Hah! At least -O0 seems to work. It means it will be easier to cross-check which optimisation pass affects code generation and find out if it’s an openssl bug or something else.

Setting up A/B test

Debugging cryptographic algorithms like DES is very easy from this standpoint because they don’t need any external state: no services running, no files created, input data is not randmized. They are a pure function of input bit stream(s).

I unpacked single openssl source tree and started building it in two directories: one with -O0 optimisations, another with -O1:

~/openssl/openssl-1.0.2n-O0/
    `openssl-1.0.2n-.ia64/
        `crypto/des/destest.c (file-1)
        ...
~/openssl/openssl-1.0.2n-O0/
    `openssl-1.0.2n-.ia64/
        `crypto/des/destest.c (symlink to file-1)
        ...

Then I started sprinking printf() statements in destest.c and other crypto/des/ files, then ran destest and diffed text stdouts to find where exactly difference appears first.

Relatively quickly I nailed it down to the following trace:

unsigned int u;
...
# define D_ENCRYPT(LL,R,S) {\
LOAD_DATA_tmp(R,S,u,t,E0,E1); \
t=ROTATE(t,4); \
LL^=\
DES_SPtrans[0][(u>> 2L)&0x3f]^ \
DES_SPtrans[2][(u>>10L)&0x3f]^ \
DES_SPtrans[4][(u>>18L)&0x3f]^ \
DES_SPtrans[6][(u>>26L)&0x3f]^ \
DES_SPtrans[1][(t>> 2L)&0x3f]^ \
DES_SPtrans[3][(t>>10L)&0x3f]^ \
DES_SPtrans[5][(t>>18L)&0x3f]^ \
DES_SPtrans[7][(t>>26L)&0x3f]; }

See an error? Me neither.

Minimizing test

printf() debugging suggested DES_SPtrans[6][(u>>26L)&0x3f] returns different data in -O0 and -O1 cases. Namely the following expression did change:

  • -O0: (u>>26L)&0x3f yielded 0x33
  • -O1: (u>>26L)&0x3f yielded 0x173

Note how it’s logically infeasible to get anything more than 0x3f from that expression. And yet here we are with our 0x173 value.

I spent some time deleting lines one by one from all the macros as long as the result kept producing the diff. Removing lines is safe in most of DES code because all the test does is flipping a few bits and rotating them within a single unsigned int u local variable.

After a while I came up with the following minimal reproducer:

#include <stdio.h>
typedef unsigned int u32;
u32 bug (u32 * result) __attribute__((noinline));
u32 bug (u32 * result)
{
// non-static and volatile to inhibit constant propagation
volatile u32 ss = 0xFFFFffff;
volatile u32 d = 0xEEEEeeee;
u32 tt = d & 0x00800000;
u32 r = tt << 8;
// rotate
r = (r >> 31)
| (r << 1);
u32 u = r^ss;
u32 off = u >> 1;
// seemingly unrelated but bug-triggering side-effect
*result = tt;
return off;
}
int main() {
u32 l;
u32 off = bug(&l);
printf ("off>>: %08x\n", off);
return 0;
}
$ gcc -O0 a.c -o a-O0 && ./a-O0 > o0
$ gcc -O1 a.c -o a-O1 && ./a-O1 > o1
$ diff -U0 o0 o1

-off>>: 7fffffff
+off>>: ffffffff

The test itself is very straightforward: it does only 32-bit arightmetics on unsigned int r and prints the result. This is a very fragile test: if you try to remove seemingly unrelated code like *result = tt; the bug will disappear.

What gcc actually does

I tried to look at the assembly code. If I could spot an obvious problem I could inspect intermediate gcc steps to get the idea which pass precisely introduces wrong resulting code. Despite being Itanium the code is not that complicated (added detailed comments):

Dump of assembler code for function bug:
mov r14=r12 ; r12: register holding stack pointer
; r14 = r12 (new temporary variable)
mov r15=-1 ; r15=0xFFFFffff ('ss' variable)
st4.rel [r14]=r15,4 ; write 'ss' on stack address 'r14 - 4'
movl r15=0xffffffffeeeeeeee ; r15=0xEEEEeeee ('d' variable)
st4.rel [r14]=r15 ; write 'd' on stack address 'r14'
ld4.acq r15=[r14] ; and quickly read 'd' back into 'r15' :)
movl r14=0x800000 ; r14=0x00800000
and r15=r14,r15 ; u32 tt = d & 0x00800000;
; doing u32 r = tt << 8;
dep.z r14=r15,8,24 ; "deposit" 24 bits from r14 into 15
; starting at offset 8 and zeroing the rest.
; Or in pictures (with offsets):
; r14 = 0xAABBCCDD11223344
; r15 = 0x0000000022334400
ld4.acq r8=[r12] ; read 'ss' into r8
st4 [r32]=r15 ; *result = tt
; // rotate
; r = (r >> 31)
; | (r << 1);
mix4.r r14=r14,r14 ; This one is tricky: mix duplicates lower32
; bits into lower and upper 32 bits of 14.
; Or in pictures:
; r14 = 0xAABBCCDDF1223344 ->
; r14 = 0xF1223344F1223344
shr.u r14=r14,31 ; And shift right for 31 bit (with zero padding)
; r14 = 0x00000001E2446688
; Note how '1' is in a position of 33-th bit
xor r8=r8,r14 ; u32 u = r^ss;
extr.u r8=r8,1,32 ; u32 off = u >> 1
; "extract" 32 bits at offset 1 from r8 and put them to r8
; Or in pictures:
; r8 = 0x0000000100000000 -> (note how all lower 32 bits are 0)
; r8 = 0x0000000080000000
br.ret.sptk.many b0 ; return r8 as a computation result

Tl;DR: extr.u r8=r8,1,32 extracts 32 bits from 64-bit register at offset 1 from r8 and puts them into r8 back. The problem is that for u32 off = u >> 1 to work correctly it should extract only 31 bit, not 32.

If I patch assembly file to contain extr.u r8=r8,1,31 the sample will work correctly.

At this point I shared my sample with Jason and filed a gcc bug as it was clear that compiler does something very unexpected. Jason pulled in James and they produced a working gcc patch for me while I was having dinner(!) :)

gcc passes

What surprised me is the fact that gcc recognised “rotate” pattern and used clever mix4.r/shr.u trick to achieve the bit rotation effect.

Internally gcc has a few frameworks to perform optimisations:

  • high-level (relatively) targed independent “tree” optimisations (working on GIMPLE representation)
  • low-level target-specific “register” optimisations (working on RTL representation)

GIMPLE passes run before RTL passes. Let’s check how many passes are being ran on our small sample by using -fdump-tree-all and -fdump-rtl-all:

$ ia64-unknown-linux-gnu-gcc -O1 -fdump-tree-all -fdump-rtl-all -S a.c && ls -1 | nl
1   a.c
2   a.c.001t.tu
3   a.c.002t.class
4   a.c.003t.original
5   a.c.004t.gimple
6   a.c.006t.omplower
7   a.c.007t.lower
8   a.c.010t.eh
9   a.c.011t.cfg
10  a.c.012t.ompexp
11  a.c.019t.fixup_cfg1
12  a.c.020t.ssa
13  a.c.022t.nothrow
14  a.c.027t.fixup_cfg3
15  a.c.028t.inline_param1
16  a.c.029t.einline
17  a.c.030t.early_optimizations
18  a.c.031t.objsz1
19  a.c.032t.ccp1
20  a.c.033t.forwprop1
21  a.c.034t.ethread
22  a.c.035t.esra
23  a.c.036t.ealias
24  a.c.037t.fre1
25  a.c.039t.mergephi1
26  a.c.040t.dse1
27  a.c.041t.cddce1
28  a.c.046t.profile_estimate
29  a.c.047t.local-pure-const1
30  a.c.049t.release_ssa
31  a.c.050t.inline_param2
32  a.c.086t.fixup_cfg4
33  a.c.091t.ccp2
34  a.c.094t.backprop
35  a.c.095t.phiprop
36  a.c.096t.forwprop2
37  a.c.097t.objsz2
38  a.c.098t.alias
39  a.c.099t.retslot
40  a.c.100t.fre3
41  a.c.101t.mergephi2
42  a.c.105t.dce2
43  a.c.106t.stdarg
44  a.c.107t.cdce
45  a.c.108t.cselim
46  a.c.109t.copyprop1
47  a.c.110t.ifcombine
48  a.c.111t.mergephi3
49  a.c.112t.phiopt1
50  a.c.114t.ch2
51  a.c.115t.cplxlower1
52  a.c.116t.sra
53  a.c.118t.dom2
54  a.c.120t.phicprop1
55  a.c.121t.dse2
56  a.c.122t.reassoc1
57  a.c.123t.dce3
58  a.c.124t.forwprop3
59  a.c.125t.phiopt2
60  a.c.126t.ccp3
61  a.c.127t.sincos
62  a.c.129t.laddress
63  a.c.130t.lim2
64  a.c.131t.crited1
65  a.c.134t.sink
66  a.c.138t.dce4
67  a.c.139t.fix_loops
68  a.c.167t.no_loop
69  a.c.170t.veclower21
70  a.c.172t.printf-return-value2
71  a.c.173t.reassoc2
72  a.c.174t.slsr
73  a.c.178t.dom3
74  a.c.182t.phicprop2
75  a.c.183t.dse3
76  a.c.184t.cddce3
77  a.c.185t.forwprop4
78  a.c.186t.phiopt3
79  a.c.187t.fab1
80  a.c.191t.dce7
81  a.c.192t.crited2
82  a.c.194t.uncprop1
83  a.c.195t.local-pure-const2
84  a.c.226t.nrv
85  a.c.227t.optimized
86  a.c.229r.expand
87  a.c.230r.vregs
88  a.c.231r.into_cfglayout
89  a.c.232r.jump
90  a.c.233r.subreg1
91  a.c.234r.dfinit
92  a.c.235r.cse1
93  a.c.236r.fwprop1
94  a.c.243r.ce1
95  a.c.244r.reginfo
96  a.c.245r.loop2
97  a.c.246r.loop2_init
98  a.c.247r.loop2_invariant
99  a.c.249r.loop2_doloop
100 a.c.250r.loop2_done
101 a.c.254r.dse1
102 a.c.255r.fwprop2
103 a.c.256r.auto_inc_dec
104 a.c.257r.init-regs
105 a.c.259r.combine
106 a.c.260r.ce2
107 a.c.262r.outof_cfglayout
108 a.c.263r.split1
109 a.c.264r.subreg2
110 a.c.267r.asmcons
111 a.c.271r.ira
112 a.c.272r.reload
113 a.c.273r.postreload
114 a.c.275r.split2
115 a.c.279r.pro_and_epilogue
116 a.c.280r.dse2
117 a.c.282r.jump2
118 a.c.286r.ce3
119 a.c.288r.cprop_hardreg
120 a.c.289r.rtl_dce
121 a.c.290r.bbro
122 a.c.296r.alignments
123 a.c.298r.mach
124 a.c.299r.barriers
125 a.c.303r.shorten
126 a.c.304r.nothrow
127 a.c.306r.final
128 a.c.307r.dfinish
129 a.c.308t.statistics
130 a.s

128 passes! Quite a few. t letter after pass number means “tree” pass, r is an “rtl” pass. Note how all “tree” passes precede “rtl” ones.

Latest “tree” pass outputs the following:

;; cat a.c.227t.optimized
;; Function bug (bug, funcdef_no=23, decl_uid=2078, cgraph_uid=23, symbol_order=23)

__attribute__((noinline))
bug (u32 * result)
{
  u32 off;
  u32 u;
  u32 r;
  u32 tt;
  volatile u32 d;
  volatile u32 ss;
  unsigned int d.0_1;
  unsigned int ss.1_2;

  <bb 2> [100.00%]:
  ss ={v} 4294967295;
  d ={v} 4008636142;
  d.0_1 ={v} d;
  tt_6 = d.0_1 & 8388608;
  r_7 = tt_6 << 8;
  r_8 = r_7 r>> 31;
  ss.1_2 ={v} ss;
  u_9 = ss.1_2 ^ r_8;
  off_10 = u_9 >> 1;
  *result_11(D) = tt_6;
  return off_10;

}



;; Function main (main, funcdef_no=24, decl_uid=2088, cgraph_uid=24, symbol_order=24) (executed once)

main ()
{
  u32 off;
  u32 l;

  <bb 2> [100.00%]:
  off_3 = bug (&l);
  __printf_chk (1, "off>>: %08x\n", off_3);
  l ={v} {CLOBBER};
  return 0;

}

Code does not differ much from originally written code because I specifically tried to write something that won’t trigger any high-level transformations.

RTL dumps are even more verbose. I’ll show an incomplete snippet of bug() function in a.c.229r.expand file:

;;
;; Full RTL generated for this function:
;;
(note 1 0 4 NOTE_INSN_DELETED)
(note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 2 4 3 2 (set (reg/v/f:DI 347 [ result ])
(reg:DI 112 in0 [ result ])) "a.c":5 -1
(nil))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:SI 348)
(const_int -1 [0xffffffffffffffff])) "a.c":7 -1
(nil))
(insn 7 6 8 2 (set (mem/v/c:SI (reg/f:DI 335 virtual-stack-vars) [1 ss+0 S4 A128])
(reg:SI 348)) "a.c":7 -1
(nil))
(insn 8 7 9 2 (set (reg:DI 349)
(reg/f:DI 335 virtual-stack-vars)) "a.c":8 -1
(nil))
(insn 9 8 10 2 (set (reg/f:DI 350)
(plus:DI (reg/f:DI 335 virtual-stack-vars)
(const_int 4 [0x4]))) "a.c":8 -1
(nil))
(insn 10 9 11 2 (set (reg:SI 351)
(const_int -286331154 [0xffffffffeeeeeeee])) "a.c":8 -1
(nil))
(insn 11 10 12 2 (set (mem/v/c:SI (reg/f:DI 350) [1 d+0 S4 A32])
(reg:SI 351)) "a.c":8 -1
(nil))
(insn 12 11 13 2 (set (reg:DI 352)
(reg/f:DI 335 virtual-stack-vars)) "a.c":9 -1
(nil))
(insn 13 12 14 2 (set (reg/f:DI 353)
(plus:DI (reg/f:DI 335 virtual-stack-vars)
(const_int 4 [0x4]))) "a.c":9 -1
(nil))
(insn 14 13 15 2 (set (reg:SI 340 [ d.0_1 ])
(mem/v/c:SI (reg/f:DI 353) [1 d+0 S4 A32])) "a.c":9 -1
(nil))
(insn 15 14 16 2 (set (reg:DI 355)
(const_int 8388608 [0x800000])) "a.c":9 -1
(nil))
(insn 16 15 17 2 (set (reg:DI 354)
(and:DI (subreg:DI (reg:SI 340 [ d.0_1 ]) 0)
(reg:DI 355))) "a.c":9 -1
(nil))
(insn 17 16 18 2 (set (reg/v:SI 342 [ tt ])
(subreg:SI (reg:DI 354) 0)) "a.c":9 -1
(nil))
(insn 18 17 19 2 (set (reg/v:SI 343 [ r ])
(ashift:SI (reg/v:SI 342 [ tt ])
(const_int 8 [0x8]))) "a.c":10 -1
(nil))
(insn 19 18 20 2 (set (reg:SI 341 [ ss.1_2 ])
(mem/v/c:SI (reg/f:DI 335 virtual-stack-vars) [1 ss+0 S4 A128])) "a.c":16 -1
(nil))
(insn 20 19 21 2 (set (mem:SI (reg/v/f:DI 347 [ result ]) [1 *result_11(D)+0 S4 A32])
(reg/v:SI 342 [ tt ])) "a.c":20 -1
(nil))
(insn 21 20 22 2 (set (reg:SI 357 [ r ])
(rotate:SI (reg/v:SI 343 [ r ])
(const_int 1 [0x1]))) "a.c":13 -1
(nil))
(insn 22 21 23 2 (set (reg:DI 358)
(xor:DI (subreg:DI (reg:SI 357 [ r ]) 0)
(subreg:DI (reg:SI 341 [ ss.1_2 ]) 0))) "a.c":16 -1
(nil))
(insn 23 22 24 2 (set (reg:DI 359)
(zero_extract:DI (reg:DI 358)
(const_int 31 [0x1f])
(const_int 1 [0x1]))) "a.c":17 -1
(nil))
(insn 24 23 25 2 (set (subreg:DI (reg:SI 356 [ off ]) 0)
(reg:DI 359)) "a.c":17 -1
(nil))
(insn 25 24 29 2 (set (reg/v:SI 346 [ <retval> ])
(reg:SI 356 [ off ])) "a.c":21 -1
(nil))
(insn 29 25 30 2 (set (reg/i:SI 8 r8)
(reg/v:SI 346 [ <retval> ])) "a.c":22 -1
(nil))
(insn 30 29 0 2 (use (reg/i:SI 8 r8)) "a.c":22 -1
(nil))

The above is RTL representation of our bug() function. S-expressions look like machine instructions but not quite.

For example the following snippet (single S-expression) introduces new virtual register 351 which should receive literal value of “0xeeeeeeee”. SI means SImode, or 32-bit signed integer:

More on modes is here.

(insn 10 9 11 2 (set (reg:SI 351)
(const_int -286331154 [0xffffffffeeeeeeee])) "a.c":8 -1
(nil))

Or just r351 = 0xeeeeeeee :) Note it also mentions source file line numbers. Useful when mapping RTL logs back to source files (and I guess gcc uses the same to emit dwarf debugging sections).

Another example of RTL instruction:

(insn 22 21 23 2 (set (reg:DI 358)
(xor:DI (subreg:DI (reg:SI 357 [ r ]) 0)
(subreg:DI (reg:SI 341 [ ss.1_2 ]) 0))) "a.c":16 -1
(nil))

Here we see an RTL equivalent of u_9 = ss.1_2 ^ r_8; code (GIMPLE). There is more subtlety here: xor itself operates on 64-bit integers (DI) that contain 32-bit values in lower 32-bits (subregs) which I don’t really understand.

Upstream bug attempts to decide which assumption is violated in generic RTL optimisation pass (or backend implementation). At the time of writing this post a few candidate patches were posted to address the bug.

gcc RTL optimisations

I was interested in the optimisation that converted extr.u r8=r8,1,31 (valid) to extr.u r8=r8,1,32 (invalid).

It’s GIMPLE representation is:

// ...
off_10 = u_9 >> 1;
*result_11(D) = tt_6;
return off_10;

Let’s try to find RTL representation of this construct in the very first RTL dump (a.c.229r.expand file):

(insn 23 22 24 2 (set (reg:DI 359)
(zero_extract:DI (reg:DI 358)
(const_int 31 [0x1f])
(const_int 1 [0x1]))) "a.c":17 -1
(nil))
;; ...
(insn 24 23 25 2 (set (subreg:DI (reg:SI 356 [ off ]) 0)
(reg:DI 359)) "a.c":17 -1
(nil))
(insn 25 24 29 2 (set (reg/v:SI 346 [ <retval> ])
(reg:SI 356 [ off ])) "a.c":21 -1
(nil))
(insn 29 25 30 2 (set (reg/i:SI 8 r8)
(reg/v:SI 346 [ <retval> ])) "a.c":22 -1
(nil))
(insn 30 29 0 2 (use (reg/i:SI 8 r8)) "a.c":22 -1
(nil))

Or in a slightly more concise form:

reg359 = zero_extract(reg358, offset=1, length=31)
reg356 = (u32)reg359;
reg346 = (u32)reg356;
reg8   = (u32)reg346;

Very straightforward (and still correct). zero_extract is some RTL virtual operation that will have to be expressed as real instruction at some point.

In the absence of other optimisations it’s done with the following rule (at https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/ia64/ia64.md;h=b7cd52ba366ba3d63e98479df5f4be44ffd17ca6;hb=HEAD#l1381)

(define_insn "extzv"
[(set (match_operand:DI 0 "gr_register_operand" "=r")
(zero_extract:DI (match_operand:DI 1 "gr_register_operand" "r")
(match_operand:DI 2 "extr_len_operand" "n")
(match_operand:DI 3 "shift_count_operand" "M")))]
""
"extr.u %0 = %1, %3, %2"
[(set_attr "itanium_class" "ishf")])

In unoptimised case we see all the intermediate assignments are stored at r14 address and loaded back.

;; gcc -O0 a.c
...
extr.u r15 = r15, 1, 31
;;
st4 [r14] = r15
mov r14 = r2
;;
ld8 r14 = [r14]
adds r16 = -32, r2
;;
ld4 r15 = [r16]
;;
st4 [r14] = r15
adds r14 = -20, r2
;;
ld4 r14 = [r14]
;;
mov r8 = r14
.restore sp
mov r12 = r2
br.ret.sptk.many b0

To get rid of all the needless operations RTL applies extensive list of optimisations. Each RTL dump contains summary of the pass effect on the file. Let’s look at the a.c.232r.jump:

Deleted 2 trivially dead insns
3 basic blocks, 2 edges.

Around a.c.257r.init-regs pass our RTL representation shrinks into the following:

(insn 23 22 24 2 (set (reg:DI 359)
(zero_extract:DI (reg:DI 358)
(const_int 31 [0x1f])
(const_int 1 [0x1]))) "a.c":17 159 {extzv}
(expr_list:REG_DEAD (reg:DI 358)
(nil)))
(insn 24 23 29 2 (set (subreg:DI (reg:SI 356 [ off ]) 0)
(reg:DI 359)) "a.c":17 6 {movdi_internal}
(expr_list:REG_DEAD (reg:DI 359)
(nil)))
(insn 29 24 30 2 (set (reg/i:SI 8 r8)
(reg:SI 356 [ off ])) "a.c":22 5 {movsi_internal}
(expr_list:REG_DEAD (reg:SI 356 [ off ])
(nil)))
(insn 30 29 0 2 (use (reg/i:SI 8 r8)) "a.c":22 -1
(nil))

Or in a slightly more concise form:

reg359 = zero_extract(reg358, offset=1, length=31)
reg356 = (u32)reg359;
reg8   = (u32)reg356;

Note how reg346 = (u32)reg356; already gone away.

The most magic happened in a single a.c.259r.combine pass. Here is it’s report:

;; Function bug (bug, funcdef_no=23, decl_uid=2078, cgraph_uid=23, symbol_order=23)

starting the processing of deferred insns
ending the processing of deferred insns
df_analyze called
insn_cost 2: 4
insn_cost 6: 4
insn_cost 32: 4
insn_cost 7: 4
insn_cost 10: 4
insn_cost 11: 4
insn_cost 14: 4
insn_cost 15: 4
insn_cost 16: 4
insn_cost 18: 4
insn_cost 19: 4
insn_cost 20: 4
insn_cost 21: 4
insn_cost 22: 4
insn_cost 23: 4
insn_cost 24: 4
insn_cost 29: 4
insn_cost 30: 0
allowing combination of insns 2 and 20
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 2.
modifying insn i3    20: [in0:DI]=r354:DI#0
      REG_DEAD in0:DI
      REG_DEAD r354:DI
deferring rescan insn with uid = 20.
allowing combination of insns 23 and 24
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 23.
modifying insn i3    24: r356:SI#0=r358:DI 0>>0x1
  REG_DEAD r358:DI
deferring rescan insn with uid = 24.
allowing combination of insns 24 and 29
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 24.
modifying insn i3    29: r8:DI=zero_extract(r358:DI,0x20,0x1)
      REG_DEAD r358:DI
deferring rescan insn with uid = 29.
starting the processing of deferred insns
rescanning insn with uid = 20.
rescanning insn with uid = 29.
ending the processing of deferred insns

The essential output relevant to our instruction is:

allowing combination of insns 23 and 24
allowing combination of insns 24 and 29

This combines three instructions:

reg359 = zero_extract(reg358, offset=1, length=31)
reg356 = (u32)reg359;
reg8   = (u32)reg356;

Into one:

reg8 = zero_extract(reg358, offset=1, length=32)

The problem is in combiner that decided higher bit is zero anyway (false assumption) and applied zero_extract to full 32-bits. Why exactly it decided 32 upper bits are zero (they are not) is not clear to me :) It happens somewhere in https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/combine.c;h=31e6a4f68254fab551300252688a52d8c3dcaaa4;hb=HEAD#l2628 where some if() statements take about a page. Hopefully gcc RTL and ia64 maintainers will help and guide us here.

Parting words

  • Having good robust tests in packages is always great as it helps gaining confidence package is not completely broken on new target platforms (or toolchain versions).
  • printf() is still good enough tool to trace compiler bugs :)
  • -fdump-tree-all and -fdump-rtl-all are useful gcc flags to get the idea why optimisations fire (or don’t).
  • gcc performs very unobvious optimisations!

Have fun!

Posted on December 23, 2017
<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript> comments powered by Disqus

December 22, 2017
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
Talks page (December 22, 2017, 13:39 UTC)

I finally decided to put up a page on this website referencing the talks I’ve been giving over the last few years.

You’ll see that I’m quite obsessed with fault tolerance and scalability designs… I guess I can’t deny it any more 🙂

December 21, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
New, new gamestation (December 21, 2017, 00:04 UTC)

Full disclosure: the following posts has a significantly higher amount of Amazon Affiliates links than usual. That’s because I am talking about the hardware I just bought, and this post counts just as much as an update on my hardware as a recommendation on the stuff I bought, I have not gotten hacked or bought out by anyone.

As I noted in my previous quick update, my gamestation went missing in the move. I would even go as far as to say that it was stolen, but I have no way to prove whether it was stolen by the movers or during the move. This meant that I needed to get a new computer for my photo editing hobby, which meant more money spent, and still no news from the insurance. But oh well.

As for two years ago, I wanted to post here the current hardware setup I have. You’ll notice a number of similarities with the previous configuration, because I decided to stick as much as possible to what I had before, that worked.

  • CPU: Intel Core i7 7820X, which still has a nice 3.6GHz base clock, and has more cores than I had before.
  • Motherboard: MSI X299 SLI PLUS. You may remember that I had problems with the ASUS motherboard.
  • Memory: 8×Crucial 16GB DDR4.
  • Case: Fractal Design Define S, as I really like the designs of Fractal Design (pun not intended), and I do not need the full cage or the optical disk bays for sure this time around.
  • CPU cooler: NZXT Kraken X52, because the 280mm version appears to be more aimed towards extreme overclockers than my normal usage; this way I had more leeway on how to mount the radiator in.
  • SSD: 2×Crucial MX300 M.2 SATA. While I liked the Samsung 850 EVO, the performance of the MX300 appear to be effectively the same, and this allowed me to get the M2 version, leaving more space if I need to extend this further.
  • HDD: Toshiba X300 5TB because there is still need for spinning rust to archive data that is “at rest”.
  • GPU: Zotac GeForce GTX 1080Ti 11GB, because since I’m spending money I may just as well buy a top of the line card and be done with it.
  • PSU: Corsair RM850i, for the first time in years betraying beQuiet! as they didn’t have anything in stock at the time I ordered this.

This is the configuration in the chassis, but that ended up not being enough. In particular, because of my own stupidity, I ended up having to replace my beloved Dell U2711 monitor. I really like my UltraSharp, but earlier this year I ended up damaging the DisplayPort input on it — friends don’t let friends use DisplayPort with hooks on them! Get those without for extra safety, particularly if you have monitor arms or standing desks! Because of this I have been using a DVI-D DualLink cable instead. Unfortunately my new videocard (and most new videocard I could see) do not have DVI ports anymore, preferring instead multiple DisplayPort and (not even always) HDMI output. The UltraSharp, unfortunately, does not support 2560x1440 output over HDMI, and the DisplayPort-to-DVI adapter in the box is only for SingleLink DVI, which is not fast enough for that resolution either. DualLink DVI adapters exist, but for the most part they are “active” converters that require a power supply and more cables, and are not cheap (I have seen a “cheap” one for £150!)

I ended up buying a new monitor too, and I settled for the BenQ BL2711U, a 27 inches, 4k, 10-bit monitor “for designers” that boasts a 100% sRGB coverage. This is not my first BenQ monitor; a few months ago I bought a BenQ BL2420PT, a 24 inches monitor “for designers” that I use for both my XPS and for my work laptop, switching one and the other as needed over USB-C, and I have been pretty happy with it altogether.

Unfortunately the monitor came with DisplayPort cables with hooks, once again, so at first I decided to connect it over HDMI instead. And that was a big mistake, for multiple reasons. The first is that calibrating it with the ColorMunki was showing a huge gap between the colours uncalibrated and calibrated. The second was that, when I went to look into it, I could not enable 10-bit (10 bpc) mode in the NVIDIA display settings.

Repeat after me: if you want to use a BL-series BenQ monitor for photography you should connect it using DisplayPort.

The two problems were solved after switching to DisplayPort (temporarily with hooks, and ordered a proper cable already): 10bpc mode is not available over HDMI when using 4k resolution and 60Hz. HDMI 2 can do 4k and 10-bit (HDR) but only at lower framerate, which makes it fine for watching HDR movies and streaming, but not good for photo editing. The problem with the calibration was the same problem I noticed, but couldn’t be bothered figuring out how to fix, on my laptops: some of the gray highlighting of text would not actually be visible. For whatever reason, BenQ’s “designer” monitors ship with the HDMI colour range set to limited (16-235) rather than full (0-255). Why did they do that? I have no idea. Indeed switching the monitor to sRGB mode, full range, made the calibration effectively unnecessary (I still calibrated it out of nitpickery), and switching to DisplayPort removes the whole question on whether it should use limited or full range.

While the BenQ monitors have fairly decent integrated speakers, which make it unnecessary to have a soundbar for hearing system notifications or chatting with my mother, that is not the greatest option to play games on. So I ended up getting a pair of Bose Companion 2 speakers which are more than enough for what I need to use them for.

Now I have an overly powerful computer, and a very nice looking monitor. How do I connect them to the Internet? Well, here’s the problem: the Hyperoptic socket is in the living room, way too far from my computer to be useful. I could have just put a random WiFi adapter on it, but I also needed a new router anyway, since the box with my fairly new Linksys also got lost in the moving process.

So upon suggestion from a friend, and a recommendation from Troy Hunt I ended up getting a UAP-AC-PRO for the living room, and a UAP-AC-LITE for the home office, topped it with an EdgeRouter X (the recommendation of which was rescinded afterwards, but it seems to do its job for now), and set them as a bridge between the two locations. I think I should write down networking notes later, but Troy did that already so why bother?

So at the end of this whole thing I spent way more money on hardware than I planned to, I got myself a very new nice computer, and I have way too many extra cables than I need, plus the whole set of odds and ends of the old computer, router and scanner that are no longer useful (I still have the antennas for the router, and the power supply for the scanner). And I’m still short of the document scanner, which is a bit of a pain because I now have a collection of documents that need scanning. I could use the office’s scanners, but those don’t run OCR for the documents, and I have not seen anything decent to apply OCR to PDFs after the fact, I’m open to suggestions as I’m not sure I’m keen on ending up buying something like the EPSON DS-310 just for the duplex scanning and the OCR software.

December 18, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
UK Banking, Attempt 2: Fineco Bank (December 18, 2017, 00:04 UTC)

So after a fairly negative experience with Barclays I have been quickly looking for alternatives. Two acquaintances who don’t know each other both suggested me to look into Fineco, which is an Italian bank also operating in the United Kingdom. As you can tell from their website, their focus is on trading and traders, but turns out they also make a fairly decent bank in and by themselves.

Indeed, opening the account with Fineco has been fairly straightforward: a few online forms, uploading documents to their identity verification system (very similar to what Revolut does, except using an Italian company that I already knew and was a customer of), and then sending £1 from a bank account that is already opened in your name. I found the forms also technically well-designed, particularly the fact that all the “I agree to” checkboxes automatically trigger JavaScript downloads of PDFs with the terms agreed, whether you clicked to read the agreement or not — I guess it’s a «No excuse, you have a copy of this» protection on their side, but it also made it very easy to archive all the needed information together with everything else I keep.

I should note here that it looks like Fineco’s target audience is Italian expats in the UK explicitly. It is common for most services to “special case” their local country as the first entry in the country drop-down, and then add the rest in alphabetical order. In the case of Fineco, the drop-down started with United Kingdom and Italy for all the options.

One of the good thing about this bank being focused so much on trading is that the account is by default a multicurrency one, similar to TransferWise Borderless Account. Indeed, in addition to the primary Sterling account, Fineco sets you up right away with accounts in Euro, Swiss Francs, and US Dollars, all connected to the same login. And in addition to this, they offer you the choice between a Sterling debit card, an Euro credit card, or both (for a reasonable fee of £10/yr). The two debit cards that are connected to the respective currency accounts (and no card is available for Francs or Dollars), and there are no foreign transaction fees for the two. While Revolut mostly took care of my foreign transaction fees, it’s always good to have a local debit card with a much higher availability, particularly as ATM access for Revolut has a relatively low monthly limit.

One of the interesting details of these currency accounts is that they all have Italian IBAN and BIC (with a separate SWIFT routing number, of its parent group UniCredit). For the main Sterling account, UK-style Sort Code and Account Number are available, which make it a proper local account.

This is actually very useful for me: for the past four years I have been keeping my old Italian account open, despite it costing me a fair bit of money just in service, because I have been paying the utilities for my mother’s house. And despite SEPA Direct Debit having been introduced over two years ago, the utilities I contacted failed to let me debit a foreign (Irish) account. Since I left Ireland, and the UK is not a Euro country, I was afraid I would have to keep my Italian account open even longer, but this actually solved the problem: for Italian utilities, the account is a perfectly valid Italian account, as for the most part they don’t even validate the billing address.

An aside: Vodafone Italy and Wind 3 Italy are still attached to my Tesco credit card, which Tesco Bank assures me I can keep using as long as I direct debit it into an Euro account anywhere. They even changed my mailing address to my new apartment in London. Those two companies insist that they only ever accept Italian credit cards, but they accepted my Irish credit card just fine before; in the case of Vodafone, they have an explicit whitelist of the BIN (for whatever reason), while Wind couldn’t get a hold of the concept that the card is Irish at all. Oh well.

Speaking of direct debits and odd combinations, while I should have now managed to switch all the utilities, including the council tax, to direct debit on this new account, I had some trouble doing the setup with Thameswater, the water provider in my area. If I tried setting up the direct debit online, it would report Fineco’s sort code (30-02-48) as invalid. The Sort Code Checker provided by the category association says it’s valid and it works for everything beside the cheque and credit clearing (which is unneeded). I ended up having to call them and ask them to override the warning, but they have not sent me confirmation that they managed. This appears to be a common “feature” of Thameswater — oh and by the way their paper form to request the direct debit was a 404 response on their website. Sigh.

The UI of the bank (and of their app) is much more information-dense than any other bank I’ve ever used. It’s not a surprise when you consider that they their target audience is investors and traders. It does work well for me, but I can see how this would not be the most pleasing interface for most home users. The only feature I have been unable to find yet in the interface is how to set up standing orders – I contacted them this weekend and will see what they say – so for the moment I just set up a few months worth of rent as scheduled payments, which work just as fine for the moment.

The Android app supports fingerprint authentication (unlike Barclay’s) and does not come with its own NFC payment system. Unfortunately the debit cards also appear not to be enabled for Android Pay, which is a bit of a shame. They also don’t leverage the app to send notifications, but they do send free SMS for new offline1 transactions happening on the debit card, which is great.

All in all, I may have found the bank I was looking for. It’s not a “cuddly” bank, but it appears to have what I need and it appears to work for my needs. With a bit of luck it will mean by Q1 I’ll be done with all the other bank accounts in both Ireland and Italy, and finally it’ll be simpler to keep an eye onto how much money I have and how much of it is spent around the place (although GnuCash does help a bit there). I’ll keep you all posted if this changes.


  1. Confusingly enough, a transaction happening over the Internet is an “offline” transaction. The online/offline is referred to the chip for chip’n’pin cards. If the chip is connected to a terminal that is in turn connected to the bank, that’s an online transaction. Otherwise it’s offline. If you read or type the number manually, it’s also offline. [return]

December 16, 2017
Sergei Trofimovich a.k.a. slyfox (homepage, bugs)
Stack protection on mips64 (December 16, 2017, 00:00 UTC)

trofi's blog: Stack protection on mips64

Stack protection on mips64

Bug

On #gentoo-toolchain Matt Turner reported an obscure SIGSEGV of top program on one of mips boxes:

07:01 <@mattst88> since upgrading to >=glibc-2.25 on mips:
07:01 <@mattst88> bcm91250a-be ~ # top
07:01 <@mattst88> *** stack smashing detected ***: <unknown> terminated
07:01 <@mattst88> Aborted

A simple (but widely used) tool seemingly managed to overwrite it’s stack. Looks straightforward, right? Let’s see!

Reproducing

As I don’t have any mips hardware I will fake some:

# crossdev -t mips64-unknown-linux-gnu
# <set a few profile variables>
# mips64-unknown-linux-gnu-emerge -1 procps

We’ve built a toolchain that targets mips64 (N32 ABI). Checking if it crashes:

$ LD_LIBRARY_PATH=/lib64 qemu-mipsn32 -L /usr/mips64-unknown-linux-gnu /usr/mips64-unknown-linux-gnu/usr/bin/top
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault

This is not exactly stack smash. But who knows, maybe my set of tools is slightly different from Matt’s and the same bug manifests in a slightly different way.

First workaround

Gentoo recently enabled stack protection and already exposed a bug at least on powerpc32.

First, let’s check if it’s related to stack protector option:

# EXTRA_ECONF=--enable-stack-protector=no emerge -v1 cross-mips64-unknown-linux-gnu/glibc
$ LD_LIBRARY_PATH=/lib64 qemu-mipsn32 -L /usr/mips64-unknown-linux-gnu /usr/mips64-unknown-linux-gnu/usr/bin/top
<running top>

That worked! To unbreak mips users I disabled glibc stack self-protection by passing –enable-stack-protector=no.

Into the rabbit hole

Now let’s try to find out what is at fault here. In case of powerpc it was bad gcc code generation. Could it be something similar here?

First, we need a backtrace from qemu. Unfortunately qemu-generated .core files are truncated and are not directly readable by gdb:

$ gdb --quiet /usr/mips64-unknown-linux-gnu/usr/bin/top qemu_top_20171216-220221_14697.core
Reading symbols from /usr/mips64-unknown-linux-gnu/usr/bin/top...done.
BFD: Warning: /home/slyfox/qemu_top_20171216-220221_14697.core is truncated: expected core file size >= 8867840, found: 1504.

Let’s try gdbserver mode instead. Running qemu with -g 12345 to wait for gdb session:

$ LD_LIBRARY_PATH=/lib64 qemu-mipsn32 -g 12345 -L /usr/mips64-unknown-linux-gnu /usr/mips64-unknown-linux-gnu/usr/bin/top

And in second terminal fire up gdb:

$ gdb --quiet /usr/mips64-unknown-linux-gnu/usr/bin/top
Reading symbols from /usr/mips64-unknown-linux-gnu/usr/bin/top...done.
(gdb) set sysroot /usr/mips64-unknown-linux-gnu
(gdb) target remote :12345
Remote debugging using :12345
Reading symbols from /usr/mips64-unknown-linux-gnu/lib32/ld.so.1...
Reading symbols from /usr/lib64/debug//usr/mips64-unknown-linux-gnu/lib32/ld-2.26.so.debug...done.
done.
0x40801d10 in __start () from /usr/mips64-unknown-linux-gnu/lib32/ld.so.1
(gdb) continue 
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x408cb908 in _dlerror_run (operate=operate@entry=0x408cadf0 <dlopen_doit>, args=args@entry=0x407feb58)
    at dlerror.c:163
163       result->errcode = _dl_catch_error (&result->objname, &result->errstring,
(gdb) bt
#0  0x408cb908 in _dlerror_run (operate=operate@entry=0x408cadf0 <dlopen_doit>, args=args@entry=0x407feb58)
    at dlerror.c:163
#1  0x408caf4c in __dlopen (file=file@entry=0x10012d58 "libnuma.so", mode=mode@entry=1) at dlopen.c:87
#2  0x1000306c in before (me=0x407ff3b6 "/usr/mips64-unknown-linux-gnu/usr/bin/top") at top/top.c:3308
#3  0x10001a10 in main (dont_care_argc=<optimized out>, argv=0x407ff1d4) at top/top.c:5721

Woohoo! Nice backtrace! Matt confirmed he sees the same backtrace.

Curious fact: top tries to load libnuma.so opportunistically (top/top.c):

// ...
#ifndef NUMA_DISABLE
#if defined(PRETEND_NUMA) || defined(PRETEND8CPUS)
Numa_node_tot = Numa_max_node() + 1;
#else
// we'll try for the most recent version, then a version we know works...
if ((Libnuma_handle = dlopen("libnuma.so", RTLD_LAZY))
|| (Libnuma_handle = dlopen("libnuma.so.1", RTLD_LAZY))) {
Numa_max_node = dlsym(Libnuma_handle, "numa_max_node");
Numa_node_of_cpu = dlsym(Libnuma_handle, "numa_node_of_cpu");
if (Numa_max_node && Numa_node_of_cpu)
Numa_node_tot = Numa_max_node() + 1;
else {
dlclose(Libnuma_handle);
Libnuma_handle = NULL;
}
}
#endif
#endif
// ...

As I did not have libnuma installed it should not matter which library we try to load. I tried to write a minimal reproducer that only calls dlopen():

#include <dlfcn.h>
// mips64-unknown-linux-gnu-gcc dlopen-bug.c -o dlopen-bug -ldl
int main() {
void * h = dlopen("libdoes-not-exist.so", RTLD_LAZY);
}
$ mips64-unknown-linux-gnu-gcc dlopen-bug.c -o dlopen-bug -ldl
$ qemu-mipsn32 -L /usr/mips64-unknown-linux-gnu ./dlopen-bug
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault

The backtrace is the same. OK, that’s bad. If it’s a stack overflow it definitely happens somewhere in glibc internals and not in the client.

Reproducing on master

Chances are we will need to fix glibc to get our stack protection back. I tried to reproduce the same failure on master branch:

$ ../glibc/configure \
      --enable-stack-protector=all \
      --enable-stackguard-randomization \
      --enable-kernel=3.2.0 \
      --enable-add-ons=libidn \
      --without-selinux \
      --without-cvs \
      --disable-werror \
      --enable-bind-now \
      --build=x86_64-pc-linux-gnu \
      --host=mips64-unknown-linux-gnu \
      --disable-profile \
      --without-gd \
      --with-headers=/usr/mips64-unknown-linux-gnu/usr/include \
      --prefix=/usr \
      --sysconfdir=/etc \
      --localstatedir=/var \
      --libdir='$(prefix)/lib32' \
      --mandir='$(prefix)/share/man' \
      --infodir='$(prefix)/share/info' \
      --libexecdir='$(libdir)/misc/glibc' \
      --disable-multi-arch \
      --disable-systemtap \
      --disable-nscd \
      --disable-timezone-tools \
      CFLAGS="-O2 -ggdb3"
$ make
$ qemu-mipsn32 ./elf/ld.so --library-path .:dlfcn ~/tmp/dlopen-bug

No failure. It could mean the bug was fixed in master or something else introduces the error.

I checked out glibc-2.26 and retried above:

$ qemu-mipsn32 ./elf/ld.so --library-path .:dlfcn ~/tmp/dlopen-bug
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault

Aha, the problem is still there in latest release.

I bisected glibc from 2.26 to master to find the commit that fixes SIGSEGV. My bisection stopped at commit 2449ae7b

commit 2449ae7b2da24c9940962304a3e44bc80e389265
Author: Florian Weimer <fweimer@redhat.com>
Date:   Thu Aug 10 13:40:22 2017 +0200

    ld.so: Introduce struct dl_exception

    This commit separates allocating and raising exceptions.  This
    simplifies catching and re-raising them because it is no longer
    necessary to make a temporary, on-stack copy of the exception message.

Looking hard at that commit I have found nothing that could fix a bug. The change shuffled a few things around but did not change behaviour too much.

I decided to fetch parent commit f87cc2bfb and spend some time to understand the failure mode.

First, the crash happens in _dlerror_run() code:

0x40801c70 in __start () from /home/slyfox/tmp/lib32/ld.so.1
(gdb) continue 
Continuing.

Program received signal SIGSEGV, Segmentation fault.
_dlerror_run (operate=operate@entry=0x40838df0 <dlopen_doit>, args=args@entry=0x407ff058) at dlerror.c:163
163       result->errcode = _dl_catch_error (&result->objname, &result->errstring,
(gdb) bt
#0  _dlerror_run (operate=operate@entry=0x40838df0 <dlopen_doit>, args=args@entry=0x407ff058) at dlerror.c:163
#1  0x40838f4c in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#2  0x1000076c in main ()
(gdb) list
158           if (result->malloced)
159             free ((char *) result->errstring);
160           result->errstring = NULL;
161         }
162
163       result->errcode = _dl_catch_error (&result->objname, &result->errstring,
164                                          &result->malloced, operate, args);
165
166       /* If no error we mark that no error string is available.  */
167       result->returned = result->errstring == NULL;

Simplified version of _dlerror_run() looks like this:

static struct dl_action_result last_result;
static struct dl_action_result *static_buf = &last_result
// ...
int
internal_function
_dlerror_run (void (*operate) (void *), void *args)
{
struct dl_action_result *result;
result = static_buf;
if (result->errstring != NULL)
{
if (result->malloced)
free ((char *) result->errstring);
result->errstring = NULL;
}
result->errcode = _dl_catch_error (&result->objname, &result->errstring,
&result->malloced, operate, args);
/* If no error we mark that no error string is available. */
result->returned = result->errstring == NULL;
return result->errstring != NULL;
}

The SIGSEGV happens when we try to store result of _dl_catch_error() into result->errcode.

I poked in gdb at the values of result right before _dl_catch_error() call and after it. And values are different! Time to look at where result is physically stored:

(gdb) disassemble /s _dlerror_run
163       result->errcode = _dl_catch_error (&result->objname, &result->errstring,
=> 0x40839918 <+184>:   sw      v0,0(s0)
(gdb) print (void*)$s0
$1 = (void *) 0x40834c44 <__stack_chk_guard>

The instruction stores single word(32 bits) at address of s0 register. But s0 points not to last_result (it did right before the call) but at __stack_chk_guard.

How stack checks work on mips

What is __stack_chk_guard? Has to do something about stack checks. Short answer: it’s a read-only variable that holds stack canary value. glibc is not supposed to write to it after it is initialized. Something leaked out address of that variable into s0 (callee-save register).

Let’s familiarize ourselves with mips ABI a bit and look at how stack checks look in generated code in a simple example:

void g(void) {}
; mips64-unknown-linux-gnu-gcc -S b.c -fno-stack-protector -O1
.file 1 "b.c"
.section .mdebug.abiN32
.previous
.nan legacy
.module fp=64
.module oddspreg
.abicalls
.text
.align 2
.globl g
.set nomips16
.set nomicromips
.ent g
.type g, @function
g:
.frame $sp,0,$31 # vars= 0, regs= 0/0, args= 0, gp= 0
.mask 0x00000000,0
.fmask 0x00000000,0
.set noreorder
.set nomacro
jr $31
nop
.set macro
.set reorder
.end g
.size g, .-g
.ident "GCC: (Gentoo 7.2.0 p1.1) 7.2.0"

31 register is also known as ra, return address. The code has a lot of pragmas but they are needed only for debugging. The real code is two insructions: jr $31; nop.

Let’s check what fstack-protector-all does with our code:

; mips64-unknown-linux-gnu-gcc -S b.c -fstack-protector-all -O1
.file 1 "b.c"
.section .mdebug.abiN32
.previous
.nan legacy
.module fp=64
.module oddspreg
.abicalls
.text
.align 2
.globl g
.set nomips16
.set nomicromips
.ent g
.type g, @function
g:
.frame $sp,32,$31 # vars= 16, regs= 2/0, args= 0, gp= 0
.mask 0x90000000,-8
.fmask 0x00000000,0
.set noreorder
.set nomacro
addiu $sp,$sp,-32 ; allocate 32 bytes on stack
sd $31,24($sp) ; backup $31 (ra)
sd $28,16($sp) ; backup $28 (gp)
lui $28,%hi(__gnu_local_gp) ; compute address of GOT
addiu $28,$28,%lo(__gnu_local_gp) ; (requires two instructions
lw $2,%got_disp(__stack_chk_guard)($28) ; read offset of __stack_chk_guard in GOT
lw $3,0($2) ; read canary value of __stack_chk_guard
sw $3,12($sp) ; store canary on stack
; ... time to check our canary!
lw $3,12($sp) ; load canary from stack
lw $2,0($2) ; load canary from __stack_chk_guard
bne $3,$2,.L4 ; check canary value and crash the program
ld $31,24($sp) ; restore return address
ld $28,16($sp) ; restore gp
jr $31 ; (restore stack pointer and) return
addiu $sp,$sp,32
.L4:
lw $25,%call16(__stack_chk_fail)($28)
.reloc 1f,R_MIPS_JALR,__stack_chk_fail
1: jalr $25
nop
.set macro
.set reorder
.end g
.size g, .-g
.ident "GCC: (Gentoo 7.2.0 p1.1) 7.2.0"

Here is a quick table of mips registers.

15 instructions are doing the following: intermediate registers to hold canary value 2(v0) and 3(v1) are written on stack (sp register), read back and checked against value stored in __stack_chk_guard.

Quite straightforward.

Who broke s0?

Back to our _dl_catch_error() why did s0 change? mips ABI says s0 is callee-save. It means s0 should not be changed by callee functions.

To get more clues we need to dive into _dl_catch_error()

struct catch
{
const char **objname; /* Object/File name. */
const char **errstring; /* Error detail filled in here. */
bool *malloced; /* Nonzero if the string is malloced
by the libc malloc. */
volatile int *errcode; /* Return value of _dl_signal_error. */
jmp_buf env; /* longjmp here on error. */
};
// ...
int
internal_function
_dl_catch_error (const char **objname, const char **errstring,
bool *mallocedp, void (*operate) (void *), void *args)
{
/* We need not handle `receiver' since setting a `catch' is handled
before it. */
/* Only this needs to be marked volatile, because it is the only local
variable that gets changed between the setjmp invocation and the
longjmp call. All others are just set here (before setjmp) and read
in _dl_signal_error (before longjmp). */
volatile int errcode;
struct catch c;
/* Don't use an initializer since we don't need to clear C.env. */
c.objname = objname;
c.errstring = errstring;
c.malloced = mallocedp;
c.errcode = &errcode;
struct catch *const old = catch_hook;
catch_hook = &c;
/* Do not save the signal mask. */
if (__builtin_expect (__sigsetjmp (c.env, 0), 0) == 0)
{
(*operate) (args);
catch_hook = old;
*objname = NULL;
*errstring = NULL;
*mallocedp = false;
return 0;
}
/* We get here only if we longjmp'd out of OPERATE. _dl_signal_error has
already stored values into *OBJNAME, *ERRSTRING, and *MALLOCEDP. */
catch_hook = old;
return errcode;
}

This code is straightforward (but very scary): it wraps call of operate callback into __sigsetjmp() (really just a setjmp()).

setjmp() is a simple-ish function: it stores most of current registers into c.env() and later longjmp() restores them. Caller-saves are not saved because longjmp() looks like a normal C function call.

Normally longjmp() is called only when error condition happens. In our case it’s called when dlopen() fails (we are opening non-existent file). longjmp() restores all registers stored by setjmp() including instruction pointer pc, stack pointer sp, caller-saves s0..s7 and others.

The question arises: why and where do we lose s0 register? At save (setjmp()) or at restore (longjmp())?

As we can see setjmp() and longjmp() are functions very sensitive to ABI. Let’s check how __sigsetjmp() is implemented at sysdeps/mips/mips64/setjmp.S

ENTRY (__sigsetjmp)
SETUP_GP
SETUP_GP64_REG (v0, C_SYMBOL_NAME (__sigsetjmp))
move a2, sp
move a3, fp
PTR_LA t9, __sigsetjmp_aux
RESTORE_GP64_REG
move a4, gp
jr t9
END (__sigsetjmp)

Note how __sigsetjmp does almost nothing here: only saves gp, sp and fp and defers everything to __sigsetjmp_aux. Let’s peek at that in sysdeps/mips/mips64/setjmp_aux.c

Suddenly, its implementation is in C:

int
__sigsetjmp_aux (jmp_buf env, int savemask, long long sp, long long fp,
long long gp)
{
/* Store the floating point callee-saved registers... */
asm volatile ("s.d $f20, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[0]));
asm volatile ("s.d $f22, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[1]));
asm volatile ("s.d $f24, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[2]));
asm volatile ("s.d $f26, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[3]));
asm volatile ("s.d $f28, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[4]));
asm volatile ("s.d $f30, %0" : : "m" (env[0].__jmpbuf[0].__fpregs[5]));
/* .. and the PC; */
asm volatile ("sd $31, %0" : : "m" (env[0].__jmpbuf[0].__pc));
/* .. and the stack pointer; */
env[0].__jmpbuf[0].__sp = sp;
/* .. and the FP; it'll be in s8. */
env[0].__jmpbuf[0].__fp = fp;
/* .. and the GP; */
env[0].__jmpbuf[0].__gp = gp;
/* .. and the callee-saved registers; */
asm volatile ("sd $16, %0" : : "m" (env[0].__jmpbuf[0].__regs[0]));
asm volatile ("sd $17, %0" : : "m" (env[0].__jmpbuf[0].__regs[1]));
asm volatile ("sd $18, %0" : : "m" (env[0].__jmpbuf[0].__regs[2]));
asm volatile ("sd $19, %0" : : "m" (env[0].__jmpbuf[0].__regs[3]));
asm volatile ("sd $20, %0" : : "m" (env[0].__jmpbuf[0].__regs[4]));
asm volatile ("sd $21, %0" : : "m" (env[0].__jmpbuf[0].__regs[5]));
asm volatile ("sd $22, %0" : : "m" (env[0].__jmpbuf[0].__regs[6]));
asm volatile ("sd $23, %0" : : "m" (env[0].__jmpbuf[0].__regs[7]));
/* Save the signal mask if requested. */
return __sigjmp_save (env, savemask);
}

The function duly stores every caller-save (including 16 aka s0) and more into c.env. But what happens when that function is being compiled with -fstack-protector-all? How does it preserve original registers?

Unfortunately the answer is: it does not.

Let’s compare assembly output with and without -fstack-protector-all:

; **-fno-stack-protector**:
Dump of assembler code for function __sigsetjmp_aux:
addiu sp,sp,-16
sd gp,0(sp)
lui gp,0x16
addu gp,gp,t9
sd ra,8(sp)
addiu gp,gp,7248
sdc1 $f20,104(a0)
sdc1 $f22,112(a0)
sdc1 $f24,120(a0)
sdc1 $f26,128(a0)
sdc1 $f28,136(a0)
sdc1 $f30,144(a0)
sd ra,0(a0)
sd a2,8(a0)
sd a3,80(a0)
sd a4,88(a0)
sd s0,16(a0)
sd s1,24(a0)
sd s2,32(a0)
sd s3,40(a0)
sd s4,48(a0)
sd s5,56(a0)
sd s6,64(a0)
sd s7,72(a0)
lw t9,-32236(gp)
bal 0x30000 <__sigjmp_save>
nop
ld ra,8(sp)
ld gp,0(sp)
jr ra
addiu sp,sp,16
; **-fstack-protector-all**:
Dump of assembler code for function __sigsetjmp_aux:
addiu sp,sp,-48
sd gp,32(sp)
lui gp,0x18
addu gp,gp,t9
addiu gp,gp,-23968
sd s0,24(sp) ; here we backup s0
lw s0,-27824(gp) ; and load into s0 stack canary address
sd ra,40(sp)
lw v1,0(s0)
sw v1,12(sp)
sdc1 $f20,104(a0)
sdc1 $f22,112(a0)
sdc1 $f24,120(a0)
sdc1 $f26,128(a0)
sdc1 $f28,136(a0)
sdc1 $f30,144(a0)
sd ra,0(a0)
sd a2,8(a0)
sd a3,80(a0)
sd a4,88(a0)
sd s0,16(a0)
sd s1,24(a0)
sd s2,32(a0)
sd s3,40(a0)
sd s4,48(a0)
sd s5,56(a0)
sd s6,64(a0)
sd s7,72(a0)
lw t9,-32100(gp)
bal 0x31940 <__sigjmp_save>
nop
lw a0,12(sp)
lw v1,0(s0)
bne a0,v1,0x31c7c <__sigsetjmp_aux+156>
ld ra,40(sp)
ld gp,32(sp)
ld s0,24(sp)
jr ra
addiu sp,sp,48
lw t9,-32644(gp)
jalr t9
nop

Or in diff form:

--- no-sp 2017-12-16 23:53:51.591627849 +0000
+++ spa 2017-12-16 23:53:37.952647838 +0000
@@ -1 +1 @@
- ; **-fno-stack-protector**:
+ ; **-fstack-protector-all**:
@@ -3,3 +3,3 @@
- addiu sp,sp,-16
- sd gp,0(sp)
- lui gp,0x16
+ addiu sp,sp,-48
+ sd gp,32(sp)
+ lui gp,0x18
@@ -7,2 +7,6 @@
- sd ra,8(sp)
- addiu gp,gp,7248
+ addiu gp,gp,-23968
+ sd s0,24(sp) ; here we backup s0
+ lw s0,-27824(gp) ; and load into s0 stack canary address
+ sd ra,40(sp)
+ lw v1,0(s0)
+ sw v1,12(sp)
@@ -27,2 +31,2 @@
- lw t9,-32236(gp)
- bal 0x30000 <__sigjmp_save>
+ lw t9,-32100(gp)
+ bal 0x31940 <__sigjmp_save>
@@ -30,2 +34,6 @@
- ld ra,8(sp)
- ld gp,0(sp)
+ lw a0,12(sp)
+ lw v1,0(s0)
+ bne a0,v1,0x31c7c <__sigsetjmp_aux+156>
+ ld ra,40(sp)
+ ld gp,32(sp)
+ ld s0,24(sp)
@@ -33 +41,4 @@
- addiu sp,sp,16
+ addiu sp,sp,48
+ lw t9,-32644(gp)
+ jalr t9
+ nop

The minimal reproducer and fix

To fully nail down the problem I’d like to have something nice for upstream. Here is the minimal reproducer:

#include <setjmp.h>
#include <stdio.h>
int main() {
jmp_buf jb;
volatile register long s0 asm ("$s0");
s0 = 1234;
if (setjmp(jb) == 0)
longjmp(jb, 1);
printf ("$s0 = %lu\n", s0);
}
$ qemu-mipsn32 -L ~/bad-libc ./mips-longjmp-bug
$s0 = 1082346564
$ qemu-mipsn32 -L ~/fixed-libc ./mips-longjmp-bug
$s0 = 1234

And the fix is to disable stack protection of __sigsetjmp_aux() in all build modes of glibc. This does work:

--- a/sysdeps/mips/mips64/setjmp_aux.c
+++ b/sysdeps/mips/mips64/setjmp_aux.c
@@ -25,6 +25,7 @@
access them in C. */
int
+inhibit_stack_protector
__sigsetjmp_aux (jmp_buf env, int savemask, long long sp, long long fp,
long long gp)
{

Parting words

So it was not a stack corruption after all. But the register corruption triggered by a security feature.

What more interesting is why Matt got stack smashing detected and not SIGSEGV. It means that write into __stack_chk_guard actually succeeded and caused canary check failure not because on-stack canary copy changed but because global canary changed. __stack_chk_guard sits in .data.rel.ro section. qemu maps it as read-only and crashes my process. How it behaves on real target is an exercise to the reader with mips device :)

Fun facts:

  • It took me 12 days to find out the cause of failure. Working gdb and qemu-user made the fix happen.
  • setjmp()/longjmp() are not so opaque for me and hopefully for you. But make sure you have read all the volatility gotchas (Notes section)
  • glibc uses setjmp()/longjmp() for error handling
  • mips ABI is very pleasant to work with and not as complicated as I thought :)
  • qemu-mipsn32 needs a fix to generate readable .core files

Have fun!

Posted on December 16, 2017
<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript> comments powered by Disqus

December 12, 2017
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
py3status v3.7 (December 12, 2017, 14:34 UTC)

This important release has been long awaited as it focused on improving overall performance of py3status as well as dramatically decreasing its memory footprint!

I want once again to salute the impressive work of @lasers, our amazing contributors from the USA who has become top one contributor of 2017 in term of commits and PRs.

Thanks to him, this release brings a whole batch of improvements and QA clean ups on various modules. I encourage you to go through the changelog to see everything.

Highlights

Deep rework of the usage and scheduling of threads to run modules has been done by @tobes.

  •  now py3status does not keep one thread per module running permanently but instead uses a queue to spawn a thread to execute the module only when its cache expires
  • this new scheduling and usage of threads allows py3status to run under asynchronous event loops and gevent will be supported on the upcoming 3.8
  • memory footprint of py3status got largely reduced thanks to the threads modifications and thanks to a nice hunt on ever growing and useless variables
  • modules error reporting is now more detailed

Milestone 3.8

The next release will bring some awesome new features such as gevent support, environment variable support in config file and per module persistent data storage as well as new modules!

Thanks contributors!

This release is their work, thanks a lot guys!

  • JohnAZoidberg
  • lasers
  • maximbaz
  • pcewing
  • tobes

December 06, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)

I have expressed some doubts on Keybase last year, but I kept mostly “using” the service, as in I kept it up to date when they added new services and so on. I have honestly mostly used it to keep track of the address of the bitcoin wallet I’ve been using (more on that later).

A couple of weeks ago, I got an (unencrypted) email telling me that “chris” (the Keybase founder) “sent me a message” (more likely that they decided to do another of their rounds of adding invites to people’s account to try to gather more people. Usually these messages, being completely non-private, would be sent in the clear as email, but since Keybase has been trying to sell themselves as a secure messaging system, this time they were sent through their secure message system.

Unlike most other features of the service, that can be used by copy-pasting huge command lines that combine gpg and curl to signal out of band the requests, the secure messaging system depends on the actual keybase command line tool (written in NodeJS, if I got this right). So I installed it (from AUR since my laptop is currently running Arch Linux), and tried accessing the message. This was already more complicated than it should have been, because the previous computer that was set up to access Keybase was the laptop I replaced and gifted to a friend.

The second problem was that Keybase also decided to make encrypted file transfer easy by bundling it with messaging. Effectively making conversations a shared encrypted directory, with messages being files. This may sound a cool idea particularly for those who like efforts like Plan9, but I think this is where my patience for the service ends. Indeed, while there are sub-commands in keybase for file access, they all depend on using a FUSE (filesystem-in-userspace) module, and the same is true for the messages. So you’re now down to have to run a GPG wrapper, as well as a background NodeJS service on your computer, and a FUSE service to access files and messages.

I gave up. Even though trying to simplify the key exchange for GPG usage is definitely interesting, I have a feeling Keybase decided to feature-creep in the hope that it would attract funding so that they can keep operating. zCash first, now this whole encrypted file-share and messaging system, appear to be overly complicated for the sake of having something that you can show. As such, I explicitly said in my profile that any message sent through this service is going to be a scream into the void, as I’m not going to pay attention to it at all.

The second GPG-related problem was with GreenAddress, the bitcoin wallet I signed up for some months ago. Let me start with a note that I do not agree with bitcoin’s mission statement, with its implementation, or with any of the other cryptocurrencies existence. I found that Thomas wrote this better than me so I’m not going to spend more words on this. The reason why I even have a wallet is that some time ago, I was contacted by one of my readers (who I won’t disclose), who was trying out the Brave browser, which at the time would compensate content creators with bitcoin (they now switched to their own cryptocurrency, making it even less appealing to me and probably most other content creators who have not gotten drunk with cryptocurrencies). For full disclosure, all the money I received this way (only twice since signing up) I donated back to VideoLAN since it was barely pocket money and in a format that is not useful to me.

So GreenAddress looked like a good idea at the time, and one of the things it let you do is to provide them with your public key so that any communication from them to your mailbox is properly encrypted. This is good, particularly because you can get the one-time codes by mail encrypted, making it an actual second factor authentication, as it proves access to the GPG key (which in my case is actually a physical device). So of course I did that. And then forgot all about it until I decided to send that money to VideoLAN.

Here’s where things get messy with my own mistakes. My GPG key is, due to best practices, set to expire every year, and every year I renew it extending the expiration date. Usually I need the expired key pretty quickly, because it’s also my SSH key and I use it nearly daily. Because of my move to London I have not used my SSH key for almost a month, and missed the expiration. So when I tried logging into GreenAddress and had them send me the OTP code… I’ll get to that in a moment.

The expiration date on a GPG key is meant to ensure that the person you’re contacting has had recent access to the key, and it’s a form of self-revocation. It is a perfectly valid feature when you consider the web of trust that we’re used to. It is slightly less obvious as a feature when explicitly providing a public key for a service, be it GreenAddress, Bugzilla or Facebook. Indeed, what I would expect in these cases would be:

  • a notification that the provided public key is about to expire and you need to extend or replace it; or
  • a notification that the key has expired, and so further communication will happen in clear text; or
  • the service to ignore the expiration date altogether, just as they ignore whether the key was signed and counter-signed (preferred).

Since providing a public key directly to a service sidesteps most other trust options, including the signing and counter-signing, I would expect the expiration date to also be ignored. Indeed, if I provided a public key to a service, I expect the service to keep using it until I told it otherwise. But as I said there are other options. They are probably all bad in some ways, but ignoring the expiration date does not appear to be obviously wrong to me.

What does appear totally broken to me is sending messages that contain exactly this text:

Warning: Invalid pgp key detected

Which is exactly what GreenAddress did.

For reference, it appears that Facebook just silently disables encryption of email notifications when the key expires. I have thus uploaded the extended key now, and should probably make a good reminder a month before the expiration to keep securing my accounts. I find this particularly bothersome, because the expiration date is public information, if you use the same key for all communication. Having a service-specific key is probably the most secure option, as long as the key is not available on public keyserver, but it also complicates the securing up of your account.

What does this tell us? That once again, making things usable is an important step to make things really secure. If you have to complicate people’s lives for them to jump through hoops to be able to keep their accounts secure, they’re likely going to drop a number of security features that are meant to protect them.

December 02, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
UK Banking, Attempt 1: Barclays (December 02, 2017, 20:04 UTC)

You may remember that back in August, I tried opening a NatWest account while not living in the UK yet, and hit a stonewall of an impossible declaration being required by the NatWest employees. I gave up on setting up a remote account, and waited to open one once I got in the country. Since the Northern Irish account seemed to be good for all I needed to do (spoiler: it wasn’t), I decided to wait for the Barclays representative to show up on my official starting date, and set up a “Premier” account with them.

The procedure, that sounded very “special” beforehand, turned out to just be a “Here is how you fill in the forms on the website”. Then, instead of sending you to a local branch to get your documents copied and stamped (something that appears to be very common in the British Isles), they had three people doing the stamping on a pre-made copy of the passport. Not particularly special, but at least practical, right?

Except they also said it would take a few day for the card, but over a week to have access the online banking as they need to “send me more stuff”. The forms were filled in on Monday, set up by Tuesday, and the card arrived on Wednesday, with the PIN following on Thursday. At that point I guessed that what else they told me to wait for was a simple EMV CAP device (I did not realise that the Wikipedia page had a Barclays device as an example, until I looked to link it over here), and decided to not wait, instead signing up for the online banking using my Ulster Bank CAP device, which worked perfectly fine.

On the Friday I also tried installing the Barclays app on my phone. As you probably all noticed by now, looking for a new app from the Play Store is risky, particularly when banking is involved, so I wanted to get a link to it from their website. Turns out that the Barclays website includes a link to the Apple App Store page for their app, but not for the Google Play one. Instead, the Play Store badge image is not clickable. Instead the option they give you is to provide your phone number and they will send you a link to the app as a text message. When I tried doing so, I got an error message suggesting to check my connection.

The reason for the error became apparent with developer tools open: the request to send the SMS is sent to a separate app running on a different hostname. And that host has a different certificate than their main website, which at that point was expired for at least four days! Indeed, since then, the certificate has been replaced with a new one, an EV certificate signed by Entrust, rather than Symantec as they had before. I do find it slightly disconcerting that they have no monitoring on the validity of the certificates for all of their websites, as a bank. But let’s move on.

The online banking relies heavily on “PINSentry” (that is, CAP) but doing so it makes it fairly easy to set up most things, from standing orders to transfers and changes of address. Changing address to my new apartment was quite straightforward, and it all seemed good. The mobile app on the other hand was less useful at first. The main problem is that the app will refuse to do much for the first ten days, because they “set it up” for you. I assume this is a security feature to avoid someone to get access to your account and have the app execute the transactions instead of the website. Unfortunately it also means that the app is useless if your phone dies and you need to get a new one.

Speaking of the mobile app, Barclays supports Apple Pay, but they don’t support Android Pay, probably because they don’t have to. On Android, you can have a replacement app to provide NFC payment support, and so they decided to use their banking app for the payments as well. Unfortunately the one time I tried using it, it kept throwing errors, and asked me to login, with network connection. I don’t think I’ll use this again and will rather look for a bank that supports Android Pay in the future.

Up to here everything sounds peachy, right? The card arrived, it worked, although I only used it a handful times, to buy stuff at IKEA and to buy plane tickets where Revolut would push an extra £5 due to it running on the credit card circuit1, rather than the debit card one.

Then the time came for me to buy a new computer, because of the one ““lost”” by the movers. Since Black Friday was around the corner, and with it my trip to Italy, I decided to wait for that and see if anything at all would come discounted. And indeed Crucial (Micron) had a discount on their SSDs, which is what I ended up ordering. Unfortunately, my first try to order ended up facing a Verified by Visa screen that, instead of trying to get more authentication factors for myself, just went on to tell me the transaction failed, and to check my phone for messages.

Indeed, my phone received two text messages: one telling me that a text message would be sent to confirm a transaction, and one asking me whether the transaction was intentional or not. After confirming it was me doing the transaction, I was responded to try the transaction again in a few minutes. Which I did, but even if this went through the Verified by Visa screen, PayPal refused the payment altogether. Trying to order directly through Crucial without using PayPal managed to get my order through… except it was cancelled half an hour later because Crucial could not confirm the details of the card.

At this point I tried topping up my Revolut account with the same card, and… it didn’t go well either. I tried calling them then, and they could only tell me that the problem was not theirs, and that they couldn’t even see the requests from Revolut, and they didn’t stop any other transactions, giving the fault to the vendor. The vendor of course blamed the bank, and so I got stuck in between.

Upon suggestion from Revolut on Twitter, I tried topping up by UK bank transfer. At first I got some silly “security questions” about the transfer (“Are you making this transfer to buy some goods? Is someone on the phone instructing you to make this payment?” and so on), but when it supposedly completed, I couldn’t see it in the list of transactions, and trying again would lead to a “technical problem” message. Calling the bank again has been even more frustrating because the call dropped once, and as usual the IVR asked me three times for my date of birth and never managed to recognize it. It wasn’t until I left the office, angry and disappointed, that the SMS arrived telling me to confirm if it was really me requesting the transfer…

The end result looked like Barclays put a stricter risk engine in place for Black Friday which has been causing my payments to not go through, particularly not from the office. Trying later in the evening from my apartment (which has a much more clear UK-based geolocation) allowed the orders to go through. You could say that this is for my own protection but I do find this particularly bothersome for one reason in particular: they have an app!

They could have just as easily sent a push notification to my phone to confirm or refuse the transaction, instead of requiring me to be able to receive text messages (which is not a given, as coverage is not perfect particularly in a city like London), in addition to me knowing my access code, having my bank card with me, and knowing its PIN.

At the end of the day I decided that Barclays is not the bank for me, and applied to open an account with Fineco which is Italian and appears to have Italian expats in the UK as their target market. Will keep you posted about it.


  1. But I found out just the other day that the new virtual cards from Revolut are actually VISA Electron, rather than MasterCard. This makes a difference for many airlines as VISA Electron are often considered debit cards, due to the “Electronic Use Only” limitation. I got myself a second virtual card for that and will see how that goes next time I book a flight. [return]

November 30, 2017
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)

My love for Apple and especially MacOS does not run deep. Actually, it is essentially nonexistent. I was recently reminded of the myriad reasons why I don’t like MacOS, and one of them is that the OS should never stand in the way of what the operator wants to do. In this case, I found that even the root account couldn’t write to certain directories that MacOS deemed special. That “feature” is known System Integrity Protection. I’m not going to rant about how absurd it is to disallow the root account the ability to write, but instead I’d like to present the method of disabling System Integrity Protection.

First of all, one needs to get into the Recovery Mode of MacOS. Typically, this wouldn’t be all that difficult when following the instructions provided by Apple. Essentially, to get into Recovery Mode, one just has to hold Command+R when booting up the system. That’s all fine and dandy if it is a physical host and one has an Apple keyboard. However, my situation called for Recovery Mode from a virtual machine and using a non-Apple keyboard (so no Command key). Yes, yes, I know that MacOS offers the ability to set different key combinations, but then those would still have to be trapped by VMWare Fusion during boot. Instead, I figured that there had to be a way to do it from the MacOS terminal.

After digging through documentation and man pages (I’ll spare you the trials and tribulations of trying to find answers 😛 ), I finally found that, yes, one CAN reboot MacOS into Recovery Mode without the command key. To do so, open up the Terminal and type the following commands:

nvram "recovery-boot-mode=unused"
reboot recovery

The Apple host will reboot and the Recover Mode screen will be presented:

MacOS Recovery Mode - Utilities - Terminal
Click to enlarge

Now, in the main window, there are plenty of tasks that can be launched. However, I needed a terminal, and it might not be readily apparent, but to get it, you click on the “Utilities” menu in the top menu bar (see the screenshot above), and then select “Terminal”. Thereafter, it is fairly simple to disable System Integrity Protection via the following command:

csrutil disable

All that’s left is to reboot by going to the Apple Menu and clicking on “Restart”.

Though the procedures of getting to the MacOS Recovery menu without using the Command key and disabling System Integrity Protection are not all that difficult, they were a pain to figure out. Furthermore, I’m not sure why SIP disallows root’s write permissions anyway. That seems absurd, especially in light of Apple’s most recent glaring security hole of allowing root access without a password. 😳

Cheers,
Zach

November 29, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
Are tutorials to blame for basic IT problems? (November 29, 2017, 12:04 UTC)

It’s now effectively impossible to spend a month following IT (and not just) new and not hear of breaches, “hacks”, or general security fiascos. Some of these are tracked down to very basic mistakes in configuration or coding of software, including the lack of hashing of passwords in database. Everyone in the industry, including me, have at some point expressed the importance of proper QA and testing, and budgeting for them in the development process. But what if the problem is much higher up the chain?

Falsehoods Programmers Believe About Names is now over seven years old, and yet my just barely complicated full name (first name with a space in it, surname with an accent) can’t be easily used by most of the services I routinely use. Ireland was particularly interesting, as most services would support characters in the “Latin extended” alphabet, due to the Irish language use of ó, but they wouldn’t accept my surname, which uses ò — this not a first, I had trouble getting invoices from French companies before because they only know about ó as a character.

In a related, but not directly connected topic, there are the problems an acquaintance of mine keeps stumbling across. They don’t want service providers to attach a title to their account, but it looks like most of the developers that implement account handling don’t actually think about this option at all, and make it hard to not set a honorific at all. In particular, it appears not only UIs tend to include a mandatory drop-down list of titles, but the database schema (or whichever other model is used to store the information) also provides the title as an enumeration within a list — that is apparent by the way my acquaintance has had their account reverted to a “default” value, likely the “zeroth” one in the enumeration.

And since most systems don’t end up using British Airways’s honorific list but are rather limited to the “usual” ones, that appears to be “Ms” more often than not, as it sorts (lexicographically) before the others. I have had that happen to me a couple of times too, as I don’t usually file the “title” field on paper forms (I never seen much of a point of it), and I guess somewhere in the pipeline a model really expects a person to have a title.

All of this has me wondering, oh-so-many times, why most systems appear to want to store a name in separate entries for first and last name (or variation thereof), and why they insist on having a honorific title that is “one of the list” rather than a freeform (which would accept the empty string as a valid value). My theory on this is that it’s the fault of the training, or of the documentation. Multiple tutorials I have read, and even followed, over the years defined a model for a “person” – whether it is an user, customer, or any other entity related to the service itself – and many of these use the most basic identifying information about a person as fields to show how the model works, which give you “name”, “surname”, and “title” fields. Bonus points to use an enumeration for the title rather than a freeform, or validation that the title is one of the “admissible” ones.

You could call this a straw man argument, but the truth is that it didn’t take me any time at all to find an example tutorial (See also Archive.is, as I hope the live version can be fixed!) that did exactly that.

Similarly, I have seen sample tutorial code explaining how to write authentication primitives that oversimplify the procedure by either ignoring the salt-and-hashing or using obviously broken hashing functions such as crypt() rather than anything solid. Given many of us know all too well how even important jobs that are not flashy enough for a “rockstar” can be pushed into the hands of junior developers or even interns, I would not be surprised if a good chunk of these weak authentication problems that are now causing us so much pain are caused by simple bad practices that are (still) taught to those who join our profession.

I am afraid I don’t have an answer of how to fix this situation. While musing, again on Twitter, the only suggestion for a good text on writing correct authentication code is the NIST recommendations, but these are, unsurprisingly, written in a tone that is not useful to teach how to do things. They are standards first and foremost, and they are good, but that makes them extremely unsuitable for newcomers to learn how to do things correctly. And while they do provide very solid ground for building formally correct implementations of common libraries to implement the authentication — I somehow doubt that most systems would care about the formal correctness of their login page, particularly given the stories we have seen up to now.

I have seen comments on social media (different people on different media) about what makes a good source of documentation changes depending on your expertise, which is quite correct. Giving a long list of things that you should or should not do is probably a bad way to introduce newcomers to development in general. But maybe we should make sure that examples, samples, and documentation are updated so that they show the current best practice rather than overly simplified, or artificially complicated (sometimes at the same time) examples.

If you’re writing documentation, or new libraries (because you’re writing documentation for new libraries you write, right?) you may want to make sure that the “minimal” example is actually the minimum you need to do, and not skip over things like error checks, or full initialisation. And please, take a look at the various “Falsehoods Programmers Believe About” lists — and see if your example implementation make those assumptions. And if so fix them, please. You’ll prevent many mistakes from happening in real world applications, simply because the next junior developer who gets hired to build a startup’s latest website will not be steered towards the wrong implementations.

November 26, 2017
Diego E. Pettenò a.k.a. flameeyes (homepage, bugs)
A quick London update (November 26, 2017, 13:04 UTC)

It’s now nearly two months since I last posted something and I guess I should at least break the silence to say that I’m well and alive. Although right now I’m spending a week in Italy – to see friends and family here, and take care of some remaining paperwork – my transfer to London completed and I’m now living on the outskirts of the City.

I was almost writing “completed successfully”, but “success” is hard to measure. As I said in the post where I announced my move, a lot of the reason for me to leave Dublin was the lack of friends and a social circle, so you could say that the only way to measure success is seeing if I manage to build such a social circle in the new city (and country), and that will take a while of course.

In addition to this, more than a few things are not quite going in my favour. For instance, the removal companies that arranged and delivered my move managed to screw up, and two of my boxes are missing, which coincidentally were the ones with the most valuable goods: my gamestation, the router and the professional document scanner. And I’m still waiting to hear from the insurance to see that they at least pay the (depreciated) value of the goods so I can get part of my money back on the new computer I’m trying to buy.

I say I’m trying to buy it, because I spent most of this past Friday fighting with my bank trying to have them let me make payments out of my account, possibly because their risk engine was turned up to eleven due to Black Friday, and it took me a significant amount of time to get in a position to actually dispose of my own money. You can bet that the first thing I’m doing after I’m back from this trip is finding a new bank.

Oh yes and somehow it looks like the company that owns and manages the building my apartment is in, is having a spat with the electricity provider. Last week every tenant in the building received a note from e-on (which is not my provider) saying that they are applying for a warrant to disconnect the landlord’s power supply.

So yeah expect more rants as soon as I manage to settle down, I have drafts.

November 25, 2017
Sebastian Pipping a.k.a. sping (homepage, bugs)

November 25: International Day for the Elimination of Violence against Women
(Wikipedia)

November 24, 2017
Sebastian Pipping a.k.a. sping (homepage, bugs)

Not new at all but was new to me, and was well worth my time:

DEFCON 19: Bit-squatting: DNS Hijacking Without Exploitation (w speaker)

Only somewhat related: https://www.pytosquatting.org/

November 20, 2017
Sven Vermeulen a.k.a. swift (homepage, bugs)
SELinux and extended permissions (November 20, 2017, 16:00 UTC)

One of the features present in the August release of the SELinux user space is its support for ioctl xperm rules in modular policies. In the past, this was only possible in monolithic ones (and CIL). Through this, allow rules can be extended to not only cover source (domain) and target (resource) identifiers, but also a specific number on which it applies. And ioctl's are the first (and currently only) permission on which this is implemented.

Note that ioctl-level permission controls isn't a new feature by itself, but the fact that it can be used in modular policies is.

November 18, 2017
Sergei Trofimovich a.k.a. slyfox (homepage, bugs)
Endianness of a single byte: big or little? (November 18, 2017, 00:00 UTC)

trofi's blog: Endianness of a single byte: big or little?

Endianness of a single byte: big or little?

Bug

Rolf Eike Beer noticed two test failures while testing radvd package (IPv6 route advertiser daemon and more) on sparc. Both tests failed likely due to endianness issue:

test/send.c:317:F:build:test_add_ra_option_lowpanco:0:
  Assertion '0 == memcmp(expected, sb.buffer, sizeof(expected))'
    failed: 0 == 0, memcmp(expected, sb.buffer, sizeof(expected)) == 1
test/send.c:342:F:build:test_add_ra_option_abro:0:
  Assertion '0 == memcmp(expected, sb.buffer, sizeof(expected))'
    failed: 0 == 0, memcmp(expected, sb.buffer, sizeof(expected)) == 1

I’ve confirmed the same failure on powerpc.

Eike noted that it’s unusual because sparc is a big-endian architecture and network byteorder is also big-endian (thus no need to flip bytes). Something very specific must have lurked in radvd code to break endianness in this case. Curiously all the radvd tests were working fine on amd64.

Two functions failed to produce expected results:

START_TEST(test_add_ra_option_lowpanco)
{
ck_assert_ptr_ne(0, iface);
struct safe_buffer sb = SAFE_BUFFER_INIT;
add_ra_option_lowpanco(&sb, iface->AdvLowpanCoList);
unsigned char expected[] = {
0x22, 0x03, 0x32, 0x48, 0x00, 0x00, 0xe8, 0x03, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
};
ck_assert_int_eq(sb.used, sizeof(expected));
ck_assert_int_eq(0, memcmp(expected, sb.buffer, sizeof(expected)));
safe_buffer_free(&sb);
}
END_TEST
START_TEST(test_add_ra_option_abro)
{
ck_assert_ptr_ne(0, iface);
struct safe_buffer sb = SAFE_BUFFER_INIT;
add_ra_option_abro(&sb, iface->AdvAbroList);
unsigned char expected[] = {
0x23, 0x03, 0x0a, 0x00, 0x02, 0x00, 0x02, 0x00, 0xfe, 0x80, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xa2, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01,
};
ck_assert_int_eq(sb.used, sizeof(expected));
ck_assert_int_eq(0, memcmp(expected, sb.buffer, sizeof(expected)));
safe_buffer_free(&sb);
}
END_TEST

Both tests are straightforward: they verify 6LoWPAN and ABRO extension handling (both are related to route announcement for Low Power devices). Does not look complicated.

I looked at add_ra_option_lowpanco() implementation and noticed at least one bug of missing endianness conversion:

// radvd.h
struct nd_opt_abro {
uint8_t nd_opt_abro_type;
uint8_t nd_opt_abro_len;
uint16_t nd_opt_abro_ver_low;
uint16_t nd_opt_abro_ver_high;
uint16_t nd_opt_abro_valid_lifetime;
struct in6_addr nd_opt_abro_6lbr_address;
};
// ...
// send.c
static void add_ra_option_mipv6_home_agent_info(struct safe_buffer *sb, struct mipv6 const *mipv6)
{
struct HomeAgentInfo ha_info;
memset(&ha_info, 0, sizeof(ha_info));
ha_info.type = ND_OPT_HOME_AGENT_INFO;
ha_info.length = 1;
ha_info.flags_reserved = (mipv6->AdvMobRtrSupportFlag) ? ND_OPT_HAI_FLAG_SUPPORT_MR : 0;
ha_info.preference = htons(mipv6->HomeAgentPreference);
ha_info.lifetime = htons(mipv6->HomeAgentLifetime);
safe_buffer_append(sb, &ha_info, sizeof(ha_info));
}
static void add_ra_option_abro(struct safe_buffer *sb, struct AdvAbro const *abroo)
{
struct nd_opt_abro abro;
memset(&abro, 0, sizeof(abro));
abro.nd_opt_abro_type = ND_OPT_ABRO;
abro.nd_opt_abro_len = 3;
abro.nd_opt_abro_ver_low = abroo->Version[1];
abro.nd_opt_abro_ver_high = abroo->Version[0];
abro.nd_opt_abro_valid_lifetime = abroo->ValidLifeTime;
abro.nd_opt_abro_6lbr_address = abroo->LBRaddress;
safe_buffer_append(sb, &abro, sizeof(abro));
}

Note how add_ra_option_mipv6_home_agent_info() carefully flips bytes with htons() for all uint16_t fields but add_ra_option_abro() does not.

It means the ABRO does not really work on little-endian (aka most) systems in radvd and test checks for the wrong thing. I added missing htons() calls and fixed expected[] output in tests by manually flipping two bytes in a few locations.

Plot twist

The effect was slightly unexpected: I fixed only ABRO test, but not 6LoWPAN. It’s where things became interesting. Let’s look at add_ra_option_lowpanco() implementation:

// radvd.h
struct nd_opt_6co {
uint8_t nd_opt_6co_type;
uint8_t nd_opt_6co_len;
uint8_t nd_opt_6co_context_len;
uint8_t nd_opt_6co_res : 3;
uint8_t nd_opt_6co_c : 1;
uint8_t nd_opt_6co_cid : 4;
uint16_t nd_opt_6co_reserved;
uint16_t nd_opt_6co_valid_lifetime;
struct in6_addr nd_opt_6co_con_prefix;
};
// ...
// send.c
static void add_ra_option_lowpanco(struct safe_buffer *sb, struct AdvLowpanCo const *lowpanco)
{
struct nd_opt_6co co;
memset(&co, 0, sizeof(co));
co.nd_opt_6co_type = ND_OPT_6CO;
co.nd_opt_6co_len = 3;
co.nd_opt_6co_context_len = lowpanco->ContextLength;
co.nd_opt_6co_c = lowpanco->ContextCompressionFlag;
co.nd_opt_6co_cid = lowpanco->AdvContextID;
co.nd_opt_6co_valid_lifetime = lowpanco->AdvLifeTime;
co.nd_opt_6co_con_prefix = lowpanco->AdvContextPrefix;
safe_buffer_append(sb, &co, sizeof(co));
}

The test still failed to match one single byte: the one that spans 3 bitfields: nd_opt_6co_res, nd_opt_6co_c, nd_opt_6co_cid. But why? Does endianness really matter within byte? Apparently gcc happens to group those 3 fields in different orders on x86_64 and powerpc!

Let’s looks at a smaller example:

#include <stdio.h>
#include <stdint.h>
struct s {
uint8_t a : 3;
uint8_t b : 1;
uint8_t c : 4;
};
int main() {
struct s v = { 0x00, 0x1, 0xF, };
printf("v = %#02x\n", *(uint8_t*)&v);
return 0;
}

Output difference:

$ x86_64-pc-linux-gnu-gcc a.c -o a && ./a
v = 0xf8
# (0xF << 5) | (0x1 << 4) | 0x00
$ powerpc-unknown-linux-gnu-gcc a.c -o a && ./a
v = 0x1f
# (0x0 << 5) | (0x1 << 4) | 0xF

C standard does not specify layout of bitfields and it’s a great illustration of how things break :)

An interesting observation: the bitfield order on powerpc happens to be the desired order (as 6LoWPAN RFC defines it).

It means radvd code indeed happened to generate correct bitstream on big-endian platforms (as Eike predicted) but did not work on little-endian systems. Unfortunately golden expected[] output was generated on little-endian system.

Thus the 3 fixes:

That’s it :)

Posted on November 18, 2017
<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript> comments powered by Disqus

November 17, 2017
Nathan Zachary a.k.a. nathanzachary (homepage, bugs)
Python’s M2Crypto fails to compile (November 17, 2017, 21:29 UTC)

When updating my music server, I ran into a compilation error on Python’s M2Crypto. The error message was a little bit strange and not directly related to the package itself:

fatal error: openssl/ecdsa.h: No such file or directory

Obviously, that error is generated from OpenSSL and not directly within M2Crypto. Remembering that there are some known problems with the “bindist” USE flag, I took a look at OpenSSL and OpenSSH. Indeed, “bindist” was set. Simply removing the USE flag from those two packages took care of the problem:


# grep -i 'openssh\|openssl' /etc/portage/package.use
>=dev-libs/openssl-1.0.2m -bindist
net-misc/openssh -bindist

In this case, the problems makes sense based on the error message. The error indicated that the Elliptic Curve Digital Signature Algorithm (ECDSA) header was not found. In the previously-linked page about the “bindist” USE flag, it clearly states that having bindist set will “Disable/Restrict EC algorithms (as they seem to be patented)”.

Cheers,
Nathan Zachary

November 16, 2017
Hanno Böck a.k.a. hanno (homepage, bugs)
Some minor Security Quirks in Firefox (November 16, 2017, 16:24 UTC)

FirefoxI discovered a couple of more or less minor security issues in Firefox lately. None of them is particularly scary, but they affect interesting corner cases or unusual behavior. I'm posting this mainly hoping that other people will find it inspiring to think about unusual security issues and maybe also come up with more realistic attack scenarios for these bugs.

I'd like to point out that Mozilla hasn't fixed most of those issues, despite all of them being reported several months ago.

Bypassing XSA warning via FTP

XSA or Cross-Site Authentication is an interesting and not very well known attack. It's been discovered by Joachim Breitner in 2005.

Some web pages, mostly forums, allow users to include third party images. This can be abused by an attacker to steal other user's credentials. An attacker first posts something with an image from a server he controls. He then switches on HTTP authentication for that image. All visitors of the page will now see a login dialog on that page. They may be tempted to type in their login credentials into the HTTP authentication dialog, which seems to come from the page they trust.

The original XSA attack is, as said, quite old. As a countermeasure Firefox implements a warning in HTTP authentication dialogs that were created by a subresource like an image. However it only does that for HTTP, not for FTP.

So an attacker can run an FTP server and include an image from there. By then requiring an FTP login and logging all login attempts to the server he can gather credentials. The password dialog will show the host name of the attacker's FTP server, but he could choose one that looks close enough to the targeted web page to not raise suspicion.

Firefox FTP password dialog

I haven't found any popular site that allows embedding images from non-HTTP-protocols. The most popular page that allows embedding external images at all is Stack Overflow, but it only allows HTTPS. Generally embedding third party images is less common these days, most pages keep local copies if they embed external images.

This bug is yet unfixed.

Obviously one could fix it by showing the same warning for FTP that is shown for HTTP authentication. But I'd rather recommend to completely block authentication dialogs on third party content. This is also what Chrome is doing. Mozilla has been discussing this for several years with no result.

Firefox also has an open bug about disallowing FTP on subresources. This would obviously also fix this scenario.

Window-modal popup via FTP

In the early days of JavaScript web pages could annoy users with popups. Browsers have since changed the behavior of JavaScript popups. They are now tab-modal, which means they're not blocking the interaction with the whole browser, they're just part of one tab and will only block the interaction with the web page that created them.

So it is a goal of modern browsers to not allow web pages to create window-modal alerts that block the interaction with the whole browser. However I figured out FTP gives us a bypass of this restriction.

If Firefox receives some random garbage over an FTP connection that it cannot interpret as FTP commands it will open an alert window showing that garbage.

Window modal FTP alert

First we open up our fake "FTP-Server" that will simply send a message to all clients. We can just use netcat for this:

while true; do echo "Hello" | nc -l -p 21; done

Then we try to open a connection, e. g. by typing ftp://localhost in the address bar on the same system. Firefox will not show the alert immediately. However if we then click on the URL bar and press enter again it will show the alert window. I tried to replicate that behavior with JavaScript, which worked sometimes. I'm relatively sure this can be made reliable.

There are two problems here. One is that server controlled content is showed to the user without any interpretation. This alert window seems to be intended as some kind of error message. However it doesn't make a lot of sense like that. If at all it should probably be prefixed by some message like "the server sent an invalid command". But ultimately if the browser receives random garbage instead of protocol messages it's probably not wise to display that at all. The second problem is that FTP error messages probably should be tab-modal as well.

This bug is also yet unfixed.

FTP considered dangerous

FTP is an old protocol with many problems. Some consider the fact that browsers still support it a problem. I tend to agree, ideally FTP should simply be removed from modern browsers.

FTP in browsers is insecure by design. While TLS-enabled FTP exists browsers have never supported it. The FTP code is probably not well audited, as it's rarely used. And the fact that another protocol exists that can be used similarly to HTTP has the potential of surprises. For example I found it quite surprising to learn that it's possible to have unencrypted and unauthenticated FTP connections to hosts that enabled HSTS. (The lack of cookie support on FTP seems to avoid causing security issues, but it's still unexpected and feels dangerous.)

Self-XSS in bookmark manager export

The Firefox Bookmark manager allows exporting bookmarks to an HTML document. Before the current Firefox 57 it was possible to inject JavaScript into this exported HTML via the tags field.

I tried to come up with a plausible scenario where this could matter, however this turned out to be difficult. This would be a problematic behavior if there's a way for a web page to create such a bookmark. While it is possible to create a bookmark dialog with JavaScript, this doesn't allow us to prefill the tags field. Thus there is no way a web page can insert any content here.

One could come up with implausible social engineering scenarios (web page asks user to create a bookmark and insert some specific string into the tags field), but that seems very far fetched. A remotely plausible scenario would be a situation where a browser can be used by multiple people who are allowed to create bookmarks and the bookmarks are regularly exported and uploaded to a web page. However that also seems quite far fetched.

This was fixed in the latest Firefox release as CVE-2017-7840 and considered as low severity.

Crashing Firefox on Linux via notification API

The notification API allows browsers to send notification alerts that the operating system will show in small notification windows. A notification can contain a small message and an icon.

When playing this one of the very first things that came to my mind was to check what happens if one simply sends a very large icon. A user has to approve that a web page is allowed to use the notification API, however if he does the result is an immediate crash of the browser. This only "works" on Linux. The proof of concept is quite simple, we just embed a large black PNG via a data URI:

<script>Notification.requestPermission(function(status){
new Notification("",{icon: "" + "A".repeat(4043) + "yDjFUQABEK0vGQAAAABJRU5ErkJggg==",});
});</script>


I haven't fully tracked down what's causing this, but it seems that Firefox tries to send a message to the system's notification daemon with libnotify and if that's too large for the message size limit of dbus it will not properly handle the resulting error.

What I found quite frustrating is that when I reported it I learned that this was a duplicate of a bug that has already been reported more than a year ago. I feel having such a simple browser crash bug open for such a long time is not appropriate. It is still unfixed.

November 10, 2017
Alice Ferrazzi a.k.a. alicef (homepage, bugs)
Latex 001 (November 10, 2017, 06:03 UTC)

This is manly a memo for remember each time which packages are needed for Japanese.

Usually I use xetex with cjk for write latex document on Gentoo this is done by adding xetex and cjk support to texlive.

app-text/texlive cjk xetex app-text/texlive-core cjk xetex

for platex is needed to install

dev-texlive/texlive-langjapanese

Google Summer of Code day41-51 (November 10, 2017, 05:59 UTC)

Google Summer of Code day 41

What was my plan for today?

  • testing and improving elivepatch

What i did today?

elivepatch work: - working on making the code for sending a unknown number of files with Werkzeug

Making and Testing patch manager

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 42

What was my plan for today?

  • testing and improving elivepatch

What i did today?

elivepatch work: * Fixing sending multiple file using requests I'm making the script for incremental add patches but I'm stuck on this requests problem.

looks like I cannot concatenate files like this files = {'patch': ('01.patch', patch_01, 'multipart/form-data', {'Expires': '0'}), ('02.patch', patch_01, 'multipart/form-data', {'Expires': '0'}), ('03.patch', patch_01, 'multipart/form-data', {'Expires': '0'}), 'config': ('config', open(temporary_config.name, 'rb'), 'multipart/form-data', {'Expires': '0'})} or like this: files = {'patch': ('01.patch', patch_01, 'multipart/form-data', {'Expires': '0'}), 'patch': ('02.patch', patch_01, 'multipart/form-data', {'Expires': '0'}), 'patch': ('03.patch', patch_01, 'multipart/form-data', {'Expires': '0'}), 'config': ('config', open(temporary_config.name, 'rb'), 'multipart/form-data', {'Expires': '0'})} getting AttributeError: 'tuple' object has no attribute 'read'

looks like requests cannot manage to send data with same key but the server part of flask-restful is using parser.add_argument('patch', action='append') for concatenate more arguments togheter. and it suggest to send the informations like this: curl http://api.example.com -d "patch=bob" -d "patch=sue" -d "patch=joe"

Unfortunatly as now I'm still trying to understand how I can do it with requests.

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 43

What was my plan for today?

  • making the first draft of incremental patches feature

What i did today?

elivepatch work: * Changed send_file for manage more patches at a time * Refactored functions name for reflecting the incremental patches change * Made function for list patches in the eapply_user portage user patches folder directory and temporary folder patches reflecting the applied livepatches.

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 44

What was my plan for today?

  • making the first draft of incremental patches feature

What i did today?

elivepatch work: * renamed command client function as internal function * put togheter the client incremental patch sender function

what i will do next time?

  • make the server side incremtal patch part and test it

Google Summer of Code day 45

What was my plan for today?

  • testing and improving elivepatch

What i did today?

Meeting with mentor summary

elivepatch work: - working on incremental patch features design and implementatio - putting patch files under /var/run/elivepatch - ordering patch by numbers - cleaning folder when the machine is restarted - sending the patches to the server in order - cleaning client terminal output by catching the exceptions

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 46

What was my plan for today?

  • making the first draft of incremental patches feature

What i did today?

elivepatch work: * testing elivepatch with incremental patches * code refactoring

what i will do next time?

  • make the server side incremtal patch part and test it

Google Summer of Code day 47

What was my plan for today?

  • making the first draft of incremental patches feature

What i did today?

elivepatch work: * testing elivepatch with incremental patches * trying to make some way for make testing more fast, I'm thinking something like unit test and integration testing. * merged build_livepatch with send_files as it can reduce the steps for making the livepatch and help making it more isolated. * going on writing the incremental patch

Building with kpatch-build takes usually too much time, this is making testing parts where building live patch is needed, long and complicated. Would probably be better to add some unit/integration testing for keep each task/function under control without needing to build live patches every time. any thoughts?

what i will do next time?

  • make the server side incremental patch part and test it

Google Summer of Code day 48

What was my plan for today?

  • making the first draft of incremental patches feature

What i did today?

elivepatch work: * added PORTAGE_CONFIGROOT for set the path from there get the incremental patches. * cannot download kernel sources to /usr/portage/distfiles/ * saving created livepatch and patch to the client but sending only the incremental patches and patch to the elivepatch server * Cleaned output from bash command

what i will do next time?

  • make the server side incremental patch part and test it

Google Summer of Code day 49

What was my plan for today?

  • testing and improving elivepatch incremental patches feature

What i did today?

1) we need a way for having old genpatches. mpagano made a script for saving all the genpatches and they are saved here: http://dev.gentoo.org/~mpagano/genpatches/tarballs/ I think we can redirect the ebuild on our overlay for get the tarballs from there.

2) kpatch can work with initrd

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 50

What was my plan for today?

  • testing and improving elivepatch incremental patches feature

What i did today?

  • refactoring code
  • starting writing first draft for the automatical kernel livepatching system

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 51

What was my plan for today?

  • testing and improving elivepatch incremental patches feature

What i did today?

  • checking eapply_user eapply_user is using patch

  • writing elivepatch wiki page https://wiki.gentoo.org/wiki/User:Aliceinwire/elivepatch

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day31-40 (November 10, 2017, 05:57 UTC)

Google Summer of Code day 31

What was my plan for today?

  • testing and improving elivepatch

What I did today?

Dispatcher.py:

  • fixed comments
  • added static method for return the kernel path
  • Added todo
  • fixed how to make directory with uuid

livepatch.py * fixed docstring * removed sudo on kpatch-build * fixed comments * merged ebuild commands in one on build livepatch

restful.py * fixed comment

what I will do next time?

  • testing and improving elivepatch

Google Summer of Code day 32

What was my plan for today?

  • testing and improving elivepatch

What I did today?

client checkers: * used os.path.join for ungz function * implemented temporary folder using python tempfile

what I will do next time?

  • testing and improving elivepatch

Google Summer of Code day 33

What was my plan for today?

  • testing and improving elivepatch

What I did today?

  • Working with tempfile for keeping the uncompressed configuration file using the appropriate tempfile module.
  • Refactoring and code cleaning.
  • check if the ebuild is present in the overlay before trying to merge it.

what I will do next time?

  • testing and improving elivepatch

Google Summer of Code day 34

What was my plan for today?

  • testing and improving elivepatch

What I did today?

  • lpatch added locally
  • fix ebuild directory
  • Static patch and config filename on send This is useful for isolating the API requests using only the uuid as session identifier
  • Removed livepatchStatus and lpatch class configurations. Because we need a request to only be identified by is own UUID, for isolating the transaction.
  • return in case the request with the same uuid is already present.

we still have that problem about "can'\''t find special struct alt_instr size."

I will investigate it tomorrow

what I will do next time?

  • testing and improving elivepatch
  • Investigating the missing information in the livepatch

Google Summer of Code day 35

What was my plan for today?

  • testing and improving elivepatch

What I did today?

Kpatch needs some special section data for finding where to inject the livepatch. This special section data existence is checked by kpatch-build in the given vmlinux file. The vmlinux file need CONFIG_DEBUG_INFO=y for making the debug symbols containing the special section data. This special section data is found like this:

# Set state if name matches
a == 0 && /DW_AT_name.* alt_instr[[:space:]]*$/ {a = 1; next}
b == 0 && /DW_AT_name.* bug_entry[[:space:]]*$/ {b = 1; next}
p == 0 && /DW_AT_name.* paravirt_patch_site[[:space:]]*$/ {p = 1; next}
e == 0 && /DW_AT_name.* exception_table_entry[[:space:]]*$/ {e = 1; next}

what I will do next time?

  • testing and improving elivepatch
  • Investigating the missing information in the livepatch

Google Summer of Code day 36

What was my plan for today?

  • testing and improving elivepatch

What I did today?

  • Fixed livepatch output name for reflecting the static set patch name
  • module install removed as not needed
  • debug option is now also copying the build.log to the uuid directory, for investigating failed attempt
  • refactored uuid_dir with uuid in the case where uuid_dir is not actually containing the full path
  • adding debug_info to the configuration file if not already present however, this setting is only needed for the livepatch creation (not tested yet)

what I will do next time?

  • testing and improving elivepatch
  • Investigating the missing information in the livepatch

Google Summer of Code day 37

What was my plan for today?

  • testing and improving elivepatch

What I did today?

  • Fix return message for cve option
  • Added different configuration example [4.9.30,4.10.17]
  • Catch Gentoo-sources not available error
  • Catch missing livepatch errors on the client

Tested kpatch with patch for kernel 4.9.30 and 4.10.17 and it worked without any problem. I checked with coverage for see which code is not used. I think we can maybe remove cve option for now as not implemented yet and we could use a modular implementation of for it, so the configuration could change. We need some documentation about elivepatch on the Gentoo wiki. We need some unit tests for making development more smooth and making it more simple to check the working status with GitHub Travis.

I talked with kpatch creator and we got some feedback:

“this project could also be used for kpatch testing :)
imagine instead of just loading the .ko, the client was to kick off a series of tests and report back.”

“why bother a production or tiny machine when you might have a patch-building server”

what I will do next time?

  • testing and improving elivepatch
  • Investigating the missing information in the livepatch

Google Summer of Code day 38

What was my plan for today?

  • testing and improving elivepatch

What I did today?

Meeting with mentor summary

What we will do next: - incremental patch tracking on the client side - CVE security vulnerability checker - dividing the repository - ebuild - documentation [optional] - modularity [optional]

Kpatch work: - Started to make the incremental patch feature - Tested kpatch for permission issue

what I will do next time?

  • testing and improving elivepatch
  • Investigating the missing information in the livepatch

Google Summer of Code day 39

What was my plan for today?

  • testing and improving elivepatch

What I did today?

Meeting with mentor summary

elivepatch work: - working on incremental patch features design and implementation - putting patch files under /var/run/elivepatch - ordering patch by numbers - cleaning folder when the machine is restarted - sending the patches to the server in order - cleaning client terminal output by catching the exceptions

what I will do next time?

  • testing and improving elivepatch

Google Summer of Code day 40

What was my plan for today?

  • testing and improving elivepatch

What I did today?

Meeting with mentor

summary of elivepatch work: - working with incremental patch manager - cleaning client terminal output by catching the exceptions

Making and Testing patch manager

what I will do next time?

  • testing and improving elivepatch

November 09, 2017
Cigars and the Norwegian Government (November 09, 2017, 21:36 UTC)

[Updated 22nd November to add link to the response to proposed new regulations] As some of my readers knows, I'm an aficionado of Cigars, to the extent I bought my own Cigar Importer and Store. The picture below is from the 20th best bar in the world in 2017, and they sell our cigars of … Continue reading "Cigars and the Norwegian Government"

Alice Ferrazzi a.k.a. alicef (homepage, bugs)
Google Summer of Code day26-30 (November 09, 2017, 11:14 UTC)

Google Summer of Code day 26

What was my plan for today?

  • testing and improving elivepatch

What i did today?

After discussion with my mentor. I need: * Availability and replicability of downloading kernel sources. [needed] * Represent same kernel sources as the client kernel source (use flags for now and in the future user added patches) [needed] * Support multiple request at the same time. [needed] * Modularity for adding VM machine or container support. (in the future)

Create overlay for keeping old gentoo-sources ebuild where we will keep retrocompatibility for old kernels: https://github.com/aliceinwire/gentoo-sources_overlay

Made function for download and get old kernel sources and install sources under the designated temporary folder.

what i will do next time?

  • Going on improving elivepatch

Google Summer of Code day 27

What was my plan for today?

  • testing and improving elivepatch

What i did today?

  • Fixed git clone of the gentoo-sources overlay under a temporary directory
  • Fixed download directories
  • Tested elivepatch

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 28

What was my plan for today?

  • testing and improving elivepatch

What i did today?

  • Testing elivepatch with multi threading
  • Removed check for uname kernel version as is getting the kernel version directly from the kernel configuration file header.
  • Starting kpatch-build under the output folder Because kpatch-build is making the livepatch under the $PWD folder we are starting it under the uuid tmp folder and we are getting the livepatch from the uuid folder. this is usefull for dealing with multi-threading
  • Added some helper function for code clarity [not finished yet]
  • Refactored for code clarity [not finished yet]

For making elivepatch multithread we can simply change app.run() of flask with app.run(threaded=True).
This will make flask spawn thread for each request (using class SocketServer.ThreadingMixIn and baseWSGIserver)1, but also if is working pretty nice, the suggested way of threading is probably using gunicorn or uWSGI.2
maybe like using flask-gunicorn: https://github.com/doobeh/flask-gunicorn

what i will do next time?

  • testing and improving elivepatch

[1] https://docs.python.org/2/library/socketserver.html#SocketServer.ThreadingMixIn [2] http://flask.pocoo.org/docs/0.12/deploying/

Google Summer of Code day 29

What was my plan for today?

  • testing and improving elivepatch

What i did today?

  • Added some helper function for code clarity [not finished yet]
  • Refactored for code clarity [not finished yet]
  • Sended pull request to kpatch for adding dynamic output folder option https://github.com/dynup/kpatch/pull/718
  • Sended pull request to kpatch for small style fix https://github.com/dynup/kpatch/pull/719

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day 30

What was my plan for today?

  • testing and improving elivepatch

What i did today?

  • fixing uuid design

moved the uuid generating to the client and read rfc 4122

that states "A UUID is 128 bits long, and requires no central registration process."

  • made regex function for checking the format of uuid is actually correct.

what i will do next time?

  • testing and improving elivepatch

Google Summer of Code day21-25 (November 09, 2017, 10:42 UTC)

Google Summer of Code day 21

What was my plan for today?

  • Testing elivepatch
  • Getting the kernel version dynamically
  • Updating kpatch-build for work with Gentoo better

What I did today?

Fixing the problems pointed out by my mentor.

  • Fixed software copyright header
  • Fixed popen call
  • Used regex for checking file extension
  • Added debug option for the server
  • Changed reserved word shadowing variables
  • used os.path.join for link path
  • dynamical pass kernel version

what I will do next time?
* Testing elivepatch * work on making elivepatch support multiple calls

Google Summer of Code day 22

What was my plan for today?

  • Testing elivepatch
  • work on making elivepatch support multiple calls

What I did today?

  • working on sending the kernel version
  • working on dividing users by UUID (Universally unique identifier)

With the first connection, because the UUID is not generated yet, it will be created and returned by the elivepatch server. The client will store the UUID in shelve so that it can authenticate next time is inquiring the server. By using python UUID I'm assigning different UUID for each user. By using UUID we can start supporting multiple calls.

what I will do next time?
* Testing elivepatch * Cleaning code

Google Summer of Code day 23

What was my plan for today?

  • Testing elivepatch
  • Cleaning code

What I did today?

  • commented some code part
  • First draft of UUID working

elivepatch server need to generate a UUID for a client connection, and assign the UUID to each client, this is needed for managing multiple request. The livepatch and configs files will be generated in different folders for each client request and returned using the UUID. Made a working draft.

what I will do next time?
* Cleaning code * testing it * Go on with programming and starting implementing the CVE

Google Summer of Code day 24

What was my plan for today?

  • Cleaning code
  • testing it
  • Go on with programming and starting implementing the CVE

What I did today?

  • Working with getting the Linux kernel version from configuration file
  • Working with parsing CVE repository

I could implement the Linux kernel version from the configuration file, and I'm working on parsing the CVE repository. Would also be a nice idea to work with the kpatch-build script for making it Gentoo compatible. But I also got into one problem, that is we need to find a way to download old Gentoo-sources ebuild for building an old kernel, where the server there isn't the needed version of kernel sources. This depends on how much back compatibility we want to give. And with old Gentoo-sources, we cannot assure that is working. Because old Gentoo-sources was using different versions of Gentoo repository, eclass. So is a problem to discuss in the next days.

what I will do next time?

  • work on the CVE repository
  • testing it

Google Summer of Code day 25

What was my plan for today?

  • Cleaning code
  • testing and improving elivepatch

What I did today?

As discussed with my mentor I worked on unifying the patch and configuration file RESTful API call. And made the function to standardize the subprocess commands.

what I will do next time?

  • testing it and improving elivepatch

November 02, 2017
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
Gentoo Linux on DELL XPS 13 9365 and 9360 (November 02, 2017, 17:33 UTC)

Since I received some positive feedback about my previous DELL XPS 9350 post, I am writing this summary about my recent experience in installing Gentoo Linux on a DELL XPS 13 9365.

This installation notes goals:

  • UEFI boot using Grub
  • Provide you with a complete and working kernel configuration
  • Encrypted disk root partition using LUKS
  • Be able to type your LUKS passphrase to decrypt your partition using your local keyboard layout

Grub & UEFI & Luks installation

This installation is a fully UEFI one using grub and booting an encrypted root partition (and home partition). I was happy to see that since my previous post, everything got smoother. So even if you can have this installation working using MBR, I don’t really see a point in avoiding UEFI now.

BIOS configuration

Just like with its ancestor, you should:

  • Turn off Secure Boot
  • Set SATA Operation to AHCI

Live CD

Once again, go for the latest SystemRescueCD (it’s Gentoo based, you won’t be lost) as it’s quite more up to date and supports booting on UEFI. Make it a Live USB for example using unetbootin and the ISO on a vfat formatted USB stick.

NVME SSD disk partitioning

We’ll obviously use GPT with UEFI. I found that using gdisk was the easiest. The disk itself is found on /dev/nvme0n1. Here it is the partition table I used :

  • 10Mo UEFI BIOS partition (type EF02)
  • 500Mo UEFI boot partition (type EF00)
  • 2Go Swap partition
  • 475Go Linux root partition

The corresponding gdisk commands :

# gdisk /dev/nvme0n1

Command: o ↵
This option deletes all partitions and creates a new protective MBR.
Proceed? (Y/N): y ↵

Command: n ↵
Partition Number: 1 ↵
First sector: ↵
Last sector: +10M ↵
Hex Code: EF02 ↵

Command: n ↵
Partition Number: 2 ↵
First sector: ↵
Last sector: +500M ↵
Hex Code: EF00 ↵

Command: n ↵
Partition Number: 3 ↵
First sector: ↵
Last sector: +2G ↵
Hex Code: 8200 ↵

Command: n ↵
Partition Number: 4 ↵
First sector: ↵
Last sector: ↵ (for rest of disk)
Hex Code: ↵

Command: p ↵
Disk /dev/nvme0n1: 1000215216 sectors, 476.9 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): A73970B7-FF37-4BA7-92BE-2EADE6DDB66E
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1000215182
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048           22527   10.0 MiB    EF02  BIOS boot partition
   2           22528         1046527   500.0 MiB   EF00  EFI System
   3         1046528         5240831   2.0 GiB     8200  Linux swap
   4         5240832      1000215182   474.4 GiB   8300  Linux filesystem

Command: w ↵
Do you want to proceed? (Y/N): Y ↵

No WiFi on Live CD ? no panic

Once again on my (old?) SystemRescueCD stick, the integrated Intel 8265/8275 wifi card is not detected.

So I used my old trick with my Android phone connected to my local WiFi as a USB modem which was detected directly by the live CD.

  • get your Android phone connected on your local WiFi (unless you want to use your cellular data)
  • plug in your phone using USB to your XPS
  • on your phone, go to Settings / More / Tethering & portable hotspot
  • enable USB tethering

Running ip addr will show the network card enp0s20f0u2 (for me at least), then if no IP address is set on the card, just ask for one :

# dhcpcd enp0s20f0u2

You have now access to the internet.

Proceed with installation

The only thing to prepare is to format the UEFI boot partition as FAT32. Do not worry about the UEFI BIOS partition (/dev/nvme0n1p1), grub will take care of it later.

# mkfs.vfat -F 32 /dev/nvme0n1p2

Do not forget to use cryptsetup to encrypt your /dev/nvme0n1p4 partition! In the rest of the article, I’ll be using its device mapper representation.

# cryptsetup luksFormat -s 512 /dev/nvme0n1p4
# cryptsetup luksOpen /dev/nvme0n1p4 root
# mkfs.ext4 /dev/mapper/root

Then follow the Gentoo handbook as usual for the stage3 related next steps. Make sure you mount and bind the following to your /mnt/gentoo LiveCD installation folder (the /sys binding is important for grub UEFI):

# mount -t proc none /mnt/gentoo/proc
# mount -o bind /dev /mnt/gentoo/dev
# mount -o bind /sys /mnt/gentoo/sys

make.conf settings

I strongly recommend using at least the following on your /etc/portage/make.conf :

GRUB_PLATFORM="efi-64"
INPUT_DEVICES="evdev synaptics"
VIDEO_CARDS="intel i965"

USE="bindist cryptsetup"

The GRUB_PLATFORM one is important for later grub setup and the cryptsetup USE flag will help you along the way.

fstab for SSD

Don’t forget to make sure the noatime option is used on your fstab for / and /home.

/dev/nvme0n1p2    /boot    vfat    noauto,noatime    1 2
/dev/nvme0n1p3    none     swap    sw                0 0
/dev/mapper/root  /        ext4    noatime   0 1

Kernel configuration and compilation

I suggest you use a recent >=sys-kernel/gentoo-sources-4.13.0 along with genkernel.

  • You can download my kernel configuration file (iptables, docker, luks & stuff included)
  • Put the kernel configuration file into the /etc/kernels/ directory (with a training s)
  • Rename the configuration file with the exact version of your kernel

Then you’ll need to configure genkernel to add luks support, firmware files support and keymap support if your keyboard layout is not QWERTY.

In your /etc/genkernel.conf, change the following options:

LUKS="yes"
FIRMWARE="yes"
KEYMAP="1"

Then run genkernel all to build your kernel and luks+firmware+keymap aware initramfs.

Grub UEFI bootloader with LUKS and custom keymap support

Now it’s time for the grub magic to happen so you can boot your wonderful Gentoo installation using UEFI and type your password using your favourite keyboard layout.

  • make sure your boot vfat partition is mounted on /boot
  • edit your /etc/default/grub configuration file with the following:
GRUB_CMDLINE_LINUX="crypt_root=/dev/nvme0n1p4 keymap=fr"

This will allow your initramfs to know it has to read the encrypted root partition from the given partition and to prompt for its password in the given keyboard layout (french here).

Now let’s install the grub UEFI boot files and setup the UEFI BIOS partition.

# grub-install --efi-directory=/boot --target=x86_64-efi /dev/nvme0n1
Installing for x86_64-efi platform.
Installation finished. No error reported

It should report no error, then we can generate the grub boot config:

# grub-mkconfig -o /boot/grub/grub.cfg

You’re all set!

You will get a gentoo UEFI boot option, you can disable the Microsoft Windows one from your BIOS to get straight to the point.

Hope this helped!

November 01, 2017
Sebastian Pipping a.k.a. sping (homepage, bugs)
Expat 2.2.5 released (November 01, 2017, 18:45 UTC)

Expat 2.2.5 has recently been released. It fixes miscellaneous bugs. For more details, please check the changelog.

If you maintain Expat packaging or a bundled version of Expat somewhere, please update to 2.2.5. Thanks!

Sebastian Pipping

October 24, 2017

Description:
binutils is a set of tools necessary to build programs.

The complete ASan output of the issue:

# readelf -a -w -g -t -s -r -u -A -c -I -W $FILE
==92332==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x613000000f94 at pc 0x0000005b4c54 bp 0x7ffda4ab1e10 sp 0x7ffda4ab1e08
READ of size 1 at 0x613000000f94 thread T0
    #0 0x5b4c53 in get_line_filename_and_dirname /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/dwarf.c:4681:10
    #1 0x5b4c53 in display_debug_macro /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/dwarf.c:4839
    #2 0x536709 in display_debug_section /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:13371:16
    #3 0x536709 in process_section_contents /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:13457
    #4 0x536709 in process_object /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:18148
    #5 0x51099b in process_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:18540:13
    #6 0x51099b in main /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:18612
    #7 0x7f4febd95680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289
    #8 0x41a128 in getenv (/usr/x86_64-pc-linux-gnu/binutils-bin/git/readelf+0x41a128)

0x613000000f94 is located 0 bytes to the right of 340-byte region [0x613000000e40,0x613000000f94)
allocated by thread T0 here:
    #0 0x4d8318 in malloc /var/tmp/portage/sys-libs/compiler-rt-sanitizers-5.0.0/work/compiler-rt-5.0.0.src/lib/asan/asan_malloc_linux.cc:67
    #1 0x511975 in get_data /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:392:9
    #2 0x50eccf in load_specific_debug_section /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:13172:38
    #3 0x50e623 in load_debug_section /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:13300:10
    #4 0x5b1969 in display_debug_macro /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/dwarf.c:4717:3
    #5 0x536709 in display_debug_section /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:13371:16
    #6 0x536709 in process_section_contents /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:13457
    #7 0x536709 in process_object /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:18148
    #8 0x51099b in process_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:18540:13
    #9 0x51099b in main /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/readelf.c:18612
    #10 0x7f4febd95680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289

SUMMARY: AddressSanitizer: heap-buffer-overflow /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/dwarf.c:4681:10 in get_line_filename_and_dirname
Shadow bytes around the buggy address:
  0x0c267fff81a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c267fff81b0: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
  0x0c267fff81c0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c267fff81d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c267fff81e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c267fff81f0: 00 00[04]fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8200: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8210: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8220: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8230: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8240: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==92332==ABORTING

Affected version:
2.28 – 2.28.1 – 2.29.51.20170919 and maybe past releases

Fixed version:
N/A

Commit fix:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=5c1c468d0eddd0fda1ec8c5f33888657f94e3266

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
Waiting for a CVE assignment

Reproducer:
https://github.com/asarubbo/poc/blob/master/00382-binutils-heapoverflow-get_line_filename_and_dirname

Timeline:
2017-09-19: bug discovered and reported to upstream
2017-09-26: upstream released a patch
2017-10-24: blog post about the issue

Note:
This bug was found with American Fuzzy Lop.
This bug was identified with bare metal servers donated by Packet. This work is also supported by the Core Infrastructure Initiative.

Permalink:

binutils: heap-based buffer overflow in get_line_filename_and_dirname (dwarf.c)

Description:
binutils is a set of tools necessary to build programs.

The complete ASan output of the issue:

# nm -A -a -l -S -s --special-syms --synthetic --with-symbol-versions -D $FILE
==23816==ERROR: AddressSanitizer: SEGV on unknown address 0x4700004008d0 (pc 0x0000005427b6 bp 0x7ffd49033690 sp 0x7ffd49033680 T0)                                                                               
==23816==The signal is caused by a READ memory access.                                                                                                                                                            
    #0 0x5427b5 in _bfd_safe_read_leb128 /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/libbfd.c:1019:14                                                                                              
    #1 0x6a9b25 in find_abstract_instance_name /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:2918:19                                                                                        
    #2 0x69a3ff in scan_unit_for_symbols /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:3168:10                                                                                              
    #3 0x6a2de6 in comp_unit_maybe_decode_line_info /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:3660:9                                                                                    
    #4 0x6a2de6 in comp_unit_find_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:3686                                                                                                   
    #5 0x6a0369 in _bfd_dwarf2_find_nearest_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:4798:11                                                                                      
    #6 0x5f332e in _bfd_elf_find_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/elf.c:8695:10                                                                                                    
    #7 0x5176a3 in print_symbol /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1003:9                                                                                                       
    #8 0x514e4d in print_symbols /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1084:7                                                                                                      
    #9 0x514e4d in display_rel_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1200                                                                                                     
    #10 0x510976 in display_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1318:7                                                                                                      
    #11 0x50f4ce in main /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1792:12                                                                                                             
    #12 0x7f839bb03680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289                                                                                   
    #13 0x41a638 in chmod (/usr/x86_64-pc-linux-gnu/binutils-bin/git/nm+0x41a638)                                                                                                                                 
                                                                                                                                                                                                                  
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/libbfd.c:1019:14 in _bfd_safe_read_leb128
==23816==ABORTING

Affected version:
2.29.51.20170925 and maybe past releases

Fixed version:
N/A

Commit fix:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=1b86808a86077722ee4f42ff97f836b12420bb2a

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-15938

Reproducer:
https://github.com/asarubbo/poc/blob/master/00381-binutils-invalidread-find_abstract_instance_name

Timeline:
2017-09-26: bug discovered and reported to upstream
2017-09-26: upstream released a patch
2017-10-24: blog post about the issue
2017-10-27: CVE assigned

Note:
This bug was found with American Fuzzy Lop.
This bug was identified with bare metal servers donated by Packet. This work is also supported by the Core Infrastructure Initiative.

Permalink:

binutils: invalid memory read in find_abstract_instance_name (dwarf2.c)

Description:
binutils is a set of tools necessary to build programs.

The commit fix for this issue says:

The PR22200 fuzzer testcase found one way to put NULLs into .debug_line file tables. PR22205 finds another.
So mitre considers this an incomplete fix.

The complete ASan output of the issue:

# nm -A -a -l -S -s --special-syms --synthetic --with-symbol-versions -D $FILE
==19042==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x0000006a76a6 bp 0x7ffde0afde30 sp 0x7ffde0afde00 T0)
==19042==The signal is caused by a READ memory access.
==19042==Hint: address points to the zero page.
    #0 0x6a76a5 in concat_filename /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:1601:8
    #1 0x696ff3 in decode_line_info /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:2265:44
    #2 0x6a2d36 in comp_unit_maybe_decode_line_info /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:3651:26
    #3 0x6a2d36 in comp_unit_find_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:3686
    #4 0x6a0369 in _bfd_dwarf2_find_nearest_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:4798:11
    #5 0x5f332e in _bfd_elf_find_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/elf.c:8695:10
    #6 0x5176a3 in print_symbol /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1003:9
    #7 0x514e4d in print_symbols /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1084:7
    #8 0x514e4d in display_rel_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1200
    #9 0x510976 in display_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1318:7
    #10 0x50f4ce in main /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1792:12
    #11 0x7f6c6d793680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289
    #12 0x41a638 in chmod (/usr/x86_64-pc-linux-gnu/binutils-bin/git/nm+0x41a638)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:1601:8 in concat_filename
==19042==ABORTING

Affected version:
2.29.51.20170925 and maybe past releases

Fixed version:
N/A

Commit fix:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=a54018b72d75abf2e74bf36016702da06399c1d9

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-15939

Reproducer:
https://github.com/asarubbo/poc/blob/master/00380-binutils-NULLptr-concat_filename

Timeline:
2017-09-25: bug discovered and reported to upstream
2017-09-26: upstream released a patch
2017-10-24: blog post about the issue
2017-10-27: CVE assigned

Note:
This bug was found with American Fuzzy Lop.
This bug was identified with bare metal servers donated by Packet. This work is also supported by the Core Infrastructure Initiative.

Permalink:

binutils: NULL pointer dereference in concat_filename (dwarf2.c) (INCOMPLETE FIX FOR CVE-2017-15023)

October 16, 2017
Greg KH a.k.a. gregkh (homepage, bugs)
Linux Kernel Community Enforcement Statement FAQ (October 16, 2017, 09:05 UTC)

Based on the recent Linux Kernel Community Enforcement Statement and the article describing the background and what it means , here are some Questions/Answers to help clear things up. These are based on questions that came up when the statement was discussed among the initial round of over 200 different kernel developers.

Q: Is this changing the license of the kernel?

A: No.

Q: Seriously? It really looks like a change to the license.

A: No, the license of the kernel is still GPLv2, as before. The kernel developers are providing certain additional promises that they encourage users and adopters to rely on. And by having a specific acking process it is clear that those who ack are making commitments personally (and perhaps, if authorized, on behalf of the companies that employ them). There is nothing that says those commitments are somehow binding on anyone else. This is exactly what we have done in the past when some but not all kernel developers signed off on the driver statement.

Q: Ok, but why have this “additional permissions” document?

A: In order to help address problems caused by current and potential future copyright “trolls” aka monetizers.

Q: Ok, but how will this help address the “troll” problem?

A: “Copyright trolls” use the GPL-2.0’s immediate termination and the threat of an immediate injunction to turn an alleged compliance concern into a contract claim that gives the troll an automatic claim for money damages. The article by Heather Meeker describes this quite well, please refer to that for more details. If even a short delay is inserted for coming into compliance, that delay disrupts this expedited legal process.

By simply saying, “We think you should have 30 days to come into compliance”, we undermine that “immediacy” which supports the request to the court for an immediate injunction. The threat of an immediate junction was used to get the companies to sign contracts. Then the troll goes back after the same company for another known violation shortly after and claims they’re owed the financial penalty for breaking the contract. Signing contracts to pay damages to financially enrich one individual is completely at odds with our community’s enforcement goals.

We are showing that the community is not out for financial gain when it comes to license issues – though we do care about the company coming into compliance.  All we want is the modifications to our code to be released back to the public, and for the developers who created that code to become part of our community so that we can continue to create the best software that works well for everyone.

This is all still entirely focused on bringing the users into compliance. The 30 days can be used productively to determine exactly what is wrong, and how to resolve it.

Q: Ok, but why are we referencing GPL-3.0?

A: By using the terms from the GPLv3 for this, we use a very well-vetted and understood procedure for granting the opportunity to come fix the failure and come into compliance. We benefit from many months of work to reach agreement on a termination provision that worked in legal systems all around the world and was entirely consistent with Free Software principles.

Q: But what is the point of the “non-defensive assertion of rights” disclaimer?

A: If a copyright holder is attacked, we don’t want or need to require that copyright holder to give the party suing them an opportunity to cure. The “non-defensive assertion of rights” is just a way to leave everything unchanged for a copyright holder that gets sued.  This is no different a position than what they had before this statement.

Q: So you are ok with using Linux as a defensive copyright method?

A: There is a current copyright troll problem that is undermining confidence in our community – where a “bad actor” is attacking companies in a way to achieve personal gain. We are addressing that issue. No one has asked us to make changes to address other litigation.

Q: Ok, this document sounds like it was written by a bunch of big companies, who is behind the drafting of it and how did it all happen?

A: Grant Likely, the chairman at the time of the Linux Foundation’s Technical Advisory Board (TAB), wrote the first draft of this document when the first copyright troll issue happened a few years ago. He did this as numerous companies and developers approached the TAB asking that the Linux kernel community do something about this new attack on our community. He showed the document to a lot of kernel developers and a few company representatives in order to get feedback on how it should be worded. After the troll seemed to go away, this work got put on the back-burner. When the copyright troll showed back up, along with a few other “copycat” like individuals, the work on the document was started back up by Chris Mason, the current chairman of the TAB. He worked with the TAB members, other kernel developers, lawyers who have been trying to defend these claims in Germany, and the TAB members’ Linux Foundation’s lawyers, in order to rework the document so that it would actually achieve the intended benefits and be useful in stopping these new attacks. The document was then reviewed and revised with input from Linus Torvalds and finally a document that the TAB agreed would be sufficient was finished. That document was then sent to over 200 of the most active kernel developers for the past year by Greg Kroah-Hartman to see if they, or their company, wished to support the document. That produced the initial “signatures” on the document, and the acks of the patch that added it to the Linux kernel source tree.

Q: How do I add my name to the document?

A: If you are a developer of the Linux kernel, simply send Greg a patch adding your name to the proper location in the document (sorting the names by last name), and he will be glad to accept it.

Q: How can my company show its support of this document?

A: If you are a developer working for a company that wishes to show that they also agree with this document, have the developer put the company name in ‘(’ ‘)’ after the developer’s name. This shows that both the developer, and the company behind the developer are in agreement with this statement.

Q: How can a company or individual that is not part of the Linux kernel community show its support of the document?

A: Become part of our community! Send us patches, surely there is something that you want to see changed in the kernel. If not, wonderful, post something on your company web site, or personal blog in support of this statement, we don’t mind that at all.

Q: I’ve been approached by a copyright troll for Netfilter. What should I do?

A: Please see the Netfilter FAQ here for how to handle this

Q: I have another question, how do I ask it?

A: Email Greg or the TAB, and they will be glad to help answer them.

Linux Kernel Community Enforcement Statement (October 16, 2017, 09:00 UTC)

By Greg Kroah-Hartman, Chris Mason, Rik van Riel, Shuah Khan, and Grant Likely

The Linux kernel ecosystem of developers, companies and users has been wildly successful by any measure over the last couple decades. Even today, 26 years after the initial creation of the Linux kernel, the kernel developer community continues to grow, with more than 500 different companies and over 4,000 different developers getting changes merged into the tree during the past year. As Greg always says every year, the kernel continues to change faster this year than the last, this year we were running around 8.5 changes an hour, with 10,000 lines of code added, 2,000 modified, and 2,500 lines removed every hour of every day.

The stunning growth and widespread adoption of Linux, however, also requires ever evolving methods of achieving compliance with the terms of our community’s chosen license, the GPL-2.0. At this point, there is no lack of clarity on the base compliance expectations of our community. Our goals as an ecosystem are to make sure new participants are made aware of those expectations and the materials available to assist them, and to help them grow into our community.  Some of us spend a lot of time traveling to different companies all around the world doing this, and lots of other people and groups have been working tirelessly to create practical guides for everyone to learn how to use Linux in a way that is compliant with the license. Some of these activities include:

Unfortunately the same processes that we use to assure fulfillment of license obligations and availability of source code can also be used unjustly in trolling activities to extract personal monetary rewards. In particular, issues have arisen as a developer from the Netfilter community, Patrick McHardy, has sought to enforce his copyright claims in secret and for large sums of money by threatening or engaging in litigation. Some of his compliance claims are issues that should and could easily be resolved. However, he has also made claims based on ambiguities in the GPL-2.0 that no one in our community has ever considered part of compliance.  

Examples of these claims have been distributing over-the-air firmware, requiring a cell phone maker to deliver a paper copy of source code offer letter; claiming the source code server must be setup with a download speed as fast as the binary server based on the “equivalent access” language of Section 3; requiring the GPL-2.0 to be delivered in a local language; and many others.

How he goes about this activity was recently documented very well by Heather Meeker.

Numerous active contributors to the kernel community have tried to reach out to Patrick to have a discussion about his activities, to no response. Further, the Netfilter community suspended Patrick from contributing for violations of their principles of enforcement. The Netfilter community also published their own FAQ on this matter.

While the kernel community has always supported enforcement efforts to bring companies into compliance, we have never even considered enforcement for the purpose of extracting monetary gain.  It is not possible to know an exact figure due to the secrecy of Patrick’s actions, but we are aware of activity that has resulted in payments of at least a few million Euros.  We are also aware that these actions, which have continued for at least four years, have threatened the confidence in our ecosystem.

Because of this, and to help clarify what the majority of Linux kernel community members feel is the correct way to enforce our license, the Technical Advisory Board of the Linux Foundation has worked together with lawyers in our community, individual developers, and many companies that participate in the development of, and rely on Linux, to draft a Kernel Enforcement Statement to help address both this specific issue we are facing today, and to help prevent any future issues like this from happening again.

A key goal of all enforcement of the GPL-2.0 license has and continues to be bringing companies into compliance with the terms of the license. The Kernel Enforcement Statement is designed to do just that.  It adopts the same termination provisions we are all familiar with from GPL-3.0 as an Additional Permission giving companies confidence that they will have time to come into compliance if a failure is identified. Their ability to rely on this Additional Permission will hopefully re-establish user confidence and help direct enforcement activity back to the original purpose we have all sought over the years – actual compliance.  

Kernel developers in our ecosystem may put their own acknowledgement to the Statement by sending a patch to Greg adding their name to the Statement, like any other kernel patch submission, and it will be gladly merged. Those authorized to ‘ack’ on behalf of their company may add their company name in (parenthesis) after their name as well.

Note, a number of questions did come up when this was discussed with the kernel developer community. Please see Greg’s FAQ post answering the most common ones if you have further questions about this topic.

October 13, 2017
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
Gentoo Linux listed RethinkDB’s website (October 13, 2017, 08:22 UTC)

 

The rethinkdb‘s website has (finally) been updated and Gentoo Linux is now listed on the installation page!

Meanwhile, we have bumped the ebuild to version 2.3.6 with fixes for building on gcc-6 thanks to Peter Levine who kindly proposed a nice PR on github.

October 03, 2017

Description:
binutils is a set of tools necessary to build programs.

The complete ASan output of the issue:

# nm -A -a -l -S -s --special-syms --synthetic --with-symbol-versions -D $FILE
==26890==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6130000006d3 at pc 0x000000472115 bp 0x7ffdb7d8a0d0 sp 0x7ffdb7d89880                                                                         
READ of size 298 at 0x6130000006d3 thread T0                                                                                                                                                                      
    #0 0x472114 in __interceptor_strlen /var/tmp/portage/sys-libs/compiler-rt-sanitizers-5.0.0/work/compiler-rt-5.0.0.src/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:302                      
    #1 0x68fea5 in parse_die /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf1.c:254:12                                                                                                           
    #2 0x68ddda in _bfd_dwarf1_find_nearest_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf1.c:521:13                                                                                       
    #3 0x5f2f00 in _bfd_elf_find_nearest_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/elf.c:8659:10                                                                                            
    #4 0x517755 in print_symbol /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1004:12                                                                                                      
    #5 0x514e4d in print_symbols /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1084:7                                                                                                      
    #6 0x514e4d in display_rel_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1200                                                                                                     
    #7 0x510976 in display_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1318:7                                                                                                       
    #8 0x50f4ce in main /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1792:12                                                                                                              
    #9 0x7f3dea34e680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289                                                                                    
    #10 0x41a638 in chmod (/usr/x86_64-pc-linux-gnu/binutils-bin/git/nm+0x41a638)                                                                                                                                 
                                                                                                                                                                                                                  
0x6130000006d3 is located 0 bytes to the right of 339-byte region [0x613000000580,0x6130000006d3)                                                                                                                 
allocated by thread T0 here:                                                                                                                                                                                      
    #0 0x4d8828 in malloc /var/tmp/portage/sys-libs/compiler-rt-sanitizers-5.0.0/work/compiler-rt-5.0.0.src/lib/asan/asan_malloc_linux.cc:67                                                                      
    #1 0x53f138 in bfd_malloc /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/libbfd.c:193:9
    #2 0x799bc8 in bfd_get_full_section_contents /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/compress.c:248:21
    #3 0x7b8797 in bfd_simple_get_relocated_section_contents /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/simple.c:193:12
    #4 0x68e3b1 in _bfd_dwarf1_find_nearest_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf1.c:490:4
    #5 0x5f2f00 in _bfd_elf_find_nearest_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/elf.c:8659:10
    #6 0x517755 in print_symbol /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1004:12
    #7 0x514e4d in print_symbols /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1084:7
    #8 0x514e4d in display_rel_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1200
    #9 0x510976 in display_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1318:7
    #10 0x50f4ce in main /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1792:12
    #11 0x7f3dea34e680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289

SUMMARY: AddressSanitizer: heap-buffer-overflow /var/tmp/portage/sys-libs/compiler-rt-sanitizers-5.0.0/work/compiler-rt-5.0.0.src/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:302 in __interceptor_strlen
Shadow bytes around the buggy address:
  0x0c267fff8080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c267fff8090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c267fff80a0: 00 00 00 04 fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff80b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c267fff80c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c267fff80d0: 00 00 00 00 00 00 00 00 00 00[03]fa fa fa fa fa
  0x0c267fff80e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff80f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8100: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8110: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8120: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==26890==ABORTING

Affected version:
2.29.51.20170924 and maybe past releases

Fixed version:
N/A

Commit fix:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=1da5c9a485f3dcac4c45e96ef4b7dae5948314b5

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-15020

Reproducer:
https://github.com/asarubbo/poc/blob/master/00376-binutils-heapoverflow-parse_die

Timeline:
2017-09-25: bug discovered and reported to upstream
2017-09-25: upstream released a patch
2017-10-03: blog post about the issue
2017-10-04: CVE assigned

Note:
This bug was found with American Fuzzy Lop.
This bug was identified with bare metal servers donated by Packet. This work is also supported by the Core
Infrastructure Initiative
.

Permalink:

binutils: heap-based buffer overflow in parse_die (dwarf1.c)

Description:
binutils is a set of tools necessary to build programs.

The stacktrace of this issue appears to be a NULL pointer access. However the upstream maintainer changed the summary of the bugreport to “DW_AT_name with out of bounds reference”. The commit also reference to “DW_AT_name with out of bounds reference”

The complete ASan output of the issue:

# nm -A -a -l -S -s --special-syms --synthetic --with-symbol-versions -D $FILE
==8739==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x00000053bf16 bp 0x7ffcab59ee60 sp 0x7ffcab59ee20 T0)
==8739==The signal is caused by a READ memory access.
==8739==Hint: address points to the zero page.
    #0 0x53bf15 in bfd_hash_hash /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/hash.c:441:15
    #1 0x53bf15 in bfd_hash_lookup /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/hash.c:467
    #2 0x6a2049 in insert_info_hash_table /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:487:37
    #3 0x6a2049 in comp_unit_hash_info /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:3776
    #4 0x6a2049 in stash_maybe_update_info_hash_tables /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:4120
    #5 0x69cbbc in stash_maybe_enable_info_hash_tables /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:4214:3
    #6 0x69cbbc in _bfd_dwarf2_find_nearest_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:4613
    #7 0x5f330e in _bfd_elf_find_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/elf.c:8695:10
    #8 0x5176a3 in print_symbol /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1003:9
    #9 0x514e4d in print_symbols /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1084:7
    #10 0x514e4d in display_rel_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1200
    #11 0x510976 in display_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1318:7
    #12 0x50f4ce in main /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1792:12
    #13 0x7fd148c7b680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289
    #14 0x41a638 in chmod (/usr/x86_64-pc-linux-gnu/binutils-bin/git/nm+0x41a638)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/hash.c:441:15 in bfd_hash_hash
==8739==ABORTING

Affected version:
2.29.51.20170924 and maybe past releases

Fixed version:
N/A

Commit fix:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=11855d8a1f11b102a702ab76e95b22082cccf2f8

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-15022

Reproducer:
https://github.com/asarubbo/poc/blob/master/00375-binutils-NULLptr-bfd_hash_hash

Timeline:
2017-09-25: bug discovered and reported to upstream
2017-09-25: upstream released a patch
2017-10-03: blog post about the issue
2017-10-04: CVE assigned

Note:
This bug was found with American Fuzzy Lop.
This bug was identified with bare metal servers donated by Packet. This work is also supported by the Core
Infrastructure Initiative
.

Permalink:

binutils: NULL pointer dereference in bfd_hash_hash (hash.c)

Description:
binutils is a set of tools necessary to build programs.

The complete ASan output of the issue:

# nm -A -a -l -S -s --special-syms --synthetic --with-symbol-versions -D $FILE
==3765==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x0000006a7376 bp 0x7ffd5f9a3d50 sp 0x7ffd5f9a3d20 T0)
==3765==The signal is caused by a READ memory access.
==3765==Hint: address points to the zero page.
    #0 0x6a7375 in concat_filename /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:1601:8
    #1 0x696e83 in decode_line_info /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:2258:44
    #2 0x6a2ab8 in comp_unit_maybe_decode_line_info /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:3642:26
    #3 0x6a2ab8 in comp_unit_find_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:3677
    #4 0x6a0104 in _bfd_dwarf2_find_nearest_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:4789:11
    #5 0x5f330e in _bfd_elf_find_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/elf.c:8695:10
    #6 0x5176a3 in print_symbol /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1003:9
    #7 0x514e4d in print_symbols /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1084:7
    #8 0x514e4d in display_rel_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1200
    #9 0x510976 in display_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1318:7
    #10 0x50f4ce in main /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1792:12
    #11 0x7f0f4a74b680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289
    #12 0x41a638 in chmod (/usr/x86_64-pc-linux-gnu/binutils-bin/git/nm+0x41a638)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:1601:8 in concat_filename
==3765==ABORTING

Affected version:
2.29.51.20170924 and maybe past releases

Fixed version:
N/A

Commit fix:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=c361faae8d964db951b7100cada4dcdc983df1bf

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-15023

Reproducer:
https://github.com/asarubbo/poc/blob/master/00374-binutils-NULLptr-concat_filename

Timeline:
2017-09-25: bug discovered and reported to upstream
2017-09-25: upstream released a patch
2017-10-03: blog post about the issue
2017-10-04: CVE assigned

Note:
This bug was found with American Fuzzy Lop.
This bug was identified with bare metal servers donated by Packet. This work is also supported by the Core
Infrastructure Initiative
.

Permalink:

binutils: NULL pointer dereference in concat_filename (dwarf2.c)

Description:
binutils is a set of tools necessary to build programs.

The complete ASan output of the issue:

# nm -A -a -l -S -s --special-syms --synthetic --with-symbol-versions -D $FILE
==11994==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000029e at pc 0x7f800af7095d bp 0x7ffeab0e5c90 sp 0x7ffeab0e5c88            
READ of size 1 at 0x60200000029e thread T0                                                                                                           
    #0 0x7f800af7095c in bfd_getl32 /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/libbfd.c:559:24                                       
    #1 0x7f800af91323 in bfd_get_debug_link_info_1 /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/opncls.c:1206:12                       
    #2 0x7f800af91b8a in find_separate_debug_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/opncls.c:1423:10                        
    #3 0x7f800af91a0f in bfd_follow_gnu_debuglink /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/opncls.c:1582:10                        
    #4 0x7f800b110614 in _bfd_dwarf2_slurp_debug_info /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:4345:19                    
    #5 0x7f800b11bc67 in _bfd_dwarf2_find_nearest_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:4538:9                    
    #6 0x7f800b05e38b in _bfd_elf_find_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/elf.c:8695:10                                 
    #7 0x517c83 in print_symbol /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1003:9                                          
    #8 0x51542d in print_symbols /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1084:7                                         
    #9 0x51542d in display_rel_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1200                                        
    #10 0x510f56 in display_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1318:7                                         
    #11 0x50faae in main /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1792:12                                                
    #12 0x7f8009fa3680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289                      
    #13 0x41ac18 in _init (/usr/x86_64-pc-linux-gnu/binutils-bin/git/nm+0x41ac18)                                                                    

0x60200000029e is located 0 bytes to the right of 14-byte region [0x602000000290,0x60200000029e)
allocated by thread T0 here:
    #0 0x4d8e08 in malloc /var/tmp/portage/sys-libs/compiler-rt-sanitizers-5.0.0/work/compiler-rt-5.0.0.src/lib/asan/asan_malloc_linux.cc:67
    #1 0x7f800af6f3fc in bfd_malloc /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/libbfd.c:193:9
    #2 0x7f800af64b9f in bfd_get_full_section_contents /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/compress.c:248:21
    #3 0x7f800af91230 in bfd_get_debug_link_info_1 /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/opncls.c:1191:8
    #4 0x7f800af91b8a in find_separate_debug_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/opncls.c:1423:10
    #5 0x7f800af91a0f in bfd_follow_gnu_debuglink /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/opncls.c:1582:10
    #6 0x7f800b110614 in _bfd_dwarf2_slurp_debug_info /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:4345:19
    #7 0x7f800b11bc67 in _bfd_dwarf2_find_nearest_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:4538:9
    #8 0x7f800b05e38b in _bfd_elf_find_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/elf.c:8695:10
    #9 0x517c83 in print_symbol /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1003:9
    #10 0x51542d in print_symbols /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1084:7
    #11 0x51542d in display_rel_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1200
    #12 0x510f56 in display_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1318:7
    #13 0x50faae in main /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1792:12
    #14 0x7f8009fa3680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289

SUMMARY: AddressSanitizer: heap-buffer-overflow /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/libbfd.c:559:24 in bfd_getl32
Shadow bytes around the buggy address:
  0x0c047fff8000: fa fa 00 01 fa fa 00 06 fa fa fd fa fa fa fd fa
  0x0c047fff8010: fa fa fd fd fa fa fd fa fa fa fd fa fa fa fd fa
  0x0c047fff8020: fa fa fd fa fa fa fd fd fa fa fd fa fa fa fd fa
  0x0c047fff8030: fa fa fd fa fa fa fd fd fa fa fd fa fa fa fd fa
  0x0c047fff8040: fa fa fd fa fa fa fd fd fa fa fd fa fa fa 00 fa
=>0x0c047fff8050: fa fa 00[06]fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8070: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8080: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8090: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff80a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==11994==ABORTING

Affected version:
2.29.51.20170924 and maybe past releases

Fixed version:
N/A

Commit fix:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=52b36c51e5bf6d7600fdc6ba115b170b0e78e31d

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-15021

Reproducer:
https://github.com/asarubbo/poc/blob/master/00373-binutils-heapoverflow-bfd_getl32

Timeline:
2017-09-24: bug discovered and reported to upstream
2017-09-24: upstream released a patch
2017-10-03: blog post about the issue
2017-10-04: CVE assigned

Note:
This bug was found with American Fuzzy Lop.
This bug was identified with bare metal servers donated by Packet. This work is also supported by the Core
Infrastructure Initiative
.

Permalink:

binutils: heap-based buffer overflow in bfd_getl32 (opncls.c)

Description:
binutils is a set of tools necessary to build programs.

The complete ASan output of the issue:

 # nm -A -a -l -S -s --special-syms --synthetic --with-symbol-versions -D $FILE
==11125==ERROR: AddressSanitizer: FPE on unknown address 0x7f5e01fd42e5 (pc 0x7f5e01fd42e5 bp 0x7ffdaa5de290 sp 0x7ffdaa5de0e0 T0)
    #0 0x7f5e01fd42e4 in decode_line_info /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c
    #1 0x7f5e01fe192b in comp_unit_maybe_decode_line_info /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:3608:26
    #2 0x7f5e01fe192b in comp_unit_find_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:3643
    #3 0x7f5e01fde94f in _bfd_dwarf2_find_nearest_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:4755:11
    #4 0x7f5e01f1c20b in _bfd_elf_find_line /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/elf.c:8694:10
    #5 0x517c83 in print_symbol /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1003:9
    #6 0x51542d in print_symbols /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1084:7
    #7 0x51542d in display_rel_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1200
    #8 0x510f56 in display_file /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1318:7
    #9 0x50faae in main /var/tmp/portage/sys-devel/binutils-9999/work/binutils/binutils/nm.c:1792:12
    #10 0x7f5e00e61680 in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r4/work/glibc-2.23/csu/../csu/libc-start.c:289
    #11 0x41ac18 in _init (/usr/x86_64-pc-linux-gnu/binutils-bin/git/nm+0x41ac18)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: FPE /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c in decode_line_info
==11125==ABORTING

Affected version:
2.29.51.20170921 and maybe past releases

Fixed version:
N/A

Commit fix:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d8010d3e75ec7194a4703774090b27486b742d48

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-15025

Reproducer:
https://github.com/asarubbo/poc/blob/master/00372-binutils-FPE-decode_line_info

Timeline:
2017-09-22: bug discovered and reported to upstream
2017-09-24: upstream released a patch
2017-10-03: blog post about the issue
2017-10-04: CVE assigned

Note:
This bug was found with American Fuzzy Lop.
This bug was identified with bare metal servers donated by Packet. This work is also supported by the Core
Infrastructure Initiative
.

Permalink:

binutils: divide-by-zero in decode_line_info (dwarf2.c)

Description:
binutils is a set of tools necessary to build programs.

The relevant ASan output of the issue:

# nm -A -a -l -S -s --special-syms --synthetic --with-symbol-versions -D $FILE
==22616==ERROR: AddressSanitizer: stack-overflow on address 0x7ffc2948efe8 (pc 0x0000004248eb bp 0x7ffc2948f8e0 sp 0x7ffc2948efe0 T0)
    #0 0x4248ea in __asan::Allocator::Allocate(unsigned long, unsigned long, __sanitizer::BufferedStackTrace*, __asan::AllocType, bool) /var/tmp/portage/sys-libs/compiler-rt-sanitizers-5.0.0/work/compiler-rt-5.0.0.src/lib/asan/asan_allocator.cc:381
    #1 0x41f8f3 in __asan::asan_malloc(unsigned long, __sanitizer::BufferedStackTrace*) /var/tmp/portage/sys-libs/compiler-rt-sanitizers-5.0.0/work/compiler-rt-5.0.0.src/lib/asan/asan_allocator.cc:814
    #2 0x4d8de4 in malloc /var/tmp/portage/sys-libs/compiler-rt-sanitizers-5.0.0/work/compiler-rt-5.0.0.src/lib/asan/asan_malloc_linux.cc:68
    #3 0x7ff17b5b237c in bfd_malloc /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/libbfd.c:193:9                                                                                                     
    #4 0x7ff17b5a7b2f in bfd_get_full_section_contents /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/compress.c:248:21                                                                               
    #5 0x7ff17b5e16d3 in bfd_simple_get_relocated_section_contents /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/simple.c:193:12                                                                     
    #6 0x7ff17b75626e in read_section /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:556:8                                                                                                   
    #7 0x7ff17b772053 in read_indirect_string /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:730:9                                                                                           
    #8 0x7ff17b772053 in read_attribute_value /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:1189                                                                                            
    #9 0x7ff17b76ebf4 in read_attribute /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:1306:14                                                                                               
    #10 0x7ff17b76ebf4 in find_abstract_instance_name /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:2913                                                                                    
    #11 0x7ff17b76ec98 in find_abstract_instance_name /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:2930:12                                                                                 
    #12 0x7ff17b76ec98 in find_abstract_instance_name /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:2930:12                                                                                 
    [..cut..]
    #252 0x7ff17b76ec98 in find_abstract_instance_name /var/tmp/portage/sys-devel/binutils-9999/work/binutils/bfd/dwarf2.c:2930:12

Affected version:
2.29.51.20170921 and maybe past releases

Fixed version:
N/A

Commit fix:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=52a93b95ec0771c97e26f0bb28630a271a667bd2

Credit:
This bug was discovered by Agostino Sarubbo of Gentoo.

CVE:
CVE-2017-15024

Reproducer:
https://github.com/asarubbo/poc/blob/master/00371-binutils-infiniteloop-find_abstract_instance_name

Timeline:
2017-09-22: bug discovered and reported to upstream
2017-09-24: upstream released a patch
2017-10-03: blog post about the issue
2017-10-04: CVE assigned

Note:
This bug was found with American Fuzzy Lop.
This bug was identified with bare metal servers donated by Packet. This work is also supported by the Core
Infrastructure Initiative
.

Permalink:

binutils: infinite loop in find_abstract_instance_name (dwarf2.c)