Welcome to Planet Gentoo, an aggregation of Gentoo-related weblog articles written by Gentoo developers. For a broader range of topics, you might be interested in Gentoo Universe.

Disclaimer:
Views expressed in the content published here do not necessarily represent the views of Gentoo Linux or the Gentoo Foundation.
   
April 29, 2019
Yury German a.k.a. blueknight (homepage, bugs)
Gentoo Blogs Update (April 29, 2019, 03:41 UTC)

This is just a notification that the Blogs and the appropriate plug-ins for the release 5.1.1 have been updated.

With the release of these updated we (The Gentoo Blog Team) have updated the themes that had updates. If you have a blog on this site, and have a theme that is based on one of the following themes please consider updating as these themes are no longer updated and things will break in your blogs.

  • KDE Breathe
  • KDE Graffiti
  • Oxygen
  • The Following WordPress versions might stop working (Simply because of age)
    • Twenty Fourteen
    • Twenty Fifteen
    • Twenty Sixteen

If you are using one of these themes it is recommended that you update to the other themes available. If you think that there is an open source theme that you would like to have available please contact the Blogs team by opening a Bugzilla Bug with pertinent information.

April 24, 2019
Matthew Thode a.k.a. prometheanfire (homepage, bugs)
Building Gentoo disk images (April 24, 2019, 05:00 UTC)

Disclaimer

I'm not responsible if you ruin your system, this guide functions as documentation for future me. Remember to back up your data.

Why this is useful / needed

It's useful to have a way of building a disk image for shipping, either for testing or production usage. The image output formats could be qcow2, raw or compressed tarball, it's up to you to make this what you want it to be.

Pre-work

Install diskimage-builder, for Gentoo you just have to 'emerge' the latest version. I personally keep one around in a virtual environment for testing (this allows me to build musl images as well easily).

The actual setup

What diskimage-builder actually does is take elements and run them. Each elements consists of a set of phases where the element takes actions. All you are really doing is defining the elements and they will insert themselves where needed. It also uses environment variables for tunables, or for other various small tweaks.

This is how I build the images at http://distfiles.gentoo.org/experimental/amd64/openstack/

export GENTOO_PORTAGE_CLEANUP=True
export DIB_INSTALLTYPE_pip_and_virtualenv=package
export DIB_INSTALLTYPE_simple_init=repo
export GENTOO_PYTHON_TARGETS="python3_6"
export GENTOO_PYTHON_ACTIVE_VERSION="python3.6"
export ELEMENTS="gentoo simple-init growroot vm openssh-server block-device-mbr"
export COMMAND="disk-image-create -a amd64 -t qcow2 --image-size 3"
export DATE="$(date -u +%Y%m%d)"

GENTOO_PROFILE=default/linux/amd64/17.0/no-multilib/hardened ${COMMAND} -o "gentoo-openstack-amd64-hardened-nomultilib-${DATE}" ${ELEMENTS}
GENTOO_PROFILE=default/linux/amd64/17.0/no-multilib ${COMMAND} -o "gentoo-openstack-amd64-default-nomultilib-${DATE}" ${ELEMENTS}
GENTOO_PROFILE=default/linux/amd64/17.0/hardened ${COMMAND} -o "gentoo-openstack-amd64-hardened-${DATE}" ${ELEMENTS}
GENTOO_PROFILE=default/linux/amd64/17.0/systemd ${COMMAND} -o "gentoo-openstack-amd64-systemd-${DATE}" ${ELEMENTS}
${COMMAND} -o "gentoo-openstack-amd64-default-${DATE}" ${ELEMENTS}

For musl I've had to do some custom work as I have to build the stage4s locally, but it's largely the same (with the additional need to define a musl overlay.

cd ~/diskimage-builder
cp ~/10-gentoo-image.musl diskimage_builder/elements/gentoo/root.d/10-gentoo-image
pip install -U .
cd ~/

export GENTOO_PORTAGE_CLEANUP=False
export DIB_INSTALLTYPE_pip_and_virtualenv=package
export DIB_INSTALLTYPE_simple_init=repo
export GENTOO_PYTHON_TARGETS="python3_6"
export GENTOO_PYTHON_ACTIVE_VERSION="python3.6"
DATE="$(date +%Y%m%d)"
export GENTOO_OVERLAYS="musl"
export GENTOO_PROFILE=default/linux/amd64/17.0/musl/hardened

disk-image-create -a amd64 -t qcow2 --image-size 3 -o gentoo-openstack-amd64-hardened-musl-"${DATE}" gentoo simple-init growroot vm

cd ~/diskimage-builder
git checkout diskimage_builder/elements/gentoo/root.d/10-gentoo-image
pip install -U .
cd ~/

Generic images

The elements I use are for an OpenStack image, meaning there is no default user/pass, those are set by cloud-init / glean. For a generic image you will want the following elements.

'gentoo growroot devuser vm'

The following environment variables are needed as well (changed to match your needs).

DIB_DEV_USER_PASSWORD=supersecrete DIB_DEV_USER_USERNAME=secrete DIB_DEV_USER_PWDLESS_SUDO=yes DIB_DEV_USER_AUTHORIZED_KEYS=/foo/bar/.ssh/authorized_keys

Fin

All this work was done upstream, if you have a question (or feature request) just ask. I'm on irc (Freenode) as prometheanfire or the same nick at gentoo.org for email.

April 16, 2019

Nitrokey logo

The Gentoo Foundation has partnered with Nitrokey to equip all Gentoo developers with free Nitrokey Pro 2 devices. Gentoo developers will use the Nitrokey devices to store cryptographic keys for signing of git commits and software packages, GnuPG keys, and SSH accounts.

Thanks to the Gentoo Foundation and Nitrokey’s discount, each Gentoo developer is eligible to receive one free Nitrokey Pro 2. To receive their Nitrokey, developers will need to register with their @gentoo.org email address at the dedicated order form.

A Nitrokey Pro 2 Guide is available on the Gentoo Wiki with FAQ & instructions for integrating Nitrokeys into developer workflow.

ABOUT NITROKEY PRO 2

Nitrokey Pro 2 has strong reliable hardware encryption, thanks to open source. It can help you to: sign Git commits; encrypt emails and files; secure server access; and protect accounts against identity theft via two-factor authentication (one-time passwords).

ABOUT GENTOO

Gentoo Linux is a free, source-based, rolling release meta distribution that features a high degree of flexibility and high performance. It empowers you to make your computer work for you, and offers a variety of choices at all levels of system configuration.

As a community, Gentoo consists of approximately two hundred developers and over fifty thousand users globally.

The Gentoo Foundation supports the development of Gentoo, protects Gentoo’s intellectual property, and oversees adherence to Gentoo’s Social Contract.

ABOUT NITROKEY

Nitrokey is a German IT security startup committed to open source hardware and software. Nitrokey develops and produces USB keys for data encryption, email encryption (PGP/GPG, S/MIME), and secure account logins (SSH, two-factor authentication via OTP and FIDO).

Nitrokey is proud to support the Gentoo Foundation in further securing the Gentoo infrastructure and contributing to a secure open source Linux ecosystem.

March 29, 2019
Alexys Jacob a.k.a. ultrabug (homepage, bugs)

We recently had to face free disk space outages on some of our scylla clusters and we learnt some very interesting things while outlining some improvements that could be made to the ScyllaDB guys.

100% disk space usage?

First of all I wanted to give a bit of a heads up about what happened when some of our scylla nodes reached (almost) 100% disk space usage.

Basically they:

  • stopped listening to client requests
  • complained in the logs
  • wouldn’t flush commitlog (expected)
  • abort their compaction work (which actually gave back a few GB of space)
  • stay in a stuck / unable to stop state (unexpected, this has been reported)

After restarting your scylla server, the first and obvious thing you can try to do to get out of this situation is to run the nodetool clearsnapshot command which will remove any data snapshot that could be lying around. That’s a handy command to reclaim space usually.

Reminder: depending on your compaction strategy, it is usually not advised to allow your data to grow over 50% of disk space...

But that’s only a patch so let’s go down the rabbit hole and look at the optimization options we have.


Optimize your schemas

Schema design and the types your choose for your columns have a huge impact on disk space usage! And in our case we indeed overlooked some of the optimizations that we could have done from the start and that did cost us a lot of wasted disk space. Fortunately it was easy and fast to change.

To illustrate this, I’ll take a sample of 100,000 rows of a simple and naive schema associating readings of 50 integers to a user ID:

Note: all those operations were done using Scylla 3.0.3 on Gentoo Linux.

CREATE TABLE IF NOT EXISTS test.not_optimized
(
uid text,
readings list<int>,
PRIMARY KEY(uid)
) WITH compression = {};

Once inserted on disk, this takes about 250MB of disk space:

250M    not_optimized-00cf1500520b11e9ae38000000000004

Now depending on your use case, if those readings at not meant to be updated for example you could use a frozen list instead, which will allow a huge storage optimization:

CREATE TABLE IF NOT EXISTS test.mid_optimized
(
uid text,
readings frozen<list<int>>,
PRIMARY KEY(uid)
) WITH compression = {};

With this frozen list we now consume 54MB of disk space for the same data!

54M     mid_optimized-011bae60520b11e9ae38000000000004

There’s another optimization that we could do since our user ID are UUIDs. Let’s switch to the uuid type instead of text:

CREATE TABLE IF NOT EXISTS test.optimized
(
uid uuid,
readings frozen<list<int>>,
PRIMARY KEY(uid)
) WITH compression = {};

By switching to uuid, we now consume 50MB of disk space: that’s a 80% reduced disk space consumption compared to the naive schema for the same data!

50M     optimized-01f74150520b11e9ae38000000000004

Enable compression

All those examples were not using compression. If your workload latencies allows it, you should probably enable compression on your sstables.

Let’s see its impact on our tables:

ALTER TABLE test.not_optimized WITH compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'};
ALTER TABLE test.mid_optimized WITH compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'};
ALTER TABLE test.optimized WITH compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'};

Then we run a nodetool compact test to force a (re)compaction of all the sstables and we get:

63M     not_optimized-00cf1500520b11e9ae38000000000004
28M mid_optimized-011bae60520b11e9ae38000000000004
24M optimized-01f74150520b11e9ae38000000000004

Compression is really a great gain here allowing another 50% reduced disk space usage reduction on our optimized table!

Switch to the new “mc” sstable format

Since the Scylla 3.0 release you can use the latest “mc” sstable storage format on your scylla clusters. It promises a greater efficiency for usually a way more reduced disk space consumption!

It is not enabled by default, you have to add the enable_sstables_mc_format: true parameter to your scylla.yaml for it to be taken into account.

Since it’s backward compatible, you have nothing else to do as new compactions will start being made using the “mc” storage format and the scylla server will seamlessly read from old sstables as well.

But in our case of immediate disk space outage, we switched to the new format one node at a time, dropped the data from it and ran a nodetool rebuild to reconstruct the whole node using the new sstable format.

Let’s demonstrate its impact on our test tables: we add the option to the scylla.yaml file, restart scylla-server and run nodetool compact test again:

49M     not_optimized-00cf1500520b11e9ae38000000000004
26M mid_optimized-011bae60520b11e9ae38000000000004
22M optimized-01f74150520b11e9ae38000000000004

That’s a pretty cool gain of disk space, even more for the not optimized version of our schema!

So if you’re in great need of disk space or it is hard for you to change your schemas, switching to the new “mc” sstable format is a simple and efficient way to free up some space without effort.

Consider using secondary indexes

While denormalization is the norm (yep.. legitimate pun) in the NoSQL world this does not mean we have to duplicate everything all the time. A good example lies in the internals of secondary indexes if your workload can compromise with its moderate impact on latency.

Secondary indexes on scylla are built on top of Materialized Views that basically stores an up to date pointer from your indexed column to your main table partition key. That means that secondary indexes MVs are not duplicating all the columns (and thus the data) from your main table as you would have to do when denormalizing a table to query by another column: this saves disk space!

This of course comes with a latency drawback because if your workload is interested in the other columns than the partition key of the main table, the coordinator node will actually issue two queries to get all your data:

  1. query the secondary index MV to get the pointer to the partition key of the main table
  2. query the main table with the partition key to get the rest of the columns you asked for

This has been an effective trick to avoid duplicating a table and save disk space for some of our workloads!

(not a tip) Move the commitlog to another disk / partition?

This should only be considered as a sort of emergency procedure or for cost efficiency (cheap disk tiering) on non critical clusters.

While this is possible even if the disk is not formatted using XFS, it not advised to separate the commitlog from data on modern SSD/NVMe disks but… you technically can do it (as we did) on non production clusters.

Switching is simple, you just need to change the commitlog_directory parameter in your scylla.yaml file.

March 27, 2019
Gentoo GNOME 3.30 for all init systems (March 27, 2019, 00:00 UTC)

GNOME logo

GNOME 3.30 is now available in Gentoo Linux testing branch. Starting with this release, GNOME on Gentoo once again works with OpenRC, in addition to the usual systemd option. This is achieved through the elogind project, a standalone logind implementation based on systemd code, which is currently maintained by a fellow Gentoo user. Gentoo would like to thank Mart Raudsepp (leio), Gavin Ferris, and all others working on this for their contributions. More information can be found in Mart’s blog post.

March 26, 2019
Mart Raudsepp a.k.a. leio (homepage, bugs)
Gentoo GNOME 3.30 for all init systems (March 26, 2019, 16:51 UTC)

GNOME 3.30 is now available in Gentoo Linux testing branch.
Starting with this release, GNOME on Gentoo once again works with OpenRC, in addition to the usual systemd option. This is achieved through the elogind project, a standalone logind implementation based on systemd code, which is currently maintained by a fellow Gentoo user. It provides the missing logind interfaces currently required by GNOME without booting with systemd.

For easier GNOME install, the desktop/gnome profiles now set up default USE flags with elogind for OpenRC systems, while the desktop/gnome/systemd profiles continue to do that for systemd systems. Both have been updated to provide a better initial GNOME install experience. After profile selection, a full install should be simply a matter of `emerge gnome` for testing branch users. Don’t forget to adapt your system to any changed USE flags on previously installed packages too.

GNOME 3.32 is expected to be made available in testing branch soon as well, followed by introducing all this for stable branch users. This is hoped to complete within 6-8 weeks.

If you encounter issues, don’t hesitate to file bug reports or, if necessary, contact me via e-mail or IRC. You can also discuss the elogind aspects on the Gentoo Forums.

Acknowledgements

I’d like to thank Gavin Ferris, for kindly agreeing to sponsor my work on the above (upgrading GNOME on Gentoo from 3.26 to 3.30 and introducing Gentoo GNOME elogind support); and dantrell, for his pioneering overlay work integrating GNOME 3 with OpenRC on Gentoo, and also the GNOME and elogind projects.

March 25, 2019
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
py3status v3.17 (March 25, 2019, 14:12 UTC)

I’m glad to announce a new (awaited) release of py3status featuring support for the sway window manager which allows py3status to enter the wayland environment!

Updated configuration and custom modules paths detection

The configuration section of the documentation explains the updated detection of the py3status configuration file (with respect of XDG_CONFIG environment variables):

  • ~/.config/py3status/config
  • ~/.config/i3status/config
  • ~/.config/i3/i3status.conf
  • ~/.i3status.conf
  • ~/.i3/i3status.conf
  • /etc/xdg/i3status/config
  • /etc/i3status.conf

Regarding custom modules paths detection, py3status does as described in the documentation:

  • ~/.config/py3status/modules
  • ~/.config/i3status/py3status
  • ~/.config/i3/py3status
  • ~/.i3/py3status

Highlights

Lots of modules improvements and clean ups, see changelog.

  • we worked on the documentation sections and content which allowed us to fix a bunch of typos
  • our magic @lasers have worked a lot on harmonizing thresholds on modules along with a lot of code clean ups
  • new module: scroll to scroll modules on your bar (#1748)
  • @lasers has worked a lot on a more granular pango support for modules output (still work to do as it breaks some composites)

Thanks contributors

  • Ajeet D’Souza
  • @boucman
  • Cody Hiar
  • @cyriunx
  • @duffydack
  • @lasers
  • Maxim Baz
  • Thiago Kenji Okada
  • Yaroslav Dronskii

March 20, 2019
Install Gentoo in less than one minute (March 20, 2019, 18:35 UTC)

I’m pretty sure that the title of this post will catch your attention…and/or maybe your curiosity.

Well..this is something I’m doing since years…and since did not cost too much to make it in a public and usable state, I decided to share my work, to help some people to avoid waste of time and to avoid to be angry when your cloud provider does not offer the gentoo image.

So what are the goals of this project?

  1. Install gentoo on cloud providers that do not offer a Gentoo image (e.g Hetzner)
  2. Install gentoo everywhere in few seconds.

To do a fast installation, we need a stage4….but what is exactly a stage4? In this case the stage4 is composed by the official gentoo stage3 plus grub, some more utilities and some file already configured.

So since the stage4 has already everything to complete the installation, we just need to make some replacement (fstab, grub and so on), install grub on the disk………..and…..it’s done (by the auto-installer script)!

At this point I’d expect some people to say….”yeah…it’s so simply and logical…why I didn’t think about that” – Well, I guess that every gentoo user didn’t discover that just after the first installation…so you don’t need to blame yourself 🙂

The technical details are covered by the README in the gentoo-stage4 git repository

As said in the README:

  • If you have any request, feel free to contact me
  • A star on the project will give me the idea of the usage and then the effort to put here.

So what’s more? Just a screenshot of the script in action 🙂

# Gentoo hetzner cloud
# Gentoo stage4
# Gentoo cloud

February 22, 2019
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)
Postgresql major version upgrade (gentoo) (February 22, 2019, 10:37 UTC)

Just did an upgrade from postgres 10.x to 11.x on a test machine..

The guide on the Gentoo Wiki is pretty good, but a few things I forgot at first:

First off when initializing the new cluster with "emerge --config =dev-db/postgresql-11.1" making sure the DB init options are the same as the old cluster. They are stored in /etc/conf.d/postgresql-XX.Y so just make sure PG_INITDB_OPTS collation ,.. match - if not delete the new cluster and re-run emerge --config ;)

The second thing was pg_hba.conf: make sure to re-add extra user/db/connection permissions again (in my case I ran diff and then just copied the old config file as the only difference was the extra permissions I had added)

The third thing was postgresql.conf: here I forgot to make sure listen_addresses and port are the same as in the old config (I did not copy this one as there are a lot more differences here. -- and of course check the rest of the config file too (diff is your friend ;) )

other than that pg_upgrade worked really well for me and it is now up and running agian.

February 20, 2019
Michał Górny a.k.a. mgorny (homepage, bugs)

Traditionally, OpenPGP revocation certificates are used as a last resort. You are expected to generate one for your primary key and keep it in a secure location. If you ever lose the secret portion of the key and are unable to revoke it any other way, you import the revocation certificate and submit the updated key to keyservers. However, there is another interesting use for revocation certificates — revoking shared organization keys.

Let’s take Gentoo, for example. We are using a few keys needed to perform automated signatures on servers. For this reason, the key is especially exposed to attacks and we want to be able to revoke it quickly if the need arises. Now, we really do not want to have every single Infra member hold a copy of the secret primary key. However, we can give Infra members revocation certificates instead. This way, they maintain the possibility of revoking the key without unnecessarily increasing its exposure.

The problem with traditional revocation certificates is that they are supported for the purpose of revoking the primary key only. In our security model, the primary key is well protected, compared to subkeys that are totally exposed. Therefore, it is superfluous to revoke the complete key when only a subkey is compromised. To resolve this limitation, gen-revoke tool was created that can create exported revocation signatures for both the primary key and subkeys.

Technical background

The OpenPGP key (v4, as defined by RFC 4880) consists of a primary key, one or more UIDs and zero or more subkeys. Each of those keys and UIDs can include zero or more signature packets. Those packets bind information to the specific key or UID, and their authenticity is confirmed by a signature made using the secret portion of a primary key.

Signatures made by the key’s owner are called self-signatures. The most basic form of them serve as a binding between the primary key and its subkeys and UIDs. Since both those classes of objects are created independently of the primary key, self-signatures are necessary to distinguish authentic subkeys and UIDs created by the key owner from potential fakes. Appropriately, GnuPG will only accept subkeys and UIDs that have valid self-signature.

One specific type of signatures are revocation signatures. Those signatures indicate that the relevant key, subkey or UID has been revoked. If a revocation signature is found, it takes precedence over any other kinds of signatures and prevents the revoked object from being further used.

Key updates are means of distributing new data associated with the key. What’s important is that during an update the key is not replaced by a new one. Instead, GnuPG collects all the new data (subkeys, UIDs, signatures) and adds it to the local copy of the key. The validity of this data is verified against appropriate signatures. Appropriately, anyone can submit a key update to the keyserver, provided that the new data includes valid signatures. Similarly to local GnuPG instance, the keyserver is going to update its copy of the key rather than replacing it.

Revocation certificates specifically make use of this property. Technically, a revocation certificate is simply an exported form of a revocation signature, signed using the owner’s primary key. As long as it’s not on the key (i.e. GnuPG does not see it), it does not do anything. When it’s imported, GnuPG adds it to the key. Further submissions and exports include it, effectively distributing it to all copies of the key.

gen-revoke builds on this idea. It creates and exports revocation signatures for the primary key and subkeys. Due to implementation limitations (and for better compatibility), rather than exporting the signature alone it exports a minimal copy of the relevant key. This copy can be imported just like any other key export, and it causes the revocation signature to be added to the key. Afterwards, it can be exported and distributed just like a revocation done directly on the key.

Usage

To use the script, you need to have the secret portion of the primary key available, and public encryption keys for all the people who are supposed to obtain a copy of the revocation signatures (recipients).

The script takes at least two parameters: an identifier of the key for which revocation signatures should be created, followed by one or more e-mail addresses of signature recipients. It creates revocation signatures both for the primary key and for all valid subkeys, for all the people specified.

The signatures are written into the current directory as key exports and are encrypted to each specified person. They should be distributed afterwards, and kept securely by all the individuals. If a need to revoke either a subkey or the primary key arises, the first person available can decrypt the signature, import it and send the resulting key to keyservers.

Additionally, each signature includes a comment specifying the person it was created for. This comment will afterwards be displayed by GnuPG if one of the revocation signatures is imported. This provides a clear audit trace as to who revoked the key.

Security considerations

Each of the revocation signatures can be used by an attacker to disable the key in question. The signatures are protected through encryption. Therefore, the system is vulnerable to the key of a single signature owner being compromised.

However, this is considerably safer than the equivalent option of distributing the secret portion of the primary key. In the latter case, the attacker would be able to completely compromise the key and use it for malicious purposes; in the former, it is only capable of revoking the key and therefore causing some frustration. Furthermore, the revocation comment helps identifying the compromised user.

The tradeoff between reliability and security can be adjusted by changing the number of revocation signature holders.

January 31, 2019
Michał Górny a.k.a. mgorny (homepage, bugs)

This article describes the UI deficiency of Evolution mail client that extrapolates the trust of one of OpenPGP key UIDs into the key itself, and reports it along with the (potentially untrusted) primary UID. This creates the possibility of tricking the user into trusting a phished mail via adding a forged UID to a key that has a previously trusted UID.

Continue reading

January 29, 2019
Michał Górny a.k.a. mgorny (homepage, bugs)
Identity with OpenPGP trust model (January 29, 2019, 13:50 UTC)

Let’s say you want to send a confidential message to me, and possibly receive a reply. Through employing asymmetric encryption, you can prevent a third party from reading its contents, even if it can intercept the ciphertext. Through signatures, you can verify the authenticity of the message, and therefore detect any possible tampering. But for all this to work, you need to be able to verify the authenticity of the public keys first. In other words, we need to be able to prevent the aforementioned third party — possibly capable of intercepting your communications and publishing a forged key with my credentials on it — from tricking you into using the wrong key.

This renders key authenticity the fundamental problem of asymmetric cryptography. But before we start discussing how key certification is implemented, we need to cover another fundamental issue — identity. After all, who am I — who is the person you are writing to? Are you writing to a person you’ve met? Or to a specific Gentoo developer? Author of some project? Before you can distinguish my authentic key from a forged key, you need to be able to clearly distinguish me from an impostor.

Forms of identity

Identity via e-mail address

If your primary goal is to communicate with the owner of the particular e-mail address, it seems obvious to associate the identity with the owner of the e-mail address. However, how in reality would you distinguish a ‘rightful owner’ of the e-mail address from a cracker who managed to obtain access to it, or to intercept your network communications and inject forged mails?

The truth is, the best you can certify is that the owner of a particular key is able to read and/or send mails from a particular e-mail address, at a particular point in time. Then, if you can certify the same for a long enough period of time, you may reasonably assume the address is continuously used by the same identity (which may qualify as a legitimate owner or a cracker with a lot of patience).

Of course, all this relies on your trust in mail infrastructure not being compromised.

Identity via personal data

A stronger protection against crackers may be provided by associating the identity with personal data, as confirmed by government-issued documents. In case of OpenPGP, this is just the real name; X.509 certificates also provide fields for street address, phone number, etc.

The use of real names seems to be based on two assumptions: that your real name is reasonable well-known (e.g. it can be established with little risk of being replaced by a third party), and that the attacker does not wish to disclose his own name. Besides that, using real names meets with some additional criticism.

Firstly, requiring one to use his real name may be considered an invasion on privacy. Most notably, some people wish not to disclose or use their real names, and this effectively prevents them from ever being certified.

Secondly, real names are not unique. After all, the naming systems developed from the necessity of distinguishing individuals in comparatively small groups, and they simply don’t scale to the size of the Internet. Therefore, name collisions are entirely possible and we are relying on sheer luck that the attacker wouldn’t happen to have the same name as you do.

Thirdly and most importantly, verifying identity documents is non-trivial and untrained individuals are likely to fall victim of mediocre quality fakes. After all, we’re talking about people who hopefully read some article on verifying a particular kind of document but have no experience recognizing forgery, no specialized hardware (I suppose most of you don’t carry a magnifying glass and a UV light on yourself) and who may lack skills in comparing signatures or photographs (not to mention some people have really old photographs in documents). Some countries don’t even issue any official documentation for document verification in English!

Finally, even besides the point of forged documents, this relies on trust in administration.

Identity via photographs

This one I’m mentioning merely for completeness. OpenPGP keys allow adding a photo as one of your UIDs. However, this is rather rarely used (out of the keys my GnuPG fetched so far, less than 10% have photographs). The concerns are similar as for personal data: it assumes that others are reliably able to know how you look like, and that they are capable of reliably comparing faces.

Online identity

An interesting concept is to use your public online activity to prove your identity — such as websites or social media. This is generally based on cross-referencing multiple resources with cryptographically proven publishing access, and assuming that an attacker would not be able to compromise all of them simultaneously.

A form of this concept is utilized by keybase.io. This service builds trust in user profiles via cryptographically cross-linking your profiles on some external sites and/or your websites. Furthermore, it actively encourages other users to verify those external proofs as well.

This identity model entirely relies on trust in network infrastructure and external sites. The likeliness of it being compromised is reduced by (potentially) relying on multiple independent sites.

Web of Trust model

Most of time, you won’t be able to directly verify the identity of everyone you’d like to communicate with. This creates a necessity of obtaining indirect proof of authenticity, and the model normally used for that purpose in OpenPGP is the Web of Trust. I won’t be getting into the fine details — you can find them e.g. in the GNU Privacy Handbook. For our purposes, it suffices to say that in WoT the authenticity of keys you haven’t verified may be assessed by people whose keys you trust already, or people they know, with a limited level of recursion.

The more key holders you can trust, the more keys you can have verified indirectly and the more likely it is that your future recipient will be in that group. Or that you will be able to get someone from across the world into your WoT by meeting someone residing much closer to yourself. Therefore, you’d naturally want the WoT to grow fast and include more individuals. You’d want to preach OpenPGP onto non-crypto-aware people. However, this comes with inherent danger: can you really trust that they will properly verify the identity of the keys they sign?

I believe this is the most fundamental issue with WoT model: for it to work outside of small specialized circles, it has to include more and more individuals across the world. But this growth inevitable makes it easier for a malicious third party to find people that can be tricked into certifying keys with forged identities.

Conclusion

The fundamental problem in OpenPGP usage is finding the correct key and verifying its authenticity. This becomes especially complex given there is no single clear way of determining one’s identity in the Internet. Normally, OpenPGP uses a combination of real name and e-mail address, optionally combined with a photograph. However, all of them have their weaknesses.

Direct identity verification for all recipients is non-practical, and therefore requires indirect certification solutions. While the WoT model used by OpenPGP attempts to avoid centralized trust specific to PKI, it is not clear whether it’s practically manageable. On one hand, it requires trusting more people in order to improve coverage; on the other, it makes it more vulnerable to fraud.

Given all the above, the trust-via-online-presence concept may be of some interest. Most importantly, it establishes a closer relationship between the identity you actually need and the identity you verify — e.g. you want to mail the person being an open source developer, author of some specific projects rather than arbitrary person with a common enough name. However, this concept is not established broadly yet.

January 26, 2019
Michał Górny a.k.a. mgorny (homepage, bugs)

This article shortly explains the historical git weakness regarding handling commits with multiple OpenPGP signatures in git older than v2.20. The method of creating such commits is presented, and the results of using them are described and analyzed.

Continue reading

January 20, 2019
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
py3status v3.16 (January 20, 2019, 21:10 UTC)

Two py3status versions in less than a month? That’s the holidays effect but not only!

Our community has been busy discussing our way forward to 4.0 (see below) and organization so it was time I wrote a bit about that.

Community

A new collaborator

First of all we have the great pleasure and honor to welcome Maxim Baz @maximbaz as a new collaborator on the project!

His engagement, numerous contributions and insightful reviews to py3status has made him a well known community member, not to mention his IRC support 🙂

Once again, thank you for being there Maxim!

Zen of py3status

As a result of an interesting discussion, we worked on defining better how to contribute to py3status as well as a set of guidelines we agree on to get the project moving on smoothly.

Here is born the zen of py3status which extends the philosophy from the user point of view to the contributor point of view!

This allowed us to handle the numerous open pull requests and get their number down to 5 at the time of writing this post!

Even our dear @lasers don’t have any open PR anymore 🙂

3.15 + 3.16 versions

Our magic @lasers has worked a lot on general modules options as well as adding support for i3-gaps added features such as border coloring and fine tuning.

Also interesting is the work of Thiago Kenji Okada @m45t3r around NixOS packaging of py3status. Thanks a lot for this work and for sharing Thiago!

I also liked the question of Andreas Lundblad @aioobe asking if we could have a feature allowing to display a custom graphical output, such as a small PNG or anything upon clicking on the i3bar, you might be interested in following up the i3 issue he opened.

Make sure to read the amazing changelog for details, a lot of modules have been enhanced!

Highlights

  • You can now set a background, border colors and their urgent counterparts on a global scale or per module
  • CI now checks for black format on modules, so now all the code base obey the black format style!
  • All HTTP requests based modules now have a standard way to define HTTP timeout as well as the same 10 seconds default timeout
  • py3-cmd now allows sending click events with modifiers
  • The py3status -n / –interval command line argument has been removed as it was obsolete. We will ignore it if you have set it up, but better remove it to be clean
  • You can specify your own i3status binary path using the new -u, –i3status command line argument thanks to @Dettorer and @lasers
  • Since Yahoo! decided to retire its public & free weather API, the weather_yahoo module has been removed

New modules

  • new conky module: display conky system monitoring (#1664), by lasers
  • new module emerge_status: display information about running gentoo emerge (#1275), by AnwariasEu
  • new module hueshift: change your screen color temperature (#1142), by lasers
  • new module mega_sync: to check for MEGA service synchronization (#1458), by Maxim Baz
  • new module speedtest: to check your internet bandwidth (#1435), by cyrinux
  • new module usbguard: control usbguard from your bar (#1376), by cyrinux
  • new module velib_metropole: display velib metropole stations and (e)bikes (#1515), by cyrinux

A word on 4.0

Do you wonder what’s gonna be in the 4.0 release?
Do you have ideas that you’d like to share?
Do you have dreams that you’d love to become true?

Then make sure to read and participate in the open RFC on 4.0 version!

Development has not started yet; we really want to hear from you.

Thank you contributors!

There would be no py3status release without our amazing contributors, so thank you guys!

  • AnwariasEu
  • cyrinux
  • Dettorer
  • ecks
  • flyingapfopenguin
  • girst
  • Jack Doan
  • justin j lin
  • Keith Hughitt
  • L0ric0
  • lasers
  • Maxim Baz
  • oceyral
  • Simon Legner
  • sridhars
  • Thiago Kenji Okada
  • Thomas F. Duellmann
  • Till Backhaus

January 09, 2019
FOSDEM 2019 (January 09, 2019, 00:00 UTC)

FOSDEM logo

It’s FOSDEM time again! Join us at Université libre de Bruxelles, Campus du Solbosch, in Brussels, Belgium. This year’s FOSDEM 2019 will be held on February 2nd and 3rd.

Our developers will be happy to greet all open source enthusiasts at our Gentoo stand in building K. Visit this year’s wiki page to see who’s coming. So far eight developers have specified their attendance, with most likely many more on the way!

December 06, 2018
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
Scylla Summit 2018 write-up (December 06, 2018, 22:53 UTC)

It’s been almost one month since I had the chance to attend and speak at Scylla Summit 2018 so I’m relieved to finally publish a short write-up on the key things I wanted to share about this wonderful event!

Make Scylla boring

This statement of Glauber Costa sums up what looked to me to be the main driver of the engineering efforts put into Scylla lately: making it work so consistently well on any kind of workload that it’s boring to operate 🙂

I will follow up on this statement to highlight the things I heard and (hopefully) understood during the summit. I hope you’ll find it insightful.

Reduced operational efforts

The thread-per-core and queues design still has a lot of possibilities to be leveraged.

The recent addition of RPC streaming capabilities to seastar allows a drastic reduction in the time it takes the cluster to grow or shrink (data rebalancing / resynchronization).

Incremental compaction is also very promising as this background process is one of the most expensive there is in the database’s design.

I was happy to hear that scylla-manager will soon be made available and free to use with basic features while retaining more advanced ones for enterprise version (like backup/restore).
I also noticed that the current version was not supporting SSL enabled clusters to store its configuration. So I directly asked Michał for it and I’m glad that it will be released on version 1.3.1.

Performant multi-tenancy

Why choose between real-time OLTP & analytics OLAP workloads?

The goal here is to be able to run both on the same cluster by giving users the ability to assign “SLA” shares to ROLES. That’s basically like pools on Hadoop at a much finer grain since it will create dedicated queues that will be weighted by their share.

Having one queue per usage and full accounting will allow to limit resources efficiently and users to have their say on their latency SLAs.

But Scylla also has a lot to do in the background to run smoothly. So while this design pattern was already applied to tamper compactions, a lot of work has also been done on automatic flow control and back pressure.

For instance, Materialized Views are updated asynchronously which means that while we can interact and put a lot of pressure on the table its based on (called the Main Table), we could overwhelm the background work that’s needed to keep MVs View Tables in sync. To mitigate this, a smart back pressure approach was developed and will throttle the clients to make sure that Scylla can manage to do everything at the best performance the hardware allows!

I was happy to hear that work on tiered storage is also planned to better optimize disk space costs for certain workloads.

Last but not least, columnar storage optimized for time series and analytics workloads are also something the developers are looking at.

Latency is expensive

If you care for latency, you might be happy to hear that a new polling API (named IOCB_CMD_POLL) has been contributed by Christoph Hellwig and Avi Kivity to the 4.19 Linux kernel which avoids context switching I/O by using a shared ring between kernel and userspace. Scylla will be using it by default if the kernel supports it.

The iotune utility has been upgraded since 2.3 to generate an enhanced I/O configuration.

Also, persistent (disk backed) in-memory tables are getting ready and are very promising for latency sensitive workloads!

A word on drivers

ScyllaDB has been relying on the Datastax drivers since the start. While it’s a good thing for the whole community, it’s important to note that the shard-per-CPU approach on data that Scylla is using is not known and leveraged by the current drivers.

Discussions took place and it seems that Datastax will not allow the protocol to evolve so that drivers could discover if the connected cluster is shard aware or not and then use this information to be more clever in which write/read path to use.

So for now ScyllaDB has been forking and developing their shard aware drivers for Java and Go (no Python yet… I was disappointed).

Kubernetes & containers

The ScyllaDB guys of course couldn’t avoid the Kubernetes frenzy so Moreno Garcia gave a lot of feedback and tips on how to operate Scylla on docker with minimal performance degradation.

Kubernetes has been designed for stateless applications, not stateful ones and Docker does some automatic magic that have rather big performance hits on Scylla. You will basically have to play with affinities to dedicate one Scylla instance to run on one server with a “retain” reclaim policy.

Remember that the official Scylla docker image runs with dev-mode enabled by default which turns off all performance checks on start. So start by disabling that and look at all the tips and literature that Moreno has put online!

Scylla 3.0

A lot has been written on it already so I will just be short on things that important to understand in my point of view.

  • Materialized Views do back fill the whole data set
    • this job is done by the view building process
    • you can watch its progress in the system_distributed.view_build_status table
  • Secondary Indexes are Materialized Views under the hood
    • it’s like a reverse pointer to the primary key of the Main Table
    • so if you read the whole row by selecting on the indexed column, two reads will be issued under the hood: one on the indexed MV view table to get the primary key and one on the main table to get the rest of the columns
    • so if your workload is mostly interested by the whole row, you’re better off creating a complete MV to read from than using a SI
    • this is even more true if you plan to do range scans as this double query could lead you to read from multiple nodes instead of one
  • Range scan is way more performant
    • ALLOW FILTERING finally allows a great flexibility by providing server-side filtering!

Random notes

Support for LWT (lightweight transactions) will be relying on a future implementation of the Raft consensus algorithm inside Scylla. This work will also benefits Materialized Views consistency. Duarte Nunes will be the one working on this and I envy him very much!

Support for search workloads is high in the ScyllaDB devs priorities so we should definitely hear about it in the coming months.

Support for “mc” sstables (new generation format) is done and will reduce storage requirements thanks to metadata / data compression. Migration will be transparent because Scylla can read previous formats as well so it will upgrade your sstables as it compacts them.

ScyllaDB developers have not settled on how to best implement CDC. I hope they do rather soon because it is crucial in their ability to integrate well with Kafka!

Materialized Views, Secondary Indexes and filtering will benefit from the work on partition key and indexes intersections to avoid server side filtering on the coordinator. That’s an important optimization to come!

Last but not least, I’ve had the pleasure to discuss with Takuya Asada who is the packager of Scylla for RedHat/CentOS & Debian/Ubuntu. We discussed Gentoo Linux packaging requirements as well as the recent and promising work on a relocatable package. We will collaborate more closely in the future!

November 25, 2018
Michał Górny a.k.a. mgorny (homepage, bugs)
Portability of tar features (November 25, 2018, 14:26 UTC)

The tar format is one of the oldest archive formats in use. It comes as no surprise that it is ugly — built as layers of hacks on the older format versions to overcome their limitations. However, given the POSIX standarization in late 80s and the popularity of GNU tar, you would expect the interoperability problems to be mostly resolved nowadays.

This article is directly inspired by my proof-of-concept work on new binary package format for Gentoo. My original proposal used volume label to provide user- and file(1)-friendly way of distinguish our binary packages. While it is a GNU tar extension, it falls within POSIX ustar implementation-defined file format and you would expect that non-compliant implementations would extract it as regular files. What I did not anticipate is that some implementation reject the whole archive instead.

This naturally raised more questions on how portable various tar formats actually are. To verify that, I have decided to analyze the standards for possible incompatibility dangers and build a suite of test inputs that could be used to check how various implementations cope with that. This article describes those points and provides test results for a number of implementations.

Please note that this article is focused merely on read-wise format compatibility. In other words, it establishes how tar files should be written in order to achieve best probability that it will be read correctly afterwards. It does not investigate what formats the listed tools can write and whether they can correctly create archives using specific features.

Continue reading

November 10, 2018
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
py3status v3.14 (November 10, 2018, 21:08 UTC)

I’m happy to announce this release as it contains some very interesting developments in the project. This release was focused on core changes.

IMPORTANT notice

There are now two optional dependencies to py3status:

  • gevent
    • will monkey patch the code to make it concurrent
    • the main benefit is to use an asynchronous loop instead of threads
  • pyudev
    • will enable a udev monitor if a module asks for it (only xrandr so far)
    • the benefit is described below

To install them all using pip, simply do:

pip install py3status[all]

Modules can now react/refresh on udev events

When pyudev is available, py3status will allow modules to subscribe and react to udev events!

The xrandr module uses this feature by default which allows the module to instantly refresh when you plug in or off a secondary monitor. This also allows to stop running the xrandr command in the background and saves a lot of CPU!

Highlights

  • py3status core uses black formatter
  • fix default i3status.conf detection
    • add ~/.config/i3 as a default config directory, closes #1548
    • add .config/i3/py3status in default user modules include directories
  • add markup (pango) support for modules (#1408), by @MikaYuoadas
  • py3: notify_user module name in the title (#1556), by @lasers
  • print module information to sdtout instead of stderr (#1565), by @robertnf
  • battery_level module: default to using sys instead of acpi (#1562), by @eddie-dunn
  • imap module: fix output formatting issue (#1559), by @girst

Thank you contributors!

  • eddie-dunn
  • girst
  • MikaYuoadas
  • robertnf
  • lasers
  • maximbaz
  • tobes

October 04, 2018

CLIP OS logo ANSSI, the National Cybersecurity Agency of France, has released the sources of CLIP OS, that aims to build a hardened, multi-level operating system, based on the Linux kernel and a lot of free and open source software. We are happy to hear that it is based on Gentoo Hardened!

September 28, 2018
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
py3status v3.13 (September 28, 2018, 11:56 UTC)

I am once again lagging behind the release blog posts but this one is an important one.

I’m proud to announce that our long time contributor @lasers has become an official collaborator of the py3status project!

Dear @lasers, your amazing energy and overwhelming ideas have served our little community for a while. I’m sure we’ll have a great way forward as we learn to work together with @tobes 🙂 Thank you again very much for everything you do!

This release is as much dedicated to you as it is yours 🙂

IMPORTANT notice

After this release, py3status coding style CI will enforce the ‘black‘ formatter style.

Highlights

Needless to say that the changelog is huge, as usual, here is a very condensed view:

  • documentation updates, especially on the formatter (thanks @L0ric0)
  • py3 storage: use $XDG_CACHE_HOME or ~/.cache
  • formatter: multiple variable and feature fixes and enhancements
  • better config parser
  • new modules: lm_sensors, loadavg, mail, nvidia_smi, sql, timewarrior, wanda_the_fish

Thank you contributors!

  • lasers
  • tobes
  • maximbaz
  • cyrinux
  • Lorenz Steinert @L0ric0
  • wojtex
  • horgix
  • su8
  • Maikel Punie

September 27, 2018
Michał Górny a.k.a. mgorny (homepage, bugs)
New copyright policy explained (September 27, 2018, 06:47 UTC)

On 2018-09-15 meeting, the Trustees have given the final stamp of approval to the new Gentoo copyright policy outlined in GLEP 76. This policy is the result of work that has been slowly progressing since 2005, and that has taken considerable speed by the end of 2017. It is a major step forward from the status quo that has been used since the forming of Gentoo Foundation, and that mostly has been inherited from earlier Gentoo Technologies.

The policy aims to cover all copyright-related aspects, bringing Gentoo in line with the practices used in many other large open source projects. Most notably, it introduces a concept of Gentoo Certificate of Origin that requires all contributors to confirm that they are entitled to submit their contributions to Gentoo, and corrects the copyright attribution policy to be viable under more jurisdictions.

This article aims to shortly reiterate over the most important points in the new copyright policy, and provide a detailed guide on following it in Q&A form.

Continue reading

September 15, 2018
Michał Górny a.k.a. mgorny (homepage, bugs)

With Qt5 gaining support for high-DPI displays, and applications starting to exercise that support, it’s easy for applications to suddenly become unusable with some screens. For example, my old Samsung TV reported itself as 7″ screen. While this used not to really matter with websites forcing you to force the resolution of 96 DPI, the high-DPI applications started scaling themselves to occupy most of my screen, with elements becoming really huge (and ugly, apparently due to some poor scaling).

It turns out that it is really hard to find a solution for this. Most of the guides and tips are focused either on proprietary drivers or on getting custom resolutions. The DisplaySize specification in xorg.conf apparently did not change anything either. Finally, I was able to resolve the issue by overriding the EDID data for my screen. This guide explains how I did it.

Step 1: dump EDID data

Firstly, you need to get the EDID data from your monitor. Supposedly read-edid tool could be used for this purpose but it did not work for me. With only a little bit more effort, you can get it e.g. from xrandr:

$ xrandr --verbose
[...]
HDMI-0 connected primary 1920x1080+0+0 (0x57) normal (normal left inverted right x axis y axis) 708mm x 398mm
[...]
  EDID:
    00ffffffffffff004c2dfb0400000000
    2f120103804728780aee91a3544c9926
    0f5054bdef80714f8100814081809500
    950fb300a940023a801871382d40582c
    4500c48e2100001e662150b051001b30
    40703600c48e2100001e000000fd0018
    4b1a5117000a2020202020200000000a
    0053414d53554e470a20202020200143
    020323f14b901f041305140312202122
    2309070783010000e2000f67030c0010
    00b82d011d007251d01e206e285500c4
    8e2100001e011d00bc52d01e20b82855
    40c48e2100001e011d8018711c162058
    2c2500c48e2100009e011d80d0721c16
    20102c2580c48e2100009e0000000000
    00000000000000000000000000000029
[...]

If you have multiple displays connected, make sure to use the EDID for the one you’re overriding. Copy the hexdump and convert it to a binary blob. You can do this by passing it through xxd -p -r (installed by vim).

Step 2: fix screen dimensions

Once you have the EDID blob ready, you need to update the screen dimensions inside it. Initially, I did it using hex editor which involved finding all the occurrences, updating them (and manually encoding into the weird split-integers) and correcting the checksums. Then, I’ve written edid-fixdim so you wouldn’t have to repeat that experience.

First, use --get option to verify that your EDID is supported correctly:

$ edid-fixdim -g edid.bin
EDID structure: 71 cm x 40 cm
Detailed timing desc: 708 mm x 398 mm
Detailed timing desc: 708 mm x 398 mm
CEA EDID found
Detailed timing desc: 708 mm x 398 mm
Detailed timing desc: 708 mm x 398 mm
Detailed timing desc: 708 mm x 398 mm
Detailed timing desc: 708 mm x 398 mm

So your EDID consists of basic EDID structure, followed by one extension block. The screen dimensions are stored in 7 different blocks you’d have to update, and referenced in two checksums. The tool will take care of updating it all for you, so just pass the correct dimensions to --set:

$ edid-fixdim -s 1600x900 edid.bin
EDID structure updated to 160 cm x 90 cm
Detailed timing desc updated to 1600 mm x 900 mm
Detailed timing desc updated to 1600 mm x 900 mm
CEA EDID found
Detailed timing desc updated to 1600 mm x 900 mm
Detailed timing desc updated to 1600 mm x 900 mm
Detailed timing desc updated to 1600 mm x 900 mm
Detailed timing desc updated to 1600 mm x 900 mm

Afterwards, you can use --get again to verify that the changes were made correctly.

Step 3: overriding EDID data

Now it’s just the matter of putting the override in motion. First, make sure to enable CONFIG_DRM_LOAD_EDID_FIRMWARE in your kernel:

Device Drivers  --->
  Graphics support  --->
    Direct Rendering Manager (XFree86 4.1.0 and higher DRI support)  --->
      [*] Allow to specify an EDID data set instead of probing for it

Then, determine the correct connector name. You can find it in dmesg output:

$ dmesg | grep -C 1 Connector
[   15.192088] [drm] ib test on ring 5 succeeded
[   15.193461] [drm] Radeon Display Connectors
[   15.193524] [drm] Connector 0:
[   15.193580] [drm]   HDMI-A-1
--
[   15.193800] [drm]     DFP1: INTERNAL_UNIPHY1
[   15.193857] [drm] Connector 1:
[   15.193911] [drm]   DVI-I-1
--
[   15.194210] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
[   15.194267] [drm] Connector 2:
[   15.194322] [drm]   VGA-1

Copy the new EDID blob into location of your choice inside /lib/firmware:

$ mkdir /lib/firmware/edid
$ cp edid.bin /lib/firmware/edid/samsung.bin

Finally, add the override to your kernel command-line:

drm.edid_firmware=HDMI-A-1:edid/samsung.bin

If everything went fine, xrandr should report correct screen dimensions after next reboot, and dmesg should report that EDID override has been loaded:

$ dmesg | grep EDID
[   15.549063] [drm] Got external EDID base block and 1 extension from "edid/samsung.bin" for connector "HDMI-A-1"

If it didn't, check dmesg for error messages.

September 07, 2018
Gentoo congratulates our GSoC participants (September 07, 2018, 00:00 UTC)

GSOC logo Gentoo would like to congratulate Gibix and JSteward for finishing and passing Google’s Summer of Code for the 2018 calendar year. Gibix contributed by enhancing Rust (programming language) support within Gentoo. JSteward contributed by making a full Gentoo GNU/Linux distribution, managed by Portage, run on devices which use the original Android-customized kernel.

The final reports of their projects can be reviewed on their personal blogs:

August 24, 2018
Michał Górny a.k.a. mgorny (homepage, bugs)

I have recently worked on enabling 2-step authentication via SSH on the Gentoo developer machine. I have selected google-authenticator-libpam amongst different available implementations as it seemed the best maintained and having all the necessary features, including a friendly tool for users to configure it. However, its design has a weakness: it stores the secret unprotected in user’s home directory.

This means that if an attacker manages to gain at least temporary access to the filesystem with user’s privileges — through a malicious process, vulnerability or simply because someone left the computer unattended for a minute — he can trivially read the secret and therefore clone the token source without leaving a trace. It would completely defeat the purpose of the second step, and the user may not even notice until the attacker makes real use of the stolen secret.

In order to protect against this, I’ve created google-authenticator-wrappers (as upstream decided to ignore the problem). This package provides a rather trivial setuid wrapper that manages a write-only, authentication-protected secret store for the PAM module. Additionally, it comes with a test program (so you can test the OTP setup without jumping through the hoops or risking losing access) and friendly wrappers for the default setup, as used on Gentoo Infra.

The recommended setup (as utilized by sys-auth/google-authenticator-wrappers package) is to use a dedicated user for the password store. In this scenario, the users are unable to read their secrets, and all secret operations (including authentication via the PAM module) are done using an unprivileged user. Furthermore, any operation regarding the configuration (either updating it or removing the second step) require regular PAM authentication (e.g. typing your own password).

This is consistent with e.g. how shadow operates (users can’t read their passwords, nor update them without authenticating first), how most sites using 2-factor authentication operate (again, users can’t read their secrets) and follows the RFC 6238 recommendation (that keys […] SHOULD be protected against unauthorized access and usage). It solves the aforementioned issue by preventing user-privileged processes from reading the secrets and recovery codes. Furthermore, it prevents the attacker with this particular level of access from disabling 2-step authentication, changing the secret or even weakening the configuration.

August 17, 2018
Luca Barbato a.k.a. lu_zero (homepage, bugs)
Gentoo on Integricloud (August 17, 2018, 22:44 UTC)

Integricloud gave me access to their infrastructure to track some issues on ppc64 and ppc64le.

Since some of the issues are related to the compilers, I obviously installed Gentoo on it and in the process I started to fix some issues with catalyst to get a working install media, but that’s for another blogpost.

Today I’m just giving a walk-through on how to get a ppc64le (and ppc64 soon) VM up and running.

Preparation

Read this and get your install media available to your instance.

Install Media

I’m using the Gentoo installcd I’m currently refining.

Booting

You have to append console=hvc0 to your boot command, the boot process might figure it out for you on newer install medias (I still have to send patches to update livecd-tools)

Network configuration

You have to manually setup the network.
You can use ifconfig and route or ip as you like, refer to your instance setup for the parameters.

ifconfig enp0s0 ${ip}/16
route add -net default gw ${gw}
echo "nameserver 8.8.8.8" > /etc/resolv.conf
ip a add ${ip}/16 dev enp0s0
ip l set enp0s0 up
ip r add default via ${gw}
echo "nameserver 8.8.8.8" > /etc/resolv.conf

Disk Setup

OpenFirmware seems to like gpt much better:

parted /dev/sda mklabel gpt

You may use fdisk to create:
– a PowerPC PrEP boot partition of 8M
– root partition with the remaining space

Device     Start      End  Sectors Size Type
/dev/sda1   2048    18431    16384   8M PowerPC PReP boot
/dev/sda2  18432 33554654 33536223  16G Linux filesystem

I’m using btrfs and zstd-compress /usr/portage and /usr/src/.

mkfs.btrfs /dev/sda2

Initial setup

It is pretty much the usual.

mount /dev/sda2 /mnt/gentoo
cd /mnt/gentoo
wget https://dev.gentoo.org/~mattst88/ppc-stages/stage3-ppc64le-20180810.tar.xz
tar -xpf stage3-ppc64le-20180810.tar.xz
mount -o bind /dev dev
mount -t devpts devpts dev/pts
mount -t proc proc proc
mount -t sysfs sys sys
cp /etc/resolv.conf etc
chroot .

You just have to emerge grub and gentoo-sources, I diverge from the defconfig by making btrfs builtin.

My /etc/portage/make.conf:

CFLAGS="-O3 -mcpu=power9 -pipe"
# WARNING: Changing your CHOST is not something that should be done lightly.
# Please consult https://wiki.gentoo.org/wiki/Changing_the_CHOST_variable beforee
 changing.
CHOST="powerpc64le-unknown-linux-gnu"

# NOTE: This stage was built with the bindist Use flag enabled
PORTDIR="/usr/portage"
DISTDIR="/usr/portage/distfiles"
PKGDIR="/usr/portage/packages"

USE="ibm altivec vsx"

# This sets the language of build output to English.
# Please keep this setting intact when reporting bugs.
LC_MESSAGES=C
ACCEPT_KEYWORDS=~ppc64

MAKEOPTS="-j4 -l6"
EMERGE_DEFAULT_OPTS="--jobs 10 --load-average 6 "

My minimal set of packages I need before booting:

emerge grub gentoo-sources vim btrfs-progs openssh

NOTE: You want to emerge again openssh and make sure bindist is not in your USE.

Kernel & Bootloader

cd /usr/src/linux
make defconfig
make menuconfig # I want btrfs builtin so I can avoid a initrd
make -j 10 all && make install && make modules_install
grub-install /dev/sda1
grub-mkconfig -o /boot/grub/grub.cfg

NOTE: make sure you pass /dev/sda1 otherwise grub will happily assume OpenFirmware knows about btrfs and just point it to your directory.
That’s not the case unfortunately.

Networking

I’m using netifrc and I’m using the eth0-naming-convention.

touch /etc/udev/rules.d/80-net-name-slot.rules
ln -sf /etc/init.d/net.{lo,eth0}
echo -e "config_eth0=\"${ip}/16\"\nroutes_eth0="default via ${gw}\"\ndns_servers_eth0=\"8.8.8.8\"" > /etc/conf.d/net

Password and SSH

Even if the mticlient is quite nice, you would rather use ssh as much as you could.

passwd 
rc-update add sshd default

Finishing touches

Right now sysvinit does not add the hvc0 console as it should due to a profile quirk, for now check /etc/inittab and in case add:

echo 'hvc0:2345:respawn:/sbin/agetty -L 9600 hvc0' >> /etc/inittab

Add your user and add your ssh key and you are ready to use your new system!

August 15, 2018
Michał Górny a.k.a. mgorny (homepage, bugs)
new* helpers can read from stdin (August 15, 2018, 09:21 UTC)

Did you know that new* helpers can read from stdin? Well, now you know! So instead of writing to a temporary file you can install your inline text straight to the destination:

src_install() {
  # old code
  cat <<-EOF >"${T}"/mywrapper || die
    #!/bin/sh
    exec do-something --with-some-argument
  EOF
  dobin "${T}"/mywrapper

  # replacement
  newbin - mywrapper <<-EOF
    #!/bin/sh
    exec do-something --with-some-argument
  EOF
}

August 13, 2018
Michał Górny a.k.a. mgorny (homepage, bugs)

The recent efforts on improving the security of different areas of Gentoo have brought some arguments. Some time ago one of the developers has considered whether he would withstand physical violence if an attacker would use it in order to compromise Gentoo. A few days later another developer has suggested that an attacker could pay Gentoo developers to compromise the distribution. Is this a real threat to Gentoo? Are we all doomed?

Before I answer this question, let me make an important presumption. Gentoo is a community-driven open source project. As such, it has certain inherent weaknesses and there is no way around them without changing what Gentoo fundamentally is. Those weaknesses are common to all projects of the same nature.

Gentoo could indeed be compromised if developers are subject to the threat of violence to themselves or their families. As for money, I don’t want to insult anyone and I don’t think it really matters. The fact is, Gentoo is vulnerable to any adversary resourceful enough, and there are certainly both easier and cheaper ways than the two mentioned. For example, the adversary could get a new developer recruited, or simply trick one of the existing developers into compromising the distribution. It just takes one developer out of ~150.

As I said, there is no way around that without making major changes to the organizational structure of Gentoo. Those changes would probably do more harm to Gentoo than good. We can just admit that we can’t fully protect Gentoo from focused attack of a resourceful adversary, and all we can do is to limit the potential damage, detect it quickly and counteract the best we can. However, in reality random probes and script kiddie attacks that focus on trivial technical vulnerabilities are more likely, and that’s what the security efforts end up focusing on.

August 12, 2018

Pwnies logo

Congratulations to security researcher and Gentoo developer Hanno Böck and his co-authors Juraj Somorovsky and Craig Young for winning one of this year’s coveted Pwnie awards!

The award is for their work on the Return Of Bleichenbacher’s Oracle Threat or ROBOT vulnerability, which at the time of discovery affected such illustrious sites as Facebook and Paypal. Technical details can be found in the full paper published at the Cryptology ePrint Archive.

FroSCon logo

As last year, there will be a Gentoo booth again at the upcoming FrOSCon “Free and Open Source Conference” in St. Augustin near Bonn! Visitors can meet Gentoo developers to ask any question, get Gentoo swag, and prepare, configure, and compile their own Gentoo buttons.

The conference is 25th and 26th of August 2018, and there is no entry fee. See you there!

July 19, 2018
Alexys Jacob a.k.a. ultrabug (homepage, bugs)

This quick article is a wrap up for reference on how to connect to ScyllaDB using Spark 2 when authentication and SSL are enforced for the clients on the Scylla cluster.

We encountered multiple problems, even more since we distribute our workload using a YARN cluster so that our worker nodes should have everything they need to connect properly to Scylla.

We found very little help online so I hope it will serve anyone facing similar issues (that’s also why I copy/pasted them here).

The authentication part is easy going by itself and was not the source of our problems, SSL on the client side was.

Environment

  • (py)spark: 2.1.0.cloudera2
  • spark-cassandra-connector: datastax:spark-cassandra-connector: 2.0.1-s_2.11
  • python: 3.5.5
  • java: 1.8.0_144
  • scylladb: 2.1.5

SSL cipher setup

The Datastax spark cassandra driver uses default the TLS_RSA_WITH_AES_256_CBC_SHA cipher that the JVM does not support by default. This raises the following error when connecting to Scylla:

18/07/18 13:13:41 WARN channel.ChannelInitializer: Failed to initialize a channel. Closing: [id: 0x8d6f78a7]
java.lang.IllegalArgumentException: Cannot support TLS_RSA_WITH_AES_256_CBC_SHA with currently installed providers

According to the ssl documentation we have two ciphers available:

  1. TLS_RSA_WITH_AES_256_CBC_SHA
  2. TLS_RSA_WITH_AES_128_CBC_SHA

We can get get rid of the error by lowering the cipher to TLS_RSA_WITH_AES_128_CBC_SHA using the following configuration:

.config("spark.cassandra.connection.ssl.enabledAlgorithms", "TLS_RSA_WITH_AES_128_CBC_SHA")\

However, this is not really a good solution and instead we’d be inclined to use the TLS_RSA_WITH_AES_256_CBC_SHA version. For this we need to follow this Datastax’s procedure.

Then we need to deploy the JCE security jars on our all client nodes, if using YARN like us this means that you have to deploy these jars to all your NodeManager nodes.

For example by hand:

# unzip jce_policy-8.zip
# cp UnlimitedJCEPolicyJDK8/*.jar /opt/oracle-jdk-bin-1.8.0.144/jre/lib/security/

Java trust store

When connecting, the clients need to be able to validate the Scylla cluster’s self-signed CA. This is done by setting up a trustStore JKS file and providing it to the spark connector configuration (note that you protect this file with a password).

keyStore vs trustStore

In SSL handshake purpose of trustStore is to verify credentials and purpose of keyStore is to provide credentials. keyStore in Java stores private key and certificates corresponding to the public keys and is required if you are a SSL Server or SSL requires client authentication. TrustStore stores certificates from third parties or your own self-signed certificates, your application identify and validates them using this trustStore.

The spark-cassandra-connector documentation has two options to handle keyStore and trustStore.

When we did not use the trustStore option, we would get some obscure error when connecting to Scylla:

com.datastax.driver.core.exceptions.TransportException: [node/1.1.1.1:9042] Channel has been closed

When enabling DEBUG logging, we get a clearer error which indicated a failure in validating the SSL certificate provided by the Scylla server node:

Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

setting up the trustStore JKS

You need to have the self-signed CA public certificate file, then issue the following command:

# keytool -importcert -file /usr/local/share/ca-certificates/MY_SELF_SIGNED_CA.crt -keystore COMPANY_TRUSTSTORE.jks -noprompt
Enter keystore password:  
Re-enter new password: 
Certificate was added to keystore

using the trustStore

Now you need to configure spark to use the trustStore like this:

.config("spark.cassandra.connection.ssl.trustStore.password", "PASSWORD")\
.config("spark.cassandra.connection.ssl.trustStore.path", "COMPANY_TRUSTSTORE.jks")\

Spark SSL configuration example

This wraps up the SSL connection configuration used for spark.

This example uses pyspark2 and reads a table in Scylla from a YARN cluster:

$ pyspark2 --packages datastax:spark-cassandra-connector:2.0.1-s_2.11 --files COMPANY_TRUSTSTORE.jks

>>> spark = SparkSession.builder.appName("scylla_app")\
.config("spark.cassandra.auth.password", "test")\
.config("spark.cassandra.auth.username", "test")\
.config("spark.cassandra.connection.host", "node1,node2,node3")\
.config("spark.cassandra.connection.ssl.clientAuth.enabled", True)\
.config("spark.cassandra.connection.ssl.enabled", True)\
.config("spark.cassandra.connection.ssl.trustStore.password", "PASSWORD")\
.config("spark.cassandra.connection.ssl.trustStore.path", "COMPANY_TRUSTSTORE.jks")\
.config("spark.cassandra.input.split.size_in_mb", 1)\
.config("spark.yarn.queue", "scylla_queue").getOrCreate()

>>> df = spark.read.format("org.apache.spark.sql.cassandra").options(table="my_table", keyspace="test").load()
>>> df.show()

July 06, 2018
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
A botspot story (July 06, 2018, 14:50 UTC)

I felt like sharing a recent story that allowed us identify a bot in a haystack thanks to Scylla.

 

The scenario

While working on loading up 2B+ of rows into Scylla from Hive (using Spark), we noticed a strange behaviour in the performances of one of our nodes:

 

So we started wondering why that server in blue was having those peaks of load and was clearly diverging from the two others… As we obviously expect the three nodes to behave the same, there were two options on the table:

  1. hardware problem on the node
  2. bad data distribution (bad schema design? consistent hash problem?)

We shared this with our pals from ScyllaDB and started working on finding out what was going on.

The investigation

Hardware?

Hardware problem was pretty quickly evicted, nothing showed up on the monitoring and on the kernel logs. I/O queues and throughput were good:

Data distribution?

Avi Kivity (ScyllaDB’s CTO) quickly got the feeling that something was wrong with the data distribution and that we could be facing a hotspot situation. He quickly nailed it down to shard 44 thanks to the scylla-grafana-monitoring platform.

Data is distributed between shards that are stored on nodes (consistent hash ring). This distribution is done by hashing the primary key of your data which dictates the shard it belongs to (and thus the node(s) where the shard is stored).

If one of your keys is over represented in your original data set, then the shard it belongs to can be overly populated and the related node overloaded. This is called a hotspot situation.

tracing queries

The first step was to trace queries in Scylla to try to get deeper into the hotspot analysis. So we enabled tracing using the following formula to get about 1 trace per second in the system_traces namespace.

tracing probability = 1 / expected requests per second throughput

In our case, we were doing between 90K req/s and 150K req/s so we settled for 100K req/s to be safe and enabled tracing on our nodes like this:

# nodetool settraceprobability 0.00001

Turns out tracing didn’t help very much in our case because the traces do not include the query parameters in Scylla 2.1, it is becoming available in the soon to be released 2.2 version.

NOTE: traces expire on the tables, make sure your TRUNCATE the events and sessions tables while iterating. Else you will have to wait for the next gc_grace_period (10 days by default) before they are actually removed. If you do not do that and generate millions of traces like we did, querying the mentioned tables will likely time out because of the “tombstoned” rows even if there is no trace inside any more.

looking at cfhistograms

Glauber Costa was also helping on the case and got us looking at the cfhistograms of the tables we were pushing data to. That proved to be clearly highlighting a hotspot problem:

histograms
Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                             (micros)          (micros)           (bytes)                  
50%             0,00              6,00              0,00               258                 2
75%             0,00              6,00              0,00               535                 5
95%             0,00              8,00              0,00              1916                24
98%             0,00             11,72              0,00              3311                50
99%             0,00             28,46              0,00              5722                72
Min             0,00              2,00              0,00               104                 0
Max             0,00          45359,00              0,00          14530764            182785

What this basically means is that 99% percentile of our partitions are small (5KB) while the biggest is 14MB! That’s a huge difference and clearly shows that we have a hotspot on a partition somewhere.

So now we know for sure that we have an over represented key in our data set, but what key is it and why?

The culprit

So we looked at the cardinality of our data set keys which are SHA256 hashes and found out that indeed we had one with more than 1M occurrences while the second highest one was around 100K!…

Now that we had the main culprit hash, we turned to our data streaming pipeline to figure out what kind of event was generating the data associated to the given SHA256 hash… and surprise! It was a client’s quality assurance bot that was constantly browsing their own website with legitimate behaviour and identity credentials associated to it.

So we modified our pipeline to detect this bot and discard its events so that it stops polluting our databases with fake data. Then we cleaned up the million of events worth of mess and traces we stored about the bot.

The aftermath

Finally, we cleared out the data in Scylla and tried again from scratch. Needless to say that the curves got way better and are exactly what we should expect from a well balanced cluster:

Thanks a lot to the ScyllaDB team for their thorough help and high spirited support!

I’ll quote them conclude this quick blog post:

July 03, 2018
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)

Turns out I should read things like the gentoo wiki upgrade guide *first* to avoid issues ..

After installing php 7.1 to replace the (quite old) php 5.6 I did check php.ini ,.. but forgot to check for compiled modules and setting PHP_TARGETS .. then wondered why I just got Internal Server Error messages..

Thanks to the php team for writing this nice guide to remind people like me of what to do:

https://wiki.gentoo.org/wiki/PHP/Upgrading_to_PHP_7.1

As always the Gentoo Wiki is a great source of information, and I like to use it as a reminder of the things needed when installing/upgrading,.. ;)

June 29, 2018
My comments on the Gentoo Github hack (June 29, 2018, 16:00 UTC)

Several news outlets are reporting on the takeover of the Gentoo GitHub organization that was announced recently. Today 28 June at approximately 20:20 UTC unknown individuals have gained control of the Github Gentoo organization, and modified the content of repositories as well as pages there. We are still working to determine the exact extent and … Continue reading "My comments on the Gentoo Github hack"

June 28, 2018

2018-07-04 14:00 UTC

We believe this incident is now resolved. Please see the incident report for details about the incident, its impact, and resolution.

2018-06-29 15:15 UTC

The community raised questions about the provenance of Gentoo packages. Gentoo development is performed on hardware run by the Gentoo Infrastructure team (not github). The Gentoo hardware was unaffected by this incident. Users using the default Gentoo mirroring infrastructure should not be affected.

If you are still concerned about provenance or are unsure what solution you are using, please consult https://wiki.gentoo.org/wiki/Project:Portage/Repository_Verification. This will instruct you on how to verify your repository.

2018-06-29 06:45 UTC

The gentoo GitHub organization remains temporarily locked down by GitHub support, pending fixes to pull-request content.

For ongoing status, please see the Gentoo infra-status incident page.

For later followup, please see the Gentoo Wiki page for GitHub 2018-06-28. An incident post-mortem will follow on the wiki.

June 03, 2018
Sebastian Pipping a.k.a. sping (homepage, bugs)

Repology is monitoring package repositories across Linux distributions. By now, Atom feeds of per-maintainer outdated packages that I was waiting for have been implemented.

So I subscribed to my own Gentoo feed using net-mail/rss2email and now Repology notifies me via e-mail of new upstream releases that other Linux distros have packaged that I still need to bump in Gentoo. In my case, it brought an update of dev-vcs/svn2git to my attention that I would have missed (or heard about later), otherwise.

Based on this comment, Repology may soon do release detection upstream similar to what euscan does, as well.

April 21, 2018
Rafael G. Martins a.k.a. rafaelmartins (homepage, bugs)
Updates (April 21, 2018, 14:35 UTC)

Since I don't write anything here for almost 2 years, I think it is time for some short updates:

  • I left RedHat and moved to Berlin, Germany, in March 2017.
  • The series of posts about balde was stopped. The first post got a lot of Hacker News attention, and I will come back with it as soon as I can implement the required changes in the framework. Not going to happen very soon, though.
  • I've been spending most of my free time with flight simulation. You can expect a few related posts soon.
  • I left the Gentoo GSoC administration this year.
  • blogc is the only project that is currently getting some frequent attention from me, as I use it for most of my websites. Check it out! ;-)

That's all for now.

April 18, 2018
Zack Medico a.k.a. zmedico (homepage, bugs)

In portage-2.3.30, portage’s python API provides an asyncio event loop policy via a DefaultEventLoopPolicy class. For example, here’s a little program that uses portage’s DefaultEventLoopPolicy to do the same thing as emerge --regen, using an async_iter_completed function to implement the --jobs and --load-average options:

#!/usr/bin/env python

from __future__ import print_function

import argparse
import functools
import multiprocessing
import operator

import portage
from portage.util.futures.iter_completed import (
    async_iter_completed,
)
from portage.util.futures.unix_events import (
    DefaultEventLoopPolicy,
)


def handle_result(cpv, future):
    metadata = dict(zip(portage.auxdbkeys, future.result()))
    print(cpv)
    for k, v in sorted(metadata.items(),
        key=operator.itemgetter(0)):
        if v:
            print('\t{}: {}'.format(k, v))
    print()


def future_generator(repo_location, loop=None):

    portdb = portage.portdb

    for cp in portdb.cp_all(trees=[repo_location]):
        for cpv in portdb.cp_list(cp, mytree=repo_location):
            future = portdb.async_aux_get(
                cpv,
                portage.auxdbkeys,
                mytree=repo_location,
                loop=loop,
            )

            future.add_done_callback(
                functools.partial(handle_result, cpv))

            yield future


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        '--repo',
        action='store',
        default='gentoo',
    )
    parser.add_argument(
        '--jobs',
        action='store',
        type=int,
        default=multiprocessing.cpu_count(),
    )
    parser.add_argument(
        '--load-average',
        action='store',
        type=float,
        default=multiprocessing.cpu_count(),
    )
    args = parser.parse_args()

    try:
        repo_location = portage.settings.repositories.\
            get_location_for_name(args.repo)
    except KeyError:
        parser.error('unknown repo: {}\navailable repos: {}'.\
            format(args.repo, ' '.join(sorted(
            repo.name for repo in
            portage.settings.repositories))))

    policy = DefaultEventLoopPolicy()
    loop = policy.get_event_loop()

    try:
        for future_done_set in async_iter_completed(
            future_generator(repo_location, loop=loop),
            max_jobs=args.jobs,
            max_load=args.load_average,
            loop=loop):
            loop.run_until_complete(future_done_set)
    finally:
        loop.close()



if __name__ == '__main__':
    main()

April 03, 2018
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
py3status v3.8 (April 03, 2018, 12:06 UTC)

Another long awaited release has come true thanks to our community!

The changelog is so huge that I had to open an issue and cry for help to make it happen… thanks again @lasers for stepping up once again 🙂

Highlights

  • gevent support (-g option) to switch from threads scheduling to greenlets and reduce resources consumption
  • environment variables support in i3status.conf to remove sensible information from your config
  • modules can now leverage a persistent data store
  • hundreds of improvements for various modules
  • we now have an official debian package
  • we reached 500 stars on github #vanity

Milestone 3.9

  • try to release a version faster than every 4 months (j/k) 😉

The next release will focus on bugs and modules improvements / standardization.

Thanks contributors!

This release is their work, thanks a lot guys!

  • alex o’neill
  • anubiann00b
  • cypher1
  • daniel foerster
  • daniel schaefer
  • girst
  • igor grebenkov
  • james curtis
  • lasers
  • maxim baz
  • nollain
  • raspbeguy
  • regnat
  • robert ricci
  • sébastien delafond
  • themistokle benetatos
  • tobes
  • woland

April 02, 2018
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)

Thanks to the enlightenment devs for fixing this ;) no lock screen sucks :D

https://www.enlightenment.org/news/e0.22.3_release

also it is in my Gentoo dev overlay as of now.

So a while ago I cleaned out my dev overlay and added dev-libs/efl-1.20.7 and x11-wm/enlightenment-0.22.1 (and 0.22.2)

Works for me at the moment (except the screen (un-)lock) but not sure if that has to do with my box. any testers welcome

Here's the link: https://gitweb.gentoo.org/dev/lordvan.git/

Oh and I added it to layman's repo list again, so gentoo users can easily just "layman -a lordvan" to test it.

On a side note: 0.22.1 gave me trouble with a 2nd screen plugged in, which seems fixed in 0.22.2, but that has (pam related) problems with the lock screen ..

March 17, 2018
Sebastian Pipping a.k.a. sping (homepage, bugs)
Holy cow! Larry the cow Gentoo tattoo (March 17, 2018, 14:53 UTC)

Probably not new but was new to me: Just ran into this Larry the Cow tattoo online: http://www.geekytattoos.com/larry-the-gender-challenged-cow/

March 01, 2018
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)

Just a quick post about how to run UCS (Core Edition in my case) with KVM on gentoo.

First off I go with the assumption that

  • KVM is working (kernel, config,..)
  • qemu is installed (+ init scripts)
  • bridge networking is set up and working

If any of the above are not yet set up: https://wiki.gentoo.org/wiki/QEMU

First download the Virtualbox Image from https://www.univention.de/download/ .

Further for the kvm name I use ucs-dc

next we convert the image to qcow2:

qemu-img convert -f vmdk -O qcow2 UCS-DC/UCS-DC-virtualbox-disk1.vmdk  UCS-DC_disk1.qcow2

create your init script link:

cd /etc/init.d; ln -s qemu kvm.ucs-dc

Then in /etc/conf.d copy qemu.conf.example to kvm.ucs-dc

Check / change the following:

  1. change the MACADDR (it includes a command line to generate one) -- The reason this is first is, that if you forget you might spend hours - like me -  trying to find out why your network is not working ..
  2. QEMU_TYPE="x86_64"
  3. NIC_TYPE=br
  4. point DISKIMAGE=  to your qcow2 file
  5. ENABLE_KVM=1 (believe me disabling kvm is noticeable)
  6. adjust MEMORY (I set it to 2GB for the DC) and SMP (i set that to 2)
  7. FOREGROUND="vnc=:<port>" - so you can connect to your console using VNC
  8. check the other stuff if it applies to you (OTHER_ARGS is quite useful for example to also add a CD/usb emulation of a rescue disk,..

run it with

/etc/init.d/kvm.ucs-dc start

connect with your favourite VNC client and set up your UCS Server.

One thing I did on the fileserver instance (I run 3 UCS kvms at the moment - DC, Backup-DC and File Server):

I created a LVM Volume for the file share on the Host, and mapped it to the KVM - here's the config line:

OTHER_ARGS="-drive format=raw,file=/dev/mapper/<your volume devide>,if=virtio,aio=native,cache.direct=on"

works great for me, and I will also add another one for other shares later I think. but this way if i really have any VM problems my files are just on the lvm device and i can get to it easily (also there are lvm snapshots,.. that could be useful eventually)

Tryton setup & config (March 01, 2018, 08:10 UTC)

Because I keep forgetting stuff I need to do (or the order) here a very quick overview:

Install trytond, modules + deps (on gentoo add the tryton overlay and just emerge)

If you don'T use sqlite create a user (and database) for tryton.

Gentoo Init scripts use /etc/conf.d/trytond (here's mine):

# Location of the configuration file
CONFIG=/etc/tryton/trytond.conf
# Location of the logging configuration file
LOGCONF=/etc/tryton/logging.conf
# The database names to load (space separated)
DATABASES=tryton

since it took me a while to find a working logging.conf example here's my working one:

[formatters]
keys=simple

[handlers]
keys=rotate,console

[loggers]
keys=root

[formatter_simple]
format=%(asctime)s] %(levelname)s:%(name)s:%(message)s
datefmt=%a %b %d %H:%M:%S %Y

[handler_rotate]
class=handlers.TimedRotatingFileHandler
args=('/var/log/trytond/trytond.log', 'D', 1, 120)
formatter=simple

[handler_console]
class=StreamHandler
formatter=simple
args=(sys.stdout,)

[logger_root]
level=INFO
handlers=rotate,console

(Not going into details here, if you want to know more there are plenty of resources online)

As for config I went and got an example online (from open Suse) and modified it:

# /etc/tryton/trytond.conf - Configuration file for Tryton Server (trytond)
#
# This file contains the most common settings for trytond (Defaults
# are commented).
# For more information read
# /usr/share/doc/trytond-<version>/

[database]
# Database related settings

# The URI to connect to the SQL database (following RFC-3986)
# uri = database://username:password@host:port/
# (Internal default: sqlite:// (i.e. a local SQLite database))
#
# PostgreSQL via Unix domain sockets
# (e.g. PostgreSQL database running on the same machine (localhost))
#uri = postgresql://tryton:tryton@/
#
#Default setting for a local postgres database
#uri = postgresql:///

#
# PostgreSQL via TCP/IP
# (e.g. connecting to a PostgreSQL database running on a remote machine or
# by means of md5 authentication. Needs PostgreSQL to be configured to accept
# those connections (pg_hba.conf).)
#uri = postgresql://tryton:tryton@localhost:5432/
uri = postgresql://tryton:mypassword@localhost:5432/

# The path to the directory where the Tryton Server stores files.
# The server must have write permissions to this directory.
# (Internal default: /var/lib/trytond)
path = /var/lib/tryton

# Shall available databases be listed in the client?
#list = True

# The number of retries of the Tryton Server when there are errors
# in a request to the database
#retry = 5

# The primary language, that is used to store entries in translatable
# fields into the database.
#language = en_US
language = de_AT

[ssl]
# SSL settings
# Activation of SSL for all available protocols.
# Uncomment the following settings for key and certificate
# to enable SSL.

# The path to the private key
#privatekey = /etc/ssl/private/ssl-cert-snakeoil.key

# The path to the certificate
#certificate = /etc/ssl/certs/ssl-cert-snakeoil.pem

[jsonrpc]
# Settings for the JSON-RPC network interface

# The IP/host and port number of the interface
# (Internal default: localhost:8000)
#
# Listen on all interfaces (IPv4)

listen = 0.0.0.0:8000

#
# Listen on all interfaces (IPv4 and IPv6)
#listen = [::]:8000

# The hostname for this interface
#hostname =

# The root path to retrieve data for GET requests
#data = jsondata

[xmlrpc]
# Settings for the XML-RPC network interface

# The IP/host and port number of the interface
#listen = localhost:8069

[webdav]
# Settings for the WebDAV network interface

# The IP/host and port number of the interface
#listen = localhost:8080
listen = 0.0.0.0:8080

[session]
# Session settings

# The time (in seconds) until an inactive session expires
timeout = 3600

# The server administration password used by the client for
# the execution of database management tasks. It is encrypted
# using using the Unix crypt(3) routine. A password can be
# generated using the following command line (on one line):
# $ python -c 'import getpass,crypt,random,string; \
# print crypt.crypt(getpass.getpass(), \
# "".join(random.sample(string.ascii_letters + string.digits, 8)))'
# Example password with 'admin'
#super_pwd = jkUbZGvFNeugk
super_pwd = <your pwd>


[email]
# Mail settings

# The URI to connect to the SMTP server.
# Available protocols are:
# - smtp: simple SMTP
# - smtp+tls: SMTP with STARTTLS
# - smtps: SMTP with SSL
#uri = smtp://localhost:25
uri = smtp://localhost:25

# The From address used by the Tryton Server to send emails.
from = tryton@<your-domain.tld>

[report]
# Report settings

# Unoconv parameters for connection to the unoconv service.
#unoconv = pipe,name=trytond;urp;StarOffice.ComponentContext

# Module settings
#
# Some modules are reading configuration parameters from this
# configuration file. These settings only apply when those modules
# are installed.
#
#[ldap_authentication]
# The URI to connect to the LDAP server.
#uri = ldap://host:port/dn?attributes?scope?filter?extensions
# A basic default URL could look like
#uri = ldap://localhost:389/

[web]
# Path for the web-frontend
#root = /usr/lib/node-modules/tryton-sao
listen = 0.0.0.0:8000
root = /usr/share/sao

Set up the database tables, modules, superuser

trytond-admin -c /etc/tryton/trytond.conf -d tryton --all

Should you forget to set your superuser password (or need to change it later):

trytond-admin -c /etc/tryton/trytond.conf -d tryton -p

It's now time to connect a client to it and enable & configure the modules (make sure to finish the basic configuration (including accounts,..) otherwise you have to either restart, or know what exactly needs to be set up accounting wise !

  • configure user(s)
  • enable account_eu (config and setup take a while)
  • set up company
    • create a party for it
    • assign currency and timezone
  • set up chart of accounts from the template (only do this manually if you really know what you - and tryton - needs !!)
    • choose company & pick the template (e.g. "Minimaler Kontenplan" (if using german) )
      • set the defaults (only one per option usually)
  • after applying the above activate and configure whatever else you need (sale, timesheet, ...)

during this you can watch trytond.log to see what happens behind the scenes (e.g. country module takes a while,..)

How to add languages:

  • Administration -> Localization -> Languages -> add the language and set it to
    active and translatable 

If you install new modules or languages run trytond-admin ... --all again (see above) 

February 28, 2018
Alexys Jacob a.k.a. ultrabug (homepage, bugs)
Evaluating ScyllaDB for production 2/2 (February 28, 2018, 10:32 UTC)

In my previous blog post, I shared 7 lessons on our experience in evaluating Scylla for production.

Those lessons were focused on the setup and execution of the POC and I promised a more technical blog post with technical details and lessons learned from the POC, here it is!

Before you read on, be mindful that our POC was set up to test workloads and workflows, not to benchmark technologies. So even if the Scylla figures are great, they have not been the main drivers of the actual conclusion of the POC.

Business context

As a data driven company working in the Marketing and Advertising industry, we help our clients make sense of multiple sources of data to build and improve their relationship with their customers and prospects.

Dealing with multiple sources of data is nothing new but their volume has dramatically changed during the past decade. I will spare you with the Big-Data-means-nothing term and the technical challenges that comes with it as you already heard enough of it.

Still, it is clear that our line of business is tied to our capacity at mixing and correlating a massive amount of different types of events (data sources/types) coming from various sources which all have their own identifiers (think primary keys):

  • Web navigation tracking: identifier is a cookie that’s tied to the tracking domain (we have our own)
  • CRM databases: usually the email address or an internal account ID serve as an identifier
  • Partners’ digital platform: identifier is also a cookie tied to their tracking domain

To try to make things simple, let’s take a concrete example:

You work for UNICEF and want to optimize their banner ads budget by targeting the donors of their last fundraising campaign.

  • Your reference user database is composed of the donors who registered with their email address on the last campaign: main identifier is the email address.
  • To buy web display ads, you use an Ad Exchange partner such as AppNexus or DoubleClick (Google). From their point of view, users are seen as cookie IDs which are tied to their own domain.

So you basically need to be able to translate an email address to a cookie ID for every partner you work with.

Use case: ID matching tables

We operate and maintain huge ID matching tables for every partner and a great deal of our time is spent translating those IDs from one to another. In SQL terms, we are basically doing JOINs between a dataset and those ID matching tables.

  • You select your reference population
  • You JOIN it with the corresponding ID matching table
  • You get a matched population that your partner can recognize and interact with

Those ID matching tables have a pretty high read AND write throughput because they’re updated and queried all the time.

Usual figures are JOINs between a 10+ Million dataset and 1.5+ Billion ID matching tables.

The reference query basically looks like this:

SELECT count(m.partnerid)
FROM population_10M_rows AS p JOIN partner_id_match_400M_rows AS m
ON p.id = m.id

 Current implementations

We operate a lambda architecture where we handle real time ID matching using MongoDB and batch ones using Hive (Apache Hadoop).

The first downside to note is that it requires us to maintain two copies of every ID matching table. We also couldn’t choose one over the other because neither MongoDB nor Hive can sustain both the read/write lookup/update ratio while performing within the low latencies that we need.

This is an operational burden and requires quite a bunch of engineering to ensure data consistency between different data stores.

Production hardware overview:

  • MongoDB is running on a 15 nodes (5 shards) cluster
    • 64GB RAM, 2 sockets, RAID10 SAS spinning disks, 10Gbps dual NIC
  • Hive is running on 50+ YARN NodeManager instances
    • 128GB RAM, 2 sockets, JBOD SAS spinning disks, 10Gbps dual NIC

Target implementation

The key question is simple: is there a technology out there that can sustain our ID matching tables workloads while maintaining consistently low upsert/write and lookup/read latencies?

Having one technology to handle both use cases would allow:

  • Simpler data consistency
  • Operational simplicity and efficiency
  • Reduced costs

POC hardware overview:

So we decided to find out if Scylla could be that technology. For this, we used three decommissioned machines that we had in the basement of our Paris office.

  • 2 DELL R510
    • 19GB RAM, 2 socket 8 cores, RAID0 SAS spinning disks, 1Gbps NIC
  • 1 DELL R710
    • 19GB RAM, 2 socket 4 cores, RAID0 SAS spinning disks, 1Gbps NIC

I know, these are not glamorous machines and they are even inconsistent in specs, but we still set up a 3 node Scylla cluster running Gentoo Linux with them.

Our take? If those three lousy machines can challenge or beat the production machines on our current workloads, then Scylla can seriously be considered for production.

Step 1: Validate a schema model

Once the POC document was complete and the ScyllaDB team understood what we were trying to do, we started iterating on the schema model using a query based modeling strategy.

So we wrote down and rated the questions that our model(s) should answer to, they included stuff like:

  • What are all our cookie IDs associated to the given partner ID ?
  • What are all the cookie IDs associated to the given partner ID over the last N months ?
  • What is the last cookie ID/date for the given partner ID ?
  • What is the last date we have seen the given cookie ID / partner ID couple ?

As you can imagine, the reverse questions are also to be answered so ID translations can be done both ways (ouch!).

Prototyping

This is no news that I’m a Python addict so I did all my prototyping using Python and the cassandra-driver.

I ended up using a test-driven data modelling strategy using pytest. I wrote tests on my dataset so I could concentrate on the model while making sure that all my questions were being answered correctly and consistently.

Schema

In our case, we ended up with three denormalized tables to answer all the questions we had. To answer the first three questions above, you could use the schema below:

CREATE TABLE IF NOT EXISTS ids_by_partnerid(
 partnerid text,
 id text,
 date timestamp,
 PRIMARY KEY ((partnerid), date, id)
 )
 WITH CLUSTERING ORDER BY (date DESC)

Note on clustering key ordering

One important learning I got in the process of validating the model is about the internals of Cassandra’s file format that resulted in the choice of using a descending order DESC on the date clustering key as you can see above.

If your main use case of querying is to look for the latest value of an history-like table design like ours, then make sure to change the default ASC order of your clustering key to DESC. This will ensure that the latest values (rows) are stored at the beginning of the sstable file effectively reducing the read latency when the row is not in cache!

Let me quote Glauber Costa’s detailed explanation on this:

Basically in Cassandra’s file format, the index points to an entire partition (for very large partitions there is a hack to avoid that, but the logic is mostly the same). So if you want to read the first row, that’s easy you get the index to the partition and read the first row. If you want to read the last row, then you get the index to the partition and do a linear scan to the next.

This is the kind of learning you can only get from experts like Glauber and that can justify the whole POC on its own!

Step 2: Set up scylla-grafana-monitoring

As I said before, make sure to set up and run the scylla-grafana-monitoring project before running your test workloads. This easy to run solution will be of great help to understand the performance of your cluster and to tune your workload for optimal performances.

If you can, also discuss with the ScyllaDB team to allow them to access the Grafana dashboard. This will be very valuable since they know where to look better than we usually do… I gained a lot of understandings thanks to this!

Note on scrape interval

I advise you to lower the Prometheus scrape interval to have a shorter and finer sampling of your metrics. This will allow your dashboard to be more reactive when you start your test workloads.

For this, change the prometheus/prometheus.yml file like this:

scrape_interval: 2s # Scrape targets every 2 seconds (5s default)
scrape_timeout: 1s # Timeout before trying to scrape a target again (4s default)

Test your monitoring

Before going any further, I strongly advise you to run a stress test on your POC cluster using the cassandra-stress tool and share the results and their monitoring graphs with the ScyllaDB team.

This will give you a common understanding of the general performances of your cluster as well as help in outlining any obvious misconfiguration or hardware problem.

Key graphs to look at

There are a lot of interesting graphs so I’d like to share the ones that I have been mainly looking at. Remember that depending on your test workloads, some other graphs may be more relevant for you.

  • number of open connections

You’ll want to see a steady and high enough number of open connections which will prove that your clients are pushed at their maximum (at the time of testing this graph was not on Grafana and you had to add it yourself)

  • cache hits / misses

Depending on your reference dataset, you’ll obviously see that cache hits and misses will have a direct correlation with disk I/O and overall performances. Running your test workloads multiple times should trigger higher cache hits if your RAM is big enough.

  • per shard/node distribution

The Requests Served per shard graph should display a nicely distributed load between your shards and nodes so that you’re sure that you’re getting the best out of your cluster.

The same is true for almost every other “per shard/node” graph: you’re looking for evenly distributed load.

  • sstable reads

Directly linked with your disk performances, you’ll be trying to make sure that you have almost no queued sstable reads.

Step 3: Get your reference data and metrics

We obviously need to have some reference metrics on our current production stack so we can compare them with the results on our POC Scylla cluster.

Whether you choose to use your current production machines or set up a similar stack on the side to run your test workloads is up to you. We chose to run the vast majority of our tests on our current production machines to be as close to our real workloads as possible.

Prepare a reference dataset

During your work on the POC document, you should have detailed the usual data cardinality and volume you work with. Use this information to set up a reference dataset that you can use on all of the platforms that you plan to compare.

In our case, we chose a 10 Million reference dataset that we JOINed with a 400+ Million extract of an ID matching table. Those volumes seemed easy enough to work with and allowed some nice ratio for memory bound workloads.

Measure on your current stack

Then it’s time to load this reference datasets on your current platforms.

  • If you run a MongoDB cluster like we do, make sure to shard and index the dataset just like you do on the production collections.
  • On Hive, make sure to respect the storage file format of your current implementations as well as their partitioning.

If you chose to run your test workloads on your production machines, make sure to run them multiple times and at different hours of the day and night so you can correlate the measures with the load on the cluster at the time of the tests.

Reference metrics

For the sake of simplicity I’ll focus on the Hive-only batch workloads. I performed a count on the JOIN of the dataset and the ID matching table using Spark 2 and then I also ran the JOIN using a simple Hive query through Beeline.

I gave the following definitions on the reference load:

  • IDLE: YARN available containers and free resources are optimal, parallelism is very limited
  • NORMAL: YARN sustains some casual load, parallelism exists but we are not bound by anything still
  • HIGH: YARN has pending containers, parallelism is high and applications have to wait for containers before executing

There’s always an error margin on the results you get and I found that there was not significant enough differences between the results using Spark 2 and Beeline so I stuck with a simple set of results:

  • IDLE: 2 minutes, 15 seconds
  • NORMAL: 4 minutes
  • HIGH: 15 minutes

Step 4: Get Scylla in the mix

It’s finally time to do your best to break Scylla or at least to push it to its limits on your hardware… But most importantly, you’ll be looking to understand what those limits are depending on your test workloads as well as outlining out all the required tuning that you will be required to do on the client side to reach those limits.

Speaking about the results, we will have to differentiate two cases:

  1. The Scylla cluster is fresh and its cache is empty (cold start): performance is mostly Disk I/O bound
  2. The Scylla cluster has been running some test workload already and its cache is hot: performance is mostly Memory bound with some Disk I/O depending on the size of your RAM

Spark 2 / Scala test workload

Here I used Scala (yes, I did) and DataStax’s spark-cassandra-connector so I could use the magic joinWithCassandraTable function.

  • spark-cassandra-connector-2.0.1-s_2.11.jar
  • Java 7

I had to stick with the 2.0.1 version of the spark-cassandra-connector because newer version (2.0.5 at the time of testing) were performing bad with no apparent reason. The ScyllaDB team couldn’t help on this.

You can interact with your test workload using the spark2-shell:

spark2-shell --jars jars/commons-beanutils_commons-beanutils-1.9.3.jar,jars/com.twitter_jsr166e-1.1.0.jar,jars/io.netty_netty-all-4.0.33.Final.jar,jars/org.joda_joda-convert-1.2.jar,jars/commons-collections_commons-collections-3.2.2.jar,jars/joda-time_joda-time-2.3.jar,jars/org.scala-lang_scala-reflect-2.11.8.jar,jars/spark-cassandra-connector-2.0.1-s_2.11.jar

Then use the following Scala imports:

// main connector import
import com.datastax.spark.connector._

// the joinWithCassandraTable failed without this (dunno why, I'm no Scala guy)
import com.datastax.spark.connector.writer._
implicit val rowWriter = SqlRowWriter.Factory

Finally I could run my test workload to select the data from Hive and JOIN it with Scylla easily:

val df_population = spark.sql("SELECT id FROM population_10M_rows")
val join_rdd = df_population.rdd.repartitionByCassandraReplica("test_keyspace", "partner_id_match_400M_rows").joinWithCassandraTable("test_keyspace", "partner_id_match_400M_rows")
val joined_count = join_rdd.count()

Notes on tuning spark-cassandra-connector

I experienced pretty crappy performances at first. Thanks to the easy Grafana monitoring, I could see that Scylla was not being the bottleneck at all and that I instead had trouble getting some real load on it.

So I engaged in a thorough tuning of the spark-cassandra-connector with the help of Glauber… and it was pretty painful but we finally made it and got the best parameters to get the load on the Scylla cluster close to 100% when running the test workloads.

This tuning was done in the spark-defaults.conf file:

  • have a fixed set of executors and boost their overhead memory

This will increase test results reliability by making sure you always have a reserved number of available workers at your disposal.

spark.dynamicAllocation.enabled=false
spark.executor.instances=30
spark.yarn.executor.memoryOverhead=1024
  • set the split size to 1MB

Default is 8MB but Scylla uses a split size of 1MB so you’ll see a great boost of performance and stability by setting this setting to the right number.

spark.cassandra.input.split.size_in_mb=1
  • align driver timeouts with server timeouts

It is advised to make sure that your read request timeouts are the same on the driver and the server so you do not get stalled states waiting for a timeout to happen on one hand. You can do the same with write timeouts if your test workloads are write intensive.

/etc/scylla/scylla.yaml

read_request_timeout_in_ms: 150000

spark-defaults.conf

spark.cassandra.connection.timeout_ms=150000
spark.cassandra.read.timeout_ms=150000

// optional if you want to fail / retry faster for HA scenarios
spark.cassandra.connection.reconnection_delay_ms.max=5000
spark.cassandra.connection.reconnection_delay_ms.min=1000
spark.cassandra.query.retry.count=100
  • adjust your reads per second rate

Last but surely not least, this setting you will need to try and find out the best value for yourself since it has a direct impact on the load on your Scylla cluster. You will be looking at pushing your POC cluster to almost 100% load.

spark.cassandra.input.reads_per_sec=6666

As I said before, I could only get this to work perfectly using the 2.0.1 version of the spark-cassandra-connector driver. But then it worked very well and with great speed.

Spark 2 results

Once tuned, the best results I was able to reach on this hardware are listed below. It’s interesting to see that with spinning disks, the cold start result can compete with the results of a heavily loaded Hadoop cluster where pending containers and parallelism are knocking down its performances.

  • hot cache: 2min
  • cold cache: 12min

Wow! Those three refurbished machines can compete with our current production machines and implementations, they can even match an idle Hive cluster of a medium size!

Python test workload

I couldn’t conclude on a Scala/Spark 2 only test workload. So I obviously went back to my language of choice Python only to discover at my disappointment that there is no joinWithCassandraTable equivalent available on pyspark

I tried with some projects claiming otherwise with no success until I changed my mind and decided that I probably didn’t need Spark 2 at all. So I went into the crazy quest of beating Spark 2 performances using a pure Python implementation.

This basically means that instead of having a JOIN like helper, I had to do a massive amount of single “id -> partnerid” lookups. Simple but greatly inefficient you say? Really?

When I broke down the pieces, I was left with the following steps to implement and optimize:

  • Load the 10M rows worth of population data from Hive
  • For every row, lookup the corresponding partnerid in the ID matching table from Scylla
  • Count the resulting number of matches

The main problem to compete with Spark 2 is that it is a distributed framework and Python by itself is not. So you can’t possibly imagine outperforming Spark 2 with your single machine.

However, let’s remember that Spark 2 is shipped and ran on executors using YARN so we are firing up JVMs and dispatching containers all the time. This is a quite expensive process that we have a chance to avoid using Python!

So what I needed was a distributed computation framework that would allow to load data in a partitioned way and run the lookups on all the partitions in parallel before merging the results. In Python, this framework exists and is named Dask!

You will obviously need to have to deploy a dask topology (that’s easy and well documented) to have a comparable number of dask workers than of Spark 2 executors (30 in my case) .

The corresponding Python code samples are here.

Hive + Scylla results

Reading the population id’s from Hive, the workload can be split and executed concurrently on multiple dask workers.

  • read the 10M population rows from Hive in a partitioned manner
  • for each partition (slice of 10M), query Scylla to lookup the possibly matching partnerid
  • create a dataframe from the resulting matches
  • gather back all the dataframes and merge them
  • count the number of matches

The results showed that it is possible to compete with Spark 2 with Dask:

  • hot cache: 2min (rounded up)
  • cold cache: 6min

Interestingly, those almost two minutes can be broken down like this:

  • distributed read data from Hive: 50s
  • distributed lookup from Scylla: 60s
  • merge + count: 10s

This meant that if I could cut down the reading of data from Hive I could go even faster!

Parquet + Scylla results

Going further on my previous remark I decided to get rid of Hive and put the 10M rows population data in a parquet file instead. I ended up trying to find out the most efficient way to read and load a parquet file from HDFS.

My conclusion so far is that you can’t be the amazing libhdfs3 + pyarrow combo. It is faster to load everything on a single machine than loading from Hive on multiple ones!

The results showed that I could almost get rid of a whole minute in the total process, effectively and easily beating Spark 2!

  • hot cache: 1min 5s
  • cold cache: 5min

Notes on the Python cassandra-driver

Tests using Python showed robust queries experiencing far less failures than the spark-cassandra-connector, even more during the cold start scenario.

  • The usage of execute_concurrent() provides a clean and linear interface to submit a large number of queries while providing some level of concurrency control
  • Increasing the concurrency parameter from 100 to 512 provided additional throughput, but increasing it more looked useless
  • Protocol version 4 forbids the tuning of connection requests / number to some sort of auto configuration. All tentative to hand tune it (by lowering protocol version to 2) failed to achieve higher throughput
  • Installation of libev on the system allows the cassandra-driver to use it to handle concurrency instead of asyncore with a somewhat lower load footprint on the worker node but no noticeable change on the throughput
  • When reading a parquet file stored on HDFS, the hdfs3 + pyarrow combo provides an insane speed (less than 10s to fully load 10M rows of a single column)

Step 5: Play with High Availability

I was quite disappointed and surprised by the lack of maturity of the Cassandra community on this critical topic. Maybe the main reason is that the cassandra-driver allows for too many levels of configuration and strategies.

I wrote this simple bash script to allow me to simulate node failures. Then I could play with handling those failures and retries on the Python client code.

#!/bin/bash

iptables -t filter -X
iptables -t filter -F

ip="0.0.0.0/0"
for port in 9042 9160 9180 10000 7000; do
	iptables -t filter -A INPUT -p tcp --dport ${port} -s ${ip} -j DROP
	iptables -t filter -A OUTPUT -p tcp --sport ${port} -d ${ip} -j DROP
done

while true; do
	trap break INT
	clear
	iptables -t filter -vnL
	sleep 1
done

iptables -t filter -X
iptables -t filter -F
iptables -t filter -vnL

This topic is worth going in more details on a dedicated blog post that I shall write later on while providing code samples.

Concluding the evaluation

I’m happy to say that Scylla passed our production evaluation and will soon go live on our infrastructure!

As I said at the beginning of this post, the conclusion of the evaluation has not been driven by the good figures we got out of our test workloads. Those are no benchmarks and never pretended to be but we could still prove that performances were solid enough to not be a blocker in the adoption of Scylla.

Instead we decided on the following points of interest (in no particular order):

  • data consistency
  • production reliability
  • datacenter awareness
  • ease of operation
  • infrastructure rationalisation
  • developer friendliness
  • costs

On the side, I tried Scylla on two other different use cases which proved interesting to follow later on to displace MongoDB again…

Moving to production

Since our relationship was great we also decided to partner with ScyllaDB and support them by subscribing to their enterprise offerings. They also accepted to support us using Gentoo Linux!

We are starting with a three nodes heavy duty cluster:

  • DELL R640
    • dual socket 2,6GHz 14C, 512GB RAM, Samsung 17xxx NVMe 3,2 TB

I’m eager to see ScyllaDB building up and will continue to help with my modest contributions. Thanks again to the ScyllaDB team for their patience and support during the POC!

February 27, 2018
Thomas Raschbacher a.k.a. lordvan (homepage, bugs)

So .. since this page/blog is running on Mezzanine I thought I'd share what I had to do to get it to work.

First off I did this on Gentoo, but in general stuff should apply to most other distributions anyway.

Versions I used:

  • Apache 2.4.27
  • python 3.6.3
  • mod_wsgi 4.5.13
  • postgresql 9.4

Don't forget to enable apache loading mod_wsgi (on Gentoo add "-D WSGI " to APACHE2_OPTS in /etc/conf.d/apache).

I am running Mezzanine on it's own virtualhost.

For the rest of this I go with the assumption that the above is installed and configured correctly.

Quick & dirty Mezzanine install:

python3.6 -m venv myenv
source myenv/bin/activate
pip install mezzanine south psycopg2

Of course replace psycopg2 with whatever Database driver you intend to use.

Here is what I installed (pip freeze output):

beautifulsoup4==4.6.0
bleach==2.1.2
certifi==2018.1.18
chardet==3.0.4
Django==1.10.8
django-contrib-comments==1.8.0
filebrowser-safe==0.4.7
future==0.16.0
grappelli-safe==0.4.7
html5lib==1.0.1
idna==2.6
Mezzanine==4.2.3
oauthlib==2.0.6
Pillow==5.0.0
psycopg2==2.7.4
pytz==2018.3
requests==2.18.4
requests-oauthlib==0.8.0
six==1.11.0
South==1.0.2
tzlocal==1.5.1
urllib3==1.22
webencodings==0.5.1

Then create your project:

mezzanine-project <projectname>

Then edit your <project>/local_settings.py to use the correct database settings - and don't forget to set ALLOWED_HOSTS (like I did at first).

chmod +x <project>/manage.py
./<project>/manage.py createdb

This should create the DB, superuser, .. then I ran

./<project>/manage.py collectstatic

you can test it with

./<project>/manage.py runserver 

If you need to change the ip/port:

./<project>/manage.py runserver <ip>:<port>

Make sure stuff works ok, set up what you want to setup first or do your development.

For deploying with mod_wsgi .. here's a snippet from my apache config (I run it on https *only* and just redirect http to the https version):

<VirtualHost <your_ip>:443>
ServerName your.domain.name
ServerAdmin your_email@your_domain
ErrorLog /path/to/your/logs/your.domain.name_error.log
CustomLog /path/to/your/logs/your.domain.name_access.log combined

LogLevel Info

SSLEngine on
SSLCertificateFile /path/to/your/sslcerts/cert.pem
SSLCertificateKeyFile /path/to/your/sslcerts/privkey.pem
SSLCertificateChainFile /path/to/your/sslcerts/fullchain.pem

WSGIDaemonProcess mymezz home=/path/to/your/MezzanineInstall/Mezzanine/myenv processes=1 threads=15 display-name=[wsgi-mymezz]httpd python-path=/path/to/your/MezzanineInstall/Mezzanine/mezzproject:/path/to/your/MezzanineInstall/Mezzanine/myenv/lib64/python3.6/site-packages

WSGIProcessGroup mymezz
WSGIApplicationGroup %{GLOBAL}

WSGIScriptAlias / /path/to/your/MezzanineInstall/Mezzanine/mezzproject/apache.wsgi
Alias /static /path/to/your/MezzanineInstall/Mezzanine/mezzproject/static
Alias /robots.txt /path/to/your/MezzanineInstall/Mezzanine/htdocs_static/robots.txt
Alias /favicon.ico /path/to/your/MezzanineInstall/Mezzanine/htdocs_static/favicon.ico

<Directory /path/to/your/MezzanineInstall/Mezzanine/mezzproject>
  Options -Indexes +FollowSymLinks +MultiViews
  php_flag engine off
  <IfModule mod_authz_host.c>
    Require all granted
  </IfModule>
</Directory>

<Directory /path/to/your/MezzanineInstallMezzanine/mezzproject/static>
   Options -Indexes +FollowSymLinks +MultiViews -ExecCGI
   php_flag engine off
   RemoveHandler .cgi .php .php3 .php4 .phtml .pl .py .pyc .pyo
   AllowOverride None
   <IfModule mod_authz_host.c>
      Require all granted
   </IfModule>
</Directory>

</VirtualHost>

Should all be pretty self explainatory (maybe I'll elaborate at a later point, but I don't have that much time now and I'd rather get it finished).

Here's the apache.wsgi file:

from __future__ import unicode_literals
import os, sys, site

site.addsitedir('/path/to/your/MezzanineInstall/myenv/lib64/python3.6/site-packages')
activate_this = os.path.expanduser('/path/to/your/MezzanineInstall/myenv/bin/activate_this.py')
exec(open(activate_this, 'r').read(), dict(__file__=activate_this))

PROJECT_ROOT = os.path.dirname(os.path.abspath(__file__))
sys.path.append(os.path.join(PROJECT_ROOT, ".."))
settings_module = "%s.settings" % PROJECT_ROOT.split(os.sep)[-1]
os.environ["DJANGO_SETTINGS_MODULE"] = settings_module

from django.core.wsgi import get_wsgi_application
application = get_wsgi_application()

I found activate_this.py at https://github.com/pypa/virtualenv/blob/master/virtualenv_embedded/activate_this.py (since with python3 execfile wasn't really working for me):

"""By using execfile(this_file, dict(__file__=this_file)) you will
activate this virtualenv environment.
This can be used when you must use an existing Python interpreter, not
the virtualenv bin/python
"""

try:
    __file__
except NameError:
    raise AssertionError(
        "You must run this like execfile('path/to/activate_this.py', dict(__file__='path/to/activate_this.py'))")
import sys
import os

old_os_path = os.environ.get('PATH', '')
os.environ['PATH'] = os.path.dirname(os.path.abspath(__file__)) + os.pathsep + old_os_path
base = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
if sys.platform == 'win32':
    site_packages = os.path.join(base, 'Lib', 'site-packages')
else:
    site_packages = os.path.join(base, 'lib', 'python%s' % sys.version[:3], 'site-packages')
prev_sys_path = list(sys.path)
import site
site.addsitedir(site_packages)
sys.real_prefix = sys.prefix
sys.prefix = base
# Move the added items to the front of the path:
new_sys_path = []
for item in list(sys.path):
    if item not in prev_sys_path:
        new_sys_path.append(item)
        sys.path.remove(item)
sys.path[:0] = new_sys_path

that would be the Apache + mod_wsgi config (make sure to replace the python version number if you don'T use 3.6)

Make sure that apache has the correct permissions to all the files too btw ;)

February 19, 2018
Jason A. Donenfeld a.k.a. zx2c4 (homepage, bugs)
WireGuard in Google Summer of Code (February 19, 2018, 14:55 UTC)

WireGuard is participating in Google Summer of Code 2018. If you're a student — bachelors, masters, PhD, or otherwise — who would like to be funded this summer for writing interesting kernel code, studying cryptography, building networks, making mobile apps, contributing to the larger open source ecosystem, doing web development, writing documentation, or working on a wide variety of interesting problems, then this may be appealing. You'll be mentored by world-class experts, and the summer will certainly boost your skills. Details are on this page — simply contact the WireGuard team to get a proposal into the pipeline.

Gentoo accepted into Google Summer of Code 2018 (February 19, 2018, 00:00 UTC)

Students who want to spend their summer having fun and writing code can do so now for Gentoo. Gentoo has been accepted as a mentoring organization for this year’s Google Summer of Code.

The GSoC is an excellent opportunity for gaining real-world experience in software design and making one’s self known in the broader open source community. It also looks great on a resume.

Initial project ideas can be found here, although new projects ideas are welcome. For new projects time is of the essence: there is typically some idea-polishing which must occur before the March 27th deadline. Because of this it is strongly recommended that students refine new project ideas with a mentor before proposing the idea formally.

GSoC students are encouraged to begin discussing ideas in the #gentoo-soc IRC channel on the Freenode network.

Further information can be found on the Gentoo GSoC 2018 wiki page. Those with unanswered questions should not hesitate to contact the Summer of Code mentors via the mailing list.

February 14, 2018
Sebastian Pipping a.k.a. sping (homepage, bugs)
I love free software... and Gentoo does! #ilovefs (February 14, 2018, 14:25 UTC)

Some people care if software is free of cost or if it has the best features, above everything else. I don't. I care that I can legally inspect its inner workings, modify and share modified versions. That's why I happily avoid macOS, Windows, Skype, Photoshop. I ran into these two pieces involving Gentoo in the Gallery of Free Software lovers and would like to share them with you:

Images are licensed under CC BY-SA 4.0 (with attribution going to Free Software Foundation Europe) as confirmed by Max Mehl.

February 10, 2018
Matthew Thode a.k.a. prometheanfire (homepage, bugs)
Native ZFS encryption for your rootfs (February 10, 2018, 06:00 UTC)

Disclaimer

I'm not responsible if you ruin your system, this guide functions as documentation for future me. Remember to back up your data.

Why do this instead of luks

I wanted to remove a layer from the File to Disk layering, before it was ZFS -> LUKS -> disk, now it's ZFS -> disk.

Prework

I just got a new laptop and wanted to just migrate the data, luckily the old laptop was using ZFS as well, so the data could be sent/received though native ZFS means.

The actual setup

Set up your root pool with the encryption key, it will be inherited by all child datasets, no child datasets will be allowed to be unencrypted.

In my case the pool name was slaanesh-zp00, so I ran the following to create the fresh pool.

zpool create -O encryption=on -O keyformat=passphrase zfstest /dev/zvol/slaanesh-zp00/zfstest

After that just go on and create your datasets as normal, transfer old data as needed (it'll be encrypted as it's written). See https://wiki.gentoo.org/wiki/ZFS for a good general guide on setting up your datasets.

decrypting at boot

If you are using dracut it should just work. No changes to what you pass on the kernel command line are needed. The code is upstream in https://github.com/zfsonlinux/zfs/blob/master/contrib/dracut/90zfs/zfs-load-key.sh.in

notes

Make sure you install from git master, there was a disk format change for encrypted datasets that just went in a week or so ago.

February 06, 2018
Gentoo at FOSDEM 2018 (February 06, 2018, 19:49 UTC)

Gentoo Linux participated with a stand during this year's FOSDEM 2018, as has been the case for the past several years. Three Gentoo developers had talks this year, Haubi was back with a Gentoo-related talk on Unix? Windows? Gentoo! - POSIX? Win32? Native Portability to the max!, dilfridge talked about Perl in the Physics Lab … Continue reading "Gentoo at FOSDEM 2018"