December 07 2023

A format that does one thing well or one-size-fits-all?

Michał Górny (mgorny) December 07, 2023, 15:58

The Unix philosophy states that we ought to design programs that “do one thing well”. Nevertheless, the current trend is to design huge monoliths with multiple unrelated functions, with web browsers at the peak of that horrifying journey. However, let’s consider something else.

Does the same philosophy hold for algorithms and file formats? Is it better to design formats that suit a single use case well, and swap between different formats as need arises? Or perhaps it is a better solution to actually design them so they could fit different needs?

Let’s consider this by exploring three areas: hash algorithms, compressed file formats and image formats.

Hash algorithms

Hash, digest, checksum — they have many names, and many uses. To list a few uses of hash functions and their derivatives:

  • verifying file integrity
  • verifying file authenticity
  • generating derived keys
  • generating unique keys for fast data access and comparison

Different use cases imply different requirements. The simple CRC algorithms were good enough to check files for random damage but they aren’t suitable for cryptographic purposes. The SHA hashes provide good resistance to attacks but they are too slow to speed up data lookups. That role is much better served by dedicated fast hashes such as xxHash. In my opinion, these are all examples of “do one thing well”.

On the other hand, there is some overlap. More often than not, cryptographic hash functions are used to verify integrity. Then we have modern hashes like BLAKE2 that are both fast and secure (though not as fast as dedicated fast hashes). Argon2 key derivation function builds upon BLAKE2 to improve its security even further, rather than inventing a new hash. These are the examples how a single tool is used to serve different purposes.

Compressed file formats

The purpose of compression, of course, is to reduce file size. However, individual algorithms may be optimized for different kinds of data and different goals.

Probably the oldest category are “archiving” algorithms that focus on providing strong compression and reasonably fast decompression. Back in the day, there were used to compress files in “cold storage” and for transfer; nowadays, they can be used basically for anything that you don’t modify very frequently. The common algorithms from this category include deflate (used by gzip, zip) and LZMA (used by 7z, lzip, xz).

Then, we have very strong algorithms that achieve remarkable compression at the cost of very slow compression and decompression. These are sometimes (but rarely) used for data distribution. An example of such algorithms are the PAQ family.

Then, we have very fast algorithms such as LZ4. They provide worse compression ratios than other algorithms, but they are so fast that they can be used to compress data on the fly. They can be used to speed up data access and transmission by reducing its size with no noticeable overhead.

Of course, many algorithms have different presets. You can run lz4 -9 to get stronger compression with LZ4, or xz -1 to get faster compression with XZ. However, neither the former will excel at compression ratio, nor the latter at speed.

Again, we are seeing different algorithms that “do one thing well”. However, nowadays ZSTD is gaining popularity and it spans a wider spectrum, being capable of both providing very fast compression (but not as fast as LZ4) and quite strong compression. What’s really important is that it’s capable of providing adaptive compression — that is, dynamically adjusting the compression level to provide the best throughput. It switches to a faster preset if the current one is slowing the transmission down, and to a stronger one if there is a potential speedup in that.

Image formats

Let’s discuss image formats now. If we look back far enough, we’d arrive at a time when two image formats were dominating the web. On one hand, we had GIF — with lossless compression, limited color palette, transparency and animations, that made it a good choice for computer-generated images. On the other, we had JPEG — with efficient lossy compression and high color depth suitable for photography. We could see these two as “doing one thing well”.

Then came PNG. PNG is also lossless but provides much higher color depth and improved support for transparency via an alpha channel. While it’s still the format of choice for computer-generated images, it’s also somewhat suitable for photography (but with less efficient compression). With APNG around, it effectively replaces GIF but it also partially overlaps with the use cases for JPEG.

Modern image formats go even further. WebP, AVIF and JPEG XL all support both lossless and lossy compession, high color depths, alpha channel, animation. Therefore, they are suitable both for computer-generated images and for photography. Effectively, they can replace all their predecessors with a “one size fits all” format.

Conclusion

I’ve asked whether it is better to design formats that focus on one specific use case, or whether formats that try to cover a whole spectrum of use cases are better. I’m afraid there’s no easy answer to this question.

We can clearly see that “one-size-fits-all” solutions are gaining popularity — BLAKE2 among hashes, ZSTD in compressed file formats, WebP, AVIF and JPEG XL among image formats. They have a single clear advantage — you need just one tool, one implementation.

Your web browser needs to support only one format that covers both computer-generated graphics using lossless compression and photographs using lossy compression. Different tools can reuse the same BLAKE2 implementation that’s well tested and audited. A single ZSTD library can serve different applications in their distinct use cases.

However, there is still a clear edge to algorithms that are focused on a single use case. xxHash is still faster than any hashes that could be remotely suitable for cryptographic purposes. LZ4 is still faster than ZSTD can be in its lowest compression mode.

The only reasonable conclusion seems to be: there are use cases for both. There are use cases that are best satisfied by a dedicated algorithm, and there are use cases when a more generic solution is better. There are use cases when integrating two different hash algorithms, two different compression libraries into your program, with the overhead involved, is a better choice, than using just one algorithm that fits neither of your two distinct use cases well.

Once again, it feels that a reference to XKCD#927 is appropriate. However, in this particular instance this isn’t a bad thing.

The Unix philosophy states that we ought to design programs that “do one thing well”. Nevertheless, the current trend is to design huge monoliths with multiple unrelated functions, with web browsers at the peak of that horrifying journey. However, let’s consider something else.

Does the same philosophy hold for algorithms and file formats? Is it better to design formats that suit a single use case well, and swap between different formats as need arises? Or perhaps it is a better solution to actually design them so they could fit different needs?

Let’s consider this by exploring three areas: hash algorithms, compressed file formats and image formats.

Hash algorithms

Hash, digest, checksum — they have many names, and many uses. To list a few uses of hash functions and their derivatives:

  • verifying file integrity
  • verifying file authenticity
  • generating derived keys
  • generating unique keys for fast data access and comparison

Different use cases imply different requirements. The simple CRC algorithms were good enough to check files for random damage but they aren’t suitable for cryptographic purposes. The SHA hashes provide good resistance to attacks but they are too slow to speed up data lookups. That role is much better served by dedicated fast hashes such as xxHash. In my opinion, these are all examples of “do one thing well”.

On the other hand, there is some overlap. More often than not, cryptographic hash functions are used to verify integrity. Then we have modern hashes like BLAKE2 that are both fast and secure (though not as fast as dedicated fast hashes). Argon2 key derivation function builds upon BLAKE2 to improve its security even further, rather than inventing a new hash. These are the examples how a single tool is used to serve different purposes.

Compressed file formats

The purpose of compression, of course, is to reduce file size. However, individual algorithms may be optimized for different kinds of data and different goals.

Probably the oldest category are “archiving” algorithms that focus on providing strong compression and reasonably fast decompression. Back in the day, there were used to compress files in “cold storage” and for transfer; nowadays, they can be used basically for anything that you don’t modify very frequently. The common algorithms from this category include deflate (used by gzip, zip) and LZMA (used by 7z, lzip, xz).

Then, we have very strong algorithms that achieve remarkable compression at the cost of very slow compression and decompression. These are sometimes (but rarely) used for data distribution. An example of such algorithms are the PAQ family.

Then, we have very fast algorithms such as LZ4. They provide worse compression ratios than other algorithms, but they are so fast that they can be used to compress data on the fly. They can be used to speed up data access and transmission by reducing its size with no noticeable overhead.

Of course, many algorithms have different presets. You can run lz4 -9 to get stronger compression with LZ4, or xz -1 to get faster compression with XZ. However, neither the former will excel at compression ratio, nor the latter at speed.

Again, we are seeing different algorithms that “do one thing well”. However, nowadays ZSTD is gaining popularity and it spans a wider spectrum, being capable of both providing very fast compression (but not as fast as LZ4) and quite strong compression. What’s really important is that it’s capable of providing adaptive compression — that is, dynamically adjusting the compression level to provide the best throughput. It switches to a faster preset if the current one is slowing the transmission down, and to a stronger one if there is a potential speedup in that.

Image formats

Let’s discuss image formats now. If we look back far enough, we’d arrive at a time when two image formats were dominating the web. On one hand, we had GIF — with lossless compression, limited color palette, transparency and animations, that made it a good choice for computer-generated images. On the other, we had JPEG — with efficient lossy compression and high color depth suitable for photography. We could see these two as “doing one thing well”.

Then came PNG. PNG is also lossless but provides much higher color depth and improved support for transparency via an alpha channel. While it’s still the format of choice for computer-generated images, it’s also somewhat suitable for photography (but with less efficient compression). With APNG around, it effectively replaces GIF but it also partially overlaps with the use cases for JPEG.

Modern image formats go even further. WebP, AVIF and JPEG XL all support both lossless and lossy compession, high color depths, alpha channel, animation. Therefore, they are suitable both for computer-generated images and for photography. Effectively, they can replace all their predecessors with a “one size fits all” format.

Conclusion

I’ve asked whether it is better to design formats that focus on one specific use case, or whether formats that try to cover a whole spectrum of use cases are better. I’m afraid there’s no easy answer to this question.

We can clearly see that “one-size-fits-all” solutions are gaining popularity — BLAKE2 among hashes, ZSTD in compressed file formats, WebP, AVIF and JPEG XL among image formats. They have a single clear advantage — you need just one tool, one implementation.

Your web browser needs to support only one format that covers both computer-generated graphics using lossless compression and photographs using lossy compression. Different tools can reuse the same BLAKE2 implementation that’s well tested and audited. A single ZSTD library can serve different applications in their distinct use cases.

However, there is still a clear edge to algorithms that are focused on a single use case. xxHash is still faster than any hashes that could be remotely suitable for cryptographic purposes. LZ4 is still faster than ZSTD can be in its lowest compression mode.

The only reasonable conclusion seems to be: there are use cases for both. There are use cases that are best satisfied by a dedicated algorithm, and there are use cases when a more generic solution is better. There are use cases when integrating two different hash algorithms, two different compression libraries into your program, with the overhead involved, is a better choice, than using just one algorithm that fits neither of your two distinct use cases well.

Once again, it feels that a reference to XKCD#927 is appropriate. However, in this particular instance this isn’t a bad thing.

September 05 2023

My thin wrapper for emerge(1)

Michał Górny (mgorny) September 05, 2023, 17:04

I’ve recently written a thin wrapper over emerge that I use in my development environment. It does the following:

  1. set tmux pane title to the first package argument (so you can roughly see what’s emerging on every pane)
  2. beep meaningfully when emerge finishes (two beeps for success, three for failure),
  3. run pip check after successful run to check for mismatched Python dependencies.

Here’s the code:

#!/bin/sh

for arg; do
	case ${arg} in
		-*)
			;;
		*)
			tmux rename-window "${arg}"
			break
			;;
	esac
done

/usr/bin/emerge "${@}"
ret=${?}

if [ "${ret}" -eq 0 ]; then
	python3.11 -m pip check | grep -v certifi
else
	tput bel
	sleep 0.1
fi

tput bel
sleep 0.1
tput bel

exit "${ret}"

I’ve recently written a thin wrapper over emerge that I use in my development environment. It does the following:

  1. set tmux pane title to the first package argument (so you can roughly see what’s emerging on every pane)
  2. beep meaningfully when emerge finishes (two beeps for success, three for failure),
  3. run pip check after successful run to check for mismatched Python dependencies.

Here’s the code:

#!/bin/sh

for arg; do
	case ${arg} in
		-*)
			;;
		*)
			tmux rename-window "${arg}"
			break
			;;
	esac
done

/usr/bin/emerge "${@}"
ret=${?}

if [ "${ret}" -eq 0 ]; then
	python3.11 -m pip check | grep -v certifi
else
	tput bel
	sleep 0.1
fi

tput bel
sleep 0.1
tput bel

exit "${ret}"

August 27 2023

Final Report, Automated Gentoo System Updater

Gentoo Google Summer of Code (GSoC) August 27, 2023, 16:28
Project Goals

Main goal of the project was to write an app that will automatically handle updates on Gentoo Linux systems and send notifications with update summaries. More specifically, I wanted to:

  1. Simplify the update process for beginners, offering a simpler one-click method.
  2. Minimize time experienced users spend on routine update tasks, decreasing their workload.
  3. Ensure systems remain secure and regularly updated with minimal manual intervention.
  4. Keep users informed of the updates and changes.
  5. Improve the overall Gentoo Linux user experience.
Progress

Here is a summary of what was done every week with links to my blog posts.

Week 1

Basic system updater is ready. Also prepared a Docker Compose file to run tests in containers. Available functionality:

  • update security patches
  • update @world
  • merge changed configuration files
  • restart updated services
  • do a post-update clean up
  • read elogs
  • read news

Links:

  • Pull requests: #2, #3, #4
  • Docker tests
Week 2

Packaged Python code, created an ebuild and a GitHub Actions workflow that publishes package to PyPI when commit is tagged.

Links:

  • Pull requests: #5
  • ebuild commit
  • GitHub Actions workflow
Week 3

Fixed issue #7 and answered to issue #8 and fixed bug 908308. Added USE flags to manage dependencies. Improve Bash code stability.

Links:

  • Issues: #7, #8
  • Bugs: 908308
  • Pull requests: #6, #9, #10
Week 4

Fixed errors in ebuild, replaced USE flags with optfeature for dependency management. Wrote a blog post to introduce my app and posted it on forums. Fixed a bug in --args flag.

Links:

  • Pull requests: #11
  • request to fix ebuild
  • Blog post and Forum post
Week 5

Received some feedback from forums. Coded much of the parser (--report). Improved container testing environment.

Links:

  • Improved dockerfiles
Weeks 6 and 7

Completed parser (--report). Also added disk usage calculation before and after the update. Available functionality:

  • If the update was successful, report will show:
    • updated package names
    • package versions in the format “old -> new”
    • USE flags of those packages
    • disk usage before and after the update
  • If the emerge pretend has failed, report will show:
    • error type (for now only supports ‘blocked packages’ error)
    • error details (for blocked package it will show problematic packages)

Links:

  • Pull requests: #12, #13
Week 8

Add 2 notification methods (--send-reports) – IRC bot and emails via sendgrid.

Links:

  • Pull requests: #14, #15
Week 9-10

Improved CLI argument handling. Experimented with different mobile app UI layouts and backend options. Fixed issue #17. Started working on mobile app UI, decided to use Firebase for backend.

Links:

  • Pull requests: #16
  • Issues: #17
Week 11-12

Completed mobile app (UI + backend). Created a plan to migrate to a custom self-hosted backend based on Django+MongoDB+Nginx in the future. Added --send-reports mobile option to CLI. Available functionality:

  • UI
    • Login screen: Anonymous login
    • Reports screen: Receive and view reports send from CLI app.
    • Profile screen: View token, user ID and Sign Out button.
  • Backend
    • Create anonymous users (Cloud Functions)
    • Create user tokens (Cloud Functions)
    • Receive tokens in https requests, verify them, and route to users (Cloud Functions)
    • Send push notifications (FCM)
    • Secure database access with Firestore security rules

Link:

  • Pull requests: #18
  • Mobile app repository
Final week

Added token encryption with Cloud Functions. Packaged mobile app with Github Actions and published to Google Play Store. Recorded a demo video and wrote gentoo_update User Guide that covers both CLI and mobile app.

Links:

  • Demo video
  • gentoo_update User Guide
  • Packaging Github Actions workflow
  • Google Play link
  • Release page
Project Status

I would say I’m very satisfied with the current state of the project. Almost all tasks were completed from the proposal, and there is a product that can already be used. To summarize, here is a list of deliverables:

  1. Source code for gentoo_update CLI app
  2. gentoo_update CLI app ebuild in GURU repository
  3. gentoo_update CLI app package in PyPi
  4. Source code for mobile app
  5. Mobile app for Andoid in APK
  6. Mobile app for Android in Google Play
Future Improvements

I plan to add a lot more features to both CLI and mobile apps. Full feature lists can be found in readme’s of both repositories:

  • CLI app upcoming features
  • mobile app upcoming features
Final Thoughts

These 12 weeks felt like a hackathon, where I had to learn new technologies very quickly and create something that works very fast. I faced many challenges and acquired a range of new skills.

Over the course of this project, I coded both Linux CLI applications using Python and Bash, and mobile apps with Flutter and Firebase. To maintain the quality of my work, I tested the code in Docker containers, virtual machines and physical hardware. Additionally, I built and deployed CI/CD pipelines with GitHub Actions to automate packaging. Beyond the technical side, I engaged actively with Gentoo community, utilizing IRC chats and forums. Through these platforms, I addressed and resolved issues on both GitHub and Gentoo Bugs, enriching my understanding and refining my skills.

I also would like to thank my mentor, Andrey Falko, for all his help and support. I wouldn’t have been able to finish this project without his guidance.

In addition, I want to thank Google for providing such a generous opportunity for open source developers to work on bringing forth innovation.

Lastly, I am grateful to Gentoo community for the feedback that’s helped me to improve the project immensely.

Project Goals

Main goal of the project was to write an app that will automatically handle updates on Gentoo Linux systems and send notifications with update summaries. More specifically, I wanted to:

  1. Simplify the update process for beginners, offering a simpler one-click method.
  2. Minimize time experienced users spend on routine update tasks, decreasing their workload.
  3. Ensure systems remain secure and regularly updated with minimal manual intervention.
  4. Keep users informed of the updates and changes.
  5. Improve the overall Gentoo Linux user experience.

Progress

Here is a summary of what was done every week with links to my blog posts.

Week 1

Basic system updater is ready. Also prepared a Docker Compose file to run tests in containers. Available functionality:

  • update security patches
  • update @world
  • merge changed configuration files
  • restart updated services
  • do a post-update clean up
  • read elogs
  • read news

Links:

Week 2

Packaged Python code, created an ebuild and a GitHub Actions workflow that publishes package to PyPI when commit is tagged.

Links:

Week 3

Fixed issue #7 and answered to issue #8 and fixed bug 908308. Added USE flags to manage dependencies. Improve Bash code stability.

Links:

Week 4

Fixed errors in ebuild, replaced USE flags with optfeature for dependency management. Wrote a blog post to introduce my app and posted it on forums. Fixed a bug in --args flag.

Links:

Week 5

Received some feedback from forums. Coded much of the parser (--report). Improved container testing environment.

Links:

Weeks 6 and 7

Completed parser (--report). Also added disk usage calculation before and after the update. Available functionality:

  • If the update was successful, report will show:
    • updated package names
    • package versions in the format “old -> new”
    • USE flags of those packages
    • disk usage before and after the update
  • If the emerge pretend has failed, report will show:
    • error type (for now only supports ‘blocked packages’ error)
    • error details (for blocked package it will show problematic packages)

Links:

Week 8

Add 2 notification methods (--send-reports) – IRC bot and emails via sendgrid.

Links:

Week 9-10

Improved CLI argument handling. Experimented with different mobile app UI layouts and backend options. Fixed issue #17. Started working on mobile app UI, decided to use Firebase for backend.

Links:

  • Pull requests: #16
  • Issues: #17

Week 11-12

Completed mobile app (UI + backend). Created a plan to migrate to a custom self-hosted backend based on Django+MongoDB+Nginx in the future. Added --send-reports mobile option to CLI. Available functionality:

  • UI
    • Login screen: Anonymous login
    • Reports screen: Receive and view reports send from CLI app.
    • Profile screen: View token, user ID and Sign Out button.
  • Backend
    • Create anonymous users (Cloud Functions)
    • Create user tokens (Cloud Functions)
    • Receive tokens in https requests, verify them, and route to users (Cloud Functions)
    • Send push notifications (FCM)
    • Secure database access with Firestore security rules

Link:

Final week

Added token encryption with Cloud Functions. Packaged mobile app with Github Actions and published to Google Play Store. Recorded a demo video and wrote gentoo_update User Guide that covers both CLI and mobile app.

Links:

Project Status

I would say I’m very satisfied with the current state of the project. Almost all tasks were completed from the proposal, and there is a product that can already be used. To summarize, here is a list of deliverables:

  1. Source code for gentoo_update CLI app
  2. gentoo_update CLI app ebuild in GURU repository
  3. gentoo_update CLI app package in PyPi
  4. Source code for mobile app
  5. Mobile app for Andoid in APK
  6. Mobile app for Android in Google Play

Future Improvements

I plan to add a lot more features to both CLI and mobile apps. Full feature lists can be found in readme’s of both repositories:

Final Thoughts

These 12 weeks felt like a hackathon, where I had to learn new technologies very quickly and create something that works very fast. I faced many challenges and acquired a range of new skills.

Over the course of this project, I coded both Linux CLI applications using Python and Bash, and mobile apps with Flutter and Firebase. To maintain the quality of my work, I tested the code in Docker containers, virtual machines and physical hardware. Additionally, I built and deployed CI/CD pipelines with GitHub Actions to automate packaging. Beyond the technical side, I engaged actively with Gentoo community, utilizing IRC chats and forums. Through these platforms, I addressed and resolved issues on both GitHub and Gentoo Bugs, enriching my understanding and refining my skills.

I also would like to thank my mentor, Andrey Falko, for all his help and support. I wouldn’t have been able to finish this project without his guidance.

In addition, I want to thank Google for providing such a generous opportunity for open source developers to work on bringing forth innovation.

Lastly, I am grateful to Gentoo community for the feedback that’s helped me to improve the project immensely.

gentoo_update User Guide

Gentoo Google Summer of Code (GSoC) August 27, 2023, 15:15
Introduction

This article will go through the basic usage of gentoo_update CLI tool and the mobile app.

But before that, here is a demo of this project:

gentoo_update CLI App Installation

gentoo_update is available in GURU overlay and in PyPI. Generally, installing the program from GURU overlay is the preferred method, but PyPI will always have the most recent version.

Enable GURU and install with emerge:

eselect repository enable guru
emerge --ask app-admin/gentoo_update

Alternatively, install from PyPI with pip:

python -m venv .venv_gentoo_update
source .venv_gentoo_update/bin/activate
python -m pip install gentoo_update
Update

gentoo_update provides 2 update modes – full and security. Full mode updates @world, and security mode uses glsa-check to find security patches, and installs them if something is found.

By default, when run without flags security mode is selected:

gentoo-update

To update @world, run:

gentoo-update --update-mode full

Full list of available parameters and flags can be accessed with the --help flag. Further examples are detailed in the repository’s readme file.

Once the update concludes, a log file gets generated at /var/log/portage/gentoo_update/log_<date> (or whatever $PORTAGE_LOGDIR is set to). This log becomes the basis for the update report when the --report flag is used, transforming the log details into a structured update report.

Send Report

The update report can be sent through three distinct methods: IRC bot, email, or mobile app.

IRC Bot Method
Begin by registering a user on an IRC server and setting a nickname as outlined in the documentation. After establishing a chat channel for notifications, define the necessary environmental variables and execute the following commands:

export IRC_CHANNEL="#<irc_channel_name>"
export IRC_BOT_NICKNAME="<bot_name>"
export IRC_BOT_PASSWORD="<bot_password>"
gentoo-update --send-report irc

Email via Sendgrid
To utilize Sendgrid, register for an account and generate an API key). After installing the Sendgrid Python library from GURU, save the API key in the environmental variables and use the commands below:

emerge --ask dev-python/sendgrid
export SENDGRID_TO='recipient@email.com'
export SENDGRID_FROM='sender@email.com'
export SENDGRID_API_KEY='SG.****************'
gentoo-update --send-report email

Notifications can also be sent via the mobile app. Details on this method will be elaborated in the following section.

 

gentoo_update Mobile App Installation

Mobile app can either be installed from Github or Google Play Store.

Play Store

App can be found by searching ‘gentoo_update’ in the Play Store, or by using this link.

Manual Installation
For manual installation on an Android device, download the APK file from
Releases tab on Github. Ensure you’ve enabled installation from Unknown Sources before proceeding.

Usage

The mobile app consists of three screens: Login, Reports, and Profile.

Upon first use, users will see the Login screen. To proceed, select the Anonymous Login button. This action generates an account with a unique user ID and token, essential for the CLI to send reports.

The Reports screen displays all reports sent using a specific token. Each entry shows the update status and report ID. For an in-depth view of any report, simply tap on it.

On the Profile screen, users can find their 8-character token, which needs to be saved as the GU_TOKEN variable on the Gentoo instance. This screen also shows the AES key status, crucial for decrypting the client-side token as it’s encrypted in the database. To log out, tap the Sign Out button.
Note: Since only Anonymous Login is available, once logged out, returning to the same account isn’t possible.

Contacts

Preferred method for getting help or requesting a new feature for both CLI and mobile apps is by creating an issue in Github:

  • gentoo_update CLI issues page
  • Mobile app issues page

Or just contact me directly via labbrat_social@pm.me and IRC. I am in most of the #gentoo IRC groups and my nick is #LabBrat.

Links
  • [Link] – gentoo_update CLI repository
  • [Link] – Mobile App repository
GSoC (gsoc ) August 27, 2023, 15:15

Introduction

This article will go through the basic usage of gentoo_update CLI tool and the mobile app.

But before that, here is a demo of this project:

gentoo_update CLI App

Installation

gentoo_update is available in GURU overlay and in PyPI. Generally, installing the program from GURU overlay is the preferred method, but PyPI will always have the most recent version.

Enable GURU and install with emerge:

eselect repository enable guru
emerge --ask app-admin/gentoo_update

Alternatively, install from PyPI with pip:

python -m venv .venv_gentoo_update
source .venv_gentoo_update/bin/activate
python -m pip install gentoo_update

Update

gentoo_update provides 2 update modes – full and security. Full mode updates @world, and security mode uses glsa-check to find security patches, and installs them if something is found.

By default, when run without flags security mode is selected:

gentoo-update

To update @world, run:

gentoo-update --update-mode full

Full list of available parameters and flags can be accessed with the --help flag. Further examples are detailed in the repository’s readme file.

Once the update concludes, a log file gets generated at /var/log/portage/gentoo_update/log_<date> (or whatever $PORTAGE_LOGDIR is set to). This log becomes the basis for the update report when the --report flag is used, transforming the log details into a structured update report.

Send Report

The update report can be sent through three distinct methods: IRC bot, email, or mobile app.

IRC Bot Method
Begin by registering a user on an IRC server and setting a nickname as outlined in the documentation. After establishing a chat channel for notifications, define the necessary environmental variables and execute the following commands:

export IRC_CHANNEL="#<irc_channel_name>"
export IRC_BOT_NICKNAME="<bot_name>"
export IRC_BOT_PASSWORD="<bot_password>"
gentoo-update --send-report irc

Email via Sendgrid
To utilize Sendgrid, register for an account and generate an API key). After installing the Sendgrid Python library from GURU, save the API key in the environmental variables and use the commands below:

emerge --ask dev-python/sendgrid
export SENDGRID_TO='recipient@email.com'
export SENDGRID_FROM='sender@email.com'
export SENDGRID_API_KEY='SG.****************'
gentoo-update --send-report email

Notifications can also be sent via the mobile app. Details on this method will be elaborated in the following section.

 

gentoo_update Mobile App

Installation

Mobile app can either be installed from Github or Google Play Store.

Play Store

App can be found by searching ‘gentoo_update’ in the Play Store, or by using this link.

Manual Installation
For manual installation on an Android device, download the APK file from
Releases tab on Github. Ensure you’ve enabled installation from Unknown Sources before proceeding.

Usage

The mobile app consists of three screens: Login, Reports, and Profile.

Upon first use, users will see the Login screen. To proceed, select the Anonymous Login button. This action generates an account with a unique user ID and token, essential for the CLI to send reports.

The Reports screen displays all reports sent using a specific token. Each entry shows the update status and report ID. For an in-depth view of any report, simply tap on it.

On the Profile screen, users can find their 8-character token, which needs to be saved as the GU_TOKEN variable on the Gentoo instance. This screen also shows the AES key status, crucial for decrypting the client-side token as it’s encrypted in the database. To log out, tap the Sign Out button.
Note: Since only Anonymous Login is available, once logged out, returning to the same account isn’t possible.

Contacts

Preferred method for getting help or requesting a new feature for both CLI and mobile apps is by creating an issue in Github:

Or just contact me directly via labbrat_social@pm.me and IRC. I am in most of the #gentoo IRC groups and my nick is #LabBrat.

  • [Link] – gentoo_update CLI repository
  • [Link] – Mobile App repository

August 21 2023

Week 12 report on porting Gentoo packages to modern C

Gentoo Google Summer of Code (GSoC) August 21, 2023, 13:49

Hello all, hope you’re doing well. This is my week 12 report for my
project “Porting Gentoo’s packages to modern C”

Similar to last week I took up bugs from the tracker randomly and
patched them, sending patch upstream whenever possible. Unfortunately,
nothing new or interesting.

Also been working with Juippis on masking firefox-bin and rust-bin in
glibc llvm profile, Juippis has for now reverted the commit masking
those bin packages, but likely a proper fix will be committed soon.

Just warping things up for final review. I’m also in 1:1 contact with
Sam in case there is some major work needed on a particular section of
my project or a package.

And to be honest, it not really much, I’ve been under the weather a bit
and busy with some IRL stuff.

This week I’ve some free time which I plan on dedicating to lapac and
fortran bug on llvm profile. With that solved, we will be able to close
a good number of bugs sci package related bugs and also some qemu
related bugs (as that pull some packages like apack and lapack). I’ll
probably also sit with Sam for this one, hopefully we’ll be able to calk
something out.

EDIT

I forgot to mention that this is going to be the last week, so I’ll wrap things up after talking with my mentors. Also will create a separate blog post that will link all of my work throughout the weeks in brief and will be used as the final submission.

Till then see yah!

Hello all, hope you’re doing well. This is my week 12 report for my
project “Porting Gentoo’s packages to modern C”

Similar to last week I took up bugs from the tracker randomly and
patched them, sending patch upstream whenever possible. Unfortunately,
nothing new or interesting.

Also been working with Juippis on masking firefox-bin and rust-bin in
glibc llvm profile, Juippis has for now reverted the commit masking
those bin packages, but likely a proper fix will be committed soon.

Just warping things up for final review. I’m also in 1:1 contact with
Sam in case there is some major work needed on a particular section of
my project or a package.

And to be honest, it not really much, I’ve been under the weather a bit
and busy with some IRL stuff.

This week I’ve some free time which I plan on dedicating to lapac and
fortran bug on llvm profile. With that solved, we will be able to close
a good number of bugs sci package related bugs and also some qemu
related bugs (as that pull some packages like apack and lapack). I’ll
probably also sit with Sam for this one, hopefully we’ll be able to calk
something out.

EDIT

I forgot to mention that this is going to be the last week, so I’ll wrap things up after talking with my mentors. Also will create a separate blog post that will link all of my work throughout the weeks in brief and will be used as the final submission.

Till then see yah!

August 20 2023

Week 11+12 Report, Automated Gentoo System Updater

Gentoo Google Summer of Code (GSoC) August 20, 2023, 20:59

This article is a summary of all the changes made on Automated Gentoo System Updater project during weeks 11 and 12 of GSoC.

Project is hosted on GitHub ( gentoo_update and mobile app).

Progress on Weeks 11 and 12

During last 2 weeks I’ve completed app UI and Firebase backend. Most of the work is done!

I’m not entirely pleased with how the backend works. In Firebase, I ended up using:

  • Firestore (security rules defined here)
  • Cloud Functions (defined here)
  • Cloud Messaging (FCM).

After a user authenticates using anonymous login, a token is automatically registered in Firestore. This token is later used by gentoo_update to send reports. Cloud Functions manage the token’s creation. In fact, all database write operations are handled by Cloud Functions, with users having read-only access to the data they’ve created. Here is how to send the report via token:

export GU_TOKEN="<token ID>"
gentoo-update --send-report mobile

Internally, gentoo-update talks to a Cloud Function. This function checks the token, then saves the report in Firestore for the user to access.

This differs from the original idea, where I didn’t intend to save reports in Firestore. The initial plan was to have the client side listen and let Firebase route report content from the Gentoo system to the app. But this method often missed reports or stored them incorrectly, causing them to vanish from the app. To solve this, I chose to save the reports and tokens, but with encryption.

I’ve came up with a solution to create a custom backend for the app, which users will be to self-host, more about in the Challenges section.

Apart from the web app, I’ve fixed some minor issues in gentoo-update and pushed the latest ebuild version to GURU repository (commit link).

Challenges

While Firebase offers a quick way to set up a backend, it has its drawbacks:

  • Not all its best features are free.
  • Some of its operations aren’t transparent.
  • It doesn’t offer self-hosting.
  • Its rate-limiting and security features aren’t as strong as needed. To tackle these concerns, I’m considering a custom backend using this tech stack: Linux + Docker + Python/Django + MongoDB + Nginx.

Here’s a breakdown:

  • Django will serve as the backend, handling tasks similar to Cloud Functions.
  • MongoDB, a document database, will take Firestore’s place.
  • Nginx adds extra capabilities for rate-limiting, load balancing, and security checks.

If necessary, MongoDB can be swapped out for a relational database because the backend will heavily utilize ORM. The same flexibility applies to Nginx.

A highlight of this approach is that everything can be defined in a Docker Compose file, simplifying self-hosting for users.

Plans for Week 13 (final week ♦)

Here is my plan for the final week of GSoC’2023:

  1. Add encryption to the Firestore. I don’t want any user data to be stored in plain text.
  2. Improve some UI elements and add a pop-up with commands to copy/paste.
  3. Publish mobile app to Playstore.
  4. Write a detailed blog post on how to use the whole thing.
  5. Writa a post on forums.

This article is a summary of all the changes made on Automated Gentoo System Updater project during weeks 11 and 12 of GSoC.

Project is hosted on GitHub ( gentoo_update and mobile app).

Progress on Weeks 11 and 12

During last 2 weeks I’ve completed app UI and Firebase backend. Most of the work is done!

I’m not entirely pleased with how the backend works. In Firebase, I ended up using:

  • Firestore (security rules defined here)
  • Cloud Functions (defined here)
  • Cloud Messaging (FCM).

After a user authenticates using anonymous login, a token is automatically registered in Firestore. This token is later used by gentoo_update to send reports. Cloud Functions manage the token’s creation. In fact, all database write operations are handled by Cloud Functions, with users having read-only access to the data they’ve created. Here is how to send the report via token:

export GU_TOKEN="<token ID>"
gentoo-update --send-report mobile

Internally, gentoo-update talks to a Cloud Function. This function checks the token, then saves the report in Firestore for the user to access.

This differs from the original idea, where I didn’t intend to save reports in Firestore. The initial plan was to have the client side listen and let Firebase route report content from the Gentoo system to the app. But this method often missed reports or stored them incorrectly, causing them to vanish from the app. To solve this, I chose to save the reports and tokens, but with encryption.

I’ve came up with a solution to create a custom backend for the app, which users will be to self-host, more about in the Challenges section.

Apart from the web app, I’ve fixed some minor issues in gentoo-update and pushed the latest ebuild version to GURU repository (commit link).

Challenges

While Firebase offers a quick way to set up a backend, it has its drawbacks:

  • Not all its best features are free.
  • Some of its operations aren’t transparent.
  • It doesn’t offer self-hosting.
  • Its rate-limiting and security features aren’t as strong as needed. To tackle these concerns, I’m considering a custom backend using this tech stack: Linux + Docker + Python/Django + MongoDB + Nginx.

Here’s a breakdown:

  • Django will serve as the backend, handling tasks similar to Cloud Functions.
  • MongoDB, a document database, will take Firestore’s place.
  • Nginx adds extra capabilities for rate-limiting, load balancing, and security checks.

If necessary, MongoDB can be swapped out for a relational database because the backend will heavily utilize ORM. The same flexibility applies to Nginx.

A highlight of this approach is that everything can be defined in a Docker Compose file, simplifying self-hosting for users.

Plans for Week 13 (final week 🎉)

Here is my plan for the final week of GSoC’2023:

  1. Add encryption to the Firestore. I don’t want any user data to be stored in plain text.
  2. Improve some UI elements and add a pop-up with commands to copy/paste.
  3. Publish mobile app to Playstore.
  4. Write a detailed blog post on how to use the whole thing.
  5. Writa a post on forums.

August 14 2023

Week 11 report on porting Gentoo packages to modern C

Gentoo Google Summer of Code (GSoC) August 14, 2023, 5:30

Hello all, hope you’re doing well. This is my week 11 report for my
project “Porting Gentoo’s packages to modern C”

Similar to last two weeks I took up bugs from the tracker randomly and
patched them, sending patch upstream whenever possible. Unfortunately,
nothing new or interesting.

I’ve some open PRs at ::gentoo that I would like to work on and get
reviews on from mentor/s.

This coming week is going to be the last week, so I would like to few more bugs and
start working on wrapping things up. However, I don’t plan on abandoning
my patching work for this week (not even after GSoC) as there is still
lots interesting packages in the tracker.

Till then see yah!

Hello all, hope you’re doing well. This is my week 11 report for my
project “Porting Gentoo’s packages to modern C”

Similar to last two weeks I took up bugs from the tracker randomly and
patched them, sending patch upstream whenever possible. Unfortunately,
nothing new or interesting.

I’ve some open PRs at ::gentoo that I would like to work on and get
reviews on from mentor/s.

This coming week is going to be the last week, so I would like to few more bugs and
start working on wrapping things up. However, I don’t plan on abandoning
my patching work for this week (not even after GSoC) as there is still
lots interesting packages in the tracker.

Till then see yah!

August 07 2023

Week 9+10 Report, Automated Gentoo System Updater

Gentoo Google Summer of Code (GSoC) August 07, 2023, 18:24

This article is a summary of all the changes made on Automated Gentoo System Updater project during weeks 9 and 10 of GSoC.

Project is hosted on GitHub (gentoo_update and mobile app).

Progress on Weeks 9 and 10

I have finalized app architecture, here are the details:

The app’s main functionality is to receive notification from the push server. For each user, it will create a unique API token after authentication (there is an Anonymous option). This token will be used by gentoo_update to send the encrypted report to the mobile device using a push server endpoint. Update reports will be kept only on the mobile device, ensuring privacy.

After much discussion, I decided to implement app’s backend in Firebase. Since GSoC is organized by Google, it seems appropriate to use their products for this project. However, future plans include the possibility of implementing a self-hosted web server, so that instead of authentication user will just enter server public IP and port.

Example usage will be something like:

  1. Download the app and sign-in.
  2. App will generate a token, 1 token per 1 account.
  3. Save the token into an environmental variable on Gentoo Linux.
  4. Run gentoo_update --send-report mobile
  5. Wait until notification arrives on the mobile app.

I have also made some progress on the app’s code. I’ve decided to host it in another repository because it doesn’t require direct access to gentoo_update, and this way it will be easier to manage versions and set up CI/CD.

Splitting tasks for the app into UI and Backend categories was not very efficient in practice, since two are very closely related. Here is what I have done so far:

  • Create an app layout
  • Set up Firebase backend for the app
  • Set up database structure for storing tokens
  • Configure anonymous authentication
  • UI elements for everything above
Challenges

I’m finding it somewhat challenging to get used to Flutter and design an modern-looking app. My comfort zone lies more in coding backend and automation tasks rather than focusing on the intricacies of UI components. Despite these challenges, I am 60% sure that in end app will look half-decent.

Plans for Week 11

After week 11 I plan to have a mechanism to deliver update reports from a Gentoo Linux machine.

This article is a summary of all the changes made on Automated Gentoo System Updater project during weeks 9 and 10 of GSoC.

Project is hosted on GitHub (gentoo_update and mobile app).

Progress on Weeks 9 and 10

I have finalized app architecture, here are the details:

The app’s main functionality is to receive notification from the push server. For each user, it will create a unique API token after authentication (there is an Anonymous option). This token will be used by gentoo_update to send the encrypted report to the mobile device using a push server endpoint. Update reports will be kept only on the mobile device, ensuring privacy.

After much discussion, I decided to implement app’s backend in Firebase. Since GSoC is organized by Google, it seems appropriate to use their products for this project. However, future plans include the possibility of implementing a self-hosted web server, so that instead of authentication user will just enter server public IP and port.

Example usage will be something like:

  1. Download the app and sign-in.
  2. App will generate a token, 1 token per 1 account.
  3. Save the token into an environmental variable on Gentoo Linux.
  4. Run gentoo_update --send-report mobile
  5. Wait until notification arrives on the mobile app.

I have also made some progress on the app’s code. I’ve decided to host it in another repository because it doesn’t require direct access to gentoo_update, and this way it will be easier to manage versions and set up CI/CD.

Splitting tasks for the app into UI and Backend categories was not very efficient in practice, since two are very closely related. Here is what I have done so far:

  • Create an app layout
  • Set up Firebase backend for the app
  • Set up database structure for storing tokens
  • Configure anonymous authentication
  • UI elements for everything above

Challenges

I’m finding it somewhat challenging to get used to Flutter and design an modern-looking app. My comfort zone lies more in coding backend and automation tasks rather than focusing on the intricacies of UI components. Despite these challenges, I am 60% sure that in end app will look half-decent.

Plans for Week 11

After week 11 I plan to have a mechanism to deliver update reports from a Gentoo Linux machine.

August 06 2023

Week 10 report on porting Gentoo packages to modern C

Gentoo Google Summer of Code (GSoC) August 06, 2023, 18:53

Hello all, I’m here with my week 10 report of my project “Porting
gentoo’s packages to modern C”

So apart from the usual patching of packages from the tracker the most
significant work done this week is getting GNOME desktop on llvm
profile. But it is to be noted that the packages gui-libs/libhandy,
dev-libs/libgee and sys-libs/libblockdev require gcc fallback
environment. net-dialup/ppp was also on our list but thanks to Sam its
has been patched [0] (and fix sent upstream). I’m pretty sure that
the same work around would work on musl-llvm profile as well. Overall
point being we now have two DEs on llvm profile, GNOME and MATE.

Another thing to note is currently gui-libs/gtk-4.10.4 require
overriding of LD to bfd and OBJCOPY to gnu objcopy, it is a dependency
for gnome 44.3.

Unfortunately, time is not my friend here and I’ve got only two weeks
left. I’ll try fix as many as packages possible in the coming weeks,
starting with the GNOME dependencies.

Meanwhile lot of my upstream patches are merged as well, hope remaining
ones get merged as well, [1][2] to name a few.

Till then, see ya!

[0]: github.com/gentoo/gentoo/pull/32198
[1]: github.com/CruiserOne/Astrolog/pull/20
[2]: github.com/cosmos72/detachtty/pull/6

Hello all, I’m here with my week 10 report of my project “Porting
gentoo’s packages to modern C”

So apart from the usual patching of packages from the tracker the most
significant work done this week is getting GNOME desktop on llvm
profile. But it is to be noted that the packages gui-libs/libhandy,
dev-libs/libgee and sys-libs/libblockdev require gcc fallback
environment. net-dialup/ppp was also on our list but thanks to Sam its
has been patched [0] (and fix sent upstream). I’m pretty sure that
the same work around would work on musl-llvm profile as well. Overall
point being we now have two DEs on llvm profile, GNOME and MATE.

Another thing to note is currently gui-libs/gtk-4.10.4 require
overriding of LD to bfd and OBJCOPY to gnu objcopy, it is a dependency
for gnome 44.3.

Unfortunately, time is not my friend here and I’ve got only two weeks
left. I’ll try fix as many as packages possible in the coming weeks,
starting with the GNOME dependencies.

Meanwhile lot of my upstream patches are merged as well, hope remaining
ones get merged as well, [1][2] to name a few.

Till then, see ya!

[0]: https://github.com/gentoo/gentoo/pull/32198
[1]: https://github.com/CruiserOne/Astrolog/pull/20
[2]: https://github.com/cosmos72/detachtty/pull/6

August 02 2023

Weekly report 9, LLVM-libc

Gentoo Google Summer of Code (GSoC) August 02, 2023, 1:32

Hi! This week I’ve pretty much finished the work on LLVM/Clang support
for Crossdev and LLVM-libc ebuild(s). I have sent PRs for Crossdev and
related ebuild changes here:

github.com/gentoo/crossdev/pull/10
github.com/gentoo/gentoo/pull/32136
This PR includes changes for compiler-rt which are always needed for
Clang crossdev, regardless of libc. There are also changes to musl,
kernel-2.eclass (for linux-headers), and a new eclass, cross.eclass.

I made a gentoo.git branch that has LLVM-libc, libc-hdrgen ebuilds and a
gnuconfig patch to support
LLVM-libc. github.com/gentoo/gentoo/compare/master…alfredfo:gentoo:gentoo-llvm-libc. I
want to merge Crossdev changes and ebuilds before merging
this. Previously all autotools based projects would fail to configure on
LLVM-libc because there was no gnuconfig entry for it.

I have also solved the problem from last week not being able to compile SCUDO
into LLVM-libc directly. This was caused by two things, 1) LLVM-libc
only checked for compiler-rt in LLVM_ENABLE_PROJECTS, not
LLVM_ENABLE_RUNTIMES which is needed for using “llvm-project/runtimes”
as root source directory (“Runtimes build”).
Fix commit:
github.com/llvm/llvm-project/commit/fe9c3c786837de74dc936f8994cd5a53dd8ee708
2) Many compiler-rt configure tests would fail because of LLVM-libc not
supporting dynamic linking, and therefore disable the build of
SCUDO. This was fixed by passing
-DCMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY. So now I no longer need
to manually compile the source files and append object files into
libc.a, yay!

Now I will continue to fix packages for using LLVM-libc Crossdev, or
more likely, add needed functionality into LLVM-libc. I will of course
also fix any comments I get on my PRs.


catcream

Hi! This week I’ve pretty much finished the work on LLVM/Clang support
for Crossdev and LLVM-libc ebuild(s). I have sent PRs for Crossdev and
related ebuild changes here:

https://github.com/gentoo/crossdev/pull/10
https://github.com/gentoo/gentoo/pull/32136
This PR includes changes for compiler-rt which are always needed for
Clang crossdev, regardless of libc. There are also changes to musl,
kernel-2.eclass (for linux-headers), and a new eclass, cross.eclass.

I made a gentoo.git branch that has LLVM-libc, libc-hdrgen ebuilds and a
gnuconfig patch to support
LLVM-libc. https://github.com/gentoo/gentoo/compare/master…alfredfo:gentoo:gentoo-llvm-libc. I
want to merge Crossdev changes and ebuilds before merging
this. Previously all autotools based projects would fail to configure on
LLVM-libc because there was no gnuconfig entry for it.

I have also solved the problem from last week not being able to compile SCUDO
into LLVM-libc directly. This was caused by two things, 1) LLVM-libc
only checked for compiler-rt in LLVM_ENABLE_PROJECTS, not
LLVM_ENABLE_RUNTIMES which is needed for using “llvm-project/runtimes”
as root source directory (“Runtimes build”).
Fix commit:
https://github.com/llvm/llvm-project/commit/fe9c3c786837de74dc936f8994cd5a53dd8ee708
2) Many compiler-rt configure tests would fail because of LLVM-libc not
supporting dynamic linking, and therefore disable the build of
SCUDO. This was fixed by passing
-DCMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY. So now I no longer need
to manually compile the source files and append object files into
libc.a, yay!

Now I will continue to fix packages for using LLVM-libc Crossdev, or
more likely, add needed functionality into LLVM-libc. I will of course
also fix any comments I get on my PRs.


catcream

July 30 2023

Week 9 report on porting Gentoo packages to modern C

Gentoo Google Summer of Code (GSoC) July 30, 2023, 18:31

Hello all, hope you’re doing well. This is my week 9 report for my
project “Porting Gentoo’s packages to modern C”

Similar to last week, I picked up bugs at random and started submitting
patches. But this time I made sure to check out the upstream and send in
patches whenever possible, if it turned out to be difficult or I
couldn’t find upstream I made sure to make a note about it in the PR
either via commit message or through a separate comment. This way it’ll
help my Sam keep track of things and my progress.

Apart from that nothing new or interesting unfortunately.

Coming next week the plan is the same, pick up more bugs and send in
PRs, both in ::gentoo and upstream whenever possible. I also have some
free time coming week, so plan to make up for lost time during my sick
days in the coming week, as there still lots of packages that require
patching.

I would like to note here, that I made an extra blog post last week
about setting testing environment using lxc and the knowledge about
using gentoo’s stage-3 tarballs to create custom lxc gentoo images. I
don’t really expect anyone following it or using it, mainly put that up
for future reference for myself.

Till then, see ya!

Hello all, hope you’re doing well. This is my week 9 report for my
project “Porting Gentoo’s packages to modern C”

Similar to last week, I picked up bugs at random and started submitting
patches. But this time I made sure to check out the upstream and send in
patches whenever possible, if it turned out to be difficult or I
couldn’t find upstream I made sure to make a note about it in the PR
either via commit message or through a separate comment. This way it’ll
help my Sam keep track of things and my progress.

Apart from that nothing new or interesting unfortunately.

Coming next week the plan is the same, pick up more bugs and send in
PRs, both in ::gentoo and upstream whenever possible. I also have some
free time coming week, so plan to make up for lost time during my sick
days in the coming week, as there still lots of packages that require
patching.

I would like to note here, that I made an extra blog post last week
about setting testing environment using lxc and the knowledge about
using gentoo’s stage-3 tarballs to create custom lxc gentoo images. I
don’t really expect anyone following it or using it, mainly put that up
for future reference for myself.

Till then, see ya!

July 29 2023

Genkernel in 2023

Maciej Barć (xgqt) July 29, 2023, 17:10

I really wanted to look into the new kernel building solutions for Gentoo and maybe migrate to dracut, but last time I tried, ~1.5 years ago, the initreamfs was now working for me.

And now in 2023 I’m still running genkernel for my personal boxes as well as other servers running Gentoo.

I guess some short term solutions really become defined tools :P

So this is how I rebuild my kernel nowadays:

  1. Copy old config

    1
    2
    cd /usr/src
    cp linux-6.1.38-gentoo/.config linux-6.1.41-gentoo/
    
  2. Remove old kernel build directories

    1
    rm -r linux-6.1.31-gentoo
    
  3. Run initial preparation

    1
    ( eselect kernel set 1 && cd /usr/src/linux && make olddefconfig )
    
  4. Call genkernel

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    genkernel                                                       \
        --no-menuconfig                                             \
        --no-clean                                                  \
        --no-clear-cachedir                                         \
        --no-cleanup                                                \
        --no-mrproper                                               \
        --lvm                                                       \
        --luks                                                      \
        --mdadm                                                     \
        --nfs                                                       \
        --kernel-localversion="-$(hostname)-$(date '+%Y.%m.%d')"    \
        all
    
  5. Rebuild the modules

    If in your /etc/genkernel.conf you have MODULEREBUILD turned off, then also call emerge:

    1
    emerge -1 @module-rebuild
    
xgqt (xgqt ) July 29, 2023, 17:10

I really wanted to look into the new kernel building solutions for Gentoo and maybe migrate to dracut, but last time I tried, ~1.5 years ago, the initreamfs was now working for me.

And now in 2023 I’m still running genkernel for my personal boxes as well as other servers running Gentoo.

I guess some short term solutions really become defined tools :P

So this is how I rebuild my kernel nowadays:

  1. Copy old config

    1
    2
    cd /usr/src
    cp linux-6.1.38-gentoo/.config linux-6.1.41-gentoo/
    
  2. Remove old kernel build directories

    1
    rm -r linux-6.1.31-gentoo
    
  3. Run initial preparation

    1
    ( eselect kernel set 1 && cd /usr/src/linux && make olddefconfig )
    
  4. Call genkernel

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    genkernel                                                       \
        --no-menuconfig                                             \
        --no-clean                                                  \
        --no-clear-cachedir                                         \
        --no-cleanup                                                \
        --no-mrproper                                               \
        --lvm                                                       \
        --luks                                                      \
        --mdadm                                                     \
        --nfs                                                       \
        --kernel-localversion="-$(hostname)-$(date '+%Y.%m.%d')"    \
        all
    
  5. Rebuild the modules

    If in your /etc/genkernel.conf you have MODULEREBUILD turned off, then also call emerge:

    1
    emerge -1 @module-rebuild
    

July 24 2023

Weekly report 8, LLVM libc

Gentoo Google Summer of Code (GSoC) July 24, 2023, 23:21

Hi! This (and last week) I’ve spent my time polishing the LLVM/Clang
crossdev work. I have also created ebuilds for llvm-libc, libc-hdrgen
and also the SCUDO allocator. But I will probably bake SCUDO into the
llvm-libc ebuild instead actually.

One thing I have also made is a cross eclass that handles cross
compilation, instead of having the same logic copy-pasted in all
ebuilds. To differentiate a “normal” crossdev package and LLVM/Clang
crossdev I decided to use “cross_llvm-${CTARGET}” as package category
name. This is necessary since you need some way to tell the ebuild about
using LLVM for cross. My initial idea was to handle all this in the
crossdev script, but crossdev ebuilds are self-contained, and you can do
something like “emerge cross_llvm-gentoo-linux-llvm/llvm-libc” and it
will do the right thing without running emerge from crossdev. Hence I
need to handle cross compilation in the ebuilds themselves, using the
eclass. Me and sam are not sure if a new eclass is the right thing to
do but I will continue with it until I get some more thoughts as we can
just inline everything later without wasting any work.

I feel pretty much done now except for baking SCUDO directly into the
llvm-libc ebuild. Actually it is very simple to do but I got some issues
with libstdc++ when using llvm/ as root source directory for the libc
build, which is necessary to use when compiling SCUDO. Previously I used
runtimes/ as root directory, and that worked without issue. Currently to
work around this you can just compile the source files in
llvm-project/compiler-rt/lib/scudo/standalone and append the object
files into libc.a. LLVM libc then just works with crossdev and you
can compile things with the emerge wrapper as usual, but currently a lot
of autotools things break due to me not having specified gnuconfig for
llvm-libc yet.

I had a lot of trouble last week with sonames when doing an aarch64 musl
crossdev setup and running binaries with qemu-user, however it turned
out it was just a warning and it worked after setting LD_LIBRARY_PATH as
envvar to qemu-user. I spent a loong time on this.

Currently I will need to upstream changes to compiler-rt ebuild, musl
llvm-libc ebuild, libc-hdrgen, cross.eclass, and of course crossdev.

Next week I will send the changes upstream for review and continue work
on LLVM libc, most likely simple packages like ed, and then try to get
the missing pieces upstreamed to LLVM libc. fileno() is definitely
needed for ed.

Last week I did not write a blog post as I was in “bug hell” and worked
on a lot of small things at once and thought “if I just finish this I
can write a good report”, and then wednesday came, and I decided to just
do an overview of all my work for this weeks’ blog instead ♦

– —
catcream

Hi! This (and last week) I’ve spent my time polishing the LLVM/Clang
crossdev work. I have also created ebuilds for llvm-libc, libc-hdrgen
and also the SCUDO allocator. But I will probably bake SCUDO into the
llvm-libc ebuild instead actually.

One thing I have also made is a cross eclass that handles cross
compilation, instead of having the same logic copy-pasted in all
ebuilds. To differentiate a “normal” crossdev package and LLVM/Clang
crossdev I decided to use “cross_llvm-${CTARGET}” as package category
name. This is necessary since you need some way to tell the ebuild about
using LLVM for cross. My initial idea was to handle all this in the
crossdev script, but crossdev ebuilds are self-contained, and you can do
something like “emerge cross_llvm-gentoo-linux-llvm/llvm-libc” and it
will do the right thing without running emerge from crossdev. Hence I
need to handle cross compilation in the ebuilds themselves, using the
eclass. Me and sam are not sure if a new eclass is the right thing to
do but I will continue with it until I get some more thoughts as we can
just inline everything later without wasting any work.

I feel pretty much done now except for baking SCUDO directly into the
llvm-libc ebuild. Actually it is very simple to do but I got some issues
with libstdc++ when using llvm/ as root source directory for the libc
build, which is necessary to use when compiling SCUDO. Previously I used
runtimes/ as root directory, and that worked without issue. Currently to
work around this you can just compile the source files in
llvm-project/compiler-rt/lib/scudo/standalone and append the object
files into libc.a. LLVM libc then just works with crossdev and you
can compile things with the emerge wrapper as usual, but currently a lot
of autotools things break due to me not having specified gnuconfig for
llvm-libc yet.

I had a lot of trouble last week with sonames when doing an aarch64 musl
crossdev setup and running binaries with qemu-user, however it turned
out it was just a warning and it worked after setting LD_LIBRARY_PATH as
envvar to qemu-user. I spent a loong time on this.

Currently I will need to upstream changes to compiler-rt ebuild, musl
llvm-libc ebuild, libc-hdrgen, cross.eclass, and of course crossdev.

Next week I will send the changes upstream for review and continue work
on LLVM libc, most likely simple packages like ed, and then try to get
the missing pieces upstreamed to LLVM libc. fileno() is definitely
needed for ed.

Last week I did not write a blog post as I was in “bug hell” and worked
on a lot of small things at once and thought “if I just finish this I
can write a good report”, and then wednesday came, and I decided to just
do an overview of all my work for this weeks’ blog instead 😀

– —
catcream

Weekly report 6, LLVM libc

Gentoo Google Summer of Code (GSoC) July 24, 2023, 23:21

Hi! This week I have been working on LLVM/Clang support for
Crossdev. This is currently done by swapping out the different Crossdev
stages for ones that make sense for LLVM.

Currently it replaces stage0 with checking whether LLVM can target the
target triple’s architecture by checking the LLVM_TARGETS USE-flag.

Stage1, which normally installs libc headers and compiles a -stage1 C
compiler is replaced by installing libc headers and compiling
compiler-rt.

Stage2 (kernel headers), is the same.

Stage3 (libc install), is the same.

Stage4, which compiles a full compiler is skipped completely.

Another needed change was to make the compiler-rt ebuild cross-aware,
with changes like making the assembler and C compiler target the target triple, and
including headers from the crossdev /usr/${CTARGET}/usr/include
directory instead of using the host’s libc headers. I got some help from
wikky here, thanks!

Currently doing ‘crossdev –llvm -t riscv64-gentoo-linux-musl &&
riscv64-gentoo-linux-musl-emerge dash’ produces a working binary that
can be run using qemu-user like this ‘qemu-riscv64 -L
/usr/riscv64-gentoo-linux-musl /usr/riscv64-gentoo-linux-musl/bin/dash’
which I think is very cool! However, there is still some issues with
dynamically linking libraries built with the cross compiler. For example
xz-utils installs liblzma.so which fails with “exec format
error”. (sprunge.us/HkSmms). I am currently looking into that.

Another thing I’m still a little uncertain about is where to put all
environment variables and compilation options, whether it’s crossdev’s
job or the ebuilds’. This is something I will come back to, and I have
some changes locally on my computer
Crossdev patches:
github.com/alfredfo/crossdev/commit/ec65dee4b4c359bf3e0fc374d31e05b147fa3f0d
Compiler-rt patches:
github.com/alfredfo/catcream_repo/blob/master/sys-libs/compiler-rt/compiler-rt-17.0.0.9999.ebuild

Later during the week I made ebuilds for LLVM libc and
libc-hdrgen (generates LLVM libc headers from TableGen specification
files). Normally you build LLVM libc together with libc-hdrgen, but when
cross compiling it’s a better idea to split these and keep libc-hdrgen a
tool installed on the build host. I have played with only building
libc headers for bootstrapping with crossdev but I haven’t figured it all out yet.
To only install headers you can use the install-libc-headers target, but
it seems like CMake still wants to build things. There’s also the scudo
allocator that needs to be statically linked to LLVM libc. My idea is to
make a USE=static-scudo flag for compiler-rt that gets set in crossdev
when compiling compiler-rt for a LLVM libc target.
These are also kept locally until I’ve figured out how to cross compile
in stages.

Many “small random issues” and technicalities have also poped up during
this week that’ve taken quite a long time, but are not really worth
digging into here.

Next week I will continue with this until I can use it to work on LLVM
libc, worst case scenario I could temporarily make a “franken LLVM libc ebuild” that
does everything (headers, compiler-rt, scudo, llvm libc) in one shot,
but it should definitely be possible to do it separately.

I also forgot to update my llvm-common changes with the new
elisp-site-file-install function that was inspired by my PR ♦
… will fix that tomorrow.

Thanks for reading!

Hi! This week I have been working on LLVM/Clang support for
Crossdev. This is currently done by swapping out the different Crossdev
stages for ones that make sense for LLVM.

Currently it replaces stage0 with checking whether LLVM can target the
target triple’s architecture by checking the LLVM_TARGETS USE-flag.

Stage1, which normally installs libc headers and compiles a -stage1 C
compiler is replaced by installing libc headers and compiling
compiler-rt.

Stage2 (kernel headers), is the same.

Stage3 (libc install), is the same.

Stage4, which compiles a full compiler is skipped completely.

Another needed change was to make the compiler-rt ebuild cross-aware,
with changes like making the assembler and C compiler target the target triple, and
including headers from the crossdev /usr/${CTARGET}/usr/include
directory instead of using the host’s libc headers. I got some help from
wikky here, thanks!

Currently doing ‘crossdev –llvm -t riscv64-gentoo-linux-musl &&
riscv64-gentoo-linux-musl-emerge dash’ produces a working binary that
can be run using qemu-user like this ‘qemu-riscv64 -L
/usr/riscv64-gentoo-linux-musl /usr/riscv64-gentoo-linux-musl/bin/dash’
which I think is very cool! However, there is still some issues with
dynamically linking libraries built with the cross compiler. For example
xz-utils installs liblzma.so which fails with “exec format
error”. (http://sprunge.us/HkSmms). I am currently looking into that.

Another thing I’m still a little uncertain about is where to put all
environment variables and compilation options, whether it’s crossdev’s
job or the ebuilds’. This is something I will come back to, and I have
some changes locally on my computer
Crossdev patches:
https://github.com/alfredfo/crossdev/commit/ec65dee4b4c359bf3e0fc374d31e05b147fa3f0d
Compiler-rt patches:
https://github.com/alfredfo/catcream_repo/blob/master/sys-libs/compiler-rt/compiler-rt-17.0.0.9999.ebuild

Later during the week I made ebuilds for LLVM libc and
libc-hdrgen (generates LLVM libc headers from TableGen specification
files). Normally you build LLVM libc together with libc-hdrgen, but when
cross compiling it’s a better idea to split these and keep libc-hdrgen a
tool installed on the build host. I have played with only building
libc headers for bootstrapping with crossdev but I haven’t figured it all out yet.
To only install headers you can use the install-libc-headers target, but
it seems like CMake still wants to build things. There’s also the scudo
allocator that needs to be statically linked to LLVM libc. My idea is to
make a USE=static-scudo flag for compiler-rt that gets set in crossdev
when compiling compiler-rt for a LLVM libc target.
These are also kept locally until I’ve figured out how to cross compile
in stages.

Many “small random issues” and technicalities have also poped up during
this week that’ve taken quite a long time, but are not really worth
digging into here.

Next week I will continue with this until I can use it to work on LLVM
libc, worst case scenario I could temporarily make a “franken LLVM libc ebuild” that
does everything (headers, compiler-rt, scudo, llvm libc) in one shot,
but it should definitely be possible to do it separately.

I also forgot to update my llvm-common changes with the new
elisp-site-file-install function that was inspired by my PR 🙁
… will fix that tomorrow.

Thanks for reading!

Creating custom lxd gentoo containers from stage-3 tarballs

Gentoo Google Summer of Code (GSoC) July 24, 2023, 18:57

Much of this based from the incredible guide by user (and my mentor) Juippis and his work over at The ultimate testing system with lxd. In fact most of what comes next comes directly from Juippis himself.

The reason for creating custom gentoo containers was purely for testing. I really need a way to seamlessly test my PR’s against gcc-13, clang-16 and musl + clang-16, while also being lazy because testing them manually meant introducing error and missing test cases. So I asked Juippis for help and I’ll try to write down what he taught me. I don’t really expect anyone else to read this and this is mainly server me for a quick guide setting up lxc container. Don’t take this one as anything more than a quick guide.

First of all, you need to install lxd on your host system. For that I would recommend heading over to the Gentoo’s lxd wiki.

Once that is done, you can follow Juippis’s guide for setting up lxd container with glibc and gcc-13 or the newest gcc available. His guide is pretty verbose and straight forward and I don’t really expect any hiccups.

Now onto the steps for creating a lxd container from stage-3 llvm tarball.

  • Install distrobuilder
  • create a sub-folder gentoo under folder distrobuilder (example: mkdir -p ~/distrobuilder/gentoo)
  • Download the gentoo.yaml from raw.githubusercontent.com/lxc/lxc-ci/master/images/gentoo.yaml after cd-ing into ~/distrobuilder/gentoo
  • create another folder for this profile, lets say llvm (example: mkdir llvm) from inside the ~/distrobuilder/gentoo where you downloaded the gentoo.yaml
  • cd into to the llvm folder
  • now using distrobuilder you can create the lxd image with the following command: sudo distrobuilder build-lxd ../gentoo.yaml -o image.architecture=amd64 -o image.variant=openrc -o source.variant=llvm-openrc
  • with source.variant variable you can manipulate the what is being downloaded. So the line source.variant=llvm-openrc will download the llvm openrc stage-3 tarball for creating the image.
  • Once download finishes, you can import the rootfs with the following command: lxc image import lxd.tar.xz rootfs.squashfs –alias gentoo-amd64-llvm
  • launch your lxc image with lxc launch gentoo-amd64-llvm gentoo-llvm-test and login into it the usual way.
  • That’s it, you now have gentoo lxd image with llvm openrc profile

Note:

For using musl profile, you’ll need to modify the gentoo.yml file abit, specifially comment the following lines:

echo en_US.UTF-8 UTF-8 > /etc/locale.gen
locale-gen

This due to musl only uses C.UTF-8 locale.

So now have three test-pr scripts, test-pr-gcc, test-pr-clang16, and test-pr-mclang16 which I use to test my PR’s against gcc-13, clang-16 on glibc and clang-16 on musl libc repsectively.

Much of this based from the incredible guide by user (and my mentor) Juippis and his work over at The ultimate testing system with lxd. In fact most of what comes next comes directly from Juippis himself.

The reason for creating custom gentoo containers was purely for testing. I really need a way to seamlessly test my PR’s against gcc-13, clang-16 and musl + clang-16, while also being lazy because testing them manually meant introducing error and missing test cases. So I asked Juippis for help and I’ll try to write down what he taught me. I don’t really expect anyone else to read this and this is mainly server me for a quick guide setting up lxc container. Don’t take this one as anything more than a quick guide.

First of all, you need to install lxd on your host system. For that I would recommend heading over to the Gentoo’s lxd wiki.

Once that is done, you can follow Juippis’s guide for setting up lxd container with glibc and gcc-13 or the newest gcc available. His guide is pretty verbose and straight forward and I don’t really expect any hiccups.

Now onto the steps for creating a lxd container from stage-3 llvm tarball.

  • Install distrobuilder
  • create a sub-folder gentoo under folder distrobuilder (example: mkdir -p ~/distrobuilder/gentoo)
  • Download the gentoo.yaml from https://raw.githubusercontent.com/lxc/lxc-ci/master/images/gentoo.yaml after cd-ing into ~/distrobuilder/gentoo
  • create another folder for this profile, lets say llvm (example: mkdir llvm) from inside the ~/distrobuilder/gentoo where you downloaded the gentoo.yaml
  • cd into to the llvm folder
  • now using distrobuilder you can create the lxd image with the following command: sudo distrobuilder build-lxd ../gentoo.yaml -o image.architecture=amd64 -o image.variant=openrc -o source.variant=llvm-openrc
  • with source.variant variable you can manipulate the what is being downloaded. So the line source.variant=llvm-openrc will download the llvm openrc stage-3 tarball for creating the image.
  • Once download finishes, you can import the rootfs with the following command: lxc image import lxd.tar.xz rootfs.squashfs –alias gentoo-amd64-llvm
  • launch your lxc image with lxc launch gentoo-amd64-llvm gentoo-llvm-test and login into it the usual way.
  • That’s it, you now have gentoo lxd image with llvm openrc profile

Note:

For using musl profile, you’ll need to modify the gentoo.yml file abit, specifially comment the following lines:

echo en_US.UTF-8 UTF-8 > /etc/locale.gen
locale-gen

This due to musl only uses C.UTF-8 locale.

So now have three test-pr scripts, test-pr-gcc, test-pr-clang16, and test-pr-mclang16 which I use to test my PR’s against gcc-13, clang-16 on glibc and clang-16 on musl libc repsectively.

July 23 2023

Week 8 Report, Automated Gentoo System Updater

Gentoo Google Summer of Code (GSoC) July 23, 2023, 20:11

This article is a summary of all the changes made on Automated Gentoo System Updater project during week 8 of GSoC.

Project is hosted on GitHub.

Progress on Weeks 8

Currently, the updater supports two methods of notifications: IRC bot and email.

The IRC bot was built using Python’s sockets library with SSL support. Although functional, it remains quite basic and encounters issues with sending out the report properly in approximately 20% of cases. The issue seems to occur during connection to irc.libera.chat servers, though the exact problem remains unclear.

In addition, there’s an option to send the report via email using SendGrid. This service was selected due to its free registration and simplicity of use because it only requires an API key.

Challenges

The initial challenge involved figuring out an effective way to send the report to the IRC chat. The program has a short 10-second buffer to ensure the message is sent properly. However, with reports that could be tens or hundreds of lines long, this process can take a bit longer. The current solution is to send a brief report that merely indicates if the update was successful. After this, the bot will ask if a more detailed report is needed.

Future plans also involve setting up a local email relay using sendmail and postfix. However, this method is accompanied by several challenges. For instance, only one MTA (mail transfer agent) can be installed, which must be reflected in the ebuild. Also, configuring an email relay on Linux systems typically involves more steps, which requires writing a comprehensive documentation.

Plans for Week 9

This week, I plan to start working on the web app’s design and architecture layout.

At the same time, there are several code enhancements that need to be implemented. For instance, the current logger only covers the updater script, neglecting the parser, reporter, and notifier. Thus, it needs to be extended to cover all components of the program.

In terms of report formatting, the report is currently structured as a dictionary. However, it would be more beneficial to refactor it into a Python object, such as a dataclass.

Lastly, the way gentoo_updater accepts its CLI flags could be improved. Currently, either y or n must be passed to the CLI, as in:

gentoo_update --update-mode full --read-logs y --read-news y

It looks a bit cubmersome, since if the flag is present then y is already implied.

This article is a summary of all the changes made on Automated Gentoo System Updater project during week 8 of GSoC.

Project is hosted on GitHub.

Progress on Weeks 8

Currently, the updater supports two methods of notifications: IRC bot and email.

The IRC bot was built using Python’s sockets library with SSL support. Although functional, it remains quite basic and encounters issues with sending out the report properly in approximately 20% of cases. The issue seems to occur during connection to irc.libera.chat servers, though the exact problem remains unclear.

In addition, there’s an option to send the report via email using SendGrid. This service was selected due to its free registration and simplicity of use because it only requires an API key.

Challenges

The initial challenge involved figuring out an effective way to send the report to the IRC chat. The program has a short 10-second buffer to ensure the message is sent properly. However, with reports that could be tens or hundreds of lines long, this process can take a bit longer. The current solution is to send a brief report that merely indicates if the update was successful. After this, the bot will ask if a more detailed report is needed.

Future plans also involve setting up a local email relay using sendmail and postfix. However, this method is accompanied by several challenges. For instance, only one MTA (mail transfer agent) can be installed, which must be reflected in the ebuild. Also, configuring an email relay on Linux systems typically involves more steps, which requires writing a comprehensive documentation.

Plans for Week 9

This week, I plan to start working on the web app’s design and architecture layout.

At the same time, there are several code enhancements that need to be implemented. For instance, the current logger only covers the updater script, neglecting the parser, reporter, and notifier. Thus, it needs to be extended to cover all components of the program.

In terms of report formatting, the report is currently structured as a dictionary. However, it would be more beneficial to refactor it into a Python object, such as a dataclass.

Lastly, the way gentoo_updater accepts its CLI flags could be improved. Currently, either y or n must be passed to the CLI, as in:

gentoo_update --update-mode full --read-logs y --read-news y

It looks a bit cubmersome, since if the flag is present then y is already implied.

Week 8 report on porting Gentoo packages to modern C

Gentoo Google Summer of Code (GSoC) July 23, 2023, 15:08

Hello all,
I’m here with my week 8 report on Modern C porting of Gentoo’s packages.

Testing environments are set. I now have three environments to test my
PRs on.
– GCC 13 with glibc
– Clang-16 with llvm profile
– Clang-16 with musl-llvm profile

Much of it goes to juippis who gave me the instructions for creating
custom lxc images using gentoo stage-3 tar balls. This has helped me
immensely, I can now have testing environment ready in only couple of
minutes and keep untouched clean environments at ready.

Coming to my work, it’s has remained the same, I’ve picked up various
random bugs from the tracker list and worked on them. But I’ve come to
the realization that my work isn’t just limited to c99 or c11 porting.
It’s is mix between c99 porting, using Clang-16 as the default compiler
and perhaps using lld as the system linker as well. Which of course I’m
very happy about.

Another thing that Sam brought up is that it’s always the best to inform
him whenever I’m or I’m not sending patches upstream, because it’s in
my initial proposal to send patches upstream and sometimes it’s very
important because often times the developers of the packages know better
about the codebase and can offer more in sights about what would be the
best practice.

Coming next week, I plan to work more on reducing the bug from the
tracker, mainly picking up bugs from the tracker and send patching them.

Also, work with Sam and Joonas on my already submitted patches as they
have started to review my PRs. Not to mention I’ve to take care about
sending patches upstream whenever possible, as Sam mentioned.

Till then, see ya!

Hello all,
I’m here with my week 8 report on Modern C porting of Gentoo’s packages.

Testing environments are set. I now have three environments to test my
PRs on.
– GCC 13 with glibc
– Clang-16 with llvm profile
– Clang-16 with musl-llvm profile

Much of it goes to juippis who gave me the instructions for creating
custom lxc images using gentoo stage-3 tar balls. This has helped me
immensely, I can now have testing environment ready in only couple of
minutes and keep untouched clean environments at ready.

Coming to my work, it’s has remained the same, I’ve picked up various
random bugs from the tracker list and worked on them. But I’ve come to
the realization that my work isn’t just limited to c99 or c11 porting.
It’s is mix between c99 porting, using Clang-16 as the default compiler
and perhaps using lld as the system linker as well. Which of course I’m
very happy about.

Another thing that Sam brought up is that it’s always the best to inform
him whenever I’m or I’m not sending patches upstream, because it’s in
my initial proposal to send patches upstream and sometimes it’s very
important because often times the developers of the packages know better
about the codebase and can offer more in sights about what would be the
best practice.

Coming next week, I plan to work more on reducing the bug from the
tracker, mainly picking up bugs from the tracker and send patching them.

Also, work with Sam and Joonas on my already submitted patches as they
have started to review my PRs. Not to mention I’ve to take care about
sending patches upstream whenever possible, as Sam mentioned.

Till then, see ya!

Week 6+7 Report, Automated Gentoo System Updater

Gentoo Google Summer of Code (GSoC) July 16, 2023, 18:49
Progress on Weeks 6 + 7

These 2 weeks were spent on the parser and the reporter. During this time, I’ve added many features to it, but there are still much more things left to be done. Due to limited time of GSoC I will implement additional features after the program end.

Here is a list of features that were implemented so far:

  • If the update was successful, report will show:
    • updated package names
    • package versions in the format “old -> new”
    • USE flags of those packages
    • disk usage before and after the update
  • If the emerge pretend has failed, report will show:
    • error type (for now only supports ‘blocked packages’ error)
    • error details (for blocked package it will show problematic packages)

And here are the errors that I plan to add support for in the future:

  • Mutually exclusive USE flags
  • Errors due to licenses
  • OOM
  • Not enough disk space
  • Network issues during the update

I also had a good idea about how to go about testing gentoo_update. Basically, I can set up a CI/CD pipeline that will detect newly published stage3 Docker containers, and whenever there a new container is detected – run gentoo_update on it and check the output. Eventually, it will run into some errors that I will then use to improve gentoo_update. Pipeline itself can be set up with Jenkins, for example. This idea is a bit out of scope of my proposal, so I will work on it after GSoC ♦

Challenges

While trying to find ways to generate errors in Portage I realized how hard it is to break Portage intentionally, and it’s almost impossible without deliberately creating faulty ebuilds and USE flags, which of course is a good thing!

So far I only managed to test out ‘blocked package’ error, here is how it was done:

  • Create a simple Bash script that prints out some ASCII art (prints an owl, in my case);
  • Set up a local repository, and add an ebuild for this script;
  • Install the script on the system;
  • Version bump the script, for example 0.1 -> 0.2;
  • Then add RDEPEND="!net-print/cups" to the ebuild, which will raise an error if cups is installed. cups is just a package that was installed on my system, any other package will do;
  • Run update @world and look how Portage starts to throw errors ♦

What sounds like a couple simple steps actually took me about 2 days to figure out… Although challenging, it actually is very fun to find ways to break things ♦

Plans for Week 8

Week 8 will be dedicated to writing code to send reports via emails and IRC chats. But before that, I need to do some more work to improve integration between the updater, parser and reporter.

Ideally, I also need to spend some more time on error catching and improving overall stability of gentoo_update.

Progress on Weeks 6 + 7

These 2 weeks were spent on the parser and the reporter. During this time, I’ve added many features to it, but there are still much more things left to be done. Due to limited time of GSoC I will implement additional features after the program end.

Here is a list of features that were implemented so far:

  • If the update was successful, report will show:
    • updated package names
    • package versions in the format “old -> new”
    • USE flags of those packages
    • disk usage before and after the update
  • If the emerge pretend has failed, report will show:
    • error type (for now only supports ‘blocked packages’ error)
    • error details (for blocked package it will show problematic packages)

And here are the errors that I plan to add support for in the future:

  • Mutually exclusive USE flags
  • Errors due to licenses
  • OOM
  • Not enough disk space
  • Network issues during the update

I also had a good idea about how to go about testing gentoo_update. Basically, I can set up a CI/CD pipeline that will detect newly published stage3 Docker containers, and whenever there a new container is detected – run gentoo_update on it and check the output. Eventually, it will run into some errors that I will then use to improve gentoo_update. Pipeline itself can be set up with Jenkins, for example. This idea is a bit out of scope of my proposal, so I will work on it after GSoC 🙂

Challenges

While trying to find ways to generate errors in Portage I realized how hard it is to break Portage intentionally, and it’s almost impossible without deliberately creating faulty ebuilds and USE flags, which of course is a good thing!

So far I only managed to test out ‘blocked package’ error, here is how it was done:

  • Create a simple Bash script that prints out some ASCII art (prints an owl, in my case);
  • Set up a local repository, and add an ebuild for this script;
  • Install the script on the system;
  • Version bump the script, for example 0.1 -> 0.2;
  • Then add RDEPEND="!net-print/cups" to the ebuild, which will raise an error if cups is installed. cups is just a package that was installed on my system, any other package will do;
  • Run update @world and look how Portage starts to throw errors 🙂

What sounds like a couple simple steps actually took me about 2 days to figure out… Although challenging, it actually is very fun to find ways to break things 🙂

Plans for Week 8

Week 8 will be dedicated to writing code to send reports via emails and IRC chats. But before that, I need to do some more work to improve integration between the updater, parser and reporter.

Ideally, I also need to spend some more time on error catching and improving overall stability of gentoo_update.

Week 7 report on porting Gentoo packages to modern C

Gentoo Google Summer of Code (GSoC) July 16, 2023, 15:14

Hello all.

First of all, I would like to give the good news of passing the Mid-Term
evaluation. My mentor/s have provided me some valuable advice that I
would like to incorporate in my work in the following weeks ahead.

Coming to my work, as I said in my last report. I picked up where I left
before week 6. Sent sent in some patches (no upstream unfortunately).
This week I mainly worked with Juippis (my other mentor) on reviews of
my already submitted PRs. We came across some challenges while doing,
namely reproduction of a bug, the case being juippis and sam_ were able
to reproduce the bug, but I couldn’t due. It was most probably due to
compiler-rt. I still have to send in a proper fix for that bug. Which
brings us the to second topic of setting up a test environment. Juippis
has an excellent guide on using lxc containers for setting up test
environment.

So there’s that.

Coming weeks, priority would be setting up the test environment with
Juippis guide so we don’t have to face the aforementioned scenarios
again.

Apart I mainly want to do two things:
– stick to my proposal and work on Wstrict-prototypes, and
– work on bringing down the number of bug on the tracker, there’s still
quite a lot, and often times more keeps getting added.
– Work more with mentors on code/PR reviews

Till then, see yah!

Hello all.

First of all, I would like to give the good news of passing the Mid-Term
evaluation. My mentor/s have provided me some valuable advice that I
would like to incorporate in my work in the following weeks ahead.

Coming to my work, as I said in my last report. I picked up where I left
before week 6. Sent sent in some patches (no upstream unfortunately).
This week I mainly worked with Juippis (my other mentor) on reviews of
my already submitted PRs. We came across some challenges while doing,
namely reproduction of a bug, the case being juippis and sam_ were able
to reproduce the bug, but I couldn’t due. It was most probably due to
compiler-rt. I still have to send in a proper fix for that bug. Which
brings us the to second topic of setting up a test environment. Juippis
has an excellent guide on using lxc containers for setting up test
environment.

So there’s that.

Coming weeks, priority would be setting up the test environment with
Juippis guide so we don’t have to face the aforementioned scenarios
again.

Apart I mainly want to do two things:
– stick to my proposal and work on Wstrict-prototypes, and
– work on bringing down the number of bug on the tracker, there’s still
quite a lot, and often times more keeps getting added.
– Work more with mentors on code/PR reviews

Till then, see yah!

July 15 2023

ELisp ebuilds good practices

Maciej Barć (xgqt) July 15, 2023, 19:25
Check load path

Some Elisp package compilation failures are caused by not setting the loadpath correctly. It mostly happens when you compile source from a directory that is not the current working directory. For example:

1
elisp-compile elisp/*.el

In most cases you can cd or override the S variable to set it to location where ELisp source resides.

But in other cases you can append to load path the directory with source, see:

1
BYTECOMPFLAGS="${BYTECOMPFLAGS} -L elisp" elisp-compile elisp/*.el
Do not rename auto-generated autoload file

elisp-make-autoload-file allows to name the generated autoload file. For sake of easier debugging and writing Gentoo SITEFILEs, please do not rename the generated file.

The name of that file should always be ${PN}-autoloads.el.

Use new elisp-enable-tests function

elisp-enable-tests allows to set up IUSE, RESTRICT, BDEPEND and the test runner function for running tests with the specified test runner.

The 1st (test-runner) argument must be one of:

  • buttercup — for buttercup provided via app-emacs/buttercup,
  • ert-runner — for ert-runner provided via app-emacs/ert-runner,
  • ert — for ERT, the built-in GNU Emacs test utility.

The 2nd argument is the directory where test are located, the leftover arguments are passed to the selected test runner.

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
EAPI=8

inherit elisp

# Other package settings ...

SITEFILE="50${PN}-gentoo.el"
DOCS=( README.md )

elisp-enable-tests buttercup test
Remove empty SITEFILEs

Recently a feature was added to elisp.eclass that will cause build process to generate the required SITEFILE with boilerplate code if it does not exist.

So if your SITEFILE looked like this:

1
(add-to-list 'load-path "@SITELISP@")

… then, you can just remove that file.

But remember to keep the SITEFILE variable inside your ebuild:

1
SITEFILE="50${PN}-gentoo.el"
Remove pkg.el files

The *-pkg.el files are useless to Gentoo distribution model of Emacs Lisp packages and should be removed. It is as simple as adding this line to a ebuild:

1
ELISP_REMOVE="${PN}-pkg.el"

Beware that some packages will try to find their ${PN}-pkg.el file, but in most cases this will show up in failing package tests.

Use official repository

It is tedious to repackage Elpa tarballs, so use the official upstream even if you have to snapshot a specific commit.

To snapshot GitHub repos you would generally use this code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# First check if we have the correct version to prevent
# autobumping package version without changing the commit.
[[ ${PV} == *_p20220325 ]] && COMMIT=65c496d3d1d1298345beb9845840067bffb2ffd8

# Use correct URL that supports snapshots.
SRC_URI="github.com/domtronn/${PN}/archive/${COMMIT}.tar.gz
    -> ${P}.tar.gz"

# Override the temporary build directory variable.
S="${WORKDIR}"/${PN}-${COMMIT}
Include live version support

We do not want to be worse than the Melpa unstable :D

So, why not allow the given package to be used live?

Even if you do not push the live package to the overlay, please include support for it.

1
2
3
4
5
6
7
8
if [[ ${PV} == *9999* ]] ; then
    inherit git-r3
    EGIT_REPO_URI="github.com/example/${PN}.git"
else
    SRC_URI="github.com/example/${PN}/archive/${PV}.tar.gz
        -> ${P}.tar.gz"
    KEYWORDS="~amd64 ~x86"
fi
Ask for tags

Git is good, git tags are good. In case if upstream does not tag their package or just forgets to, kindly ask them to create a git tag when bumping Emacs package versions.

Check load path

Some Elisp package compilation failures are caused by not setting the loadpath correctly. It mostly happens when you compile source from a directory that is not the current working directory. For example:

1
elisp-compile elisp/*.el

In most cases you can cd or override the S variable to set it to location where ELisp source resides.

But in other cases you can append to load path the directory with source, see:

1
BYTECOMPFLAGS="${BYTECOMPFLAGS} -L elisp" elisp-compile elisp/*.el

Do not rename auto-generated autoload file

elisp-make-autoload-file allows to name the generated autoload file. For sake of easier debugging and writing Gentoo SITEFILEs, please do not rename the generated file.

The name of that file should always be ${PN}-autoloads.el.

Use new elisp-enable-tests function

elisp-enable-tests allows to set up IUSE, RESTRICT, BDEPEND and the test runner function for running tests with the specified test runner.

The 1st (test-runner) argument must be one of:

  • buttercup — for buttercup provided via app-emacs/buttercup,
  • ert-runner — for ert-runner provided via app-emacs/ert-runner,
  • ert — for ERT, the built-in GNU Emacs test utility.

The 2nd argument is the directory where test are located, the leftover arguments are passed to the selected test runner.

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
EAPI=8

inherit elisp

# Other package settings ...

SITEFILE="50${PN}-gentoo.el"
DOCS=( README.md )

elisp-enable-tests buttercup test

Remove empty SITEFILEs

Recently a feature was added to elisp.eclass that will cause build process to generate the required SITEFILE with boilerplate code if it does not exist.

So if your SITEFILE looked like this:

1
(add-to-list 'load-path "@SITELISP@")

… then, you can just remove that file.

But remember to keep the SITEFILE variable inside your ebuild:

1
SITEFILE="50${PN}-gentoo.el"

Remove pkg.el files

The *-pkg.el files are useless to Gentoo distribution model of Emacs Lisp packages and should be removed. It is as simple as adding this line to a ebuild:

1
ELISP_REMOVE="${PN}-pkg.el"

Beware that some packages will try to find their ${PN}-pkg.el file, but in most cases this will show up in failing package tests.

Use official repository

It is tedious to repackage Elpa tarballs, so use the official upstream even if you have to snapshot a specific commit.

To snapshot GitHub repos you would generally use this code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# First check if we have the correct version to prevent
# autobumping package version without changing the commit.
[[ ${PV} == *_p20220325 ]] && COMMIT=65c496d3d1d1298345beb9845840067bffb2ffd8

# Use correct URL that supports snapshots.
SRC_URI="https://github.com/domtronn/${PN}/archive/${COMMIT}.tar.gz
    -> ${P}.tar.gz"

# Override the temporary build directory variable.
S="${WORKDIR}"/${PN}-${COMMIT}

Include live version support

We do not want to be worse than the Melpa unstable :D

So, why not allow the given package to be used live?

Even if you do not push the live package to the overlay, please include support for it.

1
2
3
4
5
6
7
8
if [[ ${PV} == *9999* ]] ; then
    inherit git-r3
    EGIT_REPO_URI="https://github.com/example/${PN}.git"
else
    SRC_URI="https://github.com/example/${PN}/archive/${PV}.tar.gz
        -> ${P}.tar.gz"
    KEYWORDS="~amd64 ~x86"
fi

Ask for tags

Git is good, git tags are good. In case if upstream does not tag their package or just forgets to, kindly ask them to create a git tag when bumping Emacs package versions.

July 13 2023

Week 6 report on porting Gentoo packages to modern C

Gentoo Google Summer of Code (GSoC) July 13, 2023, 18:07

Hello all,

This week I couldn’t do much as I caught a bit of cold and fell ill. But
I’m doing much better now and will begin working again starting this
week. I plan on making up for last weeks work in the coming week and
in case there is still remaining work, I will make it up in the
extra/emergency week at the end. This was also the reason I could not update my blogs for the last week.

So the plan for coming is to pick up where I left and start from there
and finish any remaining work. Then start with
Wdeprecated-non-prototypes and Wstrict-prototypes, which aligns with my
proposal timeline for week 7 to 10.

Our evaluation for midterm opens this week (today 10th of July) and will
be open for the entire week. I’m super nervous and excited about it.
*fingers crosses*

Till then, see ya. Take care.

Hello all,

This week I couldn’t do much as I caught a bit of cold and fell ill. But
I’m doing much better now and will begin working again starting this
week. I plan on making up for last weeks work in the coming week and
in case there is still remaining work, I will make it up in the
extra/emergency week at the end. This was also the reason I could not update my blogs for the last week.

So the plan for coming is to pick up where I left and start from there
and finish any remaining work. Then start with
Wdeprecated-non-prototypes and Wstrict-prototypes, which aligns with my
proposal timeline for week 7 to 10.

Our evaluation for midterm opens this week (today 10th of July) and will
be open for the entire week. I’m super nervous and excited about it.
*fingers crosses*

Till then, see ya. Take care.

Week 5 report on porting Gentoo packages to modern C

Gentoo Google Summer of Code (GSoC) July 13, 2023, 18:06

I’m writing this report on 13th July, almost two week late. See week 6 report for that, I had fallen a bit sick.

Hello all, this is my week 5’s report for my project “Porting Gentoo’s
packages to Modern C”.

First things first, we now have MATE desktop and related packages
ported. Not only just in Modern C, but it’s now compatible with
gettext-0.22, too [1]. So if you are using llvm-musl or the llvm profile
you can use MATE desktop.

While fixing MATE settings-daemon I’ve learned two very valuable
lesson (thanks to my Sam),
– Getting feedback from upstream devs is important
– Casting variables in incompatible function pointer type of errors is
not always correct, it might only temporarily fix the problem/silence
the warning.
I’m going to keep this two points in mind for the next and upcoming
weeks.

Apart from the MATE work, I mostly adhered to my proposal timeline and
fixed more -Wimplicit-function-declaration bug, [2][3] and more.

While strictly according to my proposal, coming two weeks (week 6 and 7)
are to be focused on -Wdeprecated-non-prototype. But in my experience
till now there are not many bugs of this type. I’ll obviously keep an
eye out for this bug types but I’ll most likely be solving more of
-Wimplicit-function-declaration or -Wincompatible-function-pointer-types
type of bugs, as they seem to dominate the bug list/tracker.

Our midterm evaluation is also coming up, opens 10th this month, hence
working towards that (mainly communicating with my mentors on any things
they expect of me or would like to see/get done before the evaluation).
Needless to say super excited about that.

Till then, see ya!

[1]: github.com/mate-desktop/mate-panel/pull/1375
[2]: github.com/gentoo/gentoo/pull/31671
[3]: github.com/gentoo/gentoo/pull/31670

I’m writing this report on 13th July, almost two week late. See week 6 report for that, I had fallen a bit sick.

Hello all, this is my week 5’s report for my project “Porting Gentoo’s
packages to Modern C”.

First things first, we now have MATE desktop and related packages
ported. Not only just in Modern C, but it’s now compatible with
gettext-0.22, too [1]. So if you are using llvm-musl or the llvm profile
you can use MATE desktop.

While fixing MATE settings-daemon I’ve learned two very valuable
lesson (thanks to my Sam),
– Getting feedback from upstream devs is important
– Casting variables in incompatible function pointer type of errors is
not always correct, it might only temporarily fix the problem/silence
the warning.
I’m going to keep this two points in mind for the next and upcoming
weeks.

Apart from the MATE work, I mostly adhered to my proposal timeline and
fixed more -Wimplicit-function-declaration bug, [2][3] and more.

While strictly according to my proposal, coming two weeks (week 6 and 7)
are to be focused on -Wdeprecated-non-prototype. But in my experience
till now there are not many bugs of this type. I’ll obviously keep an
eye out for this bug types but I’ll most likely be solving more of
-Wimplicit-function-declaration or -Wincompatible-function-pointer-types
type of bugs, as they seem to dominate the bug list/tracker.

Our midterm evaluation is also coming up, opens 10th this month, hence
working towards that (mainly communicating with my mentors on any things
they expect of me or would like to see/get done before the evaluation).
Needless to say super excited about that.

Till then, see ya!

[1]: https://github.com/mate-desktop/mate-panel/pull/1375
[2]: https://github.com/gentoo/gentoo/pull/31671
[3]: https://github.com/gentoo/gentoo/pull/31670

July 04 2023

Week 5 – Modernization of Portage

Gentoo Google Summer of Code (GSoC) July 04, 2023, 3:56
Week 5 – Modernization of Portage

Hey everyone, this week was a fun and satisfying one. Let’s get into it.

Context

I wanted to work on the dependency resolution system of portage. It is a scary codebase and so  Sam suggested I start with bugs related to the dependency resolution system. We decided on bug  528836. The bug is relatively simple (though it took me relatively long time to understand). In gentoo, there are virtual packages. If multiple packages can provide the same functionality / library,  then there is a virtual package that depends on either one of them. Any package needing  that functionality / library can depend on the virtual package and not worry about the specifics. The  problem in this bug is that a package has two dependencies (let’s say) and one depends on a  package and the other depends on the corresponding virtual package. Now portage tries to emerge  both sides of the virtual package, which leads to conflicts. Ideally, the ebuild maintainers should  have made both dependencies depend on the virtual package rather than the actual, but nonetheless portage should have been able to figure it. The first task was to reproduce the bug in a gentoo  system.

There were several hurdles along the way. The bug was very old (from 2014). It is not  reproduceable in the current state of portage or the ebuild repository. Luckily, we got an old stage3  from Andrey, one of the mentors. Gentoo moved to git from CVS only recently and so, we had to graft in the historical gentoo repo into the current repo to restore it to an older state. The major  hurdle is my inexperience / knowledge with ebuilds. Though I have been using gentoo for a few  years, I never bothered to create ebuilds or study them. So when I had to look through the ebuilds  to figure out what is going on, it was a bit overwhelming. Reading through pages and pages of man  pages, PMS and gentoo wiki and with a lot of help from Sam, we were able to reproduce the bug.

Writing a test for the bug

Sam suggested that I write a test for portage that would expose this behaviour. It had it’s own  hurdles, but finally we were able to do it. It is not integrated into portage, but it can be found here. I  really want to thank Sam again for his patience towards me. I sometimes ask the silliest of things,  but he explains them with a smile. I could never be more thankful.

The next step will be towards trying to fix the bug with portage or declare the ebuild to be invalid  (which is reasonable). We will also work towards integrating the test into portage. Sam will have to  decide on that. I will keep you posted whatever happens.

Unreachable code

I also sent a pull request, removing some unreachable / legacy code. At the time of submitting the  pull request, GitHub’s pypy37 (one of the targets portage is tested against) runner had some issues.  The tests will be rerun and the commit will be merged into master soon.

Next week

The mid term evaluations are coming up. The next week will be towards getting ready for that,  fixing the above bug and maybe a few more type annotations. I’ll see you all next week.

Week 5 – Modernization of Portage

Hey everyone, this week was a fun and satisfying one. Let’s get into it.

Context

I wanted to work on the dependency resolution system of portage. It is a scary codebase and so  Sam suggested I start with bugs related to the dependency resolution system. We decided on bug  528836. The bug is relatively simple (though it took me relatively long time to understand). In gentoo, there are virtual packages. If multiple packages can provide the same functionality / library,  then there is a virtual package that depends on either one of them. Any package needing  that functionality / library can depend on the virtual package and not worry about the specifics. The  problem in this bug is that a package has two dependencies (let’s say) and one depends on a  package and the other depends on the corresponding virtual package. Now portage tries to emerge  both sides of the virtual package, which leads to conflicts. Ideally, the ebuild maintainers should  have made both dependencies depend on the virtual package rather than the actual, but nonetheless portage should have been able to figure it. The first task was to reproduce the bug in a gentoo  system.

There were several hurdles along the way. The bug was very old (from 2014). It is not  reproduceable in the current state of portage or the ebuild repository. Luckily, we got an old stage3  from Andrey, one of the mentors. Gentoo moved to git from CVS only recently and so, we had to graft in the historical gentoo repo into the current repo to restore it to an older state. The major  hurdle is my inexperience / knowledge with ebuilds. Though I have been using gentoo for a few  years, I never bothered to create ebuilds or study them. So when I had to look through the ebuilds  to figure out what is going on, it was a bit overwhelming. Reading through pages and pages of man  pages, PMS and gentoo wiki and with a lot of help from Sam, we were able to reproduce the bug.

Writing a test for the bug

Sam suggested that I write a test for portage that would expose this behaviour. It had it’s own  hurdles, but finally we were able to do it. It is not integrated into portage, but it can be found here. I  really want to thank Sam again for his patience towards me. I sometimes ask the silliest of things,  but he explains them with a smile. I could never be more thankful.

The next step will be towards trying to fix the bug with portage or declare the ebuild to be invalid  (which is reasonable). We will also work towards integrating the test into portage. Sam will have to  decide on that. I will keep you posted whatever happens.

Unreachable code

I also sent a pull request, removing some unreachable / legacy code. At the time of submitting the  pull request, GitHub’s pypy37 (one of the targets portage is tested against) runner had some issues.  The tests will be rerun and the commit will be merged into master soon.

Next week

The mid term evaluations are coming up. The next week will be towards getting ready for that,  fixing the above bug and maybe a few more type annotations. I’ll see you all next week.

Weekly report 5, LLVM libc

Gentoo Google Summer of Code (GSoC) July 04, 2023, 1:37

Hey! This week I’ve spent most of my time figuring out how to bootstrap
a LLVM cross compiler toolchain targeting a hosted Linux environment. I
have also resolved the wint_t issue from last week. Both of these things
took way longer than expected, but I also learned a lot more than
expected so it was worth it.

I’ll start with discussing the LLVM cross compiler setup. My initial
idea on how to bootstrap a toolchain was to simply specify LLVM_TARGETS
for the target architecture when building LLVM, then compile compiler-rt
for the target triple, and then the libc. This is indeed true, but the official
cross compilation instructions tells you to specify a sysroot where the
libc is already built, and that’s not possible when bootstrapping from
scratch.

As the compiler-rt cross compilation documentation only tells you to use
an already set up sysroot, which I didn’t have, I had to try my way
forward. This actually took me a few days, and I did things like trying
to bootstrap with a barebones build of compiler-rt, mixing in some GCC
things, and a lot of hacks. I then studied
mussel for a while until finding out about
headers-only “builds” for glibc and musl. It turns out that the only
thing compiler-rt needs the sysroot for is libc headers, and those can
be generated without a functioning compiler for both musl and
glibc. This is done by setting CC=true to pass all the configure tests
and then run ‘make headers-install‘ (for musl) into a temporary install
directory to generate the headers needed for bootstrapping
compiler-rt.

export CC=true
./configure \
--target=${CTARGET} \
--prefix="${MUSL_HEADERS}/usr" \
--syslibdir="${MUSL_HEADERS}/lib" \
--disable-gcc-wrapper
make install-headers

After this is done you can pass the following CFLAGS:
-nostdinc -I*path to temporary musl install dir*/usr/include‘ to the
compiler-rt build.

-DCMAKE_ASM_COMPILER_TARGET="${CTARGET}"
-DCMAKE_C_COMPILER_TARGET="${CTARGET}"
-DCMAKE_C_COMPILER_WORKS=1
-DCMAKE_CXX_COMPILER_WORKS=1
-DCMAKE_C_FLAGS="--target=${CTARGET} -isystem ${MUSL_HEADERS}/usr/include -nostdinc -v"

After this is done you can export
LIBCC="${COMPILER_RT_BUILDDIR}"/lib/linux/libclang_rt.builtins-aarch64.a
to the musl build to use the previously built compiler-rt builtins for
the actual libc build.

To then build actual binaries targeting the newly built libc you can do something like this:

clang --target="${CTARGET}" main.c -c -nostdinc -nostdlib -I"${MUSL_HEADERS}"/usr/include -v

ld.lld -static main.o \
"${COMPILER_RT_BUILDDIR}"/lib/linux/libclang_rt.builtins-aarch64.a \
"${MUSLLIB}"/crti.o "${MUSLLIB}"/crt1.o "${MUSLLIB}"/crtn.o "${MUSLLIB}"/libc.a

Running the binary with qemu-user:
$ cat /etc/portage/package.use/qemu
> app-emulation/qemu static-user QEMU_USER_TARGETS: aarch64
$ emerge qemu
$ qemu-aarch64 a.out
> hello, world

Afterwards it feels pretty obvious that the headers were needed, and I
could’ve probably figured it out a lot sooner by for example examining
crossdev a bit closer. But I am happy I did play with this since I
learned things like what the different runtime libraries did, what’s
needed to link a binary, and a lot more. Here’s a complete script that
does everything:
gist.
Next I will integrate this into crossdev. Another thing I need to think
about is how to do a header-only install of LLVM libc. Currently the
headers get generated with libc-hdrgen and installed with the
install-libc target. Probably this can be done by packaging a standalone
libc-hdrgen binary and using that for bootstrapping. I could also
temporarily “cheat” and do a compiler-rt+libc build to get going.

Next I also figured out what, and why, the wint_t problem occurs when
building LLVM libc in fullbuild mode on a musl system (see last week’s
report). The problem here is that on a musl system, /usr/include will be
first in the include path, regardless of CFLAGS="-ffreestanding". (for
C++ they will be after the standard C++ headers and then
#include_next‘ed, so no difference). I thought at first that this was a
bug since you don’t want to target an environment where the libc is
available (hosted environment) when building in freestanding
mode. However, after asking in #musl IRC this is actually fine since the
musl headers respect the __STDC_HOSTED__ variable that gets set when using
-ffreestanding, and there is a clear standard specifying what should be
available in a freestanding environment.

The problem arises because LLVM libc assumes that the Clang headers will
be used when passing -ffreestanding, and therefore relies on Clang header
internals. Specifically the __need_wint_t macro for stddef.h which is
in no way standardized and only an implementation detail. My thought
here was to instead of relying on CFLAGS="-ffreestanding" to use the
Clang headers, we should instead figure out another way using the build
system to force Clang headers. Another way to solve this would also just
be to also rely on musl internals (__NEED_wint_t for stddef.h).

After discussing this we agreed to first actually get the libc built,
and then decide on a strategy once we know how many times similar issues
pop up. If there are only a few instances of this then more #defines are
fine, else we could do something like the gcc buildbot target. My only
worry with this is that it will keep biting us in the ass as more things
get added.
github.com/llvm/llvm-project/issues/63510

Other things worth noting is that my ‘USE=emacs llvm-common’ PR inspired a
new elisp-common.eclass function called elisp-make-site-file
github.com/gentoo/gentoo/commit/a4e8704d22916a96725e0ef819d912ae82270d28because mgorny thought that my sitefiles were a waste of inodes :D.
github.com/gentoo/gentoo/pull/31635. I also got my
__unix__->__linux__ CL merged into LLVM. I do however have some worries
that this could’ve broken some things on macOS as seen in my comment:

> done! I think there should be something addressing pthread_once_t and
> once_flag for other Unix platforms though. Both of these would've
> previously, before this commit, been valid on macOS, as __unix__ is
> defined and __futex_word is just an aligned 32 bit int. No internal
> Linux headers were used here before that would've caused an error.

reviews.llvm.org/D153729

Next week I will try to make Crossdev be able to use LLVM/Clang by
integrating the things I did this week.

Hey! This week I’ve spent most of my time figuring out how to bootstrap
a LLVM cross compiler toolchain targeting a hosted Linux environment. I
have also resolved the wint_t issue from last week. Both of these things
took way longer than expected, but I also learned a lot more than
expected so it was worth it.

I’ll start with discussing the LLVM cross compiler setup. My initial
idea on how to bootstrap a toolchain was to simply specify LLVM_TARGETS
for the target architecture when building LLVM, then compile compiler-rt
for the target triple, and then the libc. This is indeed true, but the official
cross compilation instructions tells you to specify a sysroot where the
libc is already built, and that’s not possible when bootstrapping from
scratch.

As the compiler-rt cross compilation documentation only tells you to use
an already set up sysroot, which I didn’t have, I had to try my way
forward. This actually took me a few days, and I did things like trying
to bootstrap with a barebones build of compiler-rt, mixing in some GCC
things, and a lot of hacks. I then studied
mussel for a while until finding out about
headers-only “builds” for glibc and musl. It turns out that the only
thing compiler-rt needs the sysroot for is libc headers, and those can
be generated without a functioning compiler for both musl and
glibc. This is done by setting CC=true to pass all the configure tests
and then run ‘make headers-install‘ (for musl) into a temporary install
directory to generate the headers needed for bootstrapping
compiler-rt.

export CC=true
./configure \
--target=${CTARGET} \
--prefix="${MUSL_HEADERS}/usr" \
--syslibdir="${MUSL_HEADERS}/lib" \
--disable-gcc-wrapper
make install-headers

After this is done you can pass the following CFLAGS:
-nostdinc -I*path to temporary musl install dir*/usr/include‘ to the
compiler-rt build.

-DCMAKE_ASM_COMPILER_TARGET="${CTARGET}"
-DCMAKE_C_COMPILER_TARGET="${CTARGET}"
-DCMAKE_C_COMPILER_WORKS=1
-DCMAKE_CXX_COMPILER_WORKS=1
-DCMAKE_C_FLAGS="--target=${CTARGET} -isystem ${MUSL_HEADERS}/usr/include -nostdinc -v"

After this is done you can export
LIBCC="${COMPILER_RT_BUILDDIR}"/lib/linux/libclang_rt.builtins-aarch64.a
to the musl build to use the previously built compiler-rt builtins for
the actual libc build.

To then build actual binaries targeting the newly built libc you can do something like this:

clang --target="${CTARGET}" main.c -c -nostdinc -nostdlib -I"${MUSL_HEADERS}"/usr/include -v

ld.lld -static main.o \
"${COMPILER_RT_BUILDDIR}"/lib/linux/libclang_rt.builtins-aarch64.a \
"${MUSLLIB}"/crti.o "${MUSLLIB}"/crt1.o "${MUSLLIB}"/crtn.o "${MUSLLIB}"/libc.a

Running the binary with qemu-user:
$ cat /etc/portage/package.use/qemu
> app-emulation/qemu static-user QEMU_USER_TARGETS: aarch64
$ emerge qemu
$ qemu-aarch64 a.out
> hello, world

Afterwards it feels pretty obvious that the headers were needed, and I
could’ve probably figured it out a lot sooner by for example examining
crossdev a bit closer. But I am happy I did play with this since I
learned things like what the different runtime libraries did, what’s
needed to link a binary, and a lot more. Here’s a complete script that
does everything:
gist.
Next I will integrate this into crossdev. Another thing I need to think
about is how to do a header-only install of LLVM libc. Currently the
headers get generated with libc-hdrgen and installed with the
install-libc target. Probably this can be done by packaging a standalone
libc-hdrgen binary and using that for bootstrapping. I could also
temporarily “cheat” and do a compiler-rt+libc build to get going.

Next I also figured out what, and why, the wint_t problem occurs when
building LLVM libc in fullbuild mode on a musl system (see last week’s
report). The problem here is that on a musl system, /usr/include will be
first in the include path, regardless of CFLAGS="-ffreestanding". (for
C++ they will be after the standard C++ headers and then
#include_next‘ed, so no difference). I thought at first that this was a
bug since you don’t want to target an environment where the libc is
available (hosted environment) when building in freestanding
mode. However, after asking in #musl IRC this is actually fine since the
musl headers respect the __STDC_HOSTED__ variable that gets set when using
-ffreestanding, and there is a clear standard specifying what should be
available in a freestanding environment.

The problem arises because LLVM libc assumes that the Clang headers will
be used when passing -ffreestanding, and therefore relies on Clang header
internals. Specifically the __need_wint_t macro for stddef.h which is
in no way standardized and only an implementation detail. My thought
here was to instead of relying on CFLAGS="-ffreestanding" to use the
Clang headers, we should instead figure out another way using the build
system to force Clang headers. Another way to solve this would also just
be to also rely on musl internals (__NEED_wint_t for stddef.h).

After discussing this we agreed to first actually get the libc built,
and then decide on a strategy once we know how many times similar issues
pop up. If there are only a few instances of this then more #defines are
fine, else we could do something like the gcc buildbot target. My only
worry with this is that it will keep biting us in the ass as more things
get added.
https://github.com/llvm/llvm-project/issues/63510

Other things worth noting is that my ‘USE=emacs llvm-common’ PR inspired a
new elisp-common.eclass function called elisp-make-site-file
https://github.com/gentoo/gentoo/commit/a4e8704d22916a96725e0ef819d912ae82270d28because mgorny thought that my sitefiles were a waste of inodes :D.
https://github.com/gentoo/gentoo/pull/31635. I also got my
__unix__->__linux__ CL merged into LLVM. I do however have some worries
that this could’ve broken some things on macOS as seen in my comment:

> done! I think there should be something addressing pthread_once_t and
> once_flag for other Unix platforms though. Both of these would've
> previously, before this commit, been valid on macOS, as __unix__ is
> defined and __futex_word is just an aligned 32 bit int. No internal
> Linux headers were used here before that would've caused an error.

https://reviews.llvm.org/D153729

Next week I will try to make Crossdev be able to use LLVM/Clang by
integrating the things I did this week.

July 02 2023

Week 5 Report, Automated Gentoo System Updater

Gentoo Google Summer of Code (GSoC) July 02, 2023, 21:51
Progress on Week 5

Week started off by receiving some feedback from the community in the forums. Here are some nice ideas that community have suggested to implement:

  1. Fallback to the latest version of the package if an error is encountered during an update;
  2. Add an option to control Portage niceness;
  3. Estimate update time;
  4. Notify users about obsolete USE flags;
  5. Think of a way to make updater work on binpkg servers.

I will attempt to do 1-4 in the duration of Google Summer of Code.

There were also some suggestions on improving the workflow and many different opinions were voiced. The discussion is still ongoing, but it has already yielded some positive results.

I’ve made some progress on the Parser, it can now detect whether update has ended in an error or not. Log format and general output flow was modified to simplify parsing. Most noticeable change was the way how the updater.sh is launched. Before the whole script (~250 lines of Bash) were launched all at once, and now each function from the script is being launched separately. Additional flag (--report) was added to utilize the parser, it can now parse the last log from the log directory.

Furthermore, I spent sometime on organizing testing a bit better. I updated container versions and created a better naming convention for my containers to not get lost in them. 08-05-2023 Desktop image on openrc is being used to test glsa-check, and most recent openrc basic image is used to test updating functionality.

Challenges

Parser has turned out to be much harder than I anticipated. First of all, I had to make some changes to both Python and Bash code to create simpler log output, which reduced number of if/else statements in the parser.

Secondly, there were some motivation issues. It was a bit hard to focus on the parser, because a much better approach is to add machine readable output from Portage instead of parsing logs. I talked to my mentor about it and we decided to continue working on the parser, mainly because modifying Portage in any significant way take waay too much time.

Plans for Week 6

On week 6 the plan is to add error parsing and comprehension to the parser. This means I will have to find some different ways to cause Portage to break, and then try to make parser understand the errors that have occurred. Should be really fun!

After that is done, I can focus on using this information to create nice-looking update reports.

Progress on Week 5

Week started off by receiving some feedback from the community in the forums. Here are some nice ideas that community have suggested to implement:

  1. Fallback to the latest version of the package if an error is encountered during an update;
  2. Add an option to control Portage niceness;
  3. Estimate update time;
  4. Notify users about obsolete USE flags;
  5. Think of a way to make updater work on binpkg servers.

I will attempt to do 1-4 in the duration of Google Summer of Code.

There were also some suggestions on improving the workflow and many different opinions were voiced. The discussion is still ongoing, but it has already yielded some positive results.

I’ve made some progress on the Parser, it can now detect whether update has ended in an error or not. Log format and general output flow was modified to simplify parsing. Most noticeable change was the way how the updater.sh is launched. Before the whole script (~250 lines of Bash) were launched all at once, and now each function from the script is being launched separately. Additional flag (--report) was added to utilize the parser, it can now parse the last log from the log directory.

Furthermore, I spent sometime on organizing testing a bit better. I updated container versions and created a better naming convention for my containers to not get lost in them. 08-05-2023 Desktop image on openrc is being used to test glsa-check, and most recent openrc basic image is used to test updating functionality.

Challenges

Parser has turned out to be much harder than I anticipated. First of all, I had to make some changes to both Python and Bash code to create simpler log output, which reduced number of if/else statements in the parser.

Secondly, there were some motivation issues. It was a bit hard to focus on the parser, because a much better approach is to add machine readable output from Portage instead of parsing logs. I talked to my mentor about it and we decided to continue working on the parser, mainly because modifying Portage in any significant way take waay too much time.

Plans for Week 6

On week 6 the plan is to add error parsing and comprehension to the parser. This means I will have to find some different ways to cause Portage to break, and then try to make parser understand the errors that have occurred. Should be really fun!

After that is done, I can focus on using this information to create nice-looking update reports.

Weekly report 4, LLVM libc

Gentoo Google Summer of Code (GSoC) June 27, 2023, 4:46

Hello! This is a combined report for both week 3 and 4.

In these two weeks I’ve fixed several issues in LLVM libc, but quite a
lot of time has also been spent purely learning things. I will start
by going over what I’ve learned, and then refer to related issues.

To start with I have gotten quite comfortable with CVise, how to use
it and general tricks about writing the test script for determining
whether the issue is still there after reducing a source file. For
example, I had an issue about a the print format macro PRId64 not
being defined on LLVM libc

This caused an error that looked like this:

/home/cat/c/llvm-project/compiler-rt/lib/scudo/standalone/timing.h:182:22: error: expected ')'
Str.append("%14" PRId64 ".%" PRId64 "(ns) %-11s", Integral, Fraction, " ");

So my first attempt of reducing was to grep for “expected ‘)'”. This
went on to reduce the source file to simply: “(“. Maybe not the most
interesting thing, but it was the “aha-moment” for me with regards to
CVise, because what it did with the test script became clear.

To actually fix this issue I filed a bug
github.com/llvm/llvm-project/issues/63317 and got told by a
compiler-rt developer that the timing.cpp file is only used for
performance evaluation. So the temporary fix I made was to exclude it
from the build by checking for LLVM_LIBC_INCLUDE_SCUDO=ON in CMake
until the print format macros are added to LLVM libc.

reviews.llvm.org/D152979
github.com/llvm/llvm-project/commit/63eb7c4e6620279c63bd42d13177a94928cabb3c

The next thing I’ve learned a lot about is C++ and standard C header
interoperability, or “include hell”. I learned about the differences
between C++ standard headers like “cwhatever” and “whatever.h”, also
what #include_next did, and also that compilers ship their own header
files like stddef.h and inttypes.h.

I first ran into this when pulling new commits from master and rebuilt
LLVM libc, thinking that the errors were related to this. Weirdly
enough the original error just went away, and I couldn’t reproduce it
at all. But I quickly ran in to a similar issue when compiling LLVM
libc in fullbuild mode on a llvm/musl system.

This time it was an error about wint_t not being defined:

> /llvm-project/build-libc-full/projects/libc/include/wchar.h:21:11: error: unknown type name 'wint_t'
> int wctob(wint_t) __NOEXCEPT;

The issue here arrises because LLVM libc’s llvm-libc-types/wint_t.h
gets the wint_t type using:

> #define __need_wint_t
> #include
> #undef __need_wint_t

This depends on internal behaviour of the stddef.h header. Because
this is C++ it will include in this case libc++’s stddef.h, but this
#include_next’s the second stddef.h in the include search path.

glibc uses __need_wint_t to make stddef.h define wint_t, while musl
uses __NEED_wint_t. No one is wrong here, as it is libc internals that
should not be used by end users, instead something like wchar.h should
be included. However, as this is a libc implementation too it does not
make sense to include all of that stuff, so something else must be
done. I then grepped the whole llvm checkout for stddef.h and realized
that Clang shipped its own stddef.h too. This header, like glibc, uses
__need_wint_t to define wint_t, which is exactly what I want. I posted
a bug report and got told that the internal Clang headers are to be
used, not the system libc’s headers, because of issues like these.
github.com/llvm/llvm-project/issues/63510

However, somehow /usr/include is higher up in the include order than
/usr/lib/clang/* even when using -ffreestanding, so I assume this is a
bug with the Gentoo llvm/musl stage actually.

Another thing I have worked on is to replace some __unix__ ifdef
checks in LLVM libc with __linux__. When looking through the source
code there are quite a lot of places where these are mixed up
bizarrely enough. The most obvious ones are __unix__ check followed by
an #include . This is caused by “cargo cult” meaning that
they once did it that way and stuck with it for no reason.

I have fixed this here: reviews.llvm.org/D153729#4447435, but
I will revisit this because there’s a chance that macOS users could
have used the typedefs pthread_once_t and once_flag, even though the
underlying type __futex_word was supposed to be Linux only, because
futexes here are Linux kernel specific. This would’ve
previously not have errored out on macOS (__APPLE__) since __unix__ is
defined, and __futex_word is just defined as an aligned 32 bit uint,
no unconditional kernel headers used that would’ve broken the build.

I will therefore go back and define these for macOS later.

I have also done some work on upstreaming things needed for Python
into LLVM libc instead of just mashing everything into my Python
source dir. The first one being the POSIX extension fdopen().

As fopen is already implemented the hard part was not the function
itself, but actually figuring out where everything in LLVM libc was
placed. Apart from the obvious declaration in the internal headers,
and corresponding source file, I also needed to make sure it was
usable in the libc. In total I needed to edit 7 files, like the
TableGen specifications, config/$arch/entrypoints.txt, libc
exposed internal header file, and of course CMakeLists.txt.

This is not upstreamed yet but I am working on it here
reviews.llvm.org/D153396.

The other thing I want to get upstreamed is the limits.h header, in my
case needed for SSIZE_MAX. I have successfully made a tiny version in
my libc tree that exposes some macros, and I will try to upstream what
I have and then work on things one by one. Similarly here, the hard
part was actually getting the header and macros to be exposed in the
libc by editing build system code and specification files. I could
have temporarily just jammed in a limits.h header file but I think
it’s important to get to know how LLVM libc does things/”how the
boilerplate works” early in my project.

That’s all the big things, I also continued work on Python, fixed some
small stuff like a typo fix
(reviews.llvm.org/rGc32ba7d5e00869de05d798ec8eb791bd1d7fb585),
adding Emacs support in llvm-common
(github.com/gentoo/gentoo/pull/31635)
and other “Gentoo but not really GSoC work”:
github.com/gentoo/gentoo/pull/31560 (license fix, soju).
github.com/gentoo/gentoo/pull/30933 (new package, senpai).

Next week I will work on getting Clang/LLVM supported in
Crossdev. This will be done by first making sure that the hosts LLVM
toolchain supports the target architecture via the LLVM_TARGETS
USE-flag. Currently Clang on Gentoo can compile things like the
kernel, but anything that relies on runtime libraries, like libc,
fails due to compiler-rt not being compiled for the target triple, so
I will also make sure that Crossdev compiles compiler-rt for the
specified target triple.

– —
catcream

Hello! This is a combined report for both week 3 and 4.

In these two weeks I’ve fixed several issues in LLVM libc, but quite a
lot of time has also been spent purely learning things. I will start
by going over what I’ve learned, and then refer to related issues.

To start with I have gotten quite comfortable with CVise, how to use
it and general tricks about writing the test script for determining
whether the issue is still there after reducing a source file. For
example, I had an issue about a the print format macro PRId64 not
being defined on LLVM libc

This caused an error that looked like this:

/home/cat/c/llvm-project/compiler-rt/lib/scudo/standalone/timing.h:182:22: error: expected ')'
Str.append("%14" PRId64 ".%" PRId64 "(ns) %-11s", Integral, Fraction, " ");

So my first attempt of reducing was to grep for “expected ‘)'”. This
went on to reduce the source file to simply: “(“. Maybe not the most
interesting thing, but it was the “aha-moment” for me with regards to
CVise, because what it did with the test script became clear.

To actually fix this issue I filed a bug
https://github.com/llvm/llvm-project/issues/63317 and got told by a
compiler-rt developer that the timing.cpp file is only used for
performance evaluation. So the temporary fix I made was to exclude it
from the build by checking for LLVM_LIBC_INCLUDE_SCUDO=ON in CMake
until the print format macros are added to LLVM libc.

https://reviews.llvm.org/D152979
https://github.com/llvm/llvm-project/commit/63eb7c4e6620279c63bd42d13177a94928cabb3c

The next thing I’ve learned a lot about is C++ and standard C header
interoperability, or “include hell”. I learned about the differences
between C++ standard headers like “cwhatever” and “whatever.h”, also
what #include_next did, and also that compilers ship their own header
files like stddef.h and inttypes.h.

I first ran into this when pulling new commits from master and rebuilt
LLVM libc, thinking that the errors were related to this. Weirdly
enough the original error just went away, and I couldn’t reproduce it
at all. But I quickly ran in to a similar issue when compiling LLVM
libc in fullbuild mode on a llvm/musl system.

This time it was an error about wint_t not being defined:

> /llvm-project/build-libc-full/projects/libc/include/wchar.h:21:11: error: unknown type name 'wint_t'
> int wctob(wint_t) __NOEXCEPT;

The issue here arrises because LLVM libc’s llvm-libc-types/wint_t.h
gets the wint_t type using:

> #define __need_wint_t
> #include
> #undef __need_wint_t

This depends on internal behaviour of the stddef.h header. Because
this is C++ it will include in this case libc++’s stddef.h, but this
#include_next’s the second stddef.h in the include search path.

glibc uses __need_wint_t to make stddef.h define wint_t, while musl
uses __NEED_wint_t. No one is wrong here, as it is libc internals that
should not be used by end users, instead something like wchar.h should
be included. However, as this is a libc implementation too it does not
make sense to include all of that stuff, so something else must be
done. I then grepped the whole llvm checkout for stddef.h and realized
that Clang shipped its own stddef.h too. This header, like glibc, uses
__need_wint_t to define wint_t, which is exactly what I want. I posted
a bug report and got told that the internal Clang headers are to be
used, not the system libc’s headers, because of issues like these.
https://github.com/llvm/llvm-project/issues/63510

However, somehow /usr/include is higher up in the include order than
/usr/lib/clang/* even when using -ffreestanding, so I assume this is a
bug with the Gentoo llvm/musl stage actually.

Another thing I have worked on is to replace some __unix__ ifdef
checks in LLVM libc with __linux__. When looking through the source
code there are quite a lot of places where these are mixed up
bizarrely enough. The most obvious ones are __unix__ check followed by
an #include . This is caused by “cargo cult” meaning that
they once did it that way and stuck with it for no reason.

I have fixed this here: https://reviews.llvm.org/D153729#4447435, but
I will revisit this because there’s a chance that macOS users could
have used the typedefs pthread_once_t and once_flag, even though the
underlying type __futex_word was supposed to be Linux only, because
futexes here are Linux kernel specific. This would’ve
previously not have errored out on macOS (__APPLE__) since __unix__ is
defined, and __futex_word is just defined as an aligned 32 bit uint,
no unconditional kernel headers used that would’ve broken the build.

I will therefore go back and define these for macOS later.

I have also done some work on upstreaming things needed for Python
into LLVM libc instead of just mashing everything into my Python
source dir. The first one being the POSIX extension fdopen().

As fopen is already implemented the hard part was not the function
itself, but actually figuring out where everything in LLVM libc was
placed. Apart from the obvious declaration in the internal headers,
and corresponding source file, I also needed to make sure it was
usable in the libc. In total I needed to edit 7 files, like the
TableGen specifications, config/$arch/entrypoints.txt, libc
exposed internal header file, and of course CMakeLists.txt.

This is not upstreamed yet but I am working on it here
https://reviews.llvm.org/D153396.

The other thing I want to get upstreamed is the limits.h header, in my
case needed for SSIZE_MAX. I have successfully made a tiny version in
my libc tree that exposes some macros, and I will try to upstream what
I have and then work on things one by one. Similarly here, the hard
part was actually getting the header and macros to be exposed in the
libc by editing build system code and specification files. I could
have temporarily just jammed in a limits.h header file but I think
it’s important to get to know how LLVM libc does things/”how the
boilerplate works” early in my project.

That’s all the big things, I also continued work on Python, fixed some
small stuff like a typo fix
(https://reviews.llvm.org/rGc32ba7d5e00869de05d798ec8eb791bd1d7fb585),
adding Emacs support in llvm-common
(https://github.com/gentoo/gentoo/pull/31635)
and other “Gentoo but not really GSoC work”:
https://github.com/gentoo/gentoo/pull/31560 (license fix, soju).
https://github.com/gentoo/gentoo/pull/30933 (new package, senpai).

Next week I will work on getting Clang/LLVM supported in
Crossdev. This will be done by first making sure that the hosts LLVM
toolchain supports the target architecture via the LLVM_TARGETS
USE-flag. Currently Clang on Gentoo can compile things like the
kernel, but anything that relies on runtime libraries, like libc,
fails due to compiler-rt not being compiled for the target triple, so
I will also make sure that Crossdev compiles compiler-rt for the
specified target triple.

– —
catcream

Week 4 – Modernization of Portage

Gentoo Google Summer of Code (GSoC) June 27, 2023, 2:05
Week 4 – Modernization of Portage

Another week of GSOC. Days run really fast. This again was a productive week. The first half was  towards understanding the unit tests for portage and the second half was towards solving a bug.

Testing in portage

Tests are one of the most important components of any software. Portage being no exception  employs unit tests for testing. Till now, I did not bother to look into the tests. We have a bash script runtests. I run it and I watch for things to succeed. Sam felt that I needed to have a bit more  understanding of the tests, for various reasons. So, I started looking into the tests.

Portage’s tests are single threaded. It takes between 300 and 450 seconds to run all the tests portage has, depending on the speed of the machine. It would be nice to have the unit tests run in parallel, but there are several caveats to that. For one, portage needs to virtualize various things including runtime parameters and a filesystem (to test the changes portage makes). Sharing one virtualized  environment among many threads did not seem like a plausible idea. So, for each thread a
new virtual environment has to be created. So threading has to be outside the virtual environment creation phase.

So, I added the functionality to start and stop testing at the nth test file. With this functionality, the  plan is to count the number of tests, split them into groups and assign each group to a separate thread. This leads to a bit of overhead as virtual environments have to created for each thread, but it will make the tests faster. The implementation can be found in this pull request. It is not merged yet because the long term goal is to get rid of runtests and exclusively use standardized python tools
like pytest-xdist for running tests parallely. There is also work going on to make portage tests run properly with python-xdist. I am not sure if this work will block that. It should not, but still, we are holding the merge.

Bug 528836

From day one, I wanted to work on the dependency resolution system of portage. But it is obviously not a simple job and so Sam advised to get familiar with the algorithm by fixing bugs related to that. Sam chose me a bug to fix and it is 528836. The problem is that two conflicting packages are pulled in when only should have been pulled. The bug was not reproduceable with the current state of portage and the ebuild repository. There were a few hurdles along the way, but finally, we were able to reproduce the bug by restoring portage and the ebuild repository to 2017.

We are not yet sure if the bug is due to portage or some misconfiguration in the ebuild repository.  We will continue to work on it and I will keep you posted.

Next week’s plan

The next week’s plan will be to write tests for this bug to make sure it doesn’t happen again. We  will also try to squeeze in a few more quality of life changes if time permits.

Week 4 – Modernization of Portage

Another week of GSOC. Days run really fast. This again was a productive week. The first half was  towards understanding the unit tests for portage and the second half was towards solving a bug.

Testing in portage

Tests are one of the most important components of any software. Portage being no exception  employs unit tests for testing. Till now, I did not bother to look into the tests. We have a bash script runtests. I run it and I watch for things to succeed. Sam felt that I needed to have a bit more  understanding of the tests, for various reasons. So, I started looking into the tests.

Portage’s tests are single threaded. It takes between 300 and 450 seconds to run all the tests portage has, depending on the speed of the machine. It would be nice to have the unit tests run in parallel, but there are several caveats to that. For one, portage needs to virtualize various things including runtime parameters and a filesystem (to test the changes portage makes). Sharing one virtualized  environment among many threads did not seem like a plausible idea. So, for each thread a
new virtual environment has to be created. So threading has to be outside the virtual environment creation phase.

So, I added the functionality to start and stop testing at the nth test file. With this functionality, the  plan is to count the number of tests, split them into groups and assign each group to a separate thread. This leads to a bit of overhead as virtual environments have to created for each thread, but it will make the tests faster. The implementation can be found in this pull request. It is not merged yet because the long term goal is to get rid of runtests and exclusively use standardized python tools
like pytest-xdist for running tests parallely. There is also work going on to make portage tests run properly with python-xdist. I am not sure if this work will block that. It should not, but still, we are holding the merge.

Bug 528836

From day one, I wanted to work on the dependency resolution system of portage. But it is obviously not a simple job and so Sam advised to get familiar with the algorithm by fixing bugs related to that. Sam chose me a bug to fix and it is 528836. The problem is that two conflicting packages are pulled in when only should have been pulled. The bug was not reproduceable with the current state of portage and the ebuild repository. There were a few hurdles along the way, but finally, we were able to reproduce the bug by restoring portage and the ebuild repository to 2017.

We are not yet sure if the bug is due to portage or some misconfiguration in the ebuild repository.  We will continue to work on it and I will keep you posted.

Next week’s plan

The next week’s plan will be to write tests for this bug to make sure it doesn’t happen again. We  will also try to squeeze in a few more quality of life changes if time permits.

VIEW

SCOPE

FILTER
  from
  to