-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create focal -> noble upgrade script #7332
Comments
One question I've been debating in my head is whether to write the script in Rust or Python. The pros for Rust are pretty clear, this is a place we want to have solid error handling and recovery. The cons are that I think it'll be a little harder to get review for and be a bit more friction for others to contribute. My plan is to start sketching out the script in Python and then see if it can do all the error handling stuff in a not-crazy way. I expect any Python<-->Rust porting to be trivial. |
This is doable on the mon server but going to be an issue on the app server. |
Doesn't OSSEC have a message threshold anyway? Given this is a one-time event I think you could probably do without the alert level change. |
Yes, after it hits some limit it stops sending messages and just queues them up...which means there's a backlog of like 1000 emails and any new alerts (e.g. using the send OSSEC test button) don't get sent. When I upgraded the mon server I got ~27 emails before it stopped sending new ones. The queue seems to be stored in memory though, because I restarted the ossec-server service and then it dropped everything and started sending my newly triggered OSSEC test alerts. I'll put it into the nice to have bucket for now and when we do test upgrades we can see how bad the impact is and decide if we want something else. |
Now that I've started writing the code, there's actually a bigger issue: we are uninstalling Python 3.8 and installing Python 3.12 during the migration. If something fails midway through, we could easily have a broken Python installation. A statically compiled Rust binary will avoid all of that. |
For reference: https://github.com/freedomofpress/securedrop/blob/6f5ef9e69fb1ac87ce4414e92bd1481347f11795/securedrop/debian/config/usr/bin/securedrop-noble-migration.py is what I had sketched out in Python. I'm going to redo that all in Rust now. |
The script is split into various stages where progress is tracked on-disk. The script is able to resume where it was at any point, and needs to, given multiple reboots in the middle. Fixes #7332.
And here's the Rust port: https://github.com/freedomofpress/securedrop/blob/94b84b7894d1bb6f21e93cd61fda3f793363dba8/noble-migration/src/bin/upgrade.rs It's still very basic with lots of FIXMEs inline, but you can see the rough state machine and how it'll handle reboots, etc. Next is:
|
We should also shut down apache during the upgrade. |
The script is split into various stages where progress is tracked on-disk. The script is able to resume where it was at any point, and needs to, given multiple reboots in the middle. Fixes #7332.
The script is split into various stages where progress is tracked on-disk. The script is able to resume where it was at any point, and needs to, given multiple reboots in the middle. The new noble-upgrade.json file shipped in the securedrop-config package is used to control the upgrade process. Fixes #7332.
The script is split into various stages where progress is tracked on-disk. The script is able to resume where it was at any point, and needs to, given multiple reboots in the middle. The new noble-upgrade.json file shipped in the securedrop-config package is used to control the upgrade process. Fixes #7332.
In the upgrade-script branch I've created a noble-migration.json file that contains the automated upgrade conditions: {
"app": {
"enabled": false,
"bucket": 0
},
"mon": {
"enabled": false,
"bucket": 0
}
} You can see at securedrop/noble-migration/src/bin/upgrade.rs Lines 421 to 444 in cb3731f
One thing I'm not sure of is how does the script get manually started by admins? My current thinking is that we have them edit (via ansible playbook) this JSON file. Then it'll get overridden by the new noble securedrop-config package. So maybe if we've already started the upgrade, we ignore this file and continue upgrading. I have not tested the script yet but I think we're at the point where we're ready to do so. My plan is to add this on as an extra step to the existing focal staging job. This will mean writing an ansible playbook for molecule to execute, and it can be the same thing (hopefully) that One "gotcha" I hit was that once we start the script, at some point it'll reboot on its own accord, and ansible will lose the connection and fail. I think we can do something like https://www.jeffgeerling.com/blog/2018/reboot-and-wait-reboot-complete-ansible-playbook to handle this. |
The script is split into various stages where progress is tracked on-disk. The script is able to resume where it was at any point, and needs to, given multiple reboots in the middle. The new noble-upgrade.json file shipped in the securedrop-config package is used to control the upgrade process. Fixes #7332.
The script is split into various stages where progress is tracked on-disk. The script is able to resume where it was at any point, and needs to, given multiple reboots in the middle. The new noble-upgrade.json file shipped in the securedrop-config package is used to control the upgrade process. Fixes #7332.
The script is split into various stages where progress is tracked on-disk. The script is able to resume where it was at any point, and needs to, given multiple reboots in the middle. The new noble-upgrade.json file shipped in the securedrop-config package is used to control the upgrade process. Fixes #7332.
The script is split into various stages where progress is tracked on-disk. The script is able to resume where it was at any point, and needs to, given multiple reboots in the middle. Given that we want to invoke the check script during the upgrade path, most of the code is moved into a common lib.rs that can be imported by both check.rs and upgrade.rs. The new noble-upgrade.json file shipped in the securedrop-config package is used to control the upgrade process. Fixes #7332.
For a manual upgrade, we want to:
Ideally during the wait phase we'd show some progress output, but from what I can tell that seems to be the only thing we can't do using ansible. |
The script is split into various stages where progress is tracked on-disk. The script is able to resume where it was at any point, and needs to, given multiple reboots in the middle. Given that we want to invoke the check script during the upgrade path, most of the code is moved into a common lib.rs that can be imported by both check.rs and upgrade.rs. The new noble-upgrade.json file shipped in the securedrop-config package is used to control the upgrade process. A systemd timer runs every 3 minutes to trigger the upgrade script, which in most cases will do nothing. We need to run it so frequently since this is how the script will be restarted after it pauses for a reboot. Fixes #7332.
The script is split into various stages where progress is tracked on-disk. The script is able to resume where it was at any point, and needs to, given multiple reboots in the middle. Given that we want to invoke the check script during the upgrade path, most of the code is moved into a common lib.rs that can be imported by both check.rs and upgrade.rs. The new noble-upgrade.json file shipped in the securedrop-config package is used to control the upgrade process. A systemd timer runs every 3 minutes to trigger the upgrade script, which in most cases will do nothing. We need to run it so frequently since this is how the script will be restarted after it pauses for a reboot. Fixes #7332.
Description
Part of #7211.
The workflow of the script will be:
/etc/securedrop-upgraded-from-focal
marker fileapt-get update
apt-get upgrade --without-new-pkgs
apt-get full-upgrade
Everything the script does should be aggressively logged and each major step should trigger an OSSEC notification.
I think we can trigger the script from a systemd timer/service. Each step should be recorded in some kind of state file so it can recover and resume if interrupted.
We should also leave behind a marker file like
/etc/securedrop-upgraded-from-focal
so if we detect some bug in the future, we can conditionally apply logic based on whether it's a fresh noble install or an upgrade.The text was updated successfully, but these errors were encountered: