Skip to content

Commit

Permalink
feat: distinguish failure to start during upgrade
Browse files Browse the repository at this point in the history
It's possible for a service to be upgraded but then not subsequently start. In this case, it has
still been upgraded to a new version. It's worth making a distinction to the user between an actual
error in the upgrade process, or a failure to start, because in the latter case, they do actually
have an upgrade. They can then take action to try and start their services again.

As part of this change, the start process attempts to find whether the service process did indeed
start, because you don't always seem to get errors back from the service infrastructure. We also
make sure that we return an error if there was a failure with the upgrade process for any services.
This is necessary for visibility on our own deploy process.
  • Loading branch information
jacderida authored and RolandSherwin committed Apr 11, 2024
1 parent 7c5d931 commit 592ac51
Show file tree
Hide file tree
Showing 9 changed files with 289 additions and 23 deletions.
4 changes: 2 additions & 2 deletions sn_node_manager/src/add_services/tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ use sn_service_management::{
use std::{
ffi::OsString,
net::{IpAddr, Ipv4Addr, SocketAddr},
path::PathBuf,
path::{Path, PathBuf},
str::FromStr,
};

Expand All @@ -54,7 +54,7 @@ mock! {
fn create_service_user(&self, username: &str) -> ServiceControlResult<()>;
fn get_available_port(&self) -> ServiceControlResult<u16>;
fn install(&self, install_ctx: ServiceInstallCtx) -> ServiceControlResult<()>;
fn get_process_pid(&self, name: &str) -> ServiceControlResult<u32>;
fn get_process_pid(&self, bin_path: &Path) -> ServiceControlResult<u32>;
fn is_service_process_running(&self, pid: u32) -> bool;
fn start(&self, service_name: &str) -> ServiceControlResult<()>;
fn stop(&self, service_name: &str) -> ServiceControlResult<()>;
Expand Down
7 changes: 7 additions & 0 deletions sn_node_manager/src/cmd/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,13 @@ pub fn print_upgrade_summary(upgrade_summary: Vec<(String, UpgradeResult)>) {
service_name
);
}
UpgradeResult::UpgradedButNotStarted(previous_version, new_version, _) => {
println!(
"{} {} was upgraded from {previous_version} to {new_version} but it did not start",
"✕".red(),
service_name
);
}
UpgradeResult::Forced(previous_version, target_version) => {
println!(
"{} Forced {} version change from {previous_version} to {target_version}.",
Expand Down
15 changes: 12 additions & 3 deletions sn_node_manager/src/cmd/node.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ use crate::{
helpers::{download_and_extract_release, get_bin_version},
refresh_node_registry, status_report, ServiceManager, VerbosityLevel,
};
use color_eyre::{eyre::eyre, Result};
use color_eyre::{eyre::eyre, Help, Result};
use colored::Colorize;
use libp2p_identity::PeerId;
use semver::Version;
Expand Down Expand Up @@ -358,9 +358,18 @@ pub async fn upgrade(
}
}

print_upgrade_summary(upgrade_summary);

node_registry.save()?;
print_upgrade_summary(upgrade_summary.clone());

if upgrade_summary.iter().any(|(_, r)| {
matches!(r, UpgradeResult::Error(_))
|| matches!(r, UpgradeResult::UpgradedButNotStarted(_, _, _))
}) {
return Err(eyre!("There was a problem upgrading one or more nodes").suggestion(
"For any services that were upgraded but did not start, you can attempt to start them \
again using the 'start' command."));
}

Ok(())
}

Expand Down
Loading

0 comments on commit 592ac51

Please sign in to comment.