Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(package): detect dirty workspace manifest #15089

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/cargo/ops/cargo_package/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -387,7 +387,7 @@ fn prepare_archive(
let src_files = src.list_files(pkg)?;

// Check (git) repository state, getting the current commit hash.
let vcs_info = vcs::check_repo_state(pkg, &src_files, gctx, &opts)?;
let vcs_info = vcs::check_repo_state(pkg, &src_files, ws, &opts)?;

build_ar_list(ws, pkg, src_files, vcs_info)
}
Expand Down
161 changes: 154 additions & 7 deletions src/cargo/ops/cargo_package/vcs.rs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution here is

  • Retrieve workspace manifest from Git index.
  • Use the ws manifest from index to normalize package manifest.
  • Compare the difference between normalized tomls from Git index and
    from Git working directory.

This is a complex operation we are doing which I seriously question whether it is worth it vs something else like "its dirty if the workspace manifest is dirty" (without even bothering to check if the workspace manifest is different than the package manifest).

Similarly, we now include Cargo.lock and that could be dirty. Would we extend the same kind of "only if it affects this one package" style check for that case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I do think this fix is a bit too convoluted to merge.

"its dirty if the workspace manifest is dirty"

This is actually a good compromise as users always have --allow-dirty to bypass it. Guess we could start an FCP or discussion during the meeting.

Similarly, we now include Cargo.lock and that could be dirty. Would we extend the same kind of "only if it affects this one package" style check for that case?

Ooops. I would rather not. Also we have --lockfile-path, which makes the entire story worse 😞.

Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,13 @@ use std::path::PathBuf;

use anyhow::Context as _;
use cargo_util::paths;
use cargo_util_schemas::manifest::TomlManifest;
use serde::Serialize;
use tracing::debug;

use crate::core::Package;
use crate::core::Workspace;
use crate::core::WorkspaceRootConfig;
use crate::sources::PathEntry;
use crate::CargoResult;
use crate::GlobalContext;
Expand Down Expand Up @@ -44,9 +47,10 @@ pub struct GitVcsInfo {
pub fn check_repo_state(
p: &Package,
src_files: &[PathEntry],
gctx: &GlobalContext,
ws: &Workspace<'_>,
opts: &PackageOpts<'_>,
) -> CargoResult<Option<VcsInfo>> {
let gctx = ws.gctx();
let Ok(repo) = git2::Repository::discover(p.root()) else {
gctx.shell().verbose(|shell| {
shell.warn(format!("no (git) VCS found for `{}`", p.root().display()))
Expand Down Expand Up @@ -105,7 +109,7 @@ pub fn check_repo_state(
.and_then(|p| p.to_str())
.unwrap_or("")
.replace("\\", "/");
let Some(git) = git(p, gctx, src_files, &repo, &opts)? else {
let Some(git) = git(p, ws, src_files, &repo, &opts)? else {
// If the git repo lacks essensial field like `sha1`, and since this field exists from the beginning,
// then don't generate the corresponding file in order to maintain consistency with past behavior.
return Ok(None);
Expand Down Expand Up @@ -163,11 +167,12 @@ fn warn_symlink_checked_out_as_plain_text_file(
/// The real git status check starts from here.
fn git(
pkg: &Package,
gctx: &GlobalContext,
ws: &Workspace<'_>,
src_files: &[PathEntry],
repo: &git2::Repository,
opts: &PackageOpts<'_>,
) -> CargoResult<Option<GitVcsInfo>> {
let gctx = ws.gctx();
// This is a collection of any dirty or untracked files. This covers:
// - new/modified/deleted/renamed/type change (index or worktree)
// - untracked files (which are "new" worktree files)
Expand All @@ -189,7 +194,7 @@ fn git(
.iter()
.filter(|src_file| dirty_files.iter().any(|path| src_file.starts_with(path)))
.map(|p| p.as_ref())
.chain(dirty_files_outside_pkg_root(pkg, repo, src_files)?.iter())
.chain(dirty_files_outside_pkg_root(ws, pkg, repo, src_files)?.iter())
.map(|path| {
pathdiff::diff_paths(path, cwd)
.as_ref()
Expand Down Expand Up @@ -233,6 +238,7 @@ fn git(
/// current package root, but still under the git workdir, affecting the
/// final packaged `.crate` file.
fn dirty_files_outside_pkg_root(
ws: &Workspace<'_>,
pkg: &Package,
repo: &git2::Repository,
src_files: &[PathEntry],
Expand All @@ -247,7 +253,7 @@ fn dirty_files_outside_pkg_root(
.map(|path| paths::normalize_path(&pkg_root.join(path)))
.collect();

let mut dirty_symlinks = HashSet::new();
let mut dirty_files = HashSet::new();
for rel_path in src_files
.iter()
.filter(|p| p.is_symlink_or_under_symlink())
Expand All @@ -259,10 +265,151 @@ fn dirty_files_outside_pkg_root(
.filter_map(|p| paths::strip_prefix_canonical(p, workdir).ok())
{
if repo.status_file(&rel_path)? != git2::Status::CURRENT {
dirty_symlinks.insert(workdir.join(rel_path));
dirty_files.insert(workdir.join(rel_path));
}
}
Ok(dirty_symlinks)

if let Some(dirty_ws_manifest) = dirty_workspace_manifest(ws, pkg, repo)? {
dirty_files.insert(dirty_ws_manifest);
}
Ok(dirty_files)
}

fn dirty_workspace_manifest(
ws: &Workspace<'_>,
pkg: &Package,
repo: &git2::Repository,
) -> CargoResult<Option<PathBuf>> {
let workdir = repo.workdir().unwrap();
let ws_manifest_path = ws.root_manifest();
if pkg.manifest_path() == ws_manifest_path {
// The workspace manifest is also the primary package manifest.
// Normal file statuc check should have covered it.
return Ok(None);
}
if paths::strip_prefix_canonical(ws_manifest_path, pkg.root()).is_ok() {
// Inside package root. Don't bother checking git status.
return Ok(None);
}
let Ok(rel_path) = paths::strip_prefix_canonical(ws_manifest_path, workdir) else {
// Completely outside this git workdir.
return Ok(None);
};

// Outside package root but under git workdir.
if repo.status_file(&rel_path)? == git2::Status::CURRENT {
return Ok(None);
}

let from_index = ws_manifest_and_root_config_from_index(ws, repo, &rel_path);
// If there is no workable workspace manifest in Git index,
// create a default inheritable fields.
// With it, we can detect any member manifest has inherited fields,
// and then the workspace manifest should be considered dirty.
let inheritable = if let Some(fields) = from_index
.as_ref()
.map(|(_, root_config)| root_config.inheritable())
{
fields
} else {
&Default::default()
};

let empty = Vec::new();
let cargo_features = crate::core::Features::new(
from_index
.as_ref()
.and_then(|(manifest, _)| manifest.cargo_features.as_ref())
.unwrap_or(&empty),
ws.gctx(),
&mut Default::default(),
pkg.package_id().source_id().is_path(),
)
.unwrap_or_default();

let dirty_path = || Ok(Some(workdir.join(&rel_path)));
let dirty = |msg| {
debug!(
"{msg} for `{}` of repo at `{}`",
rel_path.display(),
workdir.display(),
);
dirty_path()
};

let Ok(normalized_toml) = crate::util::toml::normalize_toml(
pkg.manifest().original_toml(),
&cargo_features,
&|| Ok(inheritable),
pkg.manifest_path(),
ws.gctx(),
&mut Default::default(),
&mut Default::default(),
) else {
return dirty("failed to normalize pkg manifest from index");
};

let Ok(from_index) = toml::to_string_pretty(&normalized_toml) else {
return dirty("failed to serialize pkg manifest from index");
};

let Ok(from_working_dir) = toml::to_string_pretty(pkg.manifest().normalized_toml()) else {
return dirty("failed to serialize pkg manifest from working directory");
};

if from_index != from_working_dir {
tracing::trace!("--- from index ---\n{from_index}");
tracing::trace!("--- from working dir ---\n{from_working_dir}");
return dirty("normalized manifests from index and in working directory mismatched");
}

Ok(None)
}

/// Gets workspace manifest and workspace root config from Git index.
///
/// This returns an `Option` because workspace manifest might be broken or not
/// exist at all.
fn ws_manifest_and_root_config_from_index(
ws: &Workspace<'_>,
repo: &git2::Repository,
ws_manifest_rel_path: &Path,
) -> Option<(TomlManifest, WorkspaceRootConfig)> {
let workdir = repo.workdir().unwrap();
let dirty = |msg| {
debug!(
"{msg} for `{}` of repo at `{}`",
ws_manifest_rel_path.display(),
workdir.display(),
);
None
};
let Ok(index) = repo.index() else {
debug!("no index for repo at `{}`", workdir.display());
return None;
};
let Some(entry) = index.get_path(ws_manifest_rel_path, 0) else {
return dirty("workspace manifest not found");
};
let Ok(blob) = repo.find_blob(entry.id) else {
return dirty("failed to find manifest blob");
};
let Ok(contents) = String::from_utf8(blob.content().to_vec()) else {
return dirty("failed parse as UTF-8 encoding");
};
let Ok(document) = crate::util::toml::parse_document(&contents) else {
return dirty("failed to parse file");
};
let Ok(ws_manifest_from_index) = crate::util::toml::deserialize_toml(&document) else {
return dirty("failed to deserialize doc");
};
let Some(toml_workspace) = ws_manifest_from_index.workspace.as_ref() else {
return dirty("not a workspace manifest");
};

let ws_root_config =
crate::util::toml::to_workspace_root_config(toml_workspace, ws.root_manifest());
Some((ws_manifest_from_index, ws_root_config))
}

/// Helper to collect dirty statuses for a single repo.
Expand Down
24 changes: 12 additions & 12 deletions src/cargo/util/toml/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -84,10 +84,14 @@ pub fn read_manifest(
.borrow_mut()
.insert(package_root.to_owned(), ws_root_config.clone());
}
let inherit_cell = LazyCell::new();
let inherit = || {
inherit_cell.try_borrow_with(|| load_inheritable_fields(gctx, path, &workspace_config))
};
let normalized_toml = normalize_toml(
&original_toml,
&features,
&workspace_config,
&inherit,
path,
gctx,
&mut warnings,
Expand Down Expand Up @@ -158,12 +162,14 @@ fn read_toml_string(path: &Path, gctx: &GlobalContext) -> CargoResult<String> {
}

#[tracing::instrument(skip_all)]
fn parse_document(contents: &str) -> Result<toml_edit::ImDocument<String>, toml_edit::de::Error> {
pub(crate) fn parse_document(
contents: &str,
) -> Result<toml_edit::ImDocument<String>, toml_edit::de::Error> {
toml_edit::ImDocument::parse(contents.to_owned()).map_err(Into::into)
}

#[tracing::instrument(skip_all)]
fn deserialize_toml(
pub(crate) fn deserialize_toml(
document: &toml_edit::ImDocument<String>,
) -> Result<manifest::TomlManifest, toml_edit::de::Error> {
let mut unused = BTreeSet::new();
Expand Down Expand Up @@ -242,7 +248,7 @@ fn to_workspace_config(
Ok(workspace_config)
}

fn to_workspace_root_config(
pub(crate) fn to_workspace_root_config(
normalized_toml: &manifest::TomlWorkspace,
manifest_file: &Path,
) -> WorkspaceRootConfig {
Expand All @@ -266,22 +272,16 @@ fn to_workspace_root_config(

/// See [`Manifest::normalized_toml`] for more details
#[tracing::instrument(skip_all)]
fn normalize_toml(
pub(crate) fn normalize_toml<'a>(
original_toml: &manifest::TomlManifest,
features: &Features,
workspace_config: &WorkspaceConfig,
inherit: &dyn Fn() -> CargoResult<&'a InheritableFields>,
manifest_file: &Path,
gctx: &GlobalContext,
warnings: &mut Vec<String>,
errors: &mut Vec<String>,
) -> CargoResult<manifest::TomlManifest> {
let package_root = manifest_file.parent().unwrap();

let inherit_cell: LazyCell<InheritableFields> = LazyCell::new();
let inherit = || {
inherit_cell
.try_borrow_with(|| load_inheritable_fields(gctx, manifest_file, &workspace_config))
};
let workspace_root = || inherit().map(|fields| fields.ws_root().as_path());

let mut normalized_toml = manifest::TomlManifest {
Expand Down
Loading
Loading