Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(turborepo): Package Detection #7549

Merged
merged 15 commits into from Mar 5, 2024
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
146 changes: 146 additions & 0 deletions crates/turborepo-lib/src/global_deps_package_detector.rs
@@ -0,0 +1,146 @@
use thiserror::Error;
use turbopath::AnchoredSystemPath;
use turborepo_repository::{
change_mapper::{DefaultPackageDetector, PackageDetection, PackageDetector},
package_graph::{PackageGraph, WorkspacePackage},
};
use wax::{BuildError, Program};

#[derive(Error, Debug)]
pub enum Error {
#[error(transparent)]
InvalidFilter(#[from] BuildError),
}

/// A package detector that uses a global deps list to determine
/// if a file should cause all packages to be marked as changed.
/// This is less conservative than the `DefaultPackageDetector`
/// which assumes that any changed file that is not in a package
/// is a root dependency. Since we have a list of global deps,
NicholasLYang marked this conversation as resolved.
Show resolved Hide resolved
/// we can check against that and avoid invalidating in unnecessary cases.
pub struct GlobalDepsPackageDetector<'a> {
pkg_dep_graph: &'a PackageGraph,
global_deps_matcher: wax::Any<'a>,
}

impl<'a> GlobalDepsPackageDetector<'a> {
pub fn new<S: wax::Pattern<'a>, I: Iterator<Item = S>>(
pkg_dep_graph: &'a PackageGraph,
global_deps: I,
) -> Result<Self, Error> {
let global_deps_matcher = wax::any(global_deps)?;

Ok(Self {
pkg_dep_graph,
global_deps_matcher,
})
}
}

impl<'a> PackageDetector for GlobalDepsPackageDetector<'a> {
fn detect_package(&self, path: &AnchoredSystemPath) -> PackageDetection {
match DefaultPackageDetector::new(self.pkg_dep_graph).detect_package(path) {
// Since `DefaultPackageDetector` is overly conservative, we can check here if
// the path is actually in globalDeps and if not, return it as
// PackageDetection::Package(WorkspacePackage::root()).
PackageDetection::All => {
let cleaned_path = path.clean();
let in_global_deps = self.global_deps_matcher.is_match(cleaned_path.as_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble tracing this code. Does the global_deps_matcher always get the default global deps of package.json/turbo.json added to it? Otherwise we'll incorrectly map a package.json change just be a root package change instead of a all change.

Or I'm misunderstanding these changes completely and this codepath isn't hit in that scenario.

Edit: I was misunderstanding, default global deps are checked at ChangeMapper which uses a package detector if a default global change isn't detected


if in_global_deps {
PackageDetection::All
} else {
PackageDetection::Package(WorkspacePackage::root())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we return None here? This implementation implies that we're talking about globalDependencies in turbo.json. Is there a scenario in which it's valuable to know that the root package changed or would it just get ignored by the caller?

Higher level: I'm concerned about inventing a semantic meaning of WorkspacePackage::root(). So far, "root package changed" has roughly implied that "all packages changed". But in this PR, we're separating the concepts of "all" and "root package", so we can no longer ignore paper over what a "root package" is.

Our definition of package is roughly equivalent to the package managers definition, and there are no package managers that consider the root of a "workspace" a package itself.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think there's a few misunderstandings (not helped by the multiple iterations of this PR):

  • It is helpful to know that the root package changed, because you can have root tasks that can get run, say if you define a //#build entry in pipeline. We can't remove this root package return because that would be a breaking change for people with root package tasks.
  • Root package changed does not imply all packages changed. I thought it did but @chris-olszewski explained to me that it does not. I think it should in an ideal world (and I think a setting to enable that behavior is worth discussing but out of scope).
  • Therefore separating between the two concepts is fine because that is the existing behavior.
  • As for how our model relates to a package manager, I agree that we should stay closer to the package manager's semantics, but unfortunately we can't due to existing behavior. Like we should treat a root package change as a global change because that's what package managers effectively do, but we have to wait for a breaking change to make that possible.

}
}
result => result,
}
}
}

#[cfg(test)]
mod tests {
use tempfile::tempdir;
use turbopath::{AbsoluteSystemPath, AnchoredSystemPathBuf};
use turborepo_repository::{
change_mapper::{ChangeMapper, DefaultPackageDetector, PackageChanges},
discovery,
discovery::PackageDiscovery,
package_graph::{PackageGraphBuilder, WorkspacePackage},
package_json::PackageJson,
};

use super::GlobalDepsPackageDetector;
use crate::turbo_json::TurboJson;

#[allow(dead_code)]
pub struct MockDiscovery;

impl PackageDiscovery for MockDiscovery {
async fn discover_packages(
&self,
) -> Result<discovery::DiscoveryResponse, discovery::Error> {
Ok(discovery::DiscoveryResponse {
package_manager: turborepo_repository::package_manager::PackageManager::Npm,
workspaces: vec![],
})
}

async fn discover_packages_blocking(
&self,
) -> Result<discovery::DiscoveryResponse, discovery::Error> {
self.discover_packages().await
}
}

#[tokio::test]
async fn test_different_package_detectors() -> Result<(), anyhow::Error> {
let repo_root = tempdir()?;
let root_package_json = PackageJson::default();

let pkg_graph = PackageGraphBuilder::new(
AbsoluteSystemPath::from_std_path(repo_root.path())?,
root_package_json,
)
.with_package_discovery(MockDiscovery)
.build()
.await?;

let default_package_detector = DefaultPackageDetector::new(&pkg_graph);
let change_mapper = ChangeMapper::new(&pkg_graph, vec![], default_package_detector);

let package_changes = change_mapper.changed_packages(
[AnchoredSystemPathBuf::from_raw("README.md")?]
.into_iter()
.collect(),
None,
)?;

// We should return All because we don't have global deps and
// therefore must be conservative about changes
assert_eq!(package_changes, PackageChanges::All);

let turbo_json = TurboJson::default();
let turbo_package_detector = GlobalDepsPackageDetector::new(
&pkg_graph,
turbo_json.global_deps.iter().map(|s| s.as_str()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can we just the value here instead of depending on the default value of TurboJson?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm confused. In what part are we depending on the default value?

)?;
let change_mapper = ChangeMapper::new(&pkg_graph, vec![], turbo_package_detector);

let package_changes = change_mapper.changed_packages(
[AnchoredSystemPathBuf::from_raw("README.md")?]
.into_iter()
.collect(),
None,
)?;

// We only get a root workspace change since we have global deps specified and
// README.md is not one of them
assert_eq!(
package_changes,
PackageChanges::Some([WorkspacePackage::root()].into_iter().collect())
);

Ok(())
}
}
1 change: 1 addition & 0 deletions crates/turborepo-lib/src/lib.rs
Expand Up @@ -21,6 +21,7 @@ mod engine;

mod framework;
mod gitignore;
mod global_deps_package_detector;
pub(crate) mod globwatcher;
mod hash;
mod opts;
Expand Down
25 changes: 17 additions & 8 deletions crates/turborepo-lib/src/run/scope/change_detector.rs
Expand Up @@ -2,11 +2,15 @@ use std::collections::HashSet;

use turbopath::{AbsoluteSystemPath, AnchoredSystemPathBuf};
use turborepo_repository::{
change_mapper::{ChangeMapError, ChangeMapper, LockfileChange, PackageChanges},
change_mapper::{
ChangeMapError, ChangeMapper, DefaultPackageDetector, LockfileChange, PackageChanges,
},
package_graph::{PackageGraph, PackageName},
};
use turborepo_scm::SCM;

use crate::global_deps_package_detector::{Error, GlobalDepsPackageDetector};

/// Given two git refs, determine which packages have changed between them.
pub trait GitChangeDetector {
fn changed_packages(
Expand All @@ -18,7 +22,7 @@ pub trait GitChangeDetector {

pub struct ScopeChangeDetector<'a> {
turbo_root: &'a AbsoluteSystemPath,
change_mapper: ChangeMapper<'a>,
change_mapper: ChangeMapper<'a, GlobalDepsPackageDetector<'a>>,
scm: &'a SCM,
pkg_graph: &'a PackageGraph,
}
Expand All @@ -28,17 +32,18 @@ impl<'a> ScopeChangeDetector<'a> {
turbo_root: &'a AbsoluteSystemPath,
scm: &'a SCM,
pkg_graph: &'a PackageGraph,
global_deps: Vec<String>,
global_deps: impl Iterator<Item = &'a str>,
ignore_patterns: Vec<String>,
) -> Self {
let change_mapper = ChangeMapper::new(pkg_graph, global_deps, ignore_patterns);
) -> Result<Self, Error> {
let pkg_detector = GlobalDepsPackageDetector::new(pkg_graph, global_deps)?;
let change_mapper = ChangeMapper::new(pkg_graph, ignore_patterns, pkg_detector);

Self {
Ok(Self {
turbo_root,
change_mapper,
scm,
pkg_graph,
}
})
}

/// Gets the lockfile content from SCM if it has changed.
Expand All @@ -54,7 +59,11 @@ impl<'a> ScopeChangeDetector<'a> {
.package_manager()
.lockfile_path(self.turbo_root);

if !ChangeMapper::lockfile_changed(self.turbo_root, changed_files, &lockfile_path) {
if !ChangeMapper::<DefaultPackageDetector>::lockfile_changed(
self.turbo_root,
changed_files,
&lockfile_path,
) {
return None;
}

Expand Down
29 changes: 22 additions & 7 deletions crates/turborepo-lib/src/run/scope/filter.rs
Expand Up @@ -18,7 +18,10 @@ use super::{
simple_glob::{Match, SimpleGlob},
target_selector::{InvalidSelectorError, TargetSelector},
};
use crate::{run::scope::change_detector::ScopeChangeDetector, turbo_json::TurboJson};
use crate::{
global_deps_package_detector, run::scope::change_detector::ScopeChangeDetector,
turbo_json::TurboJson,
};

pub struct PackageInference {
package_name: Option<String>,
Expand Down Expand Up @@ -112,19 +115,29 @@ impl<'a> FilterResolver<'a, ScopeChangeDetector<'a>> {
turbo_root: &'a AbsoluteSystemPath,
inference: Option<PackageInference>,
scm: &'a SCM,
root_turbo_json: &TurboJson,
) -> Self {
let mut global_deps = opts.global_deps.clone();
global_deps.extend_from_slice(&root_turbo_json.global_deps);
root_turbo_json: &'a TurboJson,
) -> Result<Self, ResolutionError> {
let global_deps = opts
.global_deps
.iter()
.map(|s| s.as_str())
.chain(root_turbo_json.global_deps.iter().map(|s| s.as_str()));

let change_detector = ScopeChangeDetector::new(
turbo_root,
scm,
pkg_graph,
global_deps,
opts.ignore_patterns.clone(),
);
Self::new_with_change_detector(pkg_graph, turbo_root, inference, scm, change_detector)
)?;

Ok(Self::new_with_change_detector(
pkg_graph,
turbo_root,
inference,
scm,
change_detector,
))
}
}

Expand Down Expand Up @@ -585,6 +598,8 @@ pub enum ResolutionError {
glob: String,
err: Box<wax::BuildError>,
},
#[error("failed to construct glob for globalDependencies")]
GlobalDependenciesGlob(#[from] global_deps_package_detector::Error),
}

#[cfg(test)]
Expand Down
2 changes: 1 addition & 1 deletion crates/turborepo-lib/src/run/scope/mod.rs
Expand Up @@ -32,6 +32,6 @@ pub fn resolve_packages(
pkg_inference,
scm,
root_turbo_json,
)
)?
.resolve(&opts.get_filters())
}
4 changes: 4 additions & 0 deletions crates/turborepo-paths/src/anchored_system_path.rs
Expand Up @@ -114,4 +114,8 @@ impl AnchoredSystemPath {
.unwrap(),
)
}

pub fn clean(&self) -> AnchoredSystemPathBuf {
AnchoredSystemPathBuf(self.0.as_std_path().clean().try_into().unwrap())
}
}