Submitter | phabricator |
---|---|
Date | Jan. 14, 2020, 5:35 p.m. |
Message ID | <differential-rev-PHID-DREV-du2y5nvrvr43bahbwaia-req@mercurial-scm.org> |
Download | mbox | patch |
Permalink | /patch/44337/ |
State | Superseded |
Headers | show |
Comments
This revision now requires changes to proceed. martinvonz added inline comments. martinvonz requested changes to this revision. INLINE COMMENTS > files.rs:207-216 > + let name = if !name.is_absolute() { > + root.join(&cwd).join(&name) > + } else { > + name.to_owned() > + }; > + let mut auditor = PathAuditor::new(&root); > + if name != root && name.starts_with(&root) { Might be worth having a fast path for when `name` is not absolute (including "") and neither `name` nor `cwd` contains "../"so we don't do `let name = root.join(&cwd).join(&name)` immediately followed by `let name = name.strip_prefix(&root).unwrap()`. Especially since that seems like the common case. Or can that ever produce different results? Anyway, that can be done later. > files.rs:220-222 > + // Determine whether `name' is in the hierarchy at or beneath `root', > + // by iterating name=name.parent() until that causes no change (can't > + // check name == '/', because that doesn't work on windows). Is this accurate? It looks like we break when `name.parent() == None`. > files.rs:227-230 > + if name.components().next().is_none() { > + // `name` was actually the same as root (maybe a symlink) > + return Ok("".into()); > + } `name.components().next()` will be `None` only if `name` is something like "/", right? But then we shouldn't return "" (a repo-relative path), we should error out. REPOSITORY rHG Mercurial CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D7871/new/ REVISION DETAIL https://phab.mercurial-scm.org/D7871 To: Alphare, #hg-reviewers, kevincox, martinvonz Cc: martinvonz, durin42, kevincox, mercurial-devel
Alphare added inline comments. INLINE COMMENTS > martinvonz wrote in files.rs:207-216 > Might be worth having a fast path for when `name` is not absolute (including "") and neither `name` nor `cwd` contains "../"so we don't do `let name = root.join(&cwd).join(&name)` immediately followed by `let name = name.strip_prefix(&root).unwrap()`. Especially since that seems like the common case. Or can that ever produce different results? Anyway, that can be done later. That might be worth investigating indeed. I will defer to a later time, though. :) > martinvonz wrote in files.rs:220-222 > Is this accurate? It looks like we break when `name.parent() == None`. I could reword this sentence if you feel that it does not convey the same meaning. It fits the Python implementation better, that's for sure. > martinvonz wrote in files.rs:227-230 > `name.components().next()` will be `None` only if `name` is something like "/", right? But then we shouldn't return "" (a repo-relative path), we should error out. For `b'/'` it would return `Some(std::Path::Components::RootDir)`. It only returns `None` if there are no components. REPOSITORY rHG Mercurial CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D7871/new/ REVISION DETAIL https://phab.mercurial-scm.org/D7871 To: Alphare, #hg-reviewers, kevincox, martinvonz Cc: martinvonz, durin42, kevincox, mercurial-devel
martinvonz added inline comments. INLINE COMMENTS > Alphare wrote in files.rs:220-222 > I could reword this sentence if you feel that it does not convey the same meaning. It fits the Python implementation better, that's for sure. To me, the current sentence very much makes it sound like it iterates until `name != name.parent()`, which doesn't seem correct. So, yes, please reword it. (Unless I'm just misunderstanding something.) > Alphare wrote in files.rs:227-230 > For `b'/'` it would return `Some(std::Path::Components::RootDir)`. It only returns `None` if there are no components. Okay, but it seems that that doesn't address my concern. If I understand correctly (now that you've corrected me), we get here if `name == ""`. However, name is an absolute path (right?), so with `root = "/some/path/"` and `name = ""`, why should we return `Ok()`? If `name` is indeed an absolute path, I would interpret "" as the root (file system root, not repo root), which is not inside "/some/path". What am I missing? REPOSITORY rHG Mercurial CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D7871/new/ REVISION DETAIL https://phab.mercurial-scm.org/D7871 To: Alphare, #hg-reviewers, kevincox, martinvonz Cc: martinvonz, durin42, kevincox, mercurial-devel
martinvonz added inline comments. INLINE COMMENTS > files.rs:231 > + } > + auditor.audit_path(path_to_hg_path_buf(name)?)?; > + return Ok(name.to_owned()); Related to the comment above, this seems to (generally) return an absolute parent directory instead of a repo-relative path. Try adding a `println!()` just inside the `if same` to make sure you have test coverage. REPOSITORY rHG Mercurial CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D7871/new/ REVISION DETAIL https://phab.mercurial-scm.org/D7871 To: Alphare, #hg-reviewers, kevincox, martinvonz Cc: martinvonz, durin42, kevincox, mercurial-devel
This revision is now accepted and ready to land. martinvonz added inline comments. martinvonz accepted this revision. INLINE COMMENTS > files.rs:366 > + // TODO make portable > + std::os::unix::fs::symlink(&root, &out_of_repo).unwrap(); > + Can the test method be annotated so it's only run on unix? REPOSITORY rHG Mercurial BRANCH default CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D7871/new/ REVISION DETAIL https://phab.mercurial-scm.org/D7871 To: Alphare, #hg-reviewers, kevincox, martinvonz Cc: martinvonz, durin42, kevincox, mercurial-devel
Alphare added inline comments. INLINE COMMENTS > martinvonz wrote in files.rs:366 > Can the test method be annotated so it's only run on unix? Yes, but I think we need to have a more general chat about this subject. A non-negligible number of `hg-core`'s features do not work under non-UNIX systems. Our module policy system does not have the fine-grained control needed to map for a more complexe set of `features` (in Cargo terms). I think the more reasonable option right now is to have a compilation failure. Maybe we should have a big gated `compile_error!` at the top of `lib.rs` to inform users that there is work to be done to be cross-platform? REPOSITORY rHG Mercurial CHANGES SINCE LAST ACTION https://phab.mercurial-scm.org/D7871/new/ REVISION DETAIL https://phab.mercurial-scm.org/D7871 To: Alphare, #hg-reviewers, kevincox, martinvonz Cc: martinvonz, durin42, kevincox, mercurial-devel
Patch
diff --git a/rust/hg-core/src/utils/files.rs b/rust/hg-core/src/utils/files.rs --- a/rust/hg-core/src/utils/files.rs +++ b/rust/hg-core/src/utils/files.rs @@ -9,13 +9,17 @@ //! Functions for fiddling with files. -use crate::utils::hg_path::{HgPath, HgPathBuf}; - +use crate::utils::hg_path::{ + path_to_hg_path_buf, HgPath, HgPathBuf, HgPathError, +}; +use crate::utils::path_auditor::PathAuditor; use crate::utils::replace_slice; use lazy_static::lazy_static; use std::fs::Metadata; use std::iter::FusedIterator; -use std::path::Path; +use std::ops::Deref; +use same_file::is_same_file; +use std::path::{Path, PathBuf}; pub fn get_path_from_bytes(bytes: &[u8]) -> &Path { let os_str; @@ -272,9 +276,62 @@ } } +/// Returns the canonical path of `name`, given `cwd` and `root` +pub fn canonical_path( + root: impl AsRef<Path>, + cwd: impl AsRef<Path>, + name: impl AsRef<Path>, +) -> Result<PathBuf, HgPathError> { + // TODO add missing normalization for other platforms + let root = root.as_ref(); + let cwd = cwd.as_ref(); + let name = name.as_ref(); + + let name = if !name.is_absolute() { + root.join(&cwd).join(&name) + } else { + name.to_owned() + }; + let mut auditor = PathAuditor::new(&root); + if name != root && name.starts_with(&root) { + let name = name.strip_prefix(&root).unwrap(); + auditor.audit_path(path_to_hg_path_buf(name)?)?; + return Ok(name.to_owned()); + } else if name == root { + return Ok("".into()); + } else { + // Determine whether `name' is in the hierarchy at or beneath `root', + // by iterating name=name.parent() until that causes no change (can't + // check name == '/', because that doesn't work on windows). + let mut name = name.deref(); + loop { + let same = is_same_file(&name, &root).unwrap_or(false); + if same { + if name.components().next().is_none() { + // `name` was actually the same as root (maybe a symlink) + return Ok("".into()); + } + auditor.audit_path(path_to_hg_path_buf(name)?)?; + return Ok(name.to_owned()); + } + name = match name.parent() { + None => break, + Some(p) => p, + }; + } + // TODO hint to the user about using --cwd + // Bubble up the responsibility to Python for now + Err(HgPathError::NotUnderRoot { + path: name.to_owned(), + root: root.to_owned(), + }) + } +} + #[cfg(test)] mod tests { use super::*; + use pretty_assertions::assert_eq; #[test] fn find_dirs_some() { @@ -415,4 +472,53 @@ assert_eq!(dirs.next(), None); assert_eq!(dirs.next(), None); } + + #[test] + fn test_canonical_path() { + let root = Path::new("/repo"); + let cwd = Path::new("/dir"); + let name = Path::new("filename"); + assert_eq!( + canonical_path(root, cwd, name), + Err(HgPathError::NotUnderRoot { + path: PathBuf::from("/"), + root: root.to_path_buf() + }) + ); + + let root = Path::new("/repo"); + let cwd = Path::new("/"); + let name = Path::new("filename"); + assert_eq!( + canonical_path(root, cwd, name), + Err(HgPathError::NotUnderRoot { + path: PathBuf::from("/"), + root: root.to_path_buf() + }) + ); + + let root = Path::new("/repo"); + let cwd = Path::new("/"); + let name = Path::new("repo/filename"); + assert_eq!( + canonical_path(root, cwd, name), + Ok(PathBuf::from("filename")) + ); + + let root = Path::new("/repo"); + let cwd = Path::new("/repo"); + let name = Path::new("filename"); + assert_eq!( + canonical_path(root, cwd, name), + Ok(PathBuf::from("filename")) + ); + + let root = Path::new("/repo"); + let cwd = Path::new("/repo/subdir"); + let name = Path::new("filename"); + assert_eq!( + canonical_path(root, cwd, name), + Ok(PathBuf::from("subdir/filename")) + ); + } } diff --git a/rust/hg-core/Cargo.toml b/rust/hg-core/Cargo.toml --- a/rust/hg-core/Cargo.toml +++ b/rust/hg-core/Cargo.toml @@ -17,7 +17,8 @@ rayon = "1.2.0" regex = "1.1.0" twox-hash = "1.5.0" +same-file = "1.0.6" [dev-dependencies] tempfile = "3.1.0" pretty_assertions = "0.6.1" \ No newline at end of file diff --git a/rust/Cargo.lock b/rust/Cargo.lock --- a/rust/Cargo.lock +++ b/rust/Cargo.lock @@ -157,6 +157,7 @@ "rand_pcg 0.1.2 (registry+https://github.com/rust-lang/crates.io-index)", "rayon 1.2.0 (registry+https://github.com/rust-lang/crates.io-index)", "regex 1.3.1 (registry+https://github.com/rust-lang/crates.io-index)", + "same-file 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)", "tempfile 3.1.0 (registry+https://github.com/rust-lang/crates.io-index)", "twox-hash 1.5.0 (registry+https://github.com/rust-lang/crates.io-index)", ] @@ -481,6 +482,14 @@ ] [[package]] +name = "same-file" +version = "1.0.6" +source = "registry+https://github.com/rust-lang/crates.io-index" +dependencies = [ + "winapi-util 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)", +] + +[[package]] name = "scopeguard" version = "1.0.0" source = "registry+https://github.com/rust-lang/crates.io-index" @@ -562,6 +571,14 @@ source = "registry+https://github.com/rust-lang/crates.io-index" [[package]] +name = "winapi-util" +version = "0.1.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +dependencies = [ + "winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)", +] + +[[package]] name = "winapi-x86_64-pc-windows-gnu" version = "0.4.0" source = "registry+https://github.com/rust-lang/crates.io-index" @@ -622,6 +639,7 @@ "checksum regex-syntax 0.6.12 (registry+https://github.com/rust-lang/crates.io-index)" = "11a7e20d1cce64ef2fed88b66d347f88bd9babb82845b2b858f3edbf59a4f716" "checksum remove_dir_all 0.5.2 (registry+https://github.com/rust-lang/crates.io-index)" = "4a83fa3702a688b9359eccba92d153ac33fd2e8462f9e0e3fdf155239ea7792e" "checksum rustc_version 0.2.3 (registry+https://github.com/rust-lang/crates.io-index)" = "138e3e0acb6c9fb258b19b67cb8abd63c00679d2851805ea151465464fe9030a" +"checksum same-file 1.0.6 (registry+https://github.com/rust-lang/crates.io-index)" = "93fc1dc3aaa9bfed95e02e6eadabb4baf7e3078b0bd1b4d7b6b0b68378900502" "checksum scopeguard 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "b42e15e59b18a828bbf5c58ea01debb36b9b096346de35d941dcb89009f24a0d" "checksum semver 0.9.0 (registry+https://github.com/rust-lang/crates.io-index)" = "1d7eb9ef2c18661902cc47e535f9bc51b78acd254da71d375c2f6720d9a40403" "checksum semver-parser 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)" = "388a1df253eca08550bef6c72392cfe7c30914bf41df5269b68cbd6ff8f570a3" @@ -633,4 +651,5 @@ "checksum wasi 0.7.0 (registry+https://github.com/rust-lang/crates.io-index)" = "b89c3ce4ce14bdc6fb6beaf9ec7928ca331de5df7e5ea278375642a2f478570d" "checksum winapi 0.3.8 (registry+https://github.com/rust-lang/crates.io-index)" = "8093091eeb260906a183e6ae1abdba2ef5ef2257a21801128899c3fc699229c6" "checksum winapi-i686-pc-windows-gnu 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6" +"checksum winapi-util 0.1.3 (registry+https://github.com/rust-lang/crates.io-index)" = "4ccfbf554c6ad11084fb7517daca16cfdcaccbdadba4fc336f032a8b12c2ad80" "checksum winapi-x86_64-pc-windows-gnu 0.4.0 (registry+https://github.com/rust-lang/crates.io-index)" = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"