Patchwork [09,of,11,sparse] dirstate: move validation of files in sparse checkout from extension

login
register
mail settings
Submitter Gregory Szorc
Date July 8, 2017, 11:29 p.m.
Message ID <151bee7bb0e2ea8e612b.1499556544@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/22134/
State Accepted
Headers show

Comments

Gregory Szorc - July 8, 2017, 11:29 p.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1499554173 25200
#      Sat Jul 08 15:49:33 2017 -0700
# Node ID 151bee7bb0e2ea8e612bbb0e16fa45d6e446ec05
# Parent  711945cedb514e81299059d69bfeb72da388fad0
dirstate: move validation of files in sparse checkout from extension

The sparse extension wraps various dirstate methods to validate that
the path being modified in dirstate exists in the sparse checkout.
This commit moves that code to dirstate itself.

As part of the move, the error messages were tweaked slightly.
And the code for the validation was reworked slightly. The most
notable change is the removal of the "f is not None" check.
This check has instead been inlined into copy(), which is the only
place where it should be relevant.

I'm not crazy about the references to sparse's UI adjustments
leaking into core. But what can you do.

I'm not a dirstate expert and am not sure if this is the ideal
way to hook in path validation into dirstate now that the code is in
core. But this does preserve the behavior of the extension. We can
always refactor later.

Unless the sparse feature is enabled via the sparse extension,
the matcher will be an alwaysmatcher and the new code will
essentially no-op. This change does add the overhead of calling
a few functions, however. If the modified methods are called in
tight loops (this can occur for working directory updates) this could
have a performance impact. I haven't explicitly measured the perf
impact because I view this code as temporary until it can be
refactored. The end state should essentially be an attribute
lookup, so I'm not too worried about long-term perf impact.
via Mercurial-devel - July 11, 2017, 5:46 a.m.
On Sat, Jul 8, 2017 at 4:29 PM, Gregory Szorc <gregory.szorc@gmail.com> wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1499554173 25200
> #      Sat Jul 08 15:49:33 2017 -0700
> # Node ID 151bee7bb0e2ea8e612bbb0e16fa45d6e446ec05
> # Parent  711945cedb514e81299059d69bfeb72da388fad0
> dirstate: move validation of files in sparse checkout from extension
>
> The sparse extension wraps various dirstate methods to validate that
> the path being modified in dirstate exists in the sparse checkout.
> This commit moves that code to dirstate itself.
>
> As part of the move, the error messages were tweaked slightly.
> And the code for the validation was reworked slightly. The most
> notable change is the removal of the "f is not None" check.
> This check has instead been inlined into copy(), which is the only
> place where it should be relevant.
>
> I'm not crazy about the references to sparse's UI adjustments
> leaking into core. But what can you do.

How about raising SparseError()? You can give it the file that was
outside if that's useful.

>
> I'm not a dirstate expert and am not sure if this is the ideal
> way to hook in path validation into dirstate now that the code is in
> core. But this does preserve the behavior of the extension. We can
> always refactor later.
>
> Unless the sparse feature is enabled via the sparse extension,
> the matcher will be an alwaysmatcher and the new code will
> essentially no-op. This change does add the overhead of calling
> a few functions, however. If the modified methods are called in
> tight loops (this can occur for working directory updates) this could
> have a performance impact. I haven't explicitly measured the perf
> impact because I view this code as temporary until it can be
> refactored. The end state should essentially be an attribute
> lookup, so I'm not too worried about long-term perf impact.

The worst case I could think of for this was debugrebuilddirstate,
because it writes the entire dirstate without reading the working
copy. That took 184ms before this patch and 195ms after (best of 10).
The increase remained (as expected) for the longer-running "hg
status": 591ms before and 599ms after. Also as expected, the relative
difference there is pretty small.

I'm fine with that slow-down. It seems like we should be able to
improve it later, but it's not a big deal to me even if we couldn't.

Patch

diff --git a/hgext/sparse.py b/hgext/sparse.py
--- a/hgext/sparse.py
+++ b/hgext/sparse.py
@@ -240,23 +240,6 @@  def _setupdirstate(ui):
         return orig(self, parent, allfiles, changedfiles)
     extensions.wrapfunction(dirstate.dirstate, 'rebuild', _rebuild)
 
-    # Prevent adding files that are outside the sparse checkout
-    editfuncs = ['normal', 'add', 'normallookup', 'copy', 'remove', 'merge']
-    hint = _('include file with `hg debugsparse --include <pattern>` or use ' +
-             '`hg add -s <file>` to include file directory while adding')
-    for func in editfuncs:
-        def _wrapper(orig, self, *args):
-            sparsematch = self._sparsematcher
-            if not sparsematch.always():
-                for f in args:
-                    if (f is not None and not sparsematch(f) and
-                        f not in self):
-                        raise error.Abort(_("cannot add '%s' - it is outside "
-                                            "the sparse checkout") % f,
-                                          hint=hint)
-            return orig(self, *args)
-        extensions.wrapfunction(dirstate.dirstate, func, _wrapper)
-
 @command('^debugsparse', [
     ('I', 'include', False, _('include files in the sparse checkout')),
     ('X', 'exclude', False, _('exclude files in the sparse checkout')),
diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -68,6 +68,10 @@  def nonnormalentries(dmap):
                 otherparent.add(fname)
         return nonnorm, otherparent
 
+ADD_SPARSE_HINT = _("include file with 'hg debugsparse --include <pattern>' "
+                    "or use 'hg add --sparse <file>' to include file "
+                    "directory while adding")
+
 class dirstate(object):
 
     def __init__(self, opener, ui, root, validate, sparsematchfn):
@@ -273,6 +277,18 @@  class dirstate(object):
         # it's safe because f is always a relative path
         return self._rootdir + f
 
+    def _ensuresparsematch(self, f):
+        """Verifies a path is part of the sparse config and aborts if not."""
+        matcher = self._sparsematcher
+
+        if matcher.always():
+            return
+
+        if not matcher(f) and f not in self._map:
+            raise error.Abort(_("cannot add '%s' because it is outside the "
+                                "sparse checkout") % f,
+                              hint=ADD_SPARSE_HINT)
+
     def flagfunc(self, buildfallback):
         if self._checklink and self._checkexec:
             def f(x):
@@ -514,6 +530,11 @@  class dirstate(object):
         """Mark dest as a copy of source. Unmark dest if source is None."""
         if source == dest:
             return
+
+        if source is not None:
+            self._ensuresparsematch(source)
+        self._ensuresparsematch(dest)
+
         self._dirty = True
         if source is not None:
             self._copymap[dest] = source
@@ -565,6 +586,7 @@  class dirstate(object):
 
     def normal(self, f):
         '''Mark a file normal and clean.'''
+        self._ensuresparsematch(f)
         s = os.lstat(self._join(f))
         mtime = s.st_mtime
         self._addpath(f, 'n', s.st_mode,
@@ -581,6 +603,8 @@  class dirstate(object):
 
     def normallookup(self, f):
         '''Mark a file normal, but possibly dirty.'''
+        self._ensuresparsematch(f)
+
         if self._pl[1] != nullid and f in self._map:
             # if there is a merge going on and the file was either
             # in state 'm' (-1) or coming from other parent (-2) before
@@ -620,12 +644,14 @@  class dirstate(object):
 
     def add(self, f):
         '''Mark a file added.'''
+        self._ensuresparsematch(f)
         self._addpath(f, 'a', 0, -1, -1)
         if f in self._copymap:
             del self._copymap[f]
 
     def remove(self, f):
         '''Mark a file removed.'''
+        self._ensuresparsematch(f)
         self._dirty = True
         self._droppath(f)
         size = 0
@@ -644,6 +670,8 @@  class dirstate(object):
 
     def merge(self, f):
         '''Mark a file merged.'''
+        self._ensuresparsematch(f)
+
         if self._pl[1] == nullid:
             return self.normallookup(f)
         return self.otherparent(f)
diff --git a/tests/test-sparse.t b/tests/test-sparse.t
--- a/tests/test-sparse.t
+++ b/tests/test-sparse.t
@@ -86,8 +86,8 @@  Verify status only shows included files
 Adding an excluded file should fail
 
   $ hg add hide3
-  abort: cannot add 'hide3' - it is outside the sparse checkout
-  (include file with `hg debugsparse --include <pattern>` or use `hg add -s <file>` to include file directory while adding)
+  abort: cannot add 'hide3' because it is outside the sparse checkout
+  (include file with 'hg debugsparse --include <pattern>' or use 'hg add --sparse <file>' to include file directory while adding)
   [255]
 
 Verify deleting sparseness while a file has changes fails
@@ -253,8 +253,8 @@  Test that add -s adds dirs to sparse pro
   $ touch add/foo
   $ touch add/bar
   $ hg add add/foo
-  abort: cannot add 'add/foo' - it is outside the sparse checkout
-  (include file with `hg debugsparse --include <pattern>` or use `hg add -s <file>` to include file directory while adding)
+  abort: cannot add 'add/foo' because it is outside the sparse checkout
+  (include file with 'hg debugsparse --include <pattern>' or use 'hg add --sparse <file>' to include file directory while adding)
   [255]
   $ hg add -s add/foo
   $ hg st