Patchwork [STABLE] verify: don't init a new subrepo when a missing one is referenced (issue5128)

login
register
mail settings
Submitter Matt Harbison
Date April 3, 2016, 8:15 p.m.
Message ID <63ce31ee0e582bc25327.1459714557@Envy>
Download mbox | patch
Permalink /patch/14293/
State Changes Requested
Headers show

Comments

Matt Harbison - April 3, 2016, 8:15 p.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1459708994 14400
#      Sun Apr 03 14:43:14 2016 -0400
# Branch stable
# Node ID 63ce31ee0e582bc25327b27c0584445b28a90686
# Parent  2d39f987f0bacc39a8319cb6c84b2d38c1991028
verify: don't init a new subrepo when a missing one is referenced (issue5128)

Initializing a subrepo when one doesn't exist is the right thing to do when the
parent is being updated, but in few other cases.  Unfortunately, there isn't
enough context in the subrepo module to distinguish this case.  This same issue
can be caused with other subrepo aware commands, so there is a general issue
here beyond the scope of this fix.

A simpler attempt I tried was to add an '_updating' boolean to localrepo, and
set/clear it around the call to mergemod.update() in hg.updaterepo().  That
mostly worked, but doesn't handle the case where archive will clone the subrepo
if it is missing.  (I vaguely recall that there may be other commands that will
clone if needed like this, but certainly not all do.  It seems both handy, and a
bit surprising for what should be a read only operation.  It might be nice if
all commands did this consistently, but we probably need Angel's subrepo caching
first, to not make a mess of the working directory.)

It was suggested in the bug discussion to skip looking at the subrepo links
unless -S is specified.  I don't really like that idea because missing a subrepo
or (less likely, but worse) a corrupt .hgsubstate is a problem of the parent
repo when checking out a revision.  The -S option seems like a better fit for
functionality that would recurse into each subrepo and do a full verification.

Ultimately, the default value for 'allowcreate' should probably be flipped, but
since the default behavior was to allow creation, this is less risky for now.
Pierre-Yves David - April 4, 2016, 7:50 a.m.
On 04/03/2016 01:15 PM, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1459708994 14400
> #      Sun Apr 03 14:43:14 2016 -0400
> # Branch stable
> # Node ID 63ce31ee0e582bc25327b27c0584445b28a90686
> # Parent  2d39f987f0bacc39a8319cb6c84b2d38c1991028
> verify: don't init a new subrepo when a missing one is referenced (issue5128)
>
> Initializing a subrepo when one doesn't exist is the right thing to do when the
> parent is being updated, but in few other cases.  Unfortunately, there isn't
> enough context in the subrepo module to distinguish this case.  This same issue
> can be caused with other subrepo aware commands, so there is a general issue
> here beyond the scope of this fix.
>
> A simpler attempt I tried was to add an '_updating' boolean to localrepo, and
> set/clear it around the call to mergemod.update() in hg.updaterepo().  That
> mostly worked, but doesn't handle the case where archive will clone the subrepo
> if it is missing.  (I vaguely recall that there may be other commands that will
> clone if needed like this, but certainly not all do.  It seems both handy, and a
> bit surprising for what should be a read only operation.  It might be nice if
> all commands did this consistently, but we probably need Angel's subrepo caching
> first, to not make a mess of the working directory.)
>
> It was suggested in the bug discussion to skip looking at the subrepo links
> unless -S is specified.  I don't really like that idea because missing a subrepo
> or (less likely, but worse) a corrupt .hgsubstate is a problem of the parent
> repo when checking out a revision.  The -S option seems like a better fit for
> functionality that would recurse into each subrepo and do a full verification.
>
> Ultimately, the default value for 'allowcreate' should probably be flipped, but
> since the default behavior was to allow creation, this is less risky for now.

I think we need at least 2 extra pieces here:

First, I don't think we should silently skip the subrepo, we want at 
least a warning to tell the user we did not verify one of them.

Second, we probably want a way to disable this and still check all 
subrepository (at the expense of creating the directory, for now).

(note: I think I prefered the -S approach, but I'm fine with this one).

Patch

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -275,9 +275,9 @@ 
         except error.LookupError:
             return ''
 
-    def sub(self, path):
+    def sub(self, path, allowcreate=True):
         '''return a subrepo for the stored revision of path, never wdir()'''
-        return subrepo.subrepo(self, path)
+        return subrepo.subrepo(self, path, allowcreate=allowcreate)
 
     def nullsub(self, path, pctx):
         return subrepo.nullsubrepo(self, path, pctx)
diff --git a/mercurial/hg.py b/mercurial/hg.py
--- a/mercurial/hg.py
+++ b/mercurial/hg.py
@@ -828,7 +828,7 @@ 
             ctx = repo[rev]
             try:
                 for subpath in ctx.substate:
-                    ret = ctx.sub(subpath).verify() or ret
+                    ret = ctx.sub(subpath, allowcreate=False).verify() or ret
             except Exception:
                 repo.ui.warn(_('.hgsubstate is corrupt in revision %s\n') %
                              node.short(ctx.node()))
diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
--- a/mercurial/subrepo.py
+++ b/mercurial/subrepo.py
@@ -340,7 +340,7 @@ 
                           "in '%s'\n") % vfs.join(dirname))
                 vfs.unlink(vfs.reljoin(dirname, f))
 
-def subrepo(ctx, path, allowwdir=False):
+def subrepo(ctx, path, allowwdir=False, allowcreate=True):
     """return instance of the right subrepo class for subrepo in path"""
     # subrepo inherently violates our import layering rules
     # because it wants to make repo objects from deep inside the stack
@@ -356,7 +356,7 @@ 
         raise error.Abort(_('unknown subrepo type %s') % state[2])
     if allowwdir:
         state = (state[0], ctx.subrev(path), state[2])
-    return types[state[2]](ctx, path, state[:2])
+    return types[state[2]](ctx, path, state[:2], allowcreate)
 
 def nullsubrepo(ctx, path, pctx):
     """return an empty subrepo in pctx for the extant subrepo in ctx"""
@@ -375,7 +375,7 @@ 
     subrev = ''
     if state[2] == 'hg':
         subrev = "0" * 40
-    return types[state[2]](pctx, path, (state[0], subrev))
+    return types[state[2]](pctx, path, (state[0], subrev), True)
 
 def newcommitphase(ui, ctx):
     commitphase = phases.newcommitphase(ui)
@@ -609,12 +609,12 @@ 
         return self.wvfs.reljoin(reporelpath(self._ctx.repo()), self._path)
 
 class hgsubrepo(abstractsubrepo):
-    def __init__(self, ctx, path, state):
+    def __init__(self, ctx, path, state, allowcreate):
         super(hgsubrepo, self).__init__(ctx, path)
         self._state = state
         r = ctx.repo()
         root = r.wjoin(path)
-        create = not r.wvfs.exists('%s/.hg' % path)
+        create = allowcreate and not r.wvfs.exists('%s/.hg' % path)
         self._repo = hg.repository(r.baseui, root, create=create)
 
         # Propagate the parent's --hidden option
@@ -1062,7 +1062,7 @@ 
         return reporelpath(self._repo)
 
 class svnsubrepo(abstractsubrepo):
-    def __init__(self, ctx, path, state):
+    def __init__(self, ctx, path, state, allowcreate):
         super(svnsubrepo, self).__init__(ctx, path)
         self._state = state
         self._exe = util.findexe('svn')
@@ -1282,7 +1282,7 @@ 
 
 
 class gitsubrepo(abstractsubrepo):
-    def __init__(self, ctx, path, state):
+    def __init__(self, ctx, path, state, allowcreate):
         super(gitsubrepo, self).__init__(ctx, path)
         self._state = state
         self._abspath = ctx.repo().wjoin(path)
diff --git a/tests/test-subrepo-missing.t b/tests/test-subrepo-missing.t
--- a/tests/test-subrepo-missing.t
+++ b/tests/test-subrepo-missing.t
@@ -121,4 +121,23 @@ 
   subrepo 'subrepo' is hidden in revision 674d05939c1e
   subrepo 'subrepo' not found in revision a7d05d9055a4
 
+verifying shouldn't init a new subrepo if the reference doesn't exist
+
+  $ mv subrepo b
+  $ hg verify
+  checking changesets
+  checking manifests
+  crosschecking files in changesets and manifests
+  checking files
+  2 files, 5 changesets, 5 total revisions
+  checking subrepo links
+  .hgsubstate is corrupt in revision ef278ff32036
+  .hgsubstate is corrupt in revision a66de08943b6
+  .hgsubstate is corrupt in revision 674d05939c1e
+  .hgsubstate is corrupt in revision a7d05d9055a4
+  $ find subrepo
+  find: `subrepo': * (glob)
+  [1]
+  $ mv b subrepo
+
   $ cd ..