Patchwork D7713: verify: allow the storage to signal when renames can be tested on `skipread`

login
register
mail settings
Submitter phabricator
Date Dec. 23, 2019, 7:48 a.m.
Message ID <differential-rev-PHID-DREV-el5nz3rcl5vxp2yenoz4-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/44022/
State Superseded
Headers show

Comments

phabricator - Dec. 23, 2019, 7:48 a.m.
mharbison72 created this revision.
Herald added a reviewer: indygreg.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This applies the new marker in the lfs handler to show it in action, and adds
  the test mentioned at the beginning of the series to show that fulltext isn't
  necessary in the LFS case.
  
  The existing `skipread` isn't enough, because it is also set if an error occurs
  reading the revlog data, or the data is censored.  It could probably be cleared,
  but then it technically violates the interface contract.  That wouldn't matter
  for the existing verify algorithm, but it isn't clear how that will change as
  alternate storage support is added.
  
  The flag is probably pretty revlog specific, given the comments in verify.py.
  But there's already filelog specific stuff in there and I'm not sure what future
  storage will bring, so I don't want to over-engineer this.  Likewise, I'm not
  sure that we want the verify method for each storage type to completely drive
  the bus when it comes to detecting renames, so I don't want to go down the
  rabbithole of having verifyintegrity() return metadata hints at this point.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D7713

AFFECTED FILES
  hgext/lfs/wrapper.py
  mercurial/interfaces/repository.py
  mercurial/revlog.py
  mercurial/verify.py
  tests/test-lfs.t

CHANGE DETAILS




To: mharbison72, indygreg, #hg-reviewers
Cc: mercurial-devel

Patch

diff --git a/tests/test-lfs.t b/tests/test-lfs.t
--- a/tests/test-lfs.t
+++ b/tests/test-lfs.t
@@ -218,6 +218,15 @@ 
   R large
   $ hg commit -m 'renames'
 
+  $ hg cat -r . l -T '{rawdata}\n'
+  version https://git-lfs.github.com/spec/v1
+  oid sha256:66100b384bf761271b407d79fc30cdd0554f3b2c5d944836e936d584b88ce88e
+  size 39
+  x-hg-copy large
+  x-hg-copyrev 2c531e0992ff3107c511b53cb82a91b6436de8b2
+  x-is-binary 0
+  
+
   $ hg files -r . 'set:copied()'
   l
   s
@@ -796,27 +805,57 @@ 
   $ test -f fromcorrupt/.hg/store/lfs/objects/66/100b384bf761271b407d79fc30cdd0554f3b2c5d944836e936d584b88ce88e
   [1]
 
-Verify will not try to download lfs blobs, if told not to process lfs content
+Verify will not try to download lfs blobs, if told not to process lfs content.
+The extension makes sure that the filelog.renamed() path is taken on a missing
+blob, and the output shows that it isn't fetched.
 
-  $ hg -R fromcorrupt --config lfs.usercache=emptycache verify -v --no-lfs
+  $ cat > $TESTTMP/lfsrename.py <<EOF
+  > from mercurial import (
+  >     exthelper,
+  > )
+  > 
+  > from hgext.lfs import (
+  >     pointer,
+  >     wrapper,
+  > )
+  > 
+  > eh = exthelper.exthelper()
+  > uisetup = eh.finaluisetup
+  > 
+  > @eh.wrapfunction(wrapper, b'filelogrenamed')
+  > def filelogrenamed(orig, orig1, self, node):
+  >     ret = orig(orig1, self, node)
+  >     if wrapper._islfs(self._revlog, node) and ret:
+  >         rawtext = self._revlog.rawdata(node)
+  >         metadata = pointer.deserialize(rawtext)
+  >         print('lfs blob %s renamed %s -> %s'
+  >               % (metadata[b'oid'], ret[0], self._revlog.filename))
+  >     return ret
+  > EOF
+
+  $ hg -R fromcorrupt --config lfs.usercache=emptycache verify -v --no-lfs \
+  >                   --config extensions.x=$TESTTMP/lfsrename.py
   repository uses revlog format 1
   checking changesets
   checking manifests
   crosschecking files in changesets and manifests
   checking files
   lfs: found 22f66a3fc0b9bf3f012c814303995ec07099b3a9ce02a7af84b5970811074a3b in the local lfs store
+  lfs blob sha256:66100b384bf761271b407d79fc30cdd0554f3b2c5d944836e936d584b88ce88e renamed large -> l
   checked 5 changesets with 10 changes to 4 files
 
 Verify will not try to download lfs blobs, if told not to by the config option
 
   $ hg -R fromcorrupt --config lfs.usercache=emptycache verify -v \
-  >                   --config verify.skipflags=8192
+  >                   --config verify.skipflags=8192 \
+  >                   --config extensions.x=$TESTTMP/lfsrename.py
   repository uses revlog format 1
   checking changesets
   checking manifests
   crosschecking files in changesets and manifests
   checking files
   lfs: found 22f66a3fc0b9bf3f012c814303995ec07099b3a9ce02a7af84b5970811074a3b in the local lfs store
+  lfs blob sha256:66100b384bf761271b407d79fc30cdd0554f3b2c5d944836e936d584b88ce88e renamed large -> l
   checked 5 changesets with 10 changes to 4 files
 
 Verify will copy/link all lfs objects into the local store that aren't already
diff --git a/mercurial/verify.py b/mercurial/verify.py
--- a/mercurial/verify.py
+++ b/mercurial/verify.py
@@ -529,6 +529,8 @@ 
             else:
                 # Guard against implementations not setting this.
                 state[b'skipread'] = set()
+                state[b'safe_renamed'] = set()
+
                 for problem in fl.verifyintegrity(state):
                     if problem.node is not None:
                         linkrev = fl.linkrev(fl.rev(problem.node))
@@ -560,7 +562,7 @@ 
                     else:
                         del filenodes[f][n]
 
-                if n in state[b'skipread']:
+                if n in state[b'skipread'] and n not in state[b'safe_renamed']:
                     continue
 
                 # check renames
diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -2867,6 +2867,7 @@ 
             )
 
         state[b'skipread'] = set()
+        state[b'safe_renamed'] = set()
 
         for rev in self:
             node = self.node(rev)
diff --git a/mercurial/interfaces/repository.py b/mercurial/interfaces/repository.py
--- a/mercurial/interfaces/repository.py
+++ b/mercurial/interfaces/repository.py
@@ -878,7 +878,9 @@ 
 
         If individual revisions cannot have their revision content resolved,
         the method is expected to set the ``skipread`` key to a set of nodes
-        that encountered problems.
+        that encountered problems.  If set, the method can also add the node(s)
+        to ``safe_renamed`` in order to indicate nodes that may perform the
+        rename checks with currently accessible data.
 
         The method yields objects conforming to the ``iverifyproblem``
         interface.
diff --git a/hgext/lfs/wrapper.py b/hgext/lfs/wrapper.py
--- a/hgext/lfs/wrapper.py
+++ b/hgext/lfs/wrapper.py
@@ -236,6 +236,10 @@ 
         # the revlog.
         if rl.opener.lfslocalblobstore.has(metadata.oid()):
             skipflags &= ~revlog.REVIDX_EXTSTORED
+        elif skipflags & revlog.REVIDX_EXTSTORED:
+            # The wrapped method will set `skipread`, but there's enough local
+            # info to check renames.
+            state[b'safe_renamed'].add(node)
 
     orig(rl, skipflags, state, node)