Patchwork treemanifest: add walksubtrees api

login
register
mail settings
Submitter Durham Goode
Date April 10, 2017, 11:22 p.m.
Message ID <a273161ab99aeec1c1be.1491866526@dev111.prn1.facebook.com>
Download mbox | patch
Permalink /patch/20096/
State Accepted
Headers show

Comments

Durham Goode - April 10, 2017, 11:22 p.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1491854867 25200
#      Mon Apr 10 13:07:47 2017 -0700
# Node ID a273161ab99aeec1c1beb0da627be3aeebbb513e
# Parent  e0dc40530c5aa514feb6a09cf79ab6a3aa2ec331
treemanifest: add walksubtrees api

Adds a new function to treemanifest that allows walking over the directories in
the tree. Currently it only accepts a matcher to prune the walk, but in the
future it will also accept a list of trees and will only walk over subtrees that
differ from the versions in the list. This will be useful for identifying what
parts of the tree are new to this revision, which is useful when deciding the
minimal set of trees to send to a client given that they have a certain tree
already.

Since this is intended for an extension to use, the only current consumer is a
test. In the future this function may be useful for implementing other
algorithms like diff and changegroup generation.
Ryan McElroy - April 11, 2017, 9:15 a.m.
On 4/11/17 12:22 AM, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1491854867 25200
> #      Mon Apr 10 13:07:47 2017 -0700
> # Node ID a273161ab99aeec1c1beb0da627be3aeebbb513e
> # Parent  e0dc40530c5aa514feb6a09cf79ab6a3aa2ec331
> treemanifest: add walksubtrees api

This looks like a straightforward and useful addition. Marked as 
pre-reviewed in patchwork.

>
> Adds a new function to treemanifest that allows walking over the directories in
> the tree. Currently it only accepts a matcher to prune the walk, but in the
> future it will also accept a list of trees and will only walk over subtrees that
> differ from the versions in the list. This will be useful for identifying what
> parts of the tree are new to this revision, which is useful when deciding the
> minimal set of trees to send to a client given that they have a certain tree
> already.
>
> Since this is intended for an extension to use, the only current consumer is a
> test. In the future this function may be useful for implementing other
> algorithms like diff and changegroup generation.
>
> diff --git a/mercurial/manifest.py b/mercurial/manifest.py
> --- a/mercurial/manifest.py
> +++ b/mercurial/manifest.py
> @@ -1155,6 +1155,22 @@ class treemanifest(object):
>                   subp1, subp2 = subp2, subp1
>               writesubtree(subm, subp1, subp2)
>   
> +    def walksubtrees(self, matcher=None):
> +        """Returns an iterator of the subtrees of this manifest, including this
> +        manifest itself.
> +
> +        If `matcher` is provided, it only returns subtrees that match.
> +        """
> +        if matcher and not matcher.visitdir(self._dir[:-1] or '.'):
> +            return
> +        if not matcher or matcher(self._dir[:-1]):
> +            yield self
> +
> +        self._load()
> +        for d, subm in self._dirs.iteritems():
> +            for subtree in subm.walksubtrees(matcher=matcher):
> +                yield subtree
> +
>   class manifestrevlog(revlog.revlog):
>       '''A revlog that stores manifest texts. This is responsible for caching the
>       full-text manifest contents.
> diff --git a/tests/test-manifest.py b/tests/test-manifest.py
> --- a/tests/test-manifest.py
> +++ b/tests/test-manifest.py
> @@ -467,5 +467,21 @@ class testtreemanifest(unittest.TestCase
>       def parsemanifest(self, text):
>           return manifestmod.treemanifest('', text)
>   
> +    def testWalkSubtrees(self):
> +        m = self.parsemanifest(A_DEEPER_MANIFEST)

/me inserts inception-themed meme here

> +
> +        dirs = [s._dir for s in m.walksubtrees()]
> +        self.assertEqual(
> +            sorted(['', 'a/', 'a/c/', 'a/d/', 'a/b/', 'a/b/c/', 'a/b/d/']),
> +            sorted(dirs)
> +        )
> +
> +        match = matchmod.match('/', '', ['path:a/b/'])
> +        dirs = [s._dir for s in m.walksubtrees(matcher=match)]
> +        self.assertEqual(
> +            sorted(['a/b/', 'a/b/c/', 'a/b/d/']),
> +            sorted(dirs)
> +        )
> +
>   if __name__ == '__main__':
>       silenttestrunner.main(__name__)
>
Augie Fackler - April 11, 2017, 2:53 p.m.
On Mon, Apr 10, 2017 at 04:22:06PM -0700, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1491854867 25200
> #      Mon Apr 10 13:07:47 2017 -0700
> # Node ID a273161ab99aeec1c1beb0da627be3aeebbb513e
> # Parent  e0dc40530c5aa514feb6a09cf79ab6a3aa2ec331
> treemanifest: add walksubtrees api

Queued this, thanks.

I wonder if we can use this to make diff faster in treemanifests.

>
> Adds a new function to treemanifest that allows walking over the directories in
> the tree. Currently it only accepts a matcher to prune the walk, but in the
> future it will also accept a list of trees and will only walk over subtrees that
> differ from the versions in the list. This will be useful for identifying what
> parts of the tree are new to this revision, which is useful when deciding the
> minimal set of trees to send to a client given that they have a certain tree
> already.
>
> Since this is intended for an extension to use, the only current consumer is a
> test. In the future this function may be useful for implementing other
> algorithms like diff and changegroup generation.
>
> diff --git a/mercurial/manifest.py b/mercurial/manifest.py
> --- a/mercurial/manifest.py
> +++ b/mercurial/manifest.py
> @@ -1155,6 +1155,22 @@ class treemanifest(object):
>                  subp1, subp2 = subp2, subp1
>              writesubtree(subm, subp1, subp2)
>
> +    def walksubtrees(self, matcher=None):
> +        """Returns an iterator of the subtrees of this manifest, including this
> +        manifest itself.
> +
> +        If `matcher` is provided, it only returns subtrees that match.
> +        """
> +        if matcher and not matcher.visitdir(self._dir[:-1] or '.'):
> +            return
> +        if not matcher or matcher(self._dir[:-1]):
> +            yield self
> +
> +        self._load()
> +        for d, subm in self._dirs.iteritems():
> +            for subtree in subm.walksubtrees(matcher=matcher):
> +                yield subtree
> +
>  class manifestrevlog(revlog.revlog):
>      '''A revlog that stores manifest texts. This is responsible for caching the
>      full-text manifest contents.
> diff --git a/tests/test-manifest.py b/tests/test-manifest.py
> --- a/tests/test-manifest.py
> +++ b/tests/test-manifest.py
> @@ -467,5 +467,21 @@ class testtreemanifest(unittest.TestCase
>      def parsemanifest(self, text):
>          return manifestmod.treemanifest('', text)
>
> +    def testWalkSubtrees(self):
> +        m = self.parsemanifest(A_DEEPER_MANIFEST)
> +
> +        dirs = [s._dir for s in m.walksubtrees()]
> +        self.assertEqual(
> +            sorted(['', 'a/', 'a/c/', 'a/d/', 'a/b/', 'a/b/c/', 'a/b/d/']),
> +            sorted(dirs)
> +        )
> +
> +        match = matchmod.match('/', '', ['path:a/b/'])
> +        dirs = [s._dir for s in m.walksubtrees(matcher=match)]
> +        self.assertEqual(
> +            sorted(['a/b/', 'a/b/c/', 'a/b/d/']),
> +            sorted(dirs)
> +        )
> +
>  if __name__ == '__main__':
>      silenttestrunner.main(__name__)
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/manifest.py b/mercurial/manifest.py
--- a/mercurial/manifest.py
+++ b/mercurial/manifest.py
@@ -1155,6 +1155,22 @@  class treemanifest(object):
                 subp1, subp2 = subp2, subp1
             writesubtree(subm, subp1, subp2)
 
+    def walksubtrees(self, matcher=None):
+        """Returns an iterator of the subtrees of this manifest, including this
+        manifest itself.
+
+        If `matcher` is provided, it only returns subtrees that match.
+        """
+        if matcher and not matcher.visitdir(self._dir[:-1] or '.'):
+            return
+        if not matcher or matcher(self._dir[:-1]):
+            yield self
+
+        self._load()
+        for d, subm in self._dirs.iteritems():
+            for subtree in subm.walksubtrees(matcher=matcher):
+                yield subtree
+
 class manifestrevlog(revlog.revlog):
     '''A revlog that stores manifest texts. This is responsible for caching the
     full-text manifest contents.
diff --git a/tests/test-manifest.py b/tests/test-manifest.py
--- a/tests/test-manifest.py
+++ b/tests/test-manifest.py
@@ -467,5 +467,21 @@  class testtreemanifest(unittest.TestCase
     def parsemanifest(self, text):
         return manifestmod.treemanifest('', text)
 
+    def testWalkSubtrees(self):
+        m = self.parsemanifest(A_DEEPER_MANIFEST)
+
+        dirs = [s._dir for s in m.walksubtrees()]
+        self.assertEqual(
+            sorted(['', 'a/', 'a/c/', 'a/d/', 'a/b/', 'a/b/c/', 'a/b/d/']),
+            sorted(dirs)
+        )
+
+        match = matchmod.match('/', '', ['path:a/b/'])
+        dirs = [s._dir for s in m.walksubtrees(matcher=match)]
+        self.assertEqual(
+            sorted(['a/b/', 'a/b/c/', 'a/b/d/']),
+            sorted(dirs)
+        )
+
 if __name__ == '__main__':
     silenttestrunner.main(__name__)