Patchwork treemanifest: allow manifestrevlog to take an explicit treemanifest arg

login
register
mail settings
Submitter Durham Goode
Date May 9, 2017, 1:23 a.m.
Message ID <6d02589fd6a76c354aff.1494293001@dev111.prn1.facebook.com>
Download mbox | patch
Permalink /patch/20537/
State Changes Requested
Headers show

Comments

Durham Goode - May 9, 2017, 1:23 a.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1494292924 25200
#      Mon May 08 18:22:04 2017 -0700
# Node ID 6d02589fd6a76c354aff165503cc72333b14307a
# Parent  8f1a2b848b52ea7bf3fe2404e3b62924c7aae93f
treemanifest: allow manifestrevlog to take an explicit treemanifest arg

Previously we relied on the opener options to tell the revlog to be a tree
manifest. This makes it complicated for extensions to create treemanifests and
normal manifests at the same time. Let's add a construtor argument to create a
treemanifest revlog as well.
Ryan McElroy - May 9, 2017, 1:29 p.m.
On 5/9/17 2:23 AM, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1494292924 25200
> #      Mon May 08 18:22:04 2017 -0700
> # Node ID 6d02589fd6a76c354aff165503cc72333b14307a
> # Parent  8f1a2b848b52ea7bf3fe2404e3b62924c7aae93f
> treemanifest: allow manifestrevlog to take an explicit treemanifest arg
>
> Previously we relied on the opener options to tell the revlog to be a tree
> manifest. This makes it complicated for extensions to create treemanifests and
> normal manifests at the same time. Let's add a construtor argument to create a
> treemanifest revlog as well.
>
> diff --git a/mercurial/manifest.py b/mercurial/manifest.py
> --- a/mercurial/manifest.py
> +++ b/mercurial/manifest.py
> @@ -1175,25 +1175,29 @@ class manifestrevlog(revlog.revlog):
>       '''A revlog that stores manifest texts. This is responsible for caching the
>       full-text manifest contents.
>       '''
> -    def __init__(self, opener, dir='', dirlogcache=None, indexfile=None):
> +    def __init__(self, opener, dir='', dirlogcache=None, indexfile=None,
> +                 treemanifest=False):
>           """Constructs a new manifest revlog
>   
>           `indexfile` - used by extensions to have two manifests at once, like
>           when transitioning between flatmanifeset and treemanifests.
> +
> +        `treemanifest` - used to indicate this is a tree manifest revlog. If the
> +        treemanifest opener option is set to True, this argument is ignored.

This comment doesn't look quite accurate: from the code, it looks like 
if either treemanifest=True here, or opts.treemanifest is True, then you 
use Treemanifests, so "ignored" isn't quite the right term it seems to 
me. I'd like to see this clarified.

>           """
>           # During normal operations, we expect to deal with not more than four
>           # revs at a time (such as during commit --amend). When rebasing large
>           # stacks of commits, the number can go up, hence the config knob below.
>           cachesize = 4
> -        usetreemanifest = False
> +        optiontreemanifest = False

Not sure why the rename here -- usetreemanifest is used elsewhere in 
this file (down in the manifestlog constructor, around line 1311) and it 
still uses the "old" name and override mechanism, so it seems that both 
should be updated at the same time, or in the same series.

>           usemanifestv2 = False
>           opts = getattr(opener, 'options', None)
>           if opts is not None:
>               cachesize = opts.get('manifestcachesize', cachesize)
> -            usetreemanifest = opts.get('treemanifest', usetreemanifest)
> +            optiontreemanifest = opts.get('treemanifest', False)
>               usemanifestv2 = opts.get('manifestv2', usemanifestv2)
>   
> -        self._treeondisk = usetreemanifest
> +        self._treeondisk = optiontreemanifest or treemanifest
>           self._usemanifestv2 = usemanifestv2
>   
>           self._fulltextcache = util.lrucachedict(cachesize)
> @@ -1232,7 +1236,8 @@ class manifestrevlog(revlog.revlog):
>               assert self._treeondisk
>           if dir not in self._dirlogcache:
>               self._dirlogcache[dir] = manifestrevlog(self.opener, dir,
> -                                                    self._dirlogcache)
> +                                                    self._dirlogcache,
> +                                                    treemanifest=True)

It's not clear to me that this should have been added as True here. What 
am I missing that makes this correct?

>           return self._dirlogcache[dir]
>   
>       def add(self, m, transaction, link, p1, p2, added, removed, readtree=None):
>

I have enough questions about this I'd like to see a follow-up so I'll 
mark as "changes requested" in patchwork.
Durham Goode - May 9, 2017, 5:02 p.m.
On 5/9/17 6:29 AM, Ryan McElroy wrote:
> On 5/9/17 2:23 AM, Durham Goode wrote:
>> # HG changeset patch
>> # User Durham Goode <durham@fb.com>
>> # Date 1494292924 25200
>> #      Mon May 08 18:22:04 2017 -0700
>> # Node ID 6d02589fd6a76c354aff165503cc72333b14307a
>> # Parent  8f1a2b848b52ea7bf3fe2404e3b62924c7aae93f
>> treemanifest: allow manifestrevlog to take an explicit treemanifest arg
>>
>> Previously we relied on the opener options to tell the revlog to be a
>> tree
>> manifest. This makes it complicated for extensions to create
>> treemanifests and
>> normal manifests at the same time. Let's add a construtor argument to
>> create a
>> treemanifest revlog as well.
>>
>> diff --git a/mercurial/manifest.py b/mercurial/manifest.py
>> --- a/mercurial/manifest.py
>> +++ b/mercurial/manifest.py
>> @@ -1175,25 +1175,29 @@ class manifestrevlog(revlog.revlog):
>>       '''A revlog that stores manifest texts. This is responsible for
>> caching the
>>       full-text manifest contents.
>>       '''
>> -    def __init__(self, opener, dir='', dirlogcache=None,
>> indexfile=None):
>> +    def __init__(self, opener, dir='', dirlogcache=None, indexfile=None,
>> +                 treemanifest=False):
>>           """Constructs a new manifest revlog
>>             `indexfile` - used by extensions to have two manifests at
>> once, like
>>           when transitioning between flatmanifeset and treemanifests.
>> +
>> +        `treemanifest` - used to indicate this is a tree manifest
>> revlog. If the
>> +        treemanifest opener option is set to True, this argument is
>> ignored.
>
> This comment doesn't look quite accurate: from the code, it looks like
> if either treemanifest=True here, or opts.treemanifest is True, then you
> use Treemanifests, so "ignored" isn't quite the right term it seems to
> me. I'd like to see this clarified.

If opts.treemanifest is True, then the value of the treemanifest arg 
isn't used. 'ignored' seems appropriate since it indicates that a False 
argument doesn't override the option setting.  I can make it more 
explicit I guess?

>
>>           """
>>           # During normal operations, we expect to deal with not more
>> than four
>>           # revs at a time (such as during commit --amend). When
>> rebasing large
>>           # stacks of commits, the number can go up, hence the config
>> knob below.
>>           cachesize = 4
>> -        usetreemanifest = False
>> +        optiontreemanifest = False
>
> Not sure why the rename here -- usetreemanifest is used elsewhere in
> this file (down in the manifestlog constructor, around line 1311) and it
> still uses the "old" name and override mechanism, so it seems that both
> should be updated at the same time, or in the same series.

This function now has an argument that indicates treemanifest and a 
option that indicates treemanifest.  'usetreemanifest' was only used for 
the option version, and therefore its name was too vague (it would make 
the line below be `self._treeondisk = usetreemanifest or treemanifest` 
which is confusing).

The manifestlog constructor doesn't have this problem because it only 
has the option, so usetreemanifest is appropriate there. I could rename 
it too, but it didn't seem worth the code churn for minor consistency 
for low use variable names.

>>           usemanifestv2 = False
>>           opts = getattr(opener, 'options', None)
>>           if opts is not None:
>>               cachesize = opts.get('manifestcachesize', cachesize)
>> -            usetreemanifest = opts.get('treemanifest', usetreemanifest)
>> +            optiontreemanifest = opts.get('treemanifest', False)
>>               usemanifestv2 = opts.get('manifestv2', usemanifestv2)
>>   -        self._treeondisk = usetreemanifest
>> +        self._treeondisk = optiontreemanifest or treemanifest
>>           self._usemanifestv2 = usemanifestv2
>>             self._fulltextcache = util.lrucachedict(cachesize)
>> @@ -1232,7 +1236,8 @@ class manifestrevlog(revlog.revlog):
>>               assert self._treeondisk
>>           if dir not in self._dirlogcache:
>>               self._dirlogcache[dir] = manifestrevlog(self.opener, dir,
>> -                                                    self._dirlogcache)
>> +                                                    self._dirlogcache,
>> +                                                    treemanifest=True)
>
> It's not clear to me that this should have been added as True here. What
> am I missing that makes this correct?

This constructor is only ever run for sub directories (since the root 
value is populated in the cache during the constructor), so it will 
always be a treemanifest here.  But you're right, it's too subtle.  I'll 
change this to `treemanifest=self._treeondisk`

Patch

diff --git a/mercurial/manifest.py b/mercurial/manifest.py
--- a/mercurial/manifest.py
+++ b/mercurial/manifest.py
@@ -1175,25 +1175,29 @@  class manifestrevlog(revlog.revlog):
     '''A revlog that stores manifest texts. This is responsible for caching the
     full-text manifest contents.
     '''
-    def __init__(self, opener, dir='', dirlogcache=None, indexfile=None):
+    def __init__(self, opener, dir='', dirlogcache=None, indexfile=None,
+                 treemanifest=False):
         """Constructs a new manifest revlog
 
         `indexfile` - used by extensions to have two manifests at once, like
         when transitioning between flatmanifeset and treemanifests.
+
+        `treemanifest` - used to indicate this is a tree manifest revlog. If the
+        treemanifest opener option is set to True, this argument is ignored.
         """
         # During normal operations, we expect to deal with not more than four
         # revs at a time (such as during commit --amend). When rebasing large
         # stacks of commits, the number can go up, hence the config knob below.
         cachesize = 4
-        usetreemanifest = False
+        optiontreemanifest = False
         usemanifestv2 = False
         opts = getattr(opener, 'options', None)
         if opts is not None:
             cachesize = opts.get('manifestcachesize', cachesize)
-            usetreemanifest = opts.get('treemanifest', usetreemanifest)
+            optiontreemanifest = opts.get('treemanifest', False)
             usemanifestv2 = opts.get('manifestv2', usemanifestv2)
 
-        self._treeondisk = usetreemanifest
+        self._treeondisk = optiontreemanifest or treemanifest
         self._usemanifestv2 = usemanifestv2
 
         self._fulltextcache = util.lrucachedict(cachesize)
@@ -1232,7 +1236,8 @@  class manifestrevlog(revlog.revlog):
             assert self._treeondisk
         if dir not in self._dirlogcache:
             self._dirlogcache[dir] = manifestrevlog(self.opener, dir,
-                                                    self._dirlogcache)
+                                                    self._dirlogcache,
+                                                    treemanifest=True)
         return self._dirlogcache[dir]
 
     def add(self, m, transaction, link, p1, p2, added, removed, readtree=None):