Patchwork [2,of,2,V2,mainline] format: introduce 'format.generaldelta=accept'

login
register
mail settings
Submitter Pierre-Yves David
Date Nov. 8, 2015, 5 p.m.
Message ID <3feb562d4e83fcbb7c59.1447002058@marginatus.alto.octopoid.net>
Download mbox | patch
Permalink /patch/11329/
State Changes Requested
Headers show

Comments

Pierre-Yves David - Nov. 8, 2015, 5 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@fb.com>
# Date 1446479952 0
#      Mon Nov 02 15:59:12 2015 +0000
# Node ID 3feb562d4e83fcbb7c59fa79b5749aae45de9fb1
# Parent  69c64edb25dd96a23564db5b23b8b432b343ec88
# EXP-Topic generaldelta
# Available At http://hg.netv6.net/marmoute-wip/mercurial/
#              hg pull http://hg.netv6.net/marmoute-wip/mercurial/ -r 3feb562d4e83
format: introduce 'format.generaldelta=accept'

This value will make repository created as general delta by default but will not
make Mercurial aggressively recompute delta for all incoming bundle. Instead,
the delta contained in the bundle will be used. This will allow use to start having
general delta repository created everywhere without triggering massive
re-computation cost for all new clients cloning from old server.
Matt Mackall - Nov. 9, 2015, 9:24 p.m.
On Sun, 2015-11-08 at 12:00 -0500, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1446479952 0
> #      Mon Nov 02 15:59:12 2015 +0000
> # Node ID 3feb562d4e83fcbb7c59fa79b5749aae45de9fb1
> # Parent  69c64edb25dd96a23564db5b23b8b432b343ec88
> # EXP-Topic generaldelta
> # Available At http://hg.netv6.net/marmoute-wip/mercurial/
> #              hg pull http://hg.netv6.net/marmoute-wip/mercurial/ -r
> 3feb562d4e83
> format: introduce 'format.generaldelta=accept'


>  def gdconfig(ui):
>      """helper function to retrieve general delta setting
>  
> -    The need for such function is triggered by a soon to come third
> possible
> -    value to the config option."""
> +    return (usegd, lazydelta) tuple
> +
> +    usegd: create new repository as general delta,
> +    lazydelta: do not redelta incoming bundle."""
> +    # the r'' is a trick to have checkconfig ignore the type
> mismatch
>      # experimental config: format.generaldelta
> -    return ui.configbool('format', 'generaldelta', False)
> +    if ui.config(r'format', 'generaldelta', '') == 'accept':
> +        return True, True
> +    # experimental config: format.generaldelta
> +    usegd = ui.configbool('format', 'generaldelta', False)
> +    return usegd, False

Also, I don't think the logic here is nearly complex enough to warrant
a multi-value return. Let's just have two (nearly trivial) functions
defined next to each other.

-- 
Mathematics is the supreme nostalgia of our time.
Pierre-Yves David - Nov. 10, 2015, 5:38 p.m.
On 11/09/2015 01:24 PM, Matt Mackall wrote:
> On Sun, 2015-11-08 at 12:00 -0500, Pierre-Yves David wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david@fb.com>
>> # Date 1446479952 0
>> #      Mon Nov 02 15:59:12 2015 +0000
>> # Node ID 3feb562d4e83fcbb7c59fa79b5749aae45de9fb1
>> # Parent  69c64edb25dd96a23564db5b23b8b432b343ec88
>> # EXP-Topic generaldelta
>> # Available At http://hg.netv6.net/marmoute-wip/mercurial/
>> #              hg pull http://hg.netv6.net/marmoute-wip/mercurial/ -r
>> 3feb562d4e83
>> format: introduce 'format.generaldelta=accept'
>
>
>>   def gdconfig(ui):
>>       """helper function to retrieve general delta setting
>>
>> -    The need for such function is triggered by a soon to come third
>> possible
>> -    value to the config option."""
>> +    return (usegd, lazydelta) tuple
>> +
>> +    usegd: create new repository as general delta,
>> +    lazydelta: do not redelta incoming bundle."""
>> +    # the r'' is a trick to have checkconfig ignore the type
>> mismatch
>>       # experimental config: format.generaldelta
>> -    return ui.configbool('format', 'generaldelta', False)
>> +    if ui.config(r'format', 'generaldelta', '') == 'accept':
>> +        return True, True
>> +    # experimental config: format.generaldelta
>> +    usegd = ui.configbool('format', 'generaldelta', False)
>> +    return usegd, False
>
> Also, I don't think the logic here is nearly complex enough to warrant
> a multi-value return. Let's just have two (nearly trivial) functions
> defined next to each other.

Without this 1 config, 3 value approach, I;m not sure how to garantee 
proper transition to a new default without breaking people who 
explicitly opted in the existing option.

Currently there is three possibilitie

A) people who did not set the config
B) people who set the config to True
C) people who set the confid to False (huho, really?)

The goal here is to get (A) from "no general delta" to (B) "general 
delta lazy base". In parallel, (B) says "general delta, no lazy base".

If we move to one config, three values (no, accept, yes) we can change 
(A) without affecting (B) nor (C). If we move to two binary config, we 
get want we want for (A) but we regresse (B). They explicit config will 
match the new default, but they won't have anything set override the new 
lazybase one. Leading to delta not being recomputed (unlike before).


So we either:
1) go for the 1 config 3 value options
2) say screw this, it was experimental (for 3 years…) we have no BC
3) find a way I missed
Gregory Szorc - Nov. 10, 2015, 6:03 p.m.
On Tue, Nov 10, 2015 at 9:38 AM, Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 11/09/2015 01:24 PM, Matt Mackall wrote:
>
>> On Sun, 2015-11-08 at 12:00 -0500, Pierre-Yves David wrote:
>>
>>> # HG changeset patch
>>> # User Pierre-Yves David <pierre-yves.david@fb.com>
>>> # Date 1446479952 0
>>> #      Mon Nov 02 15:59:12 2015 +0000
>>> # Node ID 3feb562d4e83fcbb7c59fa79b5749aae45de9fb1
>>> # Parent  69c64edb25dd96a23564db5b23b8b432b343ec88
>>> # EXP-Topic generaldelta
>>> # Available At http://hg.netv6.net/marmoute-wip/mercurial/
>>> #              hg pull http://hg.netv6.net/marmoute-wip/mercurial/ -r
>>> 3feb562d4e83
>>> format: introduce 'format.generaldelta=accept'
>>>
>>
>>
>>   def gdconfig(ui):
>>>       """helper function to retrieve general delta setting
>>>
>>> -    The need for such function is triggered by a soon to come third
>>> possible
>>> -    value to the config option."""
>>> +    return (usegd, lazydelta) tuple
>>> +
>>> +    usegd: create new repository as general delta,
>>> +    lazydelta: do not redelta incoming bundle."""
>>> +    # the r'' is a trick to have checkconfig ignore the type
>>> mismatch
>>>       # experimental config: format.generaldelta
>>> -    return ui.configbool('format', 'generaldelta', False)
>>> +    if ui.config(r'format', 'generaldelta', '') == 'accept':
>>> +        return True, True
>>> +    # experimental config: format.generaldelta
>>> +    usegd = ui.configbool('format', 'generaldelta', False)
>>> +    return usegd, False
>>>
>>
>> Also, I don't think the logic here is nearly complex enough to warrant
>> a multi-value return. Let's just have two (nearly trivial) functions
>> defined next to each other.
>>
>
> Without this 1 config, 3 value approach, I;m not sure how to garantee
> proper transition to a new default without breaking people who explicitly
> opted in the existing option.
>
> Currently there is three possibilitie
>
> A) people who did not set the config
> B) people who set the config to True
> C) people who set the confid to False (huho, really?)
>
> The goal here is to get (A) from "no general delta" to (B) "general delta
> lazy base". In parallel, (B) says "general delta, no lazy base".
>
> If we move to one config, three values (no, accept, yes) we can change (A)
> without affecting (B) nor (C). If we move to two binary config, we get want
> we want for (A) but we regresse (B). They explicit config will match the
> new default, but they won't have anything set override the new lazybase
> one. Leading to delta not being recomputed (unlike before).
>
>
> So we either:
> 1) go for the 1 config 3 value options
> 2) say screw this, it was experimental (for 3 years…) we have no BC
> 3) find a way I missed
>
>
How about "throwing away" format.generaldelta and reinventing it as 2 new,
non-experimental config options. We can have an explicit setting of
format.generaldelta infer whatever makes sense. format.generaldelta is
marked as experimental after all. And, that config option is ambiguously
named if there exists a separate general delta option: "general delta what?"
Pierre-Yves David - Nov. 10, 2015, 11:45 p.m.
On 11/10/2015 09:38 AM, Pierre-Yves David wrote:
>
>
> On 11/09/2015 01:24 PM, Matt Mackall wrote:
>> On Sun, 2015-11-08 at 12:00 -0500, Pierre-Yves David wrote:
>>> # HG changeset patch
>>> # User Pierre-Yves David <pierre-yves.david@fb.com>
>>> # Date 1446479952 0
>>> #      Mon Nov 02 15:59:12 2015 +0000
>>> # Node ID 3feb562d4e83fcbb7c59fa79b5749aae45de9fb1
>>> # Parent  69c64edb25dd96a23564db5b23b8b432b343ec88
>>> # EXP-Topic generaldelta
>>> # Available At http://hg.netv6.net/marmoute-wip/mercurial/
>>> #              hg pull http://hg.netv6.net/marmoute-wip/mercurial/ -r
>>> 3feb562d4e83
>>> format: introduce 'format.generaldelta=accept'
>>
>>
>>>   def gdconfig(ui):
>>>       """helper function to retrieve general delta setting
>>>
>>> -    The need for such function is triggered by a soon to come third
>>> possible
>>> -    value to the config option."""
>>> +    return (usegd, lazydelta) tuple
>>> +
>>> +    usegd: create new repository as general delta,
>>> +    lazydelta: do not redelta incoming bundle."""
>>> +    # the r'' is a trick to have checkconfig ignore the type
>>> mismatch
>>>       # experimental config: format.generaldelta
>>> -    return ui.configbool('format', 'generaldelta', False)
>>> +    if ui.config(r'format', 'generaldelta', '') == 'accept':
>>> +        return True, True
>>> +    # experimental config: format.generaldelta
>>> +    usegd = ui.configbool('format', 'generaldelta', False)
>>> +    return usegd, False
>>
>> Also, I don't think the logic here is nearly complex enough to warrant
>> a multi-value return. Let's just have two (nearly trivial) functions
>> defined next to each other.
>
> Without this 1 config, 3 value approach, I;m not sure how to garantee
> proper transition to a new default without breaking people who
> explicitly opted in the existing option.
>
> Currently there is three possibilitie
>
> A) people who did not set the config
> B) people who set the config to True
> C) people who set the confid to False (huho, really?)
>
> The goal here is to get (A) from "no general delta" to (B) "general
> delta lazy base". In parallel, (B) says "general delta, no lazy base".
>
> If we move to one config, three values (no, accept, yes) we can change
> (A) without affecting (B) nor (C). If we move to two binary config, we
> get want we want for (A) but we regresse (B). They explicit config will
> match the new default, but they won't have anything set override the new
> lazybase one. Leading to delta not being recomputed (unlike before).
>
>
> So we either:
> 1) go for the 1 config 3 value options
> 2) say screw this, it was experimental (for 3 years…) we have no BC
> 3) find a way I missed

Matt replied to V1 with a proposal that actually work (case 3). So I'm 
going for it.

Patch

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -256,11 +256,11 @@  class localrepository(object):
                     self.vfs.append(
                         "00changelog.i",
                         '\0\0\0\2' # represents revlogv2
                         ' dummy changelog to prevent using the old repo layout'
                     )
-                if scmutil.gdconfig(self.ui):
+                if scmutil.gdconfig(self.ui)[0]:
                     self.requirements.add("generaldelta")
                 if self.ui.configbool('experimental', 'treemanifest', False):
                     self.requirements.add("treemanifest")
                 if self.ui.configbool('experimental', 'manifestv2', False):
                     self.requirements.add("manifestv2")
@@ -356,10 +356,11 @@  class localrepository(object):
             self.svfs.options['manifestcachesize'] = manifestcachesize
         # experimental config: format.aggressivemergedeltas
         aggressivemergedeltas = self.ui.configbool('format',
             'aggressivemergedeltas', False)
         self.svfs.options['aggressivemergedeltas'] = aggressivemergedeltas
+        self.svfs.options['lazydeltabase'] = scmutil.gdconfig(self.ui)[1]
 
     def _writerequirements(self):
         scmutil.writerequires(self.vfs, self.requirements)
 
     def _checknested(self, path):
diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -228,10 +228,11 @@  class revlog(object):
                 self._chunkcachesize = opts['chunkcachesize']
             if 'maxchainlen' in opts:
                 self._maxchainlen = opts['maxchainlen']
             if 'aggressivemergedeltas' in opts:
                 self._aggressivemergedeltas = opts['aggressivemergedeltas']
+            self._lazydeltabase = bool(opts.get('lazydeltabase', False))
 
         if self._chunkcachesize <= 0:
             raise RevlogError(_('revlog chunk cache size %r is not greater '
                                 'than 0') % self._chunkcachesize)
         elif self._chunkcachesize & (self._chunkcachesize - 1):
@@ -1368,11 +1369,15 @@  class revlog(object):
         else:
             textlen = len(text)
 
         # should we try to build a delta?
         if prev != nullrev:
-            if self._generaldelta:
+            if cachedelta and self._generaldelta and self._lazydeltabase:
+                # Assume what we received from the server is a good choice
+                # build delta will reuse the cache
+                d = builddelta(cachedelta[0])
+            elif self._generaldelta:
                 if p2r != nullrev and self._aggressivemergedeltas:
                     d = builddelta(p1r)
                     d2 = builddelta(p2r)
                     p1good = self._isgooddelta(d, textlen)
                     p2good = self._isgooddelta(d2, textlen)
diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
--- a/mercurial/scmutil.py
+++ b/mercurial/scmutil.py
@@ -1172,9 +1172,16 @@  def wlocksub(repo, cmd, *args, **kwargs)
                     **kwargs)
 
 def gdconfig(ui):
     """helper function to retrieve general delta setting
 
-    The need for such function is triggered by a soon to come third possible
-    value to the config option."""
+    return (usegd, lazydelta) tuple
+
+    usegd: create new repository as general delta,
+    lazydelta: do not redelta incoming bundle."""
+    # the r'' is a trick to have checkconfig ignore the type mismatch
     # experimental config: format.generaldelta
-    return ui.configbool('format', 'generaldelta', False)
+    if ui.config(r'format', 'generaldelta', '') == 'accept':
+        return True, True
+    # experimental config: format.generaldelta
+    usegd = ui.configbool('format', 'generaldelta', False)
+    return usegd, False
diff --git a/tests/test-generaldelta.t b/tests/test-generaldelta.t
--- a/tests/test-generaldelta.t
+++ b/tests/test-generaldelta.t
@@ -69,10 +69,50 @@  commit.
      rev    offset  length   base linkrev nodeid       p1           p2
        0         0       3      0       1 1406e7411862 000000000000 000000000000
 
   $ cd ..
 
+Test "accept" as a value
+
+delta coming from the server base delta server are not recompressed.
+(also include the aggressive version for comparison)
+
+  $ hg clone repo --pull --config format.generaldelta=accept accept
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 4 changesets with 5 changes to 2 files (+2 heads)
+  updating to branch default
+  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ hg clone repo --pull --config format.generaldelta=1 full
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 4 changesets with 5 changes to 2 files (+2 heads)
+  updating to branch default
+  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ hg -R repo debugindex -m
+     rev    offset  length   base linkrev nodeid       p1           p2
+       0         0      77      0       0 0273e8a1b972 000000000000 000000000000
+       1        77      57      0       1 e0c49f5ef780 0273e8a1b972 000000000000
+       2       134      77      2       2 de950093e41b 0273e8a1b972 000000000000
+       3       211      57      2       3 db74c7cde4d0 0273e8a1b972 000000000000
+  $ hg -R accept debugindex -m
+     rev    offset  length  delta linkrev nodeid       p1           p2
+       0         0      77     -1       0 0273e8a1b972 000000000000 000000000000
+       1        77      57      0       1 e0c49f5ef780 0273e8a1b972 000000000000
+       2       134      77     -1       2 de950093e41b 0273e8a1b972 000000000000
+       3       211      57      2       3 db74c7cde4d0 0273e8a1b972 000000000000
+  $ hg -R full debugindex -m
+     rev    offset  length  delta linkrev nodeid       p1           p2
+       0         0      77     -1       0 0273e8a1b972 000000000000 000000000000
+       1        77      57      0       1 e0c49f5ef780 0273e8a1b972 000000000000
+       2       134      57      0       2 de950093e41b 0273e8a1b972 000000000000
+       3       191      57      0       3 db74c7cde4d0 0273e8a1b972 000000000000
+
 Test format.aggressivemergedeltas
 
   $ hg init --config format.generaldelta=1 aggressive
   $ cd aggressive
   $ touch a b c d e