Submitter | Pierre-Yves David |
---|---|
Date | Dec. 2, 2015, 5:32 p.m. |
Message ID | <64e48bdff2461e793c96.1449077570@marginatus.alto.octopoid.net> |
Download | mbox | patch |
Permalink | /patch/11763/ |
State | Accepted |
Headers | show |
Comments
On Wed, Dec 2, 2015 at 9:56 AM Pierre-Yves David < pierre-yves.david@ens-lyon.org> wrote: > # HG changeset patch > # User Pierre-Yves David <pierre-yves.david@fb.com> > # Date 1449015359 28800 > # Tue Dec 01 16:15:59 2015 -0800 > # Node ID 64e48bdff2461e793c963d677407d35fc32c30a9 > # Parent 49f5bfa7043ae0c797536dab93260aee92fcbaa8 > # EXP-Topic generaldelta > # Available At http://hg.netv6.net/marmoute-wip/mercurial/ > # hg pull http://hg.netv6.net/marmoute-wip/mercurial/ -r > 64e48bdff246 > addrevision: use general delta when the incoming base delta is bad > > We unify the delta selection process to be a simple three options process: > > - try to use the incoming delta (if lazydeltabase is on) > - try to find a suitable parents to delta against (if gd is on) > - try to delta against the tipmost revision > > The first of this option that yield a valid delta will be used. > > The test change in 'test-generaldelta.t' show this behavior as we use a > delta > against the parent instead of a full delta when the incoming delta is not > suitable. > > This as some impact on 'test-bundle.t' because a delta somewhere changes. > It > does not seems to change the test semantic and have been ignored. > > diff --git a/mercurial/revlog.py b/mercurial/revlog.py > --- a/mercurial/revlog.py > +++ b/mercurial/revlog.py > @@ -1422,38 +1422,37 @@ class revlog(object): > else: > textlen = len(text) > > # should we try to build a delta? > if prev != nullrev: > + tested = set() > if cachedelta and self._generaldelta and self._lazydeltabase: > # Assume what we received from the server is a good choice > # build delta will reuse the cache > candidatedelta = builddelta(cachedelta[0]) > + tested.add(candidatedelta[3]) > if self._isgooddelta(candidatedelta, textlen): > delta = candidatedelta > - elif prev != candidatedelta[3]: > - # Try against prev to hopefully save us a fulltext. > - delta = builddelta(prev) > - elif self._generaldelta: > + if delta is None and self._generaldelta: > parents = [p1r, p2r] > - if not self._aggressivemergedeltas: > + # exclude already lazy tested base if any > + parents = [p for p in parents if p not in tested] > + if parents and not self._aggressivemergedeltas: > # Pick whichever parent is closer to us (to minimize > the > - # chance of having to build a fulltext). Since > - # nullrev == -1, any non-merge commit will always > pick p1r. > + # chance of having to build a fulltext). > parents = [max(parents)] > Before this patch, there were always 2 revisions in parents before this line, so the nullrev would never be tested only for root commits. After this patch, nullrev will also be tested for non-merge commits where the first parent was the incoming delta. I suppose it's unlikely to make a difference, since those will almost always probably not be good deltas, but it also seems unnecessary to even check. Is the testing of nullrev even intentional? > + tested.update(parents) > pdeltas = [] > for p in parents: > pd = builddelta(p) > if self._isgooddelta(pd, textlen): > pdeltas.append(pd) > if pdeltas: > delta = min(pdeltas, key=lambda x: x[1]) > - elif prev not in parents: > - # Neither is good, try against prev to hopefully save > us > - # a fulltext. > - delta = builddelta(prev) > - else: > + if delta is None and prev not in tested: > + # other approach failed try against prev to hopefully > save us a > + # fulltext. > delta = builddelta(prev) > if delta is not None: > dist, l, data, base, chainbase, chainlen, compresseddeltalen > = delta > > if not self._isgooddelta(delta, textlen): > diff --git a/tests/test-bundle.t b/tests/test-bundle.t > --- a/tests/test-bundle.t > +++ b/tests/test-bundle.t > @@ -264,17 +264,17 @@ Cannot produce streaming clone bundles w > [255] > > packed1 is produced properly > > $ hg -R test debugcreatestreamclonebundle packed.hg > - writing 2663 bytes for 6 files > + writing 2667 bytes for 6 files > bundle requirements: generaldelta, revlogv1 > > $ f -B 64 --size --sha1 --hexdump packed.hg > - packed.hg: size=2826, sha1=e139f97692a142b19cdcff64a69697d5307ce6d4 > + packed.hg: size=2830, sha1=c28255110a88ffa52ddc44985cad295b1ab349bc > 0000: 48 47 53 31 55 4e 00 00 00 00 00 00 00 06 00 00 |HGS1UN..........| > - 0010: 00 00 00 00 0a 67 00 16 67 65 6e 65 72 61 6c 64 |.....g..generald| > + 0010: 00 00 00 00 0a 6b 00 16 67 65 6e 65 72 61 6c 64 |.....k..generald| > 0020: 65 6c 74 61 2c 72 65 76 6c 6f 67 76 31 00 64 61 |elta,revlogv1.da| > 0030: 74 61 2f 61 64 69 66 66 65 72 65 6e 74 66 69 6c |ta/adifferentfil| > > generaldelta requirement is listed in stream clone bundles > > diff --git a/tests/test-generaldelta.t b/tests/test-generaldelta.t > --- a/tests/test-generaldelta.t > +++ b/tests/test-generaldelta.t > @@ -103,11 +103,11 @@ delta coming from the server base delta > $ hg -R usegd debugindex -m > rev offset length delta linkrev nodeid p1 p2 > 0 0 104 -1 0 cef96823c800 000000000000 > 000000000000 > 1 104 57 0 1 58ab9a8d541d cef96823c800 > 000000000000 > 2 161 57 1 2 134fdc6fd680 cef96823c800 > 000000000000 > - 3 218 104 -1 3 723508934dad cef96823c800 > 000000000000 > + 3 218 57 0 3 723508934dad cef96823c800 > 000000000000 > $ hg -R full debugindex -m > rev offset length delta linkrev nodeid p1 p2 > 0 0 104 -1 0 cef96823c800 000000000000 > 000000000000 > 1 104 57 0 1 58ab9a8d541d cef96823c800 > 000000000000 > 2 161 57 0 2 134fdc6fd680 cef96823c800 > 000000000000 > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@selenic.com > https://selenic.com/mailman/listinfo/mercurial-devel >
On 12/02/2015 11:15 PM, Martin von Zweigbergk wrote: > > > On Wed, Dec 2, 2015 at 9:56 AM Pierre-Yves David > <pierre-yves.david@ens-lyon.org <mailto:pierre-yves.david@ens-lyon.org>> > wrote: > > # HG changeset patch > # User Pierre-Yves David <pierre-yves.david@fb.com > <mailto:pierre-yves.david@fb.com>> > # Date 1449015359 28800 > # Tue Dec 01 16:15:59 2015 -0800 > # Node ID 64e48bdff2461e793c963d677407d35fc32c30a9 > # Parent 49f5bfa7043ae0c797536dab93260aee92fcbaa8 > # EXP-Topic generaldelta > # Available At http://hg.netv6.net/marmoute-wip/mercurial/ > # hg pull http://hg.netv6.net/marmoute-wip/mercurial/ > -r 64e48bdff246 > addrevision: use general delta when the incoming base delta is bad > > We unify the delta selection process to be a simple three options > process: > > - try to use the incoming delta (if lazydeltabase is on) > - try to find a suitable parents to delta against (if gd is on) > - try to delta against the tipmost revision > > The first of this option that yield a valid delta will be used. > > The test change in 'test-generaldelta.t' show this behavior as we > use a delta > against the parent instead of a full delta when the incoming delta > is not > suitable. > > This as some impact on 'test-bundle.t' because a delta somewhere > changes. It > does not seems to change the test semantic and have been ignored. > > diff --git a/mercurial/revlog.py b/mercurial/revlog.py > --- a/mercurial/revlog.py > +++ b/mercurial/revlog.py > @@ -1422,38 +1422,37 @@ class revlog(object): > else: > textlen = len(text) > > # should we try to build a delta? > if prev != nullrev: > + tested = set() > if cachedelta and self._generaldelta and > self._lazydeltabase: > # Assume what we received from the server is a > good choice > # build delta will reuse the cache > candidatedelta = builddelta(cachedelta[0]) > + tested.add(candidatedelta[3]) > if self._isgooddelta(candidatedelta, textlen): > delta = candidatedelta > - elif prev != candidatedelta[3]: > - # Try against prev to hopefully save us a fulltext. > - delta = builddelta(prev) > - elif self._generaldelta: > + if delta is None and self._generaldelta: > parents = [p1r, p2r] > - if not self._aggressivemergedeltas: > + # exclude already lazy tested base if any > + parents = [p for p in parents if p not in tested] > + if parents and not self._aggressivemergedeltas: > # Pick whichever parent is closer to us (to > minimize the > - # chance of having to build a fulltext). Since > - # nullrev == -1, any non-merge commit will > always pick p1r. > + # chance of having to build a fulltext). > parents = [max(parents)] > > > Before this patch, there were always 2 revisions in parents before this > line, so the nullrev would never be tested only for root commits. After > this patch, nullrev will also be tested for non-merge commits where the > first parent was the incoming delta. I suppose it's unlikely to make a > difference, since those will almost always probably not be good deltas, > but it also seems unnecessary to even check. Is the testing of nullrev > even intentional? earlier version of this patch was testing for nullrev explicitly, but I figured out that builddelta could also we feed nullrev (root commit) and therefore was able to handle it.
On Wed, Dec 2, 2015 at 11:19 PM Pierre-Yves David < pierre-yves.david@ens-lyon.org> wrote: > > > On 12/02/2015 11:15 PM, Martin von Zweigbergk wrote: > > > > > > On Wed, Dec 2, 2015 at 9:56 AM Pierre-Yves David > > <pierre-yves.david@ens-lyon.org <mailto:pierre-yves.david@ens-lyon.org>> > > wrote: > > > > # HG changeset patch > > # User Pierre-Yves David <pierre-yves.david@fb.com > > <mailto:pierre-yves.david@fb.com>> > > # Date 1449015359 28800 > > # Tue Dec 01 16:15:59 2015 -0800 > > # Node ID 64e48bdff2461e793c963d677407d35fc32c30a9 > > # Parent 49f5bfa7043ae0c797536dab93260aee92fcbaa8 > > # EXP-Topic generaldelta > > # Available At http://hg.netv6.net/marmoute-wip/mercurial/ > > # hg pull http://hg.netv6.net/marmoute-wip/mercurial/ > > -r 64e48bdff246 > > addrevision: use general delta when the incoming base delta is bad > > > > We unify the delta selection process to be a simple three options > > process: > > > > - try to use the incoming delta (if lazydeltabase is on) > > - try to find a suitable parents to delta against (if gd is on) > > - try to delta against the tipmost revision > > > > The first of this option that yield a valid delta will be used. > > > > The test change in 'test-generaldelta.t' show this behavior as we > > use a delta > > against the parent instead of a full delta when the incoming delta > > is not > > suitable. > > > > This as some impact on 'test-bundle.t' because a delta somewhere > > changes. It > > does not seems to change the test semantic and have been ignored. > > > > diff --git a/mercurial/revlog.py b/mercurial/revlog.py > > --- a/mercurial/revlog.py > > +++ b/mercurial/revlog.py > > @@ -1422,38 +1422,37 @@ class revlog(object): > > else: > > textlen = len(text) > > > > # should we try to build a delta? > > if prev != nullrev: > > + tested = set() > > if cachedelta and self._generaldelta and > > self._lazydeltabase: > > # Assume what we received from the server is a > > good choice > > # build delta will reuse the cache > > candidatedelta = builddelta(cachedelta[0]) > > + tested.add(candidatedelta[3]) > > if self._isgooddelta(candidatedelta, textlen): > > delta = candidatedelta > > - elif prev != candidatedelta[3]: > > - # Try against prev to hopefully save us a > fulltext. > > - delta = builddelta(prev) > > - elif self._generaldelta: > > + if delta is None and self._generaldelta: > > parents = [p1r, p2r] > > - if not self._aggressivemergedeltas: > > + # exclude already lazy tested base if any > > + parents = [p for p in parents if p not in tested] > > + if parents and not self._aggressivemergedeltas: > > # Pick whichever parent is closer to us (to > > minimize the > > - # chance of having to build a fulltext). Since > > - # nullrev == -1, any non-merge commit will > > always pick p1r. > > + # chance of having to build a fulltext). > > parents = [max(parents)] > > > > > > Before this patch, there were always 2 revisions in parents before this > > line, so the nullrev would never be tested only for root commits. After > > this patch, nullrev will also be tested for non-merge commits where the > > first parent was the incoming delta. I suppose it's unlikely to make a > > difference, since those will almost always probably not be good deltas, > > but it also seems unnecessary to even check. Is the testing of nullrev > > even intentional? > > earlier version of this patch was testing for nullrev explicitly, but I > figured out that builddelta could also we feed nullrev (root commit) and > therefore was able to handle it. > I guess you're saying it's not wasteful either. Good, pushed to the clowncopter! > > -- > Pierre-Yves David >
Patch
diff --git a/mercurial/revlog.py b/mercurial/revlog.py --- a/mercurial/revlog.py +++ b/mercurial/revlog.py @@ -1422,38 +1422,37 @@ class revlog(object): else: textlen = len(text) # should we try to build a delta? if prev != nullrev: + tested = set() if cachedelta and self._generaldelta and self._lazydeltabase: # Assume what we received from the server is a good choice # build delta will reuse the cache candidatedelta = builddelta(cachedelta[0]) + tested.add(candidatedelta[3]) if self._isgooddelta(candidatedelta, textlen): delta = candidatedelta - elif prev != candidatedelta[3]: - # Try against prev to hopefully save us a fulltext. - delta = builddelta(prev) - elif self._generaldelta: + if delta is None and self._generaldelta: parents = [p1r, p2r] - if not self._aggressivemergedeltas: + # exclude already lazy tested base if any + parents = [p for p in parents if p not in tested] + if parents and not self._aggressivemergedeltas: # Pick whichever parent is closer to us (to minimize the - # chance of having to build a fulltext). Since - # nullrev == -1, any non-merge commit will always pick p1r. + # chance of having to build a fulltext). parents = [max(parents)] + tested.update(parents) pdeltas = [] for p in parents: pd = builddelta(p) if self._isgooddelta(pd, textlen): pdeltas.append(pd) if pdeltas: delta = min(pdeltas, key=lambda x: x[1]) - elif prev not in parents: - # Neither is good, try against prev to hopefully save us - # a fulltext. - delta = builddelta(prev) - else: + if delta is None and prev not in tested: + # other approach failed try against prev to hopefully save us a + # fulltext. delta = builddelta(prev) if delta is not None: dist, l, data, base, chainbase, chainlen, compresseddeltalen = delta if not self._isgooddelta(delta, textlen): diff --git a/tests/test-bundle.t b/tests/test-bundle.t --- a/tests/test-bundle.t +++ b/tests/test-bundle.t @@ -264,17 +264,17 @@ Cannot produce streaming clone bundles w [255] packed1 is produced properly $ hg -R test debugcreatestreamclonebundle packed.hg - writing 2663 bytes for 6 files + writing 2667 bytes for 6 files bundle requirements: generaldelta, revlogv1 $ f -B 64 --size --sha1 --hexdump packed.hg - packed.hg: size=2826, sha1=e139f97692a142b19cdcff64a69697d5307ce6d4 + packed.hg: size=2830, sha1=c28255110a88ffa52ddc44985cad295b1ab349bc 0000: 48 47 53 31 55 4e 00 00 00 00 00 00 00 06 00 00 |HGS1UN..........| - 0010: 00 00 00 00 0a 67 00 16 67 65 6e 65 72 61 6c 64 |.....g..generald| + 0010: 00 00 00 00 0a 6b 00 16 67 65 6e 65 72 61 6c 64 |.....k..generald| 0020: 65 6c 74 61 2c 72 65 76 6c 6f 67 76 31 00 64 61 |elta,revlogv1.da| 0030: 74 61 2f 61 64 69 66 66 65 72 65 6e 74 66 69 6c |ta/adifferentfil| generaldelta requirement is listed in stream clone bundles diff --git a/tests/test-generaldelta.t b/tests/test-generaldelta.t --- a/tests/test-generaldelta.t +++ b/tests/test-generaldelta.t @@ -103,11 +103,11 @@ delta coming from the server base delta $ hg -R usegd debugindex -m rev offset length delta linkrev nodeid p1 p2 0 0 104 -1 0 cef96823c800 000000000000 000000000000 1 104 57 0 1 58ab9a8d541d cef96823c800 000000000000 2 161 57 1 2 134fdc6fd680 cef96823c800 000000000000 - 3 218 104 -1 3 723508934dad cef96823c800 000000000000 + 3 218 57 0 3 723508934dad cef96823c800 000000000000 $ hg -R full debugindex -m rev offset length delta linkrev nodeid p1 p2 0 0 104 -1 0 cef96823c800 000000000000 000000000000 1 104 57 0 1 58ab9a8d541d cef96823c800 000000000000 2 161 57 0 2 134fdc6fd680 cef96823c800 000000000000