Patchwork [02,of,10,V2] revlog: fix pure python slicing code when chain contains nullrev

login
register
mail settings
Submitter Boris Feld
Date Dec. 21, 2018, 11:47 a.m.
Message ID <df9b79e863d6e5215175.1545392825@localhost.localdomain>
Download mbox | patch
Permalink /patch/37293/
State Accepted
Headers show

Comments

Boris Feld - Dec. 21, 2018, 11:47 a.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1545296356 -3600
#      Thu Dec 20 09:59:16 2018 +0100
# Node ID df9b79e863d6e5215175487330b1469067e20bbd
# Parent  d51d82a46d9545235be727b875deeffd9de324e9
# EXP-Topic sparse-revlog
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r df9b79e863d6
revlog: fix pure python slicing code when chain contains nullrev

We fixed the C code, but the python code still misbehaved.
Yuya Nishihara - Dec. 22, 2018, 3:20 a.m.
On Fri, 21 Dec 2018 12:47:05 +0100, Boris Feld wrote:
> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1545296356 -3600
> #      Thu Dec 20 09:59:16 2018 +0100
> # Node ID df9b79e863d6e5215175487330b1469067e20bbd
> # Parent  d51d82a46d9545235be727b875deeffd9de324e9
> # EXP-Topic sparse-revlog
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r df9b79e863d6
> revlog: fix pure python slicing code when chain contains nullrev
> 
> We fixed the C code, but the python code still misbehaved.
> 
> diff --git a/mercurial/revlogutils/deltas.py b/mercurial/revlogutils/deltas.py
> --- a/mercurial/revlogutils/deltas.py
> +++ b/mercurial/revlogutils/deltas.py
> @@ -116,6 +116,12 @@ def slicechunk(revlog, revs, targetsize=
>      [[0], [11], [13], [15]]
>      >>> list(slicechunk(revlog, [0, 11, 13, 15], targetsize=20))
>      [[0], [11], [13, 15]]
> +
> +    Slicing involving nullrev
> +    >>> list(slicechunk(revlog, [-1, 0, 11, 13, 15], targetsize=20))
> +    [[-1, 0], [11], [13, 15]]
> +    >>> list(slicechunk(revlog, [-1, 13, 15], targetsize=5))
> +    [[-1, 13, 15]]
>      """
>      if targetsize is not None:
>          targetsize = max(targetsize, revlog._srmingapsize)
> @@ -363,6 +369,8 @@ def _slicechunktodensity(revlog, revs, t
>      gaps = []
>      prevend = None
>      for i, rev in enumerate(revs):
> +        if rev == nullrev:
> +            continue
>          revstart = start(rev)
>          revlen = length(rev)
>  
> @@ -490,7 +498,11 @@ def segmentspan(revlog, revs):
>      if not revs:
>          return 0
>      end = revlog.end(revs[-1])
> -    return end - revlog.start(revs[0])
> +    i = 0
> +    while revs[i] == nullrev:
> +        i += 1
> +    startrev = revs[i]
> +    return end - revlog.start(startrev)

The C implementation appears not doing that IIUC. Doesn't it a bug of
the _testrevlog? I think it will misbehave if nullrev is passed in.

Can you turn these tests into a unittest to cover both pure and C
implementations?
Boris Feld - Dec. 27, 2018, 11:24 p.m.
On 22/12/2018 04:20, Yuya Nishihara wrote:
> On Fri, 21 Dec 2018 12:47:05 +0100, Boris Feld wrote:
>> # HG changeset patch
>> # User Boris Feld <boris.feld@octobus.net>
>> # Date 1545296356 -3600
>> #      Thu Dec 20 09:59:16 2018 +0100
>> # Node ID df9b79e863d6e5215175487330b1469067e20bbd
>> # Parent  d51d82a46d9545235be727b875deeffd9de324e9
>> # EXP-Topic sparse-revlog
>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r df9b79e863d6
>> revlog: fix pure python slicing code when chain contains nullrev
>>
>> We fixed the C code, but the python code still misbehaved.
>>
>> diff --git a/mercurial/revlogutils/deltas.py b/mercurial/revlogutils/deltas.py
>> --- a/mercurial/revlogutils/deltas.py
>> +++ b/mercurial/revlogutils/deltas.py
>> @@ -116,6 +116,12 @@ def slicechunk(revlog, revs, targetsize=
>>      [[0], [11], [13], [15]]
>>      >>> list(slicechunk(revlog, [0, 11, 13, 15], targetsize=20))
>>      [[0], [11], [13, 15]]
>> +
>> +    Slicing involving nullrev
>> +    >>> list(slicechunk(revlog, [-1, 0, 11, 13, 15], targetsize=20))
>> +    [[-1, 0], [11], [13, 15]]
>> +    >>> list(slicechunk(revlog, [-1, 13, 15], targetsize=5))
>> +    [[-1, 13, 15]]
>>      """
>>      if targetsize is not None:
>>          targetsize = max(targetsize, revlog._srmingapsize)
>> @@ -363,6 +369,8 @@ def _slicechunktodensity(revlog, revs, t
>>      gaps = []
>>      prevend = None
>>      for i, rev in enumerate(revs):
>> +        if rev == nullrev:
>> +            continue
>>          revstart = start(rev)
>>          revlen = length(rev)
>>  
>> @@ -490,7 +498,11 @@ def segmentspan(revlog, revs):
>>      if not revs:
>>          return 0
>>      end = revlog.end(revs[-1])
>> -    return end - revlog.start(revs[0])
>> +    i = 0
>> +    while revs[i] == nullrev:
>> +        i += 1
>> +    startrev = revs[i]
>> +    return end - revlog.start(startrev)
> The C implementation appears not doing that IIUC. Doesn't it a bug of
> the _testrevlog? I think it will misbehave if nullrev is passed in.
Ha! good catch, I'll fix _testrevlog instead.
>
> Can you turn these tests into a unittest to cover both pure and C
> implementations?
Turning this specific test is a bit more complicated than we would like
as we are taking advantage of _testrevlog flexibility to produce a
simple but clear test.

However, we've already added some unittest using real revlog in patches
you have already taken. Is that enough for you?
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Yuya Nishihara - Dec. 28, 2018, 5:23 a.m.
On Fri, 28 Dec 2018 00:24:48 +0100, Boris FELD wrote:
> > Can you turn these tests into a unittest to cover both pure and C
> > implementations?
> Turning this specific test is a bit more complicated than we would like
> as we are taking advantage of _testrevlog flexibility to produce a
> simple but clear test.
> 
> However, we've already added some unittest using real revlog in patches
> you have already taken. Is that enough for you?

The unittest you wrote is nice, thanks, but it's even nicer if we had
a single test set that covers both pure and cext. It'll work as a source
of truth. At this point, we have to carefully compare both implementations
to be sure they are identical.
Boris Feld - Dec. 28, 2018, 6:07 p.m.
On 28/12/2018 06:23, Yuya Nishihara wrote:
> On Fri, 28 Dec 2018 00:24:48 +0100, Boris FELD wrote:
>>> Can you turn these tests into a unittest to cover both pure and C
>>> implementations?
>> Turning this specific test is a bit more complicated than we would like
>> as we are taking advantage of _testrevlog flexibility to produce a
>> simple but clear test.
>>
>> However, we've already added some unittest using real revlog in patches
>> you have already taken. Is that enough for you?
> The unittest you wrote is nice, thanks, but it's even nicer if we had
> a single test set that covers both pure and cext. It'll work as a source
> of truth. At this point, we have to carefully compare both implementations
> to be sure they are identical.
I'm not sure to understand. The unittest will use the pure code when run
with --pure so both the C and pure code is tested. Am I missing something?
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Yuya Nishihara - Dec. 30, 2018, 8:06 a.m.
On Fri, 28 Dec 2018 19:07:04 +0100, Boris FELD wrote:
> On 28/12/2018 06:23, Yuya Nishihara wrote:
> > On Fri, 28 Dec 2018 00:24:48 +0100, Boris FELD wrote:
> >>> Can you turn these tests into a unittest to cover both pure and C
> >>> implementations?
> >> Turning this specific test is a bit more complicated than we would like
> >> as we are taking advantage of _testrevlog flexibility to produce a
> >> simple but clear test.
> >>
> >> However, we've already added some unittest using real revlog in patches
> >> you have already taken. Is that enough for you?
> > The unittest you wrote is nice, thanks, but it's even nicer if we had
> > a single test set that covers both pure and cext. It'll work as a source
> > of truth. At this point, we have to carefully compare both implementations
> > to be sure they are identical.
> I'm not sure to understand. The unittest will use the pure code when run
> with --pure so both the C and pure code is tested. Am I missing something?

If the unittest covers all scenarios in the doctest, perhaps the doctest can
be removed. I just guessed it wouldn't because you've sent this patch.
Boris FELD - Jan. 2, 2019, 8:58 a.m.
On 30/12/2018 09:06, Yuya Nishihara wrote:
> On Fri, 28 Dec 2018 19:07:04 +0100, Boris FELD wrote:
>> On 28/12/2018 06:23, Yuya Nishihara wrote:
>>> On Fri, 28 Dec 2018 00:24:48 +0100, Boris FELD wrote:
>>>>> Can you turn these tests into a unittest to cover both pure and C
>>>>> implementations?
>>>> Turning this specific test is a bit more complicated than we would like
>>>> as we are taking advantage of _testrevlog flexibility to produce a
>>>> simple but clear test.
>>>>
>>>> However, we've already added some unittest using real revlog in patches
>>>> you have already taken. Is that enough for you?
>>> The unittest you wrote is nice, thanks, but it's even nicer if we had
>>> a single test set that covers both pure and cext. It'll work as a source
>>> of truth. At this point, we have to carefully compare both implementations
>>> to be sure they are identical.
>> I'm not sure to understand. The unittest will use the pure code when run
>> with --pure so both the C and pure code is tested. Am I missing something?
> If the unittest covers all scenarios in the doctest, perhaps the doctest can
> be removed. I just guessed it wouldn't because you've sent this patch.
As you pointed out, the error was actually in _testrevlog not in the
implementation.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/revlogutils/deltas.py b/mercurial/revlogutils/deltas.py
--- a/mercurial/revlogutils/deltas.py
+++ b/mercurial/revlogutils/deltas.py
@@ -116,6 +116,12 @@  def slicechunk(revlog, revs, targetsize=
     [[0], [11], [13], [15]]
     >>> list(slicechunk(revlog, [0, 11, 13, 15], targetsize=20))
     [[0], [11], [13, 15]]
+
+    Slicing involving nullrev
+    >>> list(slicechunk(revlog, [-1, 0, 11, 13, 15], targetsize=20))
+    [[-1, 0], [11], [13, 15]]
+    >>> list(slicechunk(revlog, [-1, 13, 15], targetsize=5))
+    [[-1, 13, 15]]
     """
     if targetsize is not None:
         targetsize = max(targetsize, revlog._srmingapsize)
@@ -363,6 +369,8 @@  def _slicechunktodensity(revlog, revs, t
     gaps = []
     prevend = None
     for i, rev in enumerate(revs):
+        if rev == nullrev:
+            continue
         revstart = start(rev)
         revlen = length(rev)
 
@@ -490,7 +498,11 @@  def segmentspan(revlog, revs):
     if not revs:
         return 0
     end = revlog.end(revs[-1])
-    return end - revlog.start(revs[0])
+    i = 0
+    while revs[i] == nullrev:
+        i += 1
+    startrev = revs[i]
+    return end - revlog.start(startrev)
 
 def _textfromdelta(fh, revlog, baserev, delta, p1, p2, flags, expectednode):
     """build full text from a (base, delta) pair and other metadata"""