Patchwork [06,of,10,V2] context: remove duplicate manifest creation during _buildstatus

login
register
mail settings
Submitter Durham Goode
Date March 8, 2017, 3:22 a.m.
Message ID <d2850df6c891a20585a0.1488943358@dev111.prn1.facebook.com>
Download mbox | patch
Permalink /patch/18983/
State Accepted
Delegated to: Martin von Zweigbergk
Headers show

Comments

Durham Goode - March 8, 2017, 3:22 a.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1488937790 28800
#      Tue Mar 07 17:49:50 2017 -0800
# Node ID d2850df6c891a20585a0d5eac370c1c2f4463cad
# Parent  36bcc5d848c6bdf33d604999631a0708d1b7f067
context: remove duplicate manifest creation during _buildstatus

Previously we called self.manifest() in some cases to preload the first
manifest. It turns out that self.manifest() may do extra logic that
_manifestmatches() does not, so it may be causing us extra work. The fix is to
just only do the work once.
via Mercurial-devel - March 8, 2017, 6:21 p.m.
On Tue, Mar 7, 2017 at 7:22 PM, Durham Goode <durham@fb.com> wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1488937790 28800
> #      Tue Mar 07 17:49:50 2017 -0800
> # Node ID d2850df6c891a20585a0d5eac370c1c2f4463cad
> # Parent  36bcc5d848c6bdf33d604999631a0708d1b7f067
> context: remove duplicate manifest creation during _buildstatus
>
> Previously we called self.manifest() in some cases to preload the first
> manifest. It turns out that self.manifest() may do extra logic that
> _manifestmatches() does not, so it may be causing us extra work. The fix is to
> just only do the work once.
>
> diff --git a/mercurial/context.py b/mercurial/context.py
> --- a/mercurial/context.py
> +++ b/mercurial/context.py
> @@ -117,10 +117,12 @@ class basectx(object):
>          # 1000 and cache it so that when you read 1001, we just need to apply a
>          # delta to what's in the cache. So that's one full reconstruction + one
>          # delta application.
> +        mf2 = None
>          if self.rev() is not None and self.rev() < other.rev():
> -            self.manifest()
> +            mf2 = self._manifestmatches(match, s)
>          mf1 = other._manifestmatches(match, s)
> -        mf2 = self._manifestmatches(match, s)
> +        if mf2 is None:
> +            mf2 = self._manifestmatches(match, s)

There seems to be three cases here that are relevant to this patch:

1. self.rev() is None (i.e. this is a workingctx)
2. self.rev() < other.rev()
3. self.rev >= other.rev()

Which case is it that may do extra logic before? And in which ctx
class? I've spent some trying to find it without success.

>
>          modified, added = [], []
>          removed = []
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Durham Goode - March 8, 2017, 6:25 p.m.
On 3/8/17 10:21 AM, Martin von Zweigbergk wrote:
> On Tue, Mar 7, 2017 at 7:22 PM, Durham Goode <durham@fb.com> wrote:
>> # HG changeset patch
>> # User Durham Goode <durham@fb.com>
>> # Date 1488937790 28800
>> #      Tue Mar 07 17:49:50 2017 -0800
>> # Node ID d2850df6c891a20585a0d5eac370c1c2f4463cad
>> # Parent  36bcc5d848c6bdf33d604999631a0708d1b7f067
>> context: remove duplicate manifest creation during _buildstatus
>>
>> Previously we called self.manifest() in some cases to preload the first
>> manifest. It turns out that self.manifest() may do extra logic that
>> _manifestmatches() does not, so it may be causing us extra work. The fix is to
>> just only do the work once.
>>
>> diff --git a/mercurial/context.py b/mercurial/context.py
>> --- a/mercurial/context.py
>> +++ b/mercurial/context.py
>> @@ -117,10 +117,12 @@ class basectx(object):
>>          # 1000 and cache it so that when you read 1001, we just need to apply a
>>          # delta to what's in the cache. So that's one full reconstruction + one
>>          # delta application.
>> +        mf2 = None
>>          if self.rev() is not None and self.rev() < other.rev():
>> -            self.manifest()
>> +            mf2 = self._manifestmatches(match, s)
>>          mf1 = other._manifestmatches(match, s)
>> -        mf2 = self._manifestmatches(match, s)
>> +        if mf2 is None:
>> +            mf2 = self._manifestmatches(match, s)
>
> There seems to be three cases here that are relevant to this patch:
>
> 1. self.rev() is None (i.e. this is a workingctx)
> 2. self.rev() < other.rev()
> 3. self.rev >= other.rev()
>
> Which case is it that may do extra logic before? And in which ctx
> class? I've spent some trying to find it without success.

I guess the 'self.rev() is not None' would prevent it from hitting cases 
I was thinking of (workingctx and memctx, where self.manifest() is 
expensive and maybe not reused for self._manifestmatches).

I still think this change is worth it, since I don't think this code 
should be making assumptions about the internal reusage of manifests 
between self.manifest() and self._manifestmatches()
via Mercurial-devel - March 8, 2017, 6:28 p.m.
On Wed, Mar 8, 2017 at 10:25 AM, Durham Goode <durham@fb.com> wrote:
>
>
> On 3/8/17 10:21 AM, Martin von Zweigbergk wrote:
>>
>> On Tue, Mar 7, 2017 at 7:22 PM, Durham Goode <durham@fb.com> wrote:
>>>
>>> # HG changeset patch
>>> # User Durham Goode <durham@fb.com>
>>> # Date 1488937790 28800
>>> #      Tue Mar 07 17:49:50 2017 -0800
>>> # Node ID d2850df6c891a20585a0d5eac370c1c2f4463cad
>>> # Parent  36bcc5d848c6bdf33d604999631a0708d1b7f067
>>> context: remove duplicate manifest creation during _buildstatus
>>>
>>> Previously we called self.manifest() in some cases to preload the first
>>> manifest. It turns out that self.manifest() may do extra logic that
>>> _manifestmatches() does not, so it may be causing us extra work. The fix
>>> is to
>>> just only do the work once.
>>>
>>> diff --git a/mercurial/context.py b/mercurial/context.py
>>> --- a/mercurial/context.py
>>> +++ b/mercurial/context.py
>>> @@ -117,10 +117,12 @@ class basectx(object):
>>>          # 1000 and cache it so that when you read 1001, we just need to
>>> apply a
>>>          # delta to what's in the cache. So that's one full
>>> reconstruction + one
>>>          # delta application.
>>> +        mf2 = None
>>>          if self.rev() is not None and self.rev() < other.rev():
>>> -            self.manifest()
>>> +            mf2 = self._manifestmatches(match, s)
>>>          mf1 = other._manifestmatches(match, s)
>>> -        mf2 = self._manifestmatches(match, s)
>>> +        if mf2 is None:
>>> +            mf2 = self._manifestmatches(match, s)
>>
>>
>> There seems to be three cases here that are relevant to this patch:
>>
>> 1. self.rev() is None (i.e. this is a workingctx)
>> 2. self.rev() < other.rev()
>> 3. self.rev >= other.rev()
>>
>> Which case is it that may do extra logic before? And in which ctx
>> class? I've spent some trying to find it without success.
>
>
> I guess the 'self.rev() is not None' would prevent it from hitting cases I
> was thinking of (workingctx and memctx, where self.manifest() is expensive
> and maybe not reused for self._manifestmatches).

Okay.

>
> I still think this change is worth it, since I don't think this code should
> be making assumptions about the internal reusage of manifests between
> self.manifest() and self._manifestmatches()

Yeah, I agree with that, but could you provide a new commit message
that's correct, please? I can replace it in flight if you like.
Durham Goode - March 8, 2017, 6:31 p.m.
On 3/8/17 10:28 AM, Martin von Zweigbergk wrote:
> On Wed, Mar 8, 2017 at 10:25 AM, Durham Goode <durham@fb.com> wrote:
>>
>>
>> On 3/8/17 10:21 AM, Martin von Zweigbergk wrote:
>>>
>>> On Tue, Mar 7, 2017 at 7:22 PM, Durham Goode <durham@fb.com> wrote:
>>>>
>>>> # HG changeset patch
>>>> # User Durham Goode <durham@fb.com>
>>>> # Date 1488937790 28800
>>>> #      Tue Mar 07 17:49:50 2017 -0800
>>>> # Node ID d2850df6c891a20585a0d5eac370c1c2f4463cad
>>>> # Parent  36bcc5d848c6bdf33d604999631a0708d1b7f067
>>>> context: remove duplicate manifest creation during _buildstatus
>>>>
>>>> Previously we called self.manifest() in some cases to preload the first
>>>> manifest. It turns out that self.manifest() may do extra logic that
>>>> _manifestmatches() does not, so it may be causing us extra work. The fix
>>>> is to
>>>> just only do the work once.
>>>>
>>>> diff --git a/mercurial/context.py b/mercurial/context.py
>>>> --- a/mercurial/context.py
>>>> +++ b/mercurial/context.py
>>>> @@ -117,10 +117,12 @@ class basectx(object):
>>>>          # 1000 and cache it so that when you read 1001, we just need to
>>>> apply a
>>>>          # delta to what's in the cache. So that's one full
>>>> reconstruction + one
>>>>          # delta application.
>>>> +        mf2 = None
>>>>          if self.rev() is not None and self.rev() < other.rev():
>>>> -            self.manifest()
>>>> +            mf2 = self._manifestmatches(match, s)
>>>>          mf1 = other._manifestmatches(match, s)
>>>> -        mf2 = self._manifestmatches(match, s)
>>>> +        if mf2 is None:
>>>> +            mf2 = self._manifestmatches(match, s)
>>>
>>>
>>> There seems to be three cases here that are relevant to this patch:
>>>
>>> 1. self.rev() is None (i.e. this is a workingctx)
>>> 2. self.rev() < other.rev()
>>> 3. self.rev >= other.rev()
>>>
>>> Which case is it that may do extra logic before? And in which ctx
>>> class? I've spent some trying to find it without success.
>>
>>
>> I guess the 'self.rev() is not None' would prevent it from hitting cases I
>> was thinking of (workingctx and memctx, where self.manifest() is expensive
>> and maybe not reused for self._manifestmatches).
>
> Okay.
>
>>
>> I still think this change is worth it, since I don't think this code should
>> be making assumptions about the internal reusage of manifests between
>> self.manifest() and self._manifestmatches()
>
> Yeah, I agree with that, but could you provide a new commit message
> that's correct, please? I can replace it in flight if you like.

context: remove assumptions about manifest creation during _buildstatus

Previously we called self.manifest() in some cases to preload the first
manifest. This relied on the assumption that the later 
_manifestmatches() call did not duplicate any work the original 
self.manifest() call did. Let's remove that assumption, since it bit me 
during my refactors of this area and is easy to remove.

Patch

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -117,10 +117,12 @@  class basectx(object):
         # 1000 and cache it so that when you read 1001, we just need to apply a
         # delta to what's in the cache. So that's one full reconstruction + one
         # delta application.
+        mf2 = None
         if self.rev() is not None and self.rev() < other.rev():
-            self.manifest()
+            mf2 = self._manifestmatches(match, s)
         mf1 = other._manifestmatches(match, s)
-        mf2 = self._manifestmatches(match, s)
+        if mf2 is None:
+            mf2 = self._manifestmatches(match, s)
 
         modified, added = [], []
         removed = []