Patchwork [3,of,7] context: fix introrev to avoid computation as initially intended

login
register
mail settings
Submitter Boris Feld
Date Sept. 7, 2018, 3:04 p.m.
Message ID <a4c3eb6c1a36cbbf64fa.1536332647@localhost.localdomain>
Download mbox | patch
Permalink /patch/34394/
State Accepted
Headers show

Comments

Boris Feld - Sept. 7, 2018, 3:04 p.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1536254177 14400
#      Thu Sep 06 13:16:17 2018 -0400
# Node ID a4c3eb6c1a36cbbf64fa8930b173154b2e77ef2b
# Parent  9a18509c522deeb62a7b244dcf4c7b79a8dc1132
# EXP-Topic copy-perf
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r a4c3eb6c1a36
context: fix introrev to avoid computation as initially intended

The `filerev.introrev()` method has various logic be as efficient as possible.
In particular, it tries to restrict the range covered by the
`ctx._adjustlinkrev(...)` call. However, it does so using the value returned by
`ctx.rev()`. In some case (eg: copy tracing), that `ctx.rev()` call would do an
`_adjustlinkrev(...)` call on its own, defeating the optimization purpose and
doing the computation twice.

We are about to improve graph traversal associated with copy tracing using code
based on `ctx.introrev()`, so we need this fixed before proceeding further.
via Mercurial-devel - Sept. 10, 2018, 4:21 p.m.
On Fri, Sep 7, 2018 at 8:09 AM Boris Feld <boris.feld@octobus.net> wrote:

> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1536254177 14400
> #      Thu Sep 06 13:16:17 2018 -0400
> # Node ID a4c3eb6c1a36cbbf64fa8930b173154b2e77ef2b
> # Parent  9a18509c522deeb62a7b244dcf4c7b79a8dc1132
> # EXP-Topic copy-perf
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r
> a4c3eb6c1a36
> context: fix introrev to avoid computation as initially intended
>
> diff --git a/mercurial/context.py b/mercurial/context.py
> --- a/mercurial/context.py
> +++ b/mercurial/context.py
> @@ -829,6 +829,23 @@ class basefilectx(object):
>              # result is crash somewhere else at to some point.
>          return lkr
>
> +    def _lazyrev(self):
>

We usually try to separate refactoring (like extracting a method) from
functional (or performance-related) changes. Could you do that with this
patch or does it not make sense for some reason?
Boris Feld - Oct. 1, 2018, 4:11 p.m.
On 10/09/2018 18:21, Martin von Zweigbergk via Mercurial-devel wrote:
>
>
> On Fri, Sep 7, 2018 at 8:09 AM Boris Feld <boris.feld@octobus.net 
> <mailto:boris.feld@octobus.net>> wrote:
>
>     # HG changeset patch
>     # User Boris Feld <boris.feld@octobus.net
>     <mailto:boris.feld@octobus.net>>
>     # Date 1536254177 14400
>     #      Thu Sep 06 13:16:17 2018 -0400
>     # Node ID a4c3eb6c1a36cbbf64fa8930b173154b2e77ef2b
>     # Parent  9a18509c522deeb62a7b244dcf4c7b79a8dc1132
>     # EXP-Topic copy-perf
>     # Available At https://bitbucket.org/octobus/mercurial-devel/
>     #              hg pull
>     https://bitbucket.org/octobus/mercurial-devel/ -r a4c3eb6c1a36
>     context: fix introrev to avoid computation as initially intended
>
>     diff --git a/mercurial/context.py b/mercurial/context.py
>     --- a/mercurial/context.py
>     +++ b/mercurial/context.py
>     @@ -829,6 +829,23 @@ class basefilectx(object):
>                  # result is crash somewhere else at to some point.
>              return lkr
>
>     +    def _lazyrev(self):
>
>
> We usually try to separate refactoring (like extracting a method) from 
> functional (or performance-related) changes. Could you do that with 
> this patch or does it not make sense for some reason?
In this case, the two changes are a bit too interleaved to be easily 
split in two. We can't do the special casing until we have the new 
method and the new method can't have any caller without changing the 
conditionals to include the special casing.
>
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
via Mercurial-devel - Oct. 1, 2018, 4:43 p.m.
On Mon, Oct 1, 2018 at 9:11 AM Boris FELD <boris.feld@octobus.net> wrote:

> On 10/09/2018 18:21, Martin von Zweigbergk via Mercurial-devel wrote:
>
>
>
> On Fri, Sep 7, 2018 at 8:09 AM Boris Feld <boris.feld@octobus.net> wrote:
>
>> # HG changeset patch
>> # User Boris Feld <boris.feld@octobus.net>
>> # Date 1536254177 14400
>> #      Thu Sep 06 13:16:17 2018 -0400
>> # Node ID a4c3eb6c1a36cbbf64fa8930b173154b2e77ef2b
>> # Parent  9a18509c522deeb62a7b244dcf4c7b79a8dc1132
>> # EXP-Topic copy-perf
>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r
>> a4c3eb6c1a36
>> context: fix introrev to avoid computation as initially intended
>>
>> diff --git a/mercurial/context.py b/mercurial/context.py
>> --- a/mercurial/context.py
>> +++ b/mercurial/context.py
>> @@ -829,6 +829,23 @@ class basefilectx(object):
>>              # result is crash somewhere else at to some point.
>>          return lkr
>>
>> +    def _lazyrev(self):
>>
>
> We usually try to separate refactoring (like extracting a method) from
> functional (or performance-related) changes. Could you do that with this
> patch or does it not make sense for some reason?
>
> In this case, the two changes are a bit too interleaved to be easily split
> in two. We can't do the special casing until we have the new method and the
> new method can't have any caller without changing the conditionals to
> include the special casing.
>

Maybe I missed something, but it looks like _lazyrev() returns either
"None" or "self.rev()", even at the end of the series. Did I get that
right? Will that change later? If not, it seems like you could instead
extract a method that calculates what's currently called "noctx". Even if
that's going to change, it might make it easier to understand this patch if
you split out a patch that made the structure here more similar to your
goal, something like:


@@ -837,9 +837,13 @@ class basefilectx(object):
         lkr = self.linkrev()
         attrs = vars(self)
         noctx = not (r'_changeid' in attrs or r'_changectx' in attrs)
-        if noctx or self.rev() == lkr:
+        if noctx:
             return self.linkrev()
-        return self._adjustlinkrev(self.rev(), inclusive=True)
+        else:
+            if self.rev() == lkr:
+                return self.linkrev()
+            else:
+                return self._adjustlinkrev(self.rev(), inclusive=True)


>
> _______________________________________________
> Mercurial-devel mailing listMercurial-devel@mercurial-scm.orghttps://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
>
Boris Feld - Oct. 2, 2018, 8:58 a.m.
On 01/10/2018 18:43, Martin von Zweigbergk via Mercurial-devel wrote:
>
>
> On Mon, Oct 1, 2018 at 9:11 AM Boris FELD <boris.feld@octobus.net 
> <mailto:boris.feld@octobus.net>> wrote:
>
>     On 10/09/2018 18:21, Martin von Zweigbergk via Mercurial-devel wrote:
>>
>>
>>     On Fri, Sep 7, 2018 at 8:09 AM Boris Feld <boris.feld@octobus.net
>>     <mailto:boris.feld@octobus.net>> wrote:
>>
>>         # HG changeset patch
>>         # User Boris Feld <boris.feld@octobus.net
>>         <mailto:boris.feld@octobus.net>>
>>         # Date 1536254177 14400
>>         #      Thu Sep 06 13:16:17 2018 -0400
>>         # Node ID a4c3eb6c1a36cbbf64fa8930b173154b2e77ef2b
>>         # Parent 9a18509c522deeb62a7b244dcf4c7b79a8dc1132
>>         # EXP-Topic copy-perf
>>         # Available At https://bitbucket.org/octobus/mercurial-devel/
>>         #              hg pull
>>         https://bitbucket.org/octobus/mercurial-devel/ -r a4c3eb6c1a36
>>         context: fix introrev to avoid computation as initially intended
>>
>>         diff --git a/mercurial/context.py b/mercurial/context.py
>>         --- a/mercurial/context.py
>>         +++ b/mercurial/context.py
>>         @@ -829,6 +829,23 @@ class basefilectx(object):
>>                      # result is crash somewhere else at to some point.
>>                  return lkr
>>
>>         +    def _lazyrev(self):
>>
>>
>>     We usually try to separate refactoring (like extracting a method)
>>     from functional (or performance-related) changes. Could you do
>>     that with this patch or does it not make sense for some reason?
>     In this case, the two changes are a bit too interleaved to be
>     easily split in two. We can't do the special casing until we have
>     the new method and the new method can't have any caller without
>     changing the conditionals to include the special casing.
>
>
> Maybe I missed something, but it looks like _lazyrev() returns either 
> "None" or "self.rev()", even at the end of the series. Did I get that 
> right? Will that change later? If not, it seems like you could instead 
> extract a method that calculates what's currently called "noctx". Even 
> if that's going to change, it might make it easier to understand this 
> patch if you split out a patch that made the structure here more 
> similar to your goal, something like:
>
>
> @@ -837,9 +837,13 @@ class basefilectx(object):
>          lkr = self.linkrev()
>          attrs = vars(self)
>          noctx = not (r'_changeid' in attrs or r'_changectx' in attrs)
> -        if noctx or self.rev() == lkr:
> +        if noctx:
>              return self.linkrev()
> -        return self._adjustlinkrev(self.rev(), inclusive=True)
> +        else:
> +            if self.rev() == lkr:
> +                return self.linkrev()
> +            else:
> +                return self._adjustlinkrev(self.rev(), inclusive=True)
Yes, you are right, we can split the changeset in two.

I don't feel that it would help the readability of the series but I'm 
not the reviewer. Do you think it would help you review the patch?
>
>>
>>
>>     _______________________________________________
>>     Mercurial-devel mailing list
>>     Mercurial-devel@mercurial-scm.org  <mailto:Mercurial-devel@mercurial-scm.org>
>>     https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
via Mercurial-devel - Oct. 2, 2018, 12:46 p.m.
On Tue, Oct 2, 2018, 01:58 Boris FELD <boris.feld@octobus.net> wrote:

>
> On 01/10/2018 18:43, Martin von Zweigbergk via Mercurial-devel wrote:
>
>
>
> On Mon, Oct 1, 2018 at 9:11 AM Boris FELD <boris.feld@octobus.net> wrote:
>
>> On 10/09/2018 18:21, Martin von Zweigbergk via Mercurial-devel wrote:
>>
>>
>>
>> On Fri, Sep 7, 2018 at 8:09 AM Boris Feld <boris.feld@octobus.net> wrote:
>>
>>> # HG changeset patch
>>> # User Boris Feld <boris.feld@octobus.net>
>>> # Date 1536254177 14400
>>> #      Thu Sep 06 13:16:17 2018 -0400
>>> # Node ID a4c3eb6c1a36cbbf64fa8930b173154b2e77ef2b
>>> # Parent  9a18509c522deeb62a7b244dcf4c7b79a8dc1132
>>> # EXP-Topic copy-perf
>>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/
>>> -r a4c3eb6c1a36
>>> context: fix introrev to avoid computation as initially intended
>>>
>>> diff --git a/mercurial/context.py b/mercurial/context.py
>>> --- a/mercurial/context.py
>>> +++ b/mercurial/context.py
>>> @@ -829,6 +829,23 @@ class basefilectx(object):
>>>              # result is crash somewhere else at to some point.
>>>          return lkr
>>>
>>> +    def _lazyrev(self):
>>>
>>
>> We usually try to separate refactoring (like extracting a method) from
>> functional (or performance-related) changes. Could you do that with this
>> patch or does it not make sense for some reason?
>>
>> In this case, the two changes are a bit too interleaved to be easily
>> split in two. We can't do the special casing until we have the new method
>> and the new method can't have any caller without changing the conditionals
>> to include the special casing.
>>
>
> Maybe I missed something, but it looks like _lazyrev() returns either
> "None" or "self.rev()", even at the end of the series. Did I get that
> right? Will that change later? If not, it seems like you could instead
> extract a method that calculates what's currently called "noctx". Even if
> that's going to change, it might make it easier to understand this patch if
> you split out a patch that made the structure here more similar to your
> goal, something like:
>
>
> @@ -837,9 +837,13 @@ class basefilectx(object):
>          lkr = self.linkrev()
>          attrs = vars(self)
>          noctx = not (r'_changeid' in attrs or r'_changectx' in attrs)
> -        if noctx or self.rev() == lkr:
> +        if noctx:
>              return self.linkrev()
> -        return self._adjustlinkrev(self.rev(), inclusive=True)
> +        else:
> +            if self.rev() == lkr:
> +                return self.linkrev()
> +            else:
> +                return self._adjustlinkrev(self.rev(), inclusive=True)
>
> Yes, you are right, we can split the changeset in two.
>
> I don't feel that it would help the readability of the series but I'm not
> the reviewer. Do you think it would help you review the patch?
>
>
Yes, I think it would. I did it myself in order to understand this patch. I
think it would also help to make that return just a boolean value, assuming
that will still work with later patches. Thanks.



>
>>
>> _______________________________________________
>> Mercurial-devel mailing listMercurial-devel@mercurial-scm.orghttps://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>>
>>
> _______________________________________________
> Mercurial-devel mailing listMercurial-devel@mercurial-scm.orghttps://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
>
via Mercurial-devel - Oct. 3, 2018, 6:41 p.m.
Btw, just because I found this patch hard to follow doesn't necessarily
mean that others do. I won't mind if someone else understands what it does
and queues it (and a third person adds a second accept stamp).

On Tue, Oct 2, 2018 at 5:46 AM Martin von Zweigbergk <martinvonz@google.com>
wrote:

>
>
> On Tue, Oct 2, 2018, 01:58 Boris FELD <boris.feld@octobus.net> wrote:
>
>>
>> On 01/10/2018 18:43, Martin von Zweigbergk via Mercurial-devel wrote:
>>
>>
>>
>> On Mon, Oct 1, 2018 at 9:11 AM Boris FELD <boris.feld@octobus.net> wrote:
>>
>>> On 10/09/2018 18:21, Martin von Zweigbergk via Mercurial-devel wrote:
>>>
>>>
>>>
>>> On Fri, Sep 7, 2018 at 8:09 AM Boris Feld <boris.feld@octobus.net>
>>> wrote:
>>>
>>>> # HG changeset patch
>>>> # User Boris Feld <boris.feld@octobus.net>
>>>> # Date 1536254177 14400
>>>> #      Thu Sep 06 13:16:17 2018 -0400
>>>> # Node ID a4c3eb6c1a36cbbf64fa8930b173154b2e77ef2b
>>>> # Parent  9a18509c522deeb62a7b244dcf4c7b79a8dc1132
>>>> # EXP-Topic copy-perf
>>>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>>>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/
>>>> -r a4c3eb6c1a36
>>>> context: fix introrev to avoid computation as initially intended
>>>>
>>>> diff --git a/mercurial/context.py b/mercurial/context.py
>>>> --- a/mercurial/context.py
>>>> +++ b/mercurial/context.py
>>>> @@ -829,6 +829,23 @@ class basefilectx(object):
>>>>              # result is crash somewhere else at to some point.
>>>>          return lkr
>>>>
>>>> +    def _lazyrev(self):
>>>>
>>>
>>> We usually try to separate refactoring (like extracting a method) from
>>> functional (or performance-related) changes. Could you do that with this
>>> patch or does it not make sense for some reason?
>>>
>>> In this case, the two changes are a bit too interleaved to be easily
>>> split in two. We can't do the special casing until we have the new method
>>> and the new method can't have any caller without changing the conditionals
>>> to include the special casing.
>>>
>>
>> Maybe I missed something, but it looks like _lazyrev() returns either
>> "None" or "self.rev()", even at the end of the series. Did I get that
>> right? Will that change later? If not, it seems like you could instead
>> extract a method that calculates what's currently called "noctx". Even if
>> that's going to change, it might make it easier to understand this patch if
>> you split out a patch that made the structure here more similar to your
>> goal, something like:
>>
>>
>> @@ -837,9 +837,13 @@ class basefilectx(object):
>>          lkr = self.linkrev()
>>          attrs = vars(self)
>>          noctx = not (r'_changeid' in attrs or r'_changectx' in attrs)
>> -        if noctx or self.rev() == lkr:
>> +        if noctx:
>>              return self.linkrev()
>> -        return self._adjustlinkrev(self.rev(), inclusive=True)
>> +        else:
>> +            if self.rev() == lkr:
>> +                return self.linkrev()
>> +            else:
>> +                return self._adjustlinkrev(self.rev(), inclusive=True)
>>
>> Yes, you are right, we can split the changeset in two.
>>
>> I don't feel that it would help the readability of the series but I'm not
>> the reviewer. Do you think it would help you review the patch?
>>
>>
> Yes, I think it would. I did it myself in order to understand this patch.
> I think it would also help to make that return just a boolean value,
> assuming that will still work with later patches. Thanks.
>
>
>
>>
>>>
>>> _______________________________________________
>>> Mercurial-devel mailing listMercurial-devel@mercurial-scm.orghttps://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>>>
>>>
>> _______________________________________________
>> Mercurial-devel mailing listMercurial-devel@mercurial-scm.orghttps://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>>
>>

Patch

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -829,6 +829,23 @@  class basefilectx(object):
             # result is crash somewhere else at to some point.
         return lkr
 
+    def _lazyrev(self):
+        """return self.rev() if it is available without computation,
+
+        If finding the rev would trigger a possibly expensive computation, we
+        return None."""
+        attrs = vars(self)
+        if r'_changeid' in attrs:
+            # We have a cached value already
+            return self.rev()
+        elif r'_changectx' in attrs:
+            # We know which changelog entry we are coming from
+            return self.rev()
+        elif r'_descendantrev' not in attrs:
+            # we have no context, so linkrev will be used
+            return self.rev()
+        return None
+
     def introrev(self):
         """return the rev of the changeset which introduced this file revision
 
@@ -839,11 +856,14 @@  class basefilectx(object):
         changesets.
         """
         lkr = self.linkrev()
-        attrs = vars(self)
-        noctx = not (r'_changeid' in attrs or r'_changectx' in attrs)
-        if noctx or self.rev() == lkr:
-            return self.linkrev()
-        return self._adjustlinkrev(self.rev(), inclusive=True)
+        lazyrev = self._lazyrev()
+        if lazyrev is not None:
+            if lazyrev == lkr:
+                return lazyrev
+            else:
+                return self._adjustlinkrev(lazyrev, inclusive=True)
+        else:
+            return self.rev()
 
     def introfilectx(self):
         """Return filectx having identical contents, but pointing to the