Patchwork D7483: changectx: add a "maybe filtered" filtered attribute

login
register
mail settings
Submitter phabricator
Date Nov. 22, 2019, 9:21 a.m.
Message ID <differential-rev-PHID-DREV-l6hx2b4okfq4bpvlkz5y-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/43418/
State Superseded
Headers show

Comments

phabricator - Nov. 22, 2019, 9:21 a.m.
marmoute created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  There are changeset that we know not to be filtered (eg: `null`). In this case,
  we could access some information without triggering changelog filtering. We add
  an attribute to changectx that track this property.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D7483

AFFECTED FILES
  mercurial/context.py

CHANGE DETAILS




To: marmoute, #hg-reviewers
Cc: mercurial-devel
phabricator - Nov. 23, 2019, 4:04 a.m.
indygreg added inline comments.

INLINE COMMENTS

> context.py:503
> +            repo = self._repo.unfiltered()
> +        return repo.changelog.changelogrevision(self.rev())
>  

Before I accept more of this series, I'd like others to think about the potential alternative solution of passing in or storing a reference to the changelog on `basectx`/`changectx`. The reason is that a lot of methods are calling `repo.changelog` and this can be expensive. I suspect we'd get a handful of random performance wins if we cached the changelog instance on each `basectx`. We also wouldn't need to litter the code with a bunch of conditional `repo.unfiltered()` calls since we'd already have an appropriate changelog instance stored in the instance.

Thoughts?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7483/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7483

To: marmoute, #hg-reviewers
Cc: indygreg, mercurial-devel
phabricator - Nov. 23, 2019, 4:21 p.m.
martinvonz added inline comments.

INLINE COMMENTS

> context.py:484
>          self._node = node
> +        self._maybe_filtered = maybe_filtered
>  

Add a comment explaining what this attribute means, such as "indicates that this changeset is visible regardless of filtering". Actually, maybe it's better to name the property something like `filter_agnostic`?

> indygreg wrote in context.py:503
> Before I accept more of this series, I'd like others to think about the potential alternative solution of passing in or storing a reference to the changelog on `basectx`/`changectx`. The reason is that a lot of methods are calling `repo.changelog` and this can be expensive. I suspect we'd get a handful of random performance wins if we cached the changelog instance on each `basectx`. We also wouldn't need to litter the code with a bunch of conditional `repo.unfiltered()` calls since we'd already have an appropriate changelog instance stored in the instance.
> 
> Thoughts?

That would make `ctx.children()` incorrect, right?

I still wonder if the "visible" filter should be removed and (so there would be no `--hidden`, and no annoying message telling you use that flag). We would need to figure out what e.g. `hg log -r 'head()'` and `hg log -r x::` should do (to not include extinct heads) and how to make it easy for the user to get either behavior. Perhaps we would let `head()` be the visible heads and add a new `allheads()` to get all? That's obviously a much larger discussion and maybe a topic for the next sprint instead.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7483/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7483

To: marmoute, #hg-reviewers
Cc: martinvonz, indygreg, mercurial-devel
phabricator - Nov. 23, 2019, 4:57 p.m.
marmoute added a comment.


  (grmlml, realising that I forgot to do the dual "submit" for my comment. It is 2019, why is Phabricator still not fixing this…)

INLINE COMMENTS

> indygreg wrote in context.py:503
> Before I accept more of this series, I'd like others to think about the potential alternative solution of passing in or storing a reference to the changelog on `basectx`/`changectx`. The reason is that a lot of methods are calling `repo.changelog` and this can be expensive. I suspect we'd get a handful of random performance wins if we cached the changelog instance on each `basectx`. We also wouldn't need to litter the code with a bunch of conditional `repo.unfiltered()` calls since we'd already have an appropriate changelog instance stored in the instance.
> 
> Thoughts?

We woul still need the conditional instance since the maybe_filtered logic only apply to some of the changectx call. (eg: children need to go through filtering).

An overall better/more reobust solution to make the filtering lazier at the changelog level and to most the fastpath logic there. It would be more contained and more robust.

However, I decided I digged the rabbit hole enough for now and this other approach can be looked at later.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7483/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7483

To: marmoute, #hg-reviewers
Cc: martinvonz, indygreg, mercurial-devel
phabricator - Nov. 23, 2019, 5:01 p.m.
marmoute added inline comments.

INLINE COMMENTS

> martinvonz wrote in context.py:484
> Add a comment explaining what this attribute means, such as "indicates that this changeset is visible regardless of filtering". Actually, maybe it's better to name the property something like `filter_agnostic`?

I'll add a comment.

filter_agnostic is misleading. the revision might be filtered for another filter.  Is is also borderline double negative.

> martinvonz wrote in context.py:503
> That would make `ctx.children()` incorrect, right?
> 
> I still wonder if the "visible" filter should be removed and (so there would be no `--hidden`, and no annoying message telling you use that flag). We would need to figure out what e.g. `hg log -r 'head()'` and `hg log -r x::` should do (to not include extinct heads) and how to make it easy for the user to get either behavior. Perhaps we would let `head()` be the visible heads and add a new `allheads()` to get all? That's obviously a much larger discussion and maybe a topic for the next sprint instead.

The filtering is usefull at many level and removing it means dedicated code in many places (eg: discovery). That is error prone. I feel like effort would be better put to simply making it go fast (could be O(1) with proper indexing).
Which anoying message are you talking about ?

I agree that this is a larger discussion and I would rather not see the discussion around this series derails. +1 to have it independently later.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7483/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7483

To: marmoute, #hg-reviewers
Cc: martinvonz, indygreg, mercurial-devel
phabricator - Nov. 24, 2019, 12:44 a.m.
This revision is now accepted and ready to land.
indygreg added a comment.
indygreg accepted this revision.


  I'm satisfied with the discussion and will proceed with reviewing this series.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7483/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7483

To: marmoute, #hg-reviewers, indygreg
Cc: martinvonz, indygreg, mercurial-devel

Patch

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -477,10 +477,11 @@ 
     changeset convenient. It represents a read-only context already present in
     the repo."""
 
-    def __init__(self, repo, rev, node):
+    def __init__(self, repo, rev, node, maybe_filtered=True):
         super(changectx, self).__init__(repo)
         self._rev = rev
         self._node = node
+        self._maybe_filtered = maybe_filtered
 
     def __hash__(self):
         try:
@@ -495,7 +496,11 @@ 
 
     @propertycache
     def _changeset(self):
-        return self._repo.changelog.changelogrevision(self.rev())
+        if self._maybe_filtered:
+            repo = self._repo
+        else:
+            repo = self._repo.unfiltered()
+        return repo.changelog.changelogrevision(self.rev())
 
     @propertycache
     def _manifest(self):