Patchwork [6,of,6,V2] obscache: use the obscache to compute the obsolete set

login
register
mail settings
Submitter Pierre-Yves David
Date May 20, 2017, 3:30 p.m.
Message ID <eb7674b12d5a15fc53f1.1495294220@nodosa.octopoid.net>
Download mbox | patch
Permalink /patch/20776/
State Deferred
Headers show

Comments

Pierre-Yves David - May 20, 2017, 3:30 p.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@octobus.net>
# Date 1495198021 -7200
#      Fri May 19 14:47:01 2017 +0200
# Node ID eb7674b12d5a15fc53f10b075dcac7bee91379d2
# Parent  3c2a082a590aa8b57693c24b8461c2afdb8d5556
# EXP-Topic obscache
# Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
#              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r eb7674b12d5a
obscache: use the obscache to compute the obsolete set

Now that we have a cache and that the cache is kept up to date, we can use it to
speeds up the obsolete set computation. This way, we no longer need to load the
obsstore for most operation.

On the mercurial-core repository, this provide a significant speed up:

Running "hg  id -r ."
- before: 0.630 second (0.56s user 0.06s system 99% cpu 0.630)
- after:  0.129 second (0.11s user 0.02s system 98% cpu 0.129)

And the obsstore loading operation disappear from execution profile.

(note: time spent inside the command drop from 0.4 to 0.04s)

To keep the changeset simple it the handling of case were
the cache has not been kept up to date is pretty simple. That might introduce a
small performance impact during the transition in some case. This will get
improved in later changeset.

In addition the cache still needs to parse the full obsstore when updating.
There as known way to skip parsing the full obsstore for wrote operation too.
This will also get improved later.
Jun Wu - May 20, 2017, 5:37 p.m.
While this does speed up commands like "hg id", people more frequently use
"hg log [-G]" which will call "ctx.troubles" by default and that requires a
full "obsolete()" revset in multiple places (unstable, bumped and divergent
calculation), which won't be helped by this cache.

Do you plan to build similar bitmap cache for unstable, bumped, divergent
too?

Besides, even with this series, _computeobsoleteset is still O(N). The time
complexity remains unchanged. This is not a long-term solution.

I believe the long term solution is to make testing O(1) and remove all
enumeration for all revsets.

Excerpts from Pierre-Yves David's message of 2017-05-20 17:30:20 +0200:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@octobus.net>
> # Date 1495198021 -7200
> #      Fri May 19 14:47:01 2017 +0200
> # Node ID eb7674b12d5a15fc53f10b075dcac7bee91379d2
> # Parent  3c2a082a590aa8b57693c24b8461c2afdb8d5556
> # EXP-Topic obscache
> # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ 
> #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/  -r eb7674b12d5a
> obscache: use the obscache to compute the obsolete set
> 
> Now that we have a cache and that the cache is kept up to date, we can use it to
> speeds up the obsolete set computation. This way, we no longer need to load the
> obsstore for most operation.
> 
> On the mercurial-core repository, this provide a significant speed up:
> 
> Running "hg  id -r ."
> - before: 0.630 second (0.56s user 0.06s system 99% cpu 0.630)
> - after:  0.129 second (0.11s user 0.02s system 98% cpu 0.129)
> 
> And the obsstore loading operation disappear from execution profile.
> 
> (note: time spent inside the command drop from 0.4 to 0.04s)
> 
> To keep the changeset simple it the handling of case were
> the cache has not been kept up to date is pretty simple. That might introduce a
> small performance impact during the transition in some case. This will get
> improved in later changeset.
> 
> In addition the cache still needs to parse the full obsstore when updating.
> There as known way to skip parsing the full obsstore for wrote operation too.
> This will also get improved later.
> 
> diff --git a/mercurial/obsolete.py b/mercurial/obsolete.py
> --- a/mercurial/obsolete.py
> +++ b/mercurial/obsolete.py
> @@ -1546,10 +1546,26 @@ def clearobscaches(repo):
>  def _computeobsoleteset(repo):
>      """the set of obsolete revisions"""
>      obs = set()
> -    getnode = repo.changelog.node
>      notpublic = repo._phasecache.getrevset(repo, (phases.draft, phases.secret))
> +    if not notpublic:
> +        # all changeset are public, none are obsolete
> +        return obs
> +
> +    # XXX There are a couple of case where the cache could not be up to date:
> +    #
> +    # 1) no transaction happened in the repository since the upgrade,
> +    # 2) both old and new client touches that repository
> +    #
> +    # recomputing the whole cache in these case is a bit slower that using the
> +    # good old version (parsing markers and checking them). We could add some
> +    # logic to fall back to the old way in these cases.
> +    obscache = repo.obsstore.obscache
> +    obscache.update(repo) # ensure it is up to date:
> +    isobs = obscache.get
> +
> +    # actually compute the obsolete set
>      for r in notpublic:
> -        if getnode(r) in repo.obsstore.successors:
> +        if isobs(r):
>              obs.add(r)
>      return obs
>
Jun Wu - May 20, 2017, 5:43 p.m.
Excerpts from Jun Wu's message of 2017-05-20 10:37:23 -0700:
> While this does speed up commands like "hg id", people more frequently use
> "hg log [-G]" which will call "ctx.troubles" by default and that requires a
> full "obsolete()" revset in multiple places (unstable, bumped and divergent
> calculation), which won't be helped by this cache.

I take the above back. This is still helpful as it improves the constant of
time complexity. But the below text is still true - the time complexity is
unchanged and a complex caching layer got added. I think we'll have similar
perf win in a cleaner way (no cache invalidation trouble) by writing C code.

> Do you plan to build similar bitmap cache for unstable, bumped, divergent
> too?
> 
> Besides, even with this series, _computeobsoleteset is still O(N). The time
> complexity remains unchanged. This is not a long-term solution.
> 
> I believe the long term solution is to make testing O(1) and remove all
> enumeration for all revsets.
> 
> Excerpts from Pierre-Yves David's message of 2017-05-20 17:30:20 +0200:
> > # HG changeset patch
> > # User Pierre-Yves David <pierre-yves.david@octobus.net>
> > # Date 1495198021 -7200
> > #      Fri May 19 14:47:01 2017 +0200
> > # Node ID eb7674b12d5a15fc53f10b075dcac7bee91379d2
> > # Parent  3c2a082a590aa8b57693c24b8461c2afdb8d5556
> > # EXP-Topic obscache
> > # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ 
> > #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/  -r eb7674b12d5a
> > obscache: use the obscache to compute the obsolete set
> > 
> > Now that we have a cache and that the cache is kept up to date, we can use it to
> > speeds up the obsolete set computation. This way, we no longer need to load the
> > obsstore for most operation.
> > 
> > On the mercurial-core repository, this provide a significant speed up:
> > 
> > Running "hg  id -r ."
> > - before: 0.630 second (0.56s user 0.06s system 99% cpu 0.630)
> > - after:  0.129 second (0.11s user 0.02s system 98% cpu 0.129)
> > 
> > And the obsstore loading operation disappear from execution profile.
> > 
> > (note: time spent inside the command drop from 0.4 to 0.04s)
> > 
> > To keep the changeset simple it the handling of case were
> > the cache has not been kept up to date is pretty simple. That might introduce a
> > small performance impact during the transition in some case. This will get
> > improved in later changeset.
> > 
> > In addition the cache still needs to parse the full obsstore when updating.
> > There as known way to skip parsing the full obsstore for wrote operation too.
> > This will also get improved later.
> > 
> > diff --git a/mercurial/obsolete.py b/mercurial/obsolete.py
> > --- a/mercurial/obsolete.py
> > +++ b/mercurial/obsolete.py
> > @@ -1546,10 +1546,26 @@ def clearobscaches(repo):
> >  def _computeobsoleteset(repo):
> >      """the set of obsolete revisions"""
> >      obs = set()
> > -    getnode = repo.changelog.node
> >      notpublic = repo._phasecache.getrevset(repo, (phases.draft, phases.secret))
> > +    if not notpublic:
> > +        # all changeset are public, none are obsolete
> > +        return obs
> > +
> > +    # XXX There are a couple of case where the cache could not be up to date:
> > +    #
> > +    # 1) no transaction happened in the repository since the upgrade,
> > +    # 2) both old and new client touches that repository
> > +    #
> > +    # recomputing the whole cache in these case is a bit slower that using the
> > +    # good old version (parsing markers and checking them). We could add some
> > +    # logic to fall back to the old way in these cases.
> > +    obscache = repo.obsstore.obscache
> > +    obscache.update(repo) # ensure it is up to date:
> > +    isobs = obscache.get
> > +
> > +    # actually compute the obsolete set
> >      for r in notpublic:
> > -        if getnode(r) in repo.obsstore.successors:
> > +        if isobs(r):
> >              obs.add(r)
> >      return obs
> >
Jun Wu - May 20, 2017, 9 p.m.
After examining this area more carefully, my final conclusions are:

  - _fm1readmarkers, _addsuccessors, _addprecursors are painfully slow.
  - The above slow functions are NOT improved by obscache.
  - For "hg id", none of the above slow functions are called with obscache.
    So, "hg id" is faster.

  - "bumped" and "divergent" revset still require slow functions.
  - For "hg log" with default "troubles" template, slow functions are called
    and obscache won't help speed it up.

I believe a more general purposed, and better way to solve the perf issue is
to build indexes, so precursors[x], successors[x] are O(1) instead of
O(len(allmarkers)). If we go the indexing way, it will solve more problems
and the obscache approach becomes unnecessary.


Excerpts from Jun Wu's message of 2017-05-20 10:43:38 -0700:
> Excerpts from Jun Wu's message of 2017-05-20 10:37:23 -0700:
> > While this does speed up commands like "hg id", people more frequently use
> > "hg log [-G]" which will call "ctx.troubles" by default and that requires a
> > full "obsolete()" revset in multiple places (unstable, bumped and divergent
> > calculation), which won't be helped by this cache.
> 
> I take the above back. This is still helpful as it improves the constant of
> time complexity. But the below text is still true - the time complexity is
> unchanged and a complex caching layer got added. I think we'll have similar
> perf win in a cleaner way (no cache invalidation trouble) by writing C code.
> 
> > Do you plan to build similar bitmap cache for unstable, bumped, divergent
> > too?
> > 
> > Besides, even with this series, _computeobsoleteset is still O(N). The time
> > complexity remains unchanged. This is not a long-term solution.
> > 
> > I believe the long term solution is to make testing O(1) and remove all
> > enumeration for all revsets.
> > 
> > Excerpts from Pierre-Yves David's message of 2017-05-20 17:30:20 +0200:
> > > # HG changeset patch
> > > # User Pierre-Yves David <pierre-yves.david@octobus.net>
> > > # Date 1495198021 -7200
> > > #      Fri May 19 14:47:01 2017 +0200
> > > # Node ID eb7674b12d5a15fc53f10b075dcac7bee91379d2
> > > # Parent  3c2a082a590aa8b57693c24b8461c2afdb8d5556
> > > # EXP-Topic obscache
> > > # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ 
> > > #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/  -r eb7674b12d5a
> > > obscache: use the obscache to compute the obsolete set
> > > 
> > > Now that we have a cache and that the cache is kept up to date, we can use it to
> > > speeds up the obsolete set computation. This way, we no longer need to load the
> > > obsstore for most operation.
> > > 
> > > On the mercurial-core repository, this provide a significant speed up:
> > > 
> > > Running "hg  id -r ."
> > > - before: 0.630 second (0.56s user 0.06s system 99% cpu 0.630)
> > > - after:  0.129 second (0.11s user 0.02s system 98% cpu 0.129)
> > > 
> > > And the obsstore loading operation disappear from execution profile.
> > > 
> > > (note: time spent inside the command drop from 0.4 to 0.04s)
> > > 
> > > To keep the changeset simple it the handling of case were
> > > the cache has not been kept up to date is pretty simple. That might introduce a
> > > small performance impact during the transition in some case. This will get
> > > improved in later changeset.
> > > 
> > > In addition the cache still needs to parse the full obsstore when updating.
> > > There as known way to skip parsing the full obsstore for wrote operation too.
> > > This will also get improved later.
> > > 
> > > diff --git a/mercurial/obsolete.py b/mercurial/obsolete.py
> > > --- a/mercurial/obsolete.py
> > > +++ b/mercurial/obsolete.py
> > > @@ -1546,10 +1546,26 @@ def clearobscaches(repo):
> > >  def _computeobsoleteset(repo):
> > >      """the set of obsolete revisions"""
> > >      obs = set()
> > > -    getnode = repo.changelog.node
> > >      notpublic = repo._phasecache.getrevset(repo, (phases.draft, phases.secret))
> > > +    if not notpublic:
> > > +        # all changeset are public, none are obsolete
> > > +        return obs
> > > +
> > > +    # XXX There are a couple of case where the cache could not be up to date:
> > > +    #
> > > +    # 1) no transaction happened in the repository since the upgrade,
> > > +    # 2) both old and new client touches that repository
> > > +    #
> > > +    # recomputing the whole cache in these case is a bit slower that using the
> > > +    # good old version (parsing markers and checking them). We could add some
> > > +    # logic to fall back to the old way in these cases.
> > > +    obscache = repo.obsstore.obscache
> > > +    obscache.update(repo) # ensure it is up to date:
> > > +    isobs = obscache.get
> > > +
> > > +    # actually compute the obsolete set
> > >      for r in notpublic:
> > > -        if getnode(r) in repo.obsstore.successors:
> > > +        if isobs(r):
> > >              obs.add(r)
> > >      return obs
> > >
Pierre-Yves David - May 21, 2017, 11:46 a.m.
On 05/20/2017 05:30 PM, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@octobus.net>
> # Date 1495198021 -7200
> #      Fri May 19 14:47:01 2017 +0200
> # Node ID eb7674b12d5a15fc53f10b075dcac7bee91379d2
> # Parent  3c2a082a590aa8b57693c24b8461c2afdb8d5556
> # EXP-Topic obscache
> # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
> #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r eb7674b12d5a
> obscache: use the obscache to compute the obsolete set
>
> Now that we have a cache and that the cache is kept up to date, we can use it to
> speeds up the obsolete set computation. This way, we no longer need to load the
> obsstore for most operation.
>
> On the mercurial-core repository, this provide a significant speed up:

More raw data from the perf extension (including obsstore loading):

Before:
! obsolete:
! wall 0.477085 comb 0.470000 user 0.460000 sys 0.010000 (best of 20)
! visible
! wall 0.474748 comb 0.480000 user 0.470000 sys 0.010000 (best of 20)

After:
! obsolete
! wall 0.001269 comb 0.010000 user 0.010000 sys 0.000000 (best of 2111)
! visible
! wall 0.003150 comb 0.000000 user 0.000000 sys 0.000000 (best of 869)
Pierre-Yves David - May 21, 2017, 9:52 p.m.
I chatted a bit with Jun on IRC, he told me to ignore the first two 
emails, so I did (I have not read them)

On 05/20/2017 11:00 PM, Jun Wu wrote:
> After examining this area more carefully, my final conclusions are:
>
>   - _fm1readmarkers, _addsuccessors, _addprecursors are painfully slow.
>   - The above slow functions are NOT improved by obscache.
>   - For "hg id", none of the above slow functions are called with obscache.
>     So, "hg id" is faster.
>
>   - "bumped" and "divergent" revset still require slow functions.
>   - For "hg log" with default "troubles" template, slow functions are called
>     and obscache won't help speed it up.

I was a bit confused by this part, so let us clarify for other readers. 
Yes, troubles computations currently requires to load the full history 
and doing so it slow. But this is unrelated to the current series.

Right now, obsolescence has a baseline impacts all mercurial commands. 
The cache in this series successfully remove that baseline impact.

Commands that needs to access the obsolescence history (for troubles or 
other) still has to pay the obsstore loading cost. We'll have to 
eventually improves that but its is not what this series is about.

> I believe a more general purposed, and better way to solve the perf issue is
> to build indexes, so precursors[x], successors[x] are O(1) instead of
> O(len(allmarkers)). If we go the indexing way, it will solve more problems
> and the obscache approach becomes unnecessary.

On disk index would be useful and we want to have them at some point. 
However, This is significantly more complex to build and not ready yet. 
According to our IRC discussion, Jun started poking at indexes but this 
is at an early stage. The cache in this series has been successfully 
deployed and used by many people for multiple weeks already. So moving 
forward with it for now seems better.

In addition, it is not clear indexes would remove the needs for other 
caches entirely:

* For example, the cache in this series use a straightforward bytearray 
to store a rev-indexable flag. This is extremely efficient to read and 
use. So that cache might still be useful after indexes land.

* Likewise, even if troubles computation get faster with indexes, some 
troubles are still pretty expensive to compute so we'll likely wants a 
cache for them too.

Finally half of this series is about introducing the cache base class. 
That class introduce a simple cache key logic (based on what we already 
did for changelog) and the associated incremental update capability. 
Newer caches in this area will likely reuse that logic. So alternative 
will be able to reuse it to build potential replacement for the cache in 
this series.

To conclude, I'll be happy to see some index for obsolescence markers, 
but the current caches as value on its own and is already completed

Cheers,

Patch

diff --git a/mercurial/obsolete.py b/mercurial/obsolete.py
--- a/mercurial/obsolete.py
+++ b/mercurial/obsolete.py
@@ -1546,10 +1546,26 @@  def clearobscaches(repo):
 def _computeobsoleteset(repo):
     """the set of obsolete revisions"""
     obs = set()
-    getnode = repo.changelog.node
     notpublic = repo._phasecache.getrevset(repo, (phases.draft, phases.secret))
+    if not notpublic:
+        # all changeset are public, none are obsolete
+        return obs
+
+    # XXX There are a couple of case where the cache could not be up to date:
+    #
+    # 1) no transaction happened in the repository since the upgrade,
+    # 2) both old and new client touches that repository
+    #
+    # recomputing the whole cache in these case is a bit slower that using the
+    # good old version (parsing markers and checking them). We could add some
+    # logic to fall back to the old way in these cases.
+    obscache = repo.obsstore.obscache
+    obscache.update(repo) # ensure it is up to date:
+    isobs = obscache.get
+
+    # actually compute the obsolete set
     for r in notpublic:
-        if getnode(r) in repo.obsstore.successors:
+        if isobs(r):
             obs.add(r)
     return obs