Submitter | Durham Goode |
---|---|
Date | Oct. 21, 2014, 2:11 a.m. |
Message ID | <1a9990727f5d75d70bb0.1413857472@dev2000.prn2.facebook.com> |
Download | mbox | patch |
Permalink | /patch/6432/ |
State | Accepted |
Headers | show |
Comments
On 10/21/2014 04:11 AM, Durham Goode wrote: > # HG changeset patch > # User Durham Goode <durham@fb.com> > # Date 1413856209 25200 > # Mon Oct 20 18:50:09 2014 -0700 > # Node ID 1a9990727f5d75d70bb0d00fdffac571449574b6 > # Parent 67cb1ab1ad1db6a66b96ac0c0333256b5383a3a2 > rebase: improve base revset performance > > The old revset had pretty terrible performance on large repositories (12+ > seconds). This new revset achieves the same result in only 0.7s. As we improve > the underlying revset APIs we can probably get this revset down to 'only(base, > dest)::', but at the moment that version still takes 2s. > > diff --git a/hgext/rebase.py b/hgext/rebase.py > --- a/hgext/rebase.py > +++ b/hgext/rebase.py > @@ -272,9 +272,9 @@ def rebase(ui, repo, **opts): > ui.status(_('empty "base" revision set - ' > "can't compute rebase set\n")) > return 1 > - rebaseset = repo.revs( > - '(children(ancestor(%ld, %d)) and ::(%ld))::', > - base, dest, base) > + commonanc = repo.revs('ancestor(%ld, %d)', base, dest).first() > + rebaseset = repo.revs('(%d::(%ld) - %d)::', > + commonanc, base, commonanc) Nice, but not really suitable for stable according to the normal policy. Let's see. I guess the trick is at a::b has very efficient pruning. Very clever. The () around %ld is just cargo cult, right? [Here I would have said something important. But it turned out that I didn't have anything important to say. Only that it is an "interesting" challenge to optimize the old expression and the proposed expressions to make them as fast as this rewrite. So I will just say:] LGTM. /Mads
On 10/20/2014 07:11 PM, Durham Goode wrote: > # HG changeset patch > # User Durham Goode <durham@fb.com> > # Date 1413856209 25200 > # Mon Oct 20 18:50:09 2014 -0700 > # Node ID 1a9990727f5d75d70bb0d00fdffac571449574b6 > # Parent 67cb1ab1ad1db6a66b96ac0c0333256b5383a3a2 > rebase: improve base revset performance > > The old revset had pretty terrible performance on large repositories (12+ > seconds). This new revset achieves the same result in only 0.7s. As we improve > the underlying revset APIs we can probably get this revset down to 'only(base, > dest)::', but at the moment that version still takes 2s. > > diff --git a/hgext/rebase.py b/hgext/rebase.py > --- a/hgext/rebase.py > +++ b/hgext/rebase.py > @@ -272,9 +272,9 @@ def rebase(ui, repo, **opts): > ui.status(_('empty "base" revision set - ' > "can't compute rebase set\n")) > return 1 > - rebaseset = repo.revs( > - '(children(ancestor(%ld, %d)) and ::(%ld))::', > - base, dest, base) > + commonanc = repo.revs('ancestor(%ld, %d)', base, dest).first() > + rebaseset = repo.revs('(%d::(%ld) - %d)::', > + commonanc, base, commonanc) So we suffer here from the cost of `::(X)` and the unability of storing the ancestors call in a variable. I wonder if if we can get similar performance by limiting the walk of `::(X)` with the min of the subset. There is probably no value for going through revset got the ancestors call. We could call the ancestors code directly. Not sure about stable suitability. > if not rebaseset: > # transform to list because smartsets are not comparable to > # lists. This should be improved to honor lazyness of
On 10/20/14 10:25 PM, Pierre-Yves David wrote: > > > On 10/20/2014 07:11 PM, Durham Goode wrote: >> # HG changeset patch >> # User Durham Goode <durham@fb.com> >> # Date 1413856209 25200 >> # Mon Oct 20 18:50:09 2014 -0700 >> # Node ID 1a9990727f5d75d70bb0d00fdffac571449574b6 >> # Parent 67cb1ab1ad1db6a66b96ac0c0333256b5383a3a2 >> rebase: improve base revset performance >> >> The old revset had pretty terrible performance on large repositories >> (12+ >> seconds). This new revset achieves the same result in only 0.7s. As >> we improve >> the underlying revset APIs we can probably get this revset down to >> 'only(base, >> dest)::', but at the moment that version still takes 2s. >> >> diff --git a/hgext/rebase.py b/hgext/rebase.py >> --- a/hgext/rebase.py >> +++ b/hgext/rebase.py >> @@ -272,9 +272,9 @@ def rebase(ui, repo, **opts): >> ui.status(_('empty "base" revision set - ' >> "can't compute rebase set\n")) >> return 1 >> - rebaseset = repo.revs( >> - '(children(ancestor(%ld, %d)) and ::(%ld))::', >> - base, dest, base) >> + commonanc = repo.revs('ancestor(%ld, %d)', base, >> dest).first() >> + rebaseset = repo.revs('(%d::(%ld) - %d)::', >> + commonanc, base, commonanc) > > So we suffer here from the cost of `::(X)` and the unability of > storing the ancestors call in a variable. > The common ancestor is very cheap to compute, so storing it in a variable was just so I didn't have to duplicate the ancestor() bit in the revset in an ugly way, not really for performance. > I wonder if if we can get similar performance by limiting the walk of > `::(X)` with the min of the subset. > In theory the revset code should be smart enough to do this efficiently, but that change is too risky for the freeze. > There is probably no value for going through revset got the ancestors > call. We could call the ancestors code directly. > I wanted to minimize the deviation from the old code. Using ancestor.whatever() might have subtle differences I'm not aware of. > Not sure about stable suitability. I'll let Matt decide. He seemed interested in taking my old original fix for stable before the freeze, so I figured this would qualify.
On Mon, 2014-10-20 at 19:11 -0700, Durham Goode wrote: > # HG changeset patch > # User Durham Goode <durham@fb.com> > # Date 1413856209 25200 > # Mon Oct 20 18:50:09 2014 -0700 > # Node ID 1a9990727f5d75d70bb0d00fdffac571449574b6 > # Parent 67cb1ab1ad1db6a66b96ac0c0333256b5383a3a2 > rebase: improve base revset performance Queued for stable, thanks.
Patch
diff --git a/hgext/rebase.py b/hgext/rebase.py --- a/hgext/rebase.py +++ b/hgext/rebase.py @@ -272,9 +272,9 @@ def rebase(ui, repo, **opts): ui.status(_('empty "base" revision set - ' "can't compute rebase set\n")) return 1 - rebaseset = repo.revs( - '(children(ancestor(%ld, %d)) and ::(%ld))::', - base, dest, base) + commonanc = repo.revs('ancestor(%ld, %d)', base, dest).first() + rebaseset = repo.revs('(%d::(%ld) - %d)::', + commonanc, base, commonanc) if not rebaseset: # transform to list because smartsets are not comparable to # lists. This should be improved to honor lazyness of