Submitter | Gregory Szorc |
---|---|
Date | Nov. 22, 2015, 1:14 a.m. |
Message ID | <f8ab893d8dac5246224f.1448154843@ubuntu-main> |
Download | mbox | patch |
Permalink | /patch/11570/ |
State | Deferred |
Delegated to: | Pierre-Yves David |
Headers | show |
Comments
On Sat, Nov 21, 2015 at 5:14 PM, Gregory Szorc <gregory.szorc@gmail.com> wrote: > # HG changeset patch > # User Gregory Szorc <gregory.szorc@gmail.com> > # Date 1448152280 28800 > # Sat Nov 21 16:31:20 2015 -0800 > # Node ID f8ab893d8dac5246224f330fafdea35c0bcef313 > # Parent ad891b362564b30ec69bc844a3fd98e73b6e032e > commands.log: operate on frozen repository > > `hg log` is a read-only command. As such, it can use a frozen > repository instance so it isn't constantly performing changelog > freshness checks. > > On a local Firefox repository with ~312,000 changesets, this > simple change results in `hg log -T '{rev}\n' > /dev/null` > with an empty/default hgrc becoming faster: > > before: 7.81s > after: 6.11s > delta: -1.70s (78.2% of original > > A full `hg log` shows similar percentage improvement: > > before: 31.14s > after: 24.95s > delta: -6.19s (80.1% of original) > > I'm actually surprised the full `hg log` output showed a similar > percentage speedup: I would have expected it to only decrease by > ~1.70s. But counting function calls does reveal that the full > template output incurs an extra changectx instantiation for each > revision as well as additional changelog accesses (for parents > I believe). > > For comparison purposes, git 2.6.3 for the same count of commits on > a Git conversion of the Firefox repo takes ~1.85s for > `git log --pretty=%H` and ~2.36s for `git log`. We're 3.3x slower > for printing revisions (4.2x before) and 10.6x slower for the > default template output (13.2x before). > So, I'm looking at further ways to optimize `hg log` performance. `hg log -q` is a fast path that doesn't involve the templating system. It's insanely fast. 2.79s, or 2.19x faster. For kicks, I experimented with PyPy 4.0.0. `hg log -q > /dev/null` is 0.50s (faster than Git!). `hg log` is 7.96s. That's 12.2x and 3.13x faster, respectively. Insane. I experimented with PyPy for `hg unbundle` performance analysis a few weeks ago. changelog reading was 2-3x faster. Manifest reading was slower. But apparently that is due to the manifest reading Python code doing stupid things compared to the C code. PyPy appears to have a ~40ms startup overhead compared to CPython. Throw in chg and that disappears and suddenly PyPy seems like a pretty compelling alternative to writing code in languages that aren't Python (C, Rust, etc). Anyway, it's pretty clear the templating code is a bit of a performance bottleneck. I'll see if I can work some more magic... > > diff --git a/mercurial/commands.py b/mercurial/commands.py > --- a/mercurial/commands.py > +++ b/mercurial/commands.py > @@ -4784,16 +4784,18 @@ def log(ui, repo, *pats, **opts): > specifying revisions. > > See :hg:`help templates` for more about pre-packaged styles and > specifying custom templates. > > Returns 0 on success. > > """ > + repo = repo.frozen() > + > if opts.get('follow') and opts.get('rev'): > opts['rev'] = [revset.formatspec('reverse(::%lr)', > opts.get('rev'))] > del opts['follow'] > > if opts.get('graph'): > return cmdutil.graphlog(ui, repo, *pats, **opts) > > revs, expr, filematcher = cmdutil.getlogrevs(repo, pats, opts) >
On Sat, 21 Nov 2015 18:48:10 -0800, Gregory Szorc wrote: > On Sat, Nov 21, 2015 at 5:14 PM, Gregory Szorc <gregory.szorc@gmail.com> > wrote: > > # HG changeset patch > > # User Gregory Szorc <gregory.szorc@gmail.com> > > # Date 1448152280 28800 > > # Sat Nov 21 16:31:20 2015 -0800 > > # Node ID f8ab893d8dac5246224f330fafdea35c0bcef313 > > # Parent ad891b362564b30ec69bc844a3fd98e73b6e032e > > commands.log: operate on frozen repository > > > > `hg log` is a read-only command. As such, it can use a frozen > > repository instance so it isn't constantly performing changelog > > freshness checks. > > > > On a local Firefox repository with ~312,000 changesets, this > > simple change results in `hg log -T '{rev}\n' > /dev/null` > > with an empty/default hgrc becoming faster: > > > > before: 7.81s > > after: 6.11s > > delta: -1.70s (78.2% of original > > > > A full `hg log` shows similar percentage improvement: > > > > before: 31.14s > > after: 24.95s > > delta: -6.19s (80.1% of original) > > > > I'm actually surprised the full `hg log` output showed a similar > > percentage speedup: I would have expected it to only decrease by > > ~1.70s. But counting function calls does reveal that the full > > template output incurs an extra changectx instantiation for each > > revision as well as additional changelog accesses (for parents > > I believe). > > > > For comparison purposes, git 2.6.3 for the same count of commits on > > a Git conversion of the Firefox repo takes ~1.85s for > > `git log --pretty=%H` and ~2.36s for `git log`. We're 3.3x slower > > for printing revisions (4.2x before) and 10.6x slower for the > > default template output (13.2x before). > > > > So, I'm looking at further ways to optimize `hg log` performance. > > `hg log -q` is a fast path that doesn't involve the templating system. It's > insanely fast. 2.79s, or 2.19x faster. "log -q" still involves the templater. % export HGRCPATH=/dev/null % export LANG=C % hg log --time -l20000 -Tdefault > /dev/null time: real 7.210 secs (user 7.190+0.000 sys 0.020+0.000) % hg log --time -l20000 > /dev/null time: real 3.260 secs (user 3.250+0.000 sys 0.010+0.000) % hg log --time -l20000 -Tdefault -q > /dev/null time: real 1.130 secs (user 1.120+0.000 sys 0.010+0.000) % hg log --time -l20000 -q > /dev/null time: real 0.370 secs (user 0.370+0.000 sys 0.020+0.000) > For kicks, I experimented with PyPy 4.0.0. `hg log -q > /dev/null` is 0.50s > (faster than Git!). `hg log` is 7.96s. That's 12.2x and 3.13x faster, > respectively. Insane. I experimented with PyPy for `hg unbundle` > performance analysis a few weeks ago. changelog reading was 2-3x faster. > Manifest reading was slower. But apparently that is due to the manifest > reading Python code doing stupid things compared to the C code. > > PyPy appears to have a ~40ms startup overhead compared to CPython. Throw in > chg and that disappears and suddenly PyPy seems like a pretty compelling > alternative to writing code in languages that aren't Python (C, Rust, etc). I should say that chg can't be run on Windows. > Anyway, it's pretty clear the templating code is a bit of a performance > bottleneck. I'll see if I can work some more magic... Yes, and PyPy can optimize it as templater would execute the same code path repeatedly.
Patch
diff --git a/mercurial/commands.py b/mercurial/commands.py --- a/mercurial/commands.py +++ b/mercurial/commands.py @@ -4784,16 +4784,18 @@ def log(ui, repo, *pats, **opts): specifying revisions. See :hg:`help templates` for more about pre-packaged styles and specifying custom templates. Returns 0 on success. """ + repo = repo.frozen() + if opts.get('follow') and opts.get('rev'): opts['rev'] = [revset.formatspec('reverse(::%lr)', opts.get('rev'))] del opts['follow'] if opts.get('graph'): return cmdutil.graphlog(ui, repo, *pats, **opts) revs, expr, filematcher = cmdutil.getlogrevs(repo, pats, opts)