Submitter | Augie Fackler |
---|---|
Date | June 10, 2016, 4:41 a.m. |
Message ID | <bd2caf6762219aba40ed.1465533675@imladris.local> |
Download | mbox | patch |
Permalink | /patch/15453/ |
State | Accepted |
Headers | show |
Comments
On Fri, 2016-06-10 at 00:41 -0400, Augie Fackler wrote: > # HG changeset patch > # User Augie Fackler <raf@durin42.com> > # Date 1465531834 14400 > # Fri Jun 10 00:10:34 2016 -0400 > # Node ID bd2caf6762219aba40edd4dd697fee70de306940 > # Parent 16f5b31b1228212dc175b5a0baf3601cca48330d > revlog: use hashlib.sha1 directly instead of through util > > Also remove module-local _sha alias, which was barely used. @@ -40,7 +41,6 @@ from . import ( > _unpack = struct.unpack > _compress = zlib.compress > _decompress = zlib.decompress > -_sha = util.sha1 These are each a micro-optimization on Python's abysmal lookup path. -- Mathematics is the supreme nostalgia of our time.
> On Jun 10, 2016, at 4:47 PM, Matt Mackall <mpm@selenic.com> wrote: > > On Fri, 2016-06-10 at 00:41 -0400, Augie Fackler wrote: >> # HG changeset patch >> # User Augie Fackler <raf@durin42.com> >> # Date 1465531834 14400 >> # Fri Jun 10 00:10:34 2016 -0400 >> # Node ID bd2caf6762219aba40edd4dd697fee70de306940 >> # Parent 16f5b31b1228212dc175b5a0baf3601cca48330d >> revlog: use hashlib.sha1 directly instead of through util >> >> Also remove module-local _sha alias, which was barely used. > > @@ -40,7 +41,6 @@ from . import ( >> _unpack = struct.unpack >> _compress = zlib.compress >> _decompress = zlib.decompress >> -_sha = util.sha1 > > These are each a micro-optimization on Python's abysmal lookup path. Ugh. Does it make a measurable difference on a perf command? > > -- > Mathematics is the supreme nostalgia of our time. > > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
On Fri, Jun 10, 2016 at 2:36 PM, Augie Fackler <raf@durin42.com> wrote: > > > On Jun 10, 2016, at 4:47 PM, Matt Mackall <mpm@selenic.com> wrote: > > > > On Fri, 2016-06-10 at 00:41 -0400, Augie Fackler wrote: > >> # HG changeset patch > >> # User Augie Fackler <raf@durin42.com> > >> # Date 1465531834 14400 > >> # Fri Jun 10 00:10:34 2016 -0400 > >> # Node ID bd2caf6762219aba40edd4dd697fee70de306940 > >> # Parent 16f5b31b1228212dc175b5a0baf3601cca48330d > >> revlog: use hashlib.sha1 directly instead of through util > >> > >> Also remove module-local _sha alias, which was barely used. > > > > @@ -40,7 +41,6 @@ from . import ( > >> _unpack = struct.unpack > >> _compress = zlib.compress > >> _decompress = zlib.decompress > >> -_sha = util.sha1 > > > > These are each a micro-optimization on Python's abysmal lookup path. > > Ugh. Does it make a measurable difference on a perf command? > I would go for a revset that accesses non-indexed changelog revision data. Something like author() or date(). That will perform a sha1 for every changeset without involving too many other operations. > > > > > -- > > Mathematics is the supreme nostalgia of our time. > > > > _______________________________________________ > > Mercurial-devel mailing list > > Mercurial-devel@mercurial-scm.org > > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel > > > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel > >
> On Jun 10, 2016, at 5:41 PM, Gregory Szorc <gregory.szorc@gmail.com> wrote: > > On Fri, Jun 10, 2016 at 2:36 PM, Augie Fackler <raf@durin42.com <mailto:raf@durin42.com>> wrote: > > > On Jun 10, 2016, at 4:47 PM, Matt Mackall <mpm@selenic.com <mailto:mpm@selenic.com>> wrote: > > > > On Fri, 2016-06-10 at 00:41 -0400, Augie Fackler wrote: > >> # HG changeset patch > >> # User Augie Fackler <raf@durin42.com <mailto:raf@durin42.com>> > >> # Date 1465531834 14400 > >> # Fri Jun 10 00:10:34 2016 -0400 > >> # Node ID bd2caf6762219aba40edd4dd697fee70de306940 > >> # Parent 16f5b31b1228212dc175b5a0baf3601cca48330d > >> revlog: use hashlib.sha1 directly instead of through util > >> > >> Also remove module-local _sha alias, which was barely used. > > > > @@ -40,7 +41,6 @@ from . import ( > >> _unpack = struct.unpack > >> _compress = zlib.compress > >> _decompress = zlib.decompress > >> -_sha = util.sha1 > > > > These are each a micro-optimization on Python's abysmal lookup path. > > Ugh. Does it make a measurable difference on a perf command? > > I would go for a revset that accesses non-indexed changelog revision data. Something like author() or date(). That will perform a sha1 for every changeset without involving too many other operations. Curiously, this appears to make things worse? See http://hg.durin42.com/hg-wip/rev/997c8cf8f586 for a patch and some perfrevset results. I’m a little suspicious of my results, but this result lines up with what I’ve seen elsewhere - it’s either a little worse or too close to be a meaningful result. > > > > > > -- > > Mathematics is the supreme nostalgia of our time. > > > > _______________________________________________ > > Mercurial-devel mailing list > > Mercurial-devel@mercurial-scm.org <mailto:Mercurial-devel@mercurial-scm.org> > > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel <https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel> > > > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org <mailto:Mercurial-devel@mercurial-scm.org> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel <https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel>
On Thu, 2016-06-23 at 21:51 -0400, Augie Fackler wrote: > > > > On Jun 10, 2016, at 5:41 PM, Gregory Szorc <gregory.szorc@gmail.com> wrote: > > > > On Fri, Jun 10, 2016 at 2:36 PM, Augie Fackler <raf@durin42.com <mailto:raf@ > > durin42.com>> wrote: > > > > > > > > On Jun 10, 2016, at 4:47 PM, Matt Mackall <mpm@selenic.com <mailto:mpm@sel > > > enic.com>> wrote: > > > > > > On Fri, 2016-06-10 at 00:41 -0400, Augie Fackler wrote: > > > > > > > > # HG changeset patch > > > > # User Augie Fackler <raf@durin42.com <mailto:raf@durin42.com>> > > > > # Date 1465531834 14400 > > > > # Fri Jun 10 00:10:34 2016 -0400 > > > > # Node ID bd2caf6762219aba40edd4dd697fee70de306940 > > > > # Parent 16f5b31b1228212dc175b5a0baf3601cca48330d > > > > revlog: use hashlib.sha1 directly instead of through util > > > > > > > > Also remove module-local _sha alias, which was barely used. > > > @@ -40,7 +41,6 @@ from . import ( > > > > > > > > _unpack = struct.unpack > > > > _compress = zlib.compress > > > > _decompress = zlib.decompress > > > > -_sha = util.sha1 > > > These are each a micro-optimization on Python's abysmal lookup path. > > Ugh. Does it make a measurable difference on a perf command? > > > > I would go for a revset that accesses non-indexed changelog revision data. > > Something like author() or date(). That will perform a sha1 for every > > changeset without involving too many other operations. > Curiously, this appears to make things worse? See http://hg.durin42.com/hg-wip > /rev/997c8cf8f586 for a patch and some perfrevset results. > > I’m a little suspicious of my results, but this result lines up with what I’ve > seen elsewhere - it’s either a little worse or too close to be a meaningful > result. I couldn't measure it last week. Power management effects seemed to dominate. Oh well. -- Mathematics is the supreme nostalgia of our time.
Patch
diff --git a/mercurial/revlog.py b/mercurial/revlog.py --- a/mercurial/revlog.py +++ b/mercurial/revlog.py @@ -15,6 +15,7 @@ from __future__ import absolute_import import collections import errno +import hashlib import os import struct import zlib @@ -40,7 +41,6 @@ from . import ( _unpack = struct.unpack _compress = zlib.compress _decompress = zlib.decompress -_sha = util.sha1 # revlog header flags REVLOGV0 = 0 @@ -74,7 +74,7 @@ def gettype(q): def offset_type(offset, type): return long(long(offset) << 16 | type) -_nullhash = _sha(nullid) +_nullhash = hashlib.sha1(nullid) def hash(text, p1, p2): """generate a hash from the given text and its parent hashes @@ -92,7 +92,7 @@ def hash(text, p1, p2): # none of the parent nodes are nullid l = [p1, p2] l.sort() - s = _sha(l[0]) + s = hashlib.sha1(l[0]) s.update(l[1]) s.update(text) return s.digest()