Submitter | Siddharth Agarwal |
---|---|
Date | May 28, 2014, 5:05 a.m. |
Message ID | <124fd417846a9105b62b.1401253521@dev1738.prn1.facebook.com> |
Download | mbox | patch |
Permalink | /patch/4879/ |
State | Superseded |
Headers | show |
Comments
Siddharth Agarwal <sid0@fb.com> writes: > # HG changeset patch > # User Siddharth Agarwal <sid0@fb.com> > # Date 1401249736 25200 > # Tue May 27 21:02:16 2014 -0700 > # Node ID 124fd417846a9105b62b067c2d21990a2e44467e > # Parent 9ed8b0fa79cfa5f220930c92ac79b93f6d6bd8ee > dirstate.status: assign members one by one instead of unpacking the tuple > > The CPython UNPACK_SEQUENCE opcode has fast paths when the value to be unpacked > is a tuple or a list, but falls back to creating a full-fledged iterator in > general. That is much slower than simply accessing and storing the tuple > members one by one. > > With this patch, hg status and hg diff regain their previous speed. > > The following tests are run against a working copy with over 270,000 files. > Here, 'before' means without this or the previous patch applied. > > Note that in this case `hg perfstatus` isn't representative since it doesn't > take dirstate parsing time into account. > > $ time hg status # best of 5 > before: 2.03s user 1.25s system 99% cpu 3.290 total > after: 2.01s user 1.25s system 99% cpu 3.261 total > > $ time hg diff # best of 5 > before: 1.32s user 0.78s system 99% cpu 2.105 total > after: 1.27s user 0.79s system 99% cpu 2.066 total > > diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py > --- a/mercurial/dirstate.py > +++ b/mercurial/dirstate.py > @@ -823,7 +823,11 @@ > uadd(fn) > continue > Perhaps it would make sense to add a comment here explaining why tuple unpacking is bad in this case? Otherwise someone might revert the change to avoid the "unnecessary" t variable. > - state, mode, size, time = dmap[fn] > + t = dmap[fn] > + state = t[0] > + mode = t[1] > + size = t[2] > + time = t[3] > > if not st and state in "nma": > dadd(fn) > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@selenic.com > http://selenic.com/mailman/listinfo/mercurial-devel
On 05/28/2014 12:26 AM, Martin Geisler wrote: > Siddharth Agarwal <sid0@fb.com> writes: > >> # HG changeset patch >> # User Siddharth Agarwal <sid0@fb.com> >> # Date 1401249736 25200 >> # Tue May 27 21:02:16 2014 -0700 >> # Node ID 124fd417846a9105b62b067c2d21990a2e44467e >> # Parent 9ed8b0fa79cfa5f220930c92ac79b93f6d6bd8ee >> dirstate.status: assign members one by one instead of unpacking the tuple >> >> The CPython UNPACK_SEQUENCE opcode has fast paths when the value to be unpacked >> is a tuple or a list, but falls back to creating a full-fledged iterator in >> general. That is much slower than simply accessing and storing the tuple >> members one by one. >> >> With this patch, hg status and hg diff regain their previous speed. >> >> The following tests are run against a working copy with over 270,000 files. >> Here, 'before' means without this or the previous patch applied. >> >> Note that in this case `hg perfstatus` isn't representative since it doesn't >> take dirstate parsing time into account. >> >> $ time hg status # best of 5 >> before: 2.03s user 1.25s system 99% cpu 3.290 total >> after: 2.01s user 1.25s system 99% cpu 3.261 total >> >> $ time hg diff # best of 5 >> before: 1.32s user 0.78s system 99% cpu 2.105 total >> after: 1.27s user 0.79s system 99% cpu 2.066 total >> >> diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py >> --- a/mercurial/dirstate.py >> +++ b/mercurial/dirstate.py >> @@ -823,7 +823,11 @@ >> uadd(fn) >> continue >> > > Perhaps it would make sense to add a comment here explaining why tuple > unpacking is bad in this case? Otherwise someone might revert the change > to avoid the "unnecessary" t variable. s/Perhaps it would make sense to/You must definitely/
On 05/28/2014 09:37 AM, Pierre-Yves David wrote: > > > On 05/28/2014 12:26 AM, Martin Geisler wrote: >> Siddharth Agarwal <sid0@fb.com> writes: >> >>> # HG changeset patch >>> # User Siddharth Agarwal <sid0@fb.com> >>> # Date 1401249736 25200 >>> # Tue May 27 21:02:16 2014 -0700 >>> # Node ID 124fd417846a9105b62b067c2d21990a2e44467e >>> # Parent 9ed8b0fa79cfa5f220930c92ac79b93f6d6bd8ee >>> dirstate.status: assign members one by one instead of unpacking the >>> tuple >>> >>> The CPython UNPACK_SEQUENCE opcode has fast paths when the value to >>> be unpacked >>> is a tuple or a list, but falls back to creating a full-fledged >>> iterator in >>> general. That is much slower than simply accessing and storing the >>> tuple >>> members one by one. >>> >>> With this patch, hg status and hg diff regain their previous speed. >>> >>> The following tests are run against a working copy with over 270,000 >>> files. >>> Here, 'before' means without this or the previous patch applied. >>> >>> Note that in this case `hg perfstatus` isn't representative since it >>> doesn't >>> take dirstate parsing time into account. >>> >>> $ time hg status # best of 5 >>> before: 2.03s user 1.25s system 99% cpu 3.290 total >>> after: 2.01s user 1.25s system 99% cpu 3.261 total >>> >>> $ time hg diff # best of 5 >>> before: 1.32s user 0.78s system 99% cpu 2.105 total >>> after: 1.27s user 0.79s system 99% cpu 2.066 total >>> >>> diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py >>> --- a/mercurial/dirstate.py >>> +++ b/mercurial/dirstate.py >>> @@ -823,7 +823,11 @@ >>> uadd(fn) >>> continue >>> >> >> Perhaps it would make sense to add a comment here explaining why tuple >> unpacking is bad in this case? Otherwise someone might revert the change >> to avoid the "unnecessary" t variable. > > s/Perhaps it would make sense to/You must definitely/ > Agreed. I'll wait for more feedback and then send a V2.
Patch
diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py --- a/mercurial/dirstate.py +++ b/mercurial/dirstate.py @@ -823,7 +823,11 @@ uadd(fn) continue - state, mode, size, time = dmap[fn] + t = dmap[fn] + state = t[0] + mode = t[1] + size = t[2] + time = t[3] if not st and state in "nma": dadd(fn)