Patchwork [5,of,5,V2] dirstate.status: assign members one by one instead of unpacking the tuple

login
register
mail settings
Submitter Siddharth Agarwal
Date May 29, 2014, 11:19 p.m.
Message ID <c862f5d80f1005d61025.1401405560@dev1738.prn1.facebook.com>
Download mbox | patch
Permalink /patch/4901/
State Accepted
Commit 4b2ebd3187a175417b10a37175eb7925e0bd85bd
Headers show

Comments

Siddharth Agarwal - May 29, 2014, 11:19 p.m.
# HG changeset patch
# User Siddharth Agarwal <sid0@fb.com>
# Date 1401249736 25200
#      Tue May 27 21:02:16 2014 -0700
# Node ID c862f5d80f1005d61025dc9bc76f4243744a01f1
# Parent  6bc322288121913762c877241a3a95ed4c1ada75
dirstate.status: assign members one by one instead of unpacking the tuple

With this patch, hg status and hg diff regain their previous speed.

The following tests are run against a working copy with over 270,000 files.
Here, 'before' means without this or the previous patch applied.

Note that in this case `hg perfstatus` isn't representative since it doesn't
take dirstate parsing time into account.

$ time hg status  # best of 5
before: 2.03s user 1.25s system 99% cpu 3.290 total
after:  2.01s user 1.25s system 99% cpu 3.261 total

$ time hg diff    # best of 5
before: 1.32s user 0.78s system 99% cpu 2.105 total
after:  1.27s user 0.79s system 99% cpu 2.066 total
Matt Mackall - June 26, 2014, 5:46 p.m.
On Thu, 2014-05-29 at 16:19 -0700, Siddharth Agarwal wrote:
> # HG changeset patch
> # User Siddharth Agarwal <sid0@fb.com>
> # Date 1401249736 25200
> #      Tue May 27 21:02:16 2014 -0700
> # Node ID c862f5d80f1005d61025dc9bc76f4243744a01f1
> # Parent  6bc322288121913762c877241a3a95ed4c1ada75
> dirstate.status: assign members one by one instead of unpacking the tuple

These are queued for default, thanks.

Patch

diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -823,7 +823,18 @@ 
                     uadd(fn)
                 continue
 
-            state, mode, size, time = dmap[fn]
+            # This is equivalent to 'state, mode, size, time = dmap[fn]' but not
+            # written like that for performance reasons. dmap[fn] is not a
+            # Python tuple in compiled builds. The CPython UNPACK_SEQUENCE
+            # opcode has fast paths when the value to be unpacked is a tuple or
+            # a list, but falls back to creating a full-fledged iterator in
+            # general. That is much slower than simply accessing and storing the
+            # tuple members one by one.
+            t = dmap[fn]
+            state = t[0]
+            mode = t[1]
+            size = t[2]
+            time = t[3]
 
             if not st and state in "nma":
                 dadd(fn)