Patchwork [8,of,9] manifestmerge: use dicthelpers.diff and join

login
register
mail settings
Submitter Siddharth Agarwal
Date March 25, 2013, 12:27 a.m.
Message ID <20a16b9bac50d7c66e59.1364171272@sid0x220>
Download mbox | patch
Permalink /patch/1184/
State Superseded
Commit 381c0ef72a56ac647a8d4de017a51fee7f80fec3
Headers show

Comments

Siddharth Agarwal - March 25, 2013, 12:27 a.m.
# HG changeset patch
# User Siddharth Agarwal <sid0@fb.com>
# Date 1364170637 25200
#      Sun Mar 24 17:17:17 2013 -0700
# Node ID 20a16b9bac50d7c66e599a2f5f0447764468bb7f
# Parent  97d7455d7bd827c8a3c20761b266eb9cfc11ed08
manifestmerge: use dicthelpers.diff and join

This patch improves manifestmerge performance significantly.

In a repository with 170,000 files, the following results were observed on a
clean working directory. Revision '.' adds one file.

hg perfmergecalculate -r .
- before this patch: 0.41 seconds
- Python dicthelpers: 0.13 seconds
- C dicthelpers: 0.04 seconds

hg perfmergecalculate -r .^
- before this patch: 0.53 seconds
- Python dicthelpers: 0.24 seconds
- C dicthelpers: 0.13 seconds

Comparing against '.' is much faster than comparing against '.^' because with
'.', the wctx and p2 manifest strings have the same identity, so comparisons
are simply pointer equality. With '.^', the strings have different identities
so we need to perform memcmps.

Any operation that uses manifestmerge benefits. With the C dicthelpers:
- hg update . goes from 2.04 seconds to 1.61
- hg update .^ goes from 2.52 seconds to 2.10
- hg rebase -r . -d .~6 (involves 4 merges) goes from 11.8 seconds to 10.2
Bryan O'Sullivan - March 25, 2013, 5:26 p.m.
On Sun, Mar 24, 2013 at 5:27 PM, Siddharth Agarwal <sid0@fb.com> wrote:

> Any operation that uses manifestmerge benefits. With the C dicthelpers:
> [...]
>

What is the difference if using the Python dicthelpers? From the rest of
this patch description, it looks like the C code is saving only 0.1
seconds, which would seem to put it squarely in the "not worth it" category.
Siddharth Agarwal - March 25, 2013, 6:27 p.m.
On 03/25/2013 10:26 AM, Bryan O'Sullivan wrote:
> On Sun, Mar 24, 2013 at 5:27 PM, Siddharth Agarwal <sid0@fb.com 
> <mailto:sid0@fb.com>> wrote:
>
>     Any operation that uses manifestmerge benefits. With the C
>     dicthelpers: [...]
>
>
> What is the difference if using the Python dicthelpers? From the rest 
> of this patch description, it looks like the C code is saving only 0.1 
> seconds, which would seem to put it squarely in the "not worth it" 
> category. 

Seems to be closer to 0.15 seconds (times 4 = 0.6 seconds for rebase) in 
practice. hg update . in 1.75 seconds, hg update .^ in 2.25 and hg 
rebase -r . -d .~6 in 10.8.

Whether it's worth it or not, I don't know. I myself was surprised at 
how relatively fast the Python dicthelpers turned out to be.

Patch

diff --git a/mercurial/merge.py b/mercurial/merge.py
--- a/mercurial/merge.py
+++ b/mercurial/merge.py
@@ -7,7 +7,7 @@ 
 
 from node import nullid, nullrev, hex, bin
 from i18n import _
-import error, util, filemerge, copies, subrepo, worker
+import error, util, filemerge, copies, subrepo, worker, dicthelpers
 import errno, os, shutil
 
 class mergestate(object):
@@ -238,17 +238,27 @@  def manifestmerge(repo, wctx, p2, pa, br
 
     aborts, prompts = [], []
     # Compare manifests
-    for f, n1 in m1.iteritems():
+    fdiff = dicthelpers.diff(m1, m2)
+    flagsdiff = m1.flagsdiff(m2)
+    diff12 = dicthelpers.join(fdiff, flagsdiff)
+
+    for f, (n12, fl12) in diff12.iteritems():
+        if n12:
+            n1, n2 = n12
+        else: # file contents didn't change, but flags did
+            n1 = n2 = m1[f]
+        if fl12:
+            fl1, fl2 = fl12
+        else: # flags didn't change, file contents did
+            fl1 = fl2 = m1.flags(f)
+
         if partial and not partial(f):
             continue
-        if f in m2:
-            n2 = m2[f]
-            fl1, fl2, fla = m1.flags(f), m2.flags(f), ma.flags(f)
+        if n1 and n2:
+            fla = ma.flags(f)
             nol = 'l' not in fl1 + fl2 + fla
             a = ma.get(f, nullid)
-            if n1 == n2 and fl1 == fl2:
-                pass # same - keep local
-            elif n2 == a and fl2 == fla:
+            if n2 == a and fl2 == fla:
                 pass # remote unchanged - keep local
             elif n1 == a and fl1 == fla: # local unchanged - use remote
                 if n1 == n2: # optimization: keep local content
@@ -263,32 +273,26 @@  def manifestmerge(repo, wctx, p2, pa, br
                 actions.append((f, "m", (f, f, False), "versions differ"))
         elif f in copied: # files we'll deal with on m2 side
             pass
-        elif f in movewithdir: # directory rename
+        elif n1 and f in movewithdir: # directory rename
             f2 = movewithdir[f]
             actions.append((f, "d", (None, f2, m1.flags(f)),
                             "remote renamed directory to " + f2))
-        elif f in copy:
+        elif n1 and f in copy:
             f2 = copy[f]
             actions.append((f, "m", (f2, f, False),
                             "local copied/moved to " + f2))
-        elif f in ma: # clean, a different, no remote
+        elif n1 and f in ma: # clean, a different, no remote
             if n1 != ma[f]:
                 prompts.append((f, "cd")) # prompt changed/deleted
             elif n1[20:] == "a": # added, no remote
                 actions.append((f, "f", None, "remote deleted"))
             else:
                 actions.append((f, "r", None, "other deleted"))
-
-    for f, n2 in m2.iteritems():
-        if partial and not partial(f):
-            continue
-        if f in m1 or f in copied: # files already visited
-            continue
-        if f in movewithdir:
+        elif n2 and f in movewithdir:
             f2 = movewithdir[f]
             actions.append((None, "d", (f, f2, m2.flags(f)),
                             "local renamed directory to " + f2))
-        elif f in copy:
+        elif n2 and f in copy:
             f2 = copy[f]
             if f2 in m2:
                 actions.append((f2, "m", (f, f, False),
@@ -296,7 +300,7 @@  def manifestmerge(repo, wctx, p2, pa, br
             else:
                 actions.append((f2, "m", (f, f, True),
                                 "remote moved to " + f))
-        elif f not in ma:
+        elif n2 and f not in ma:
             # local unknown, remote created: the logic is described by the
             # following table:
             #
@@ -320,7 +324,7 @@  def manifestmerge(repo, wctx, p2, pa, br
                     aborts.append((f, "ud"))
                 else:
                     actions.append((f, "g", (m2.flags(f),), "remote created"))
-        elif n2 != ma[f]:
+        elif n2 and n2 != ma[f]:
             prompts.append((f, "dc")) # prompt deleted/changed
 
     for f, m in sorted(aborts):