Patchwork [6,of,7,v2] bdiff: give slight preference to appending lines

login
register
mail settings
Submitter Mads Kiilerich
Date Nov. 15, 2016, 8:57 p.m.
Message ID <e022e995415bd48ffe6d.1479243427@madski>
Download mbox | patch
Permalink /patch/17593/
State Accepted
Headers show

Comments

Mads Kiilerich - Nov. 15, 2016, 8:57 p.m.
# HG changeset patch
# User Mads Kiilerich <madski@unity3d.com>
# Date 1479243409 -3600
#      Tue Nov 15 21:56:49 2016 +0100
# Node ID e022e995415bd48ffe6d4d03dc97b99dd547cde1
# Parent  5ae7e8061a9671e8941a7a17e316254d228acf59
bdiff: give slight preference to appending lines

[This change could be folded into the previous changeset to minimize the repo
churn ...]

The general preference to matches in the middle of bdiff ranges helps getting
balanced recursion and efficient computation. But, as previous changes have
shown, it might also give diffs that seems "obviously wrong".

To mitigate that: If the best match on the A side starts at the beginning of
the bdiff range, don't aim for the middle-most B side match but for the
earliest.

This will make the matches balanced (by both sides being "early") even though
the bisection will be less balanced. Still, this case only apply if the *best*
and middle-most match was fully unbalanced on the A side. Each recursion will
thus even in this worst case reduce the problem significantly and we are not
re-introducing the problem that was fixed in f1ca249696ed.

The bundle size for 4.0 (hg bundle --base null -r 4.0 x.hg) happens to go from
22806817 to 22807275 bytes - a 0.002% increase.

This make the recent test-bdiff.py changes give a more pretty output ... but
they no longer show that the recursion is around middle matches (because it in
these cases isn't).

Patch

diff --git a/mercurial/bdiff.c b/mercurial/bdiff.c
--- a/mercurial/bdiff.c
+++ b/mercurial/bdiff.c
@@ -188,7 +188,7 @@  static int longest_match(struct bdiff_li
 					/* same match but closer to half */
 					mi = i;
 					mj = j;
-				} else if (i == mi && mj > bhalf) {
+				} else if (i == mi && (mj > bhalf || i == a1)) {
 					/* same i but best earlier j */
 					mj = j;
 				}
diff --git a/tests/test-annotate.t b/tests/test-annotate.t
--- a/tests/test-annotate.t
+++ b/tests/test-annotate.t
@@ -91,8 +91,8 @@  annotate (JSON)
 annotate -n b
 
   $ hg annotate -n b
+  0: a
   1: a
-  0: a
   1: a
   3: b4
   3: b5
@@ -111,8 +111,8 @@  annotate --no-follow b
 annotate -nl b
 
   $ hg annotate -nl b
-  1:1: a
   0:1: a
+  1:2: a
   1:3: a
   3:4: b4
   3:5: b5
@@ -121,8 +121,8 @@  annotate -nl b
 annotate -nf b
 
   $ hg annotate -nf b
+  0 a: a
   1 a: a
-  0 a: a
   1 a: a
   3 b: b4
   3 b: b5
@@ -131,8 +131,8 @@  annotate -nf b
 annotate -nlf b
 
   $ hg annotate -nlf b
-  1 a:1: a
   0 a:1: a
+  1 a:2: a
   1 a:3: a
   3 b:4: b4
   3 b:5: b5
@@ -156,8 +156,8 @@  annotate -nlf b
 annotate after merge
 
   $ hg annotate -nf b
+  0 a: a
   1 a: a
-  0 a: a
   1 a: a
   3 b: b4
   4 b: c
@@ -166,8 +166,8 @@  annotate after merge
 annotate after merge with -l
 
   $ hg annotate -nlf b
-  1 a:1: a
   0 a:1: a
+  1 a:2: a
   1 a:3: a
   3 b:4: b4
   4 b:5: c
@@ -198,7 +198,7 @@  annotate after merge with -l
 annotate after rename merge
 
   $ hg annotate -nf b
-  1 a: a
+  0 a: a
   6 b: z
   1 a: a
   3 b: b4
@@ -209,7 +209,7 @@  annotate after rename merge
 annotate after rename merge with -l
 
   $ hg annotate -nlf b
-  1 a:1: a
+  0 a:1: a
   6 b:2: z
   1 a:3: a
   3 b:4: b4
@@ -226,7 +226,7 @@  Issue2807: alignment of line numbers wit
   $ echo more >> b
   $ hg ci -mmore -d '7 0'
   $ hg annotate -nlf b
-   1 a: 1: a
+   0 a: 1: a
    6 b: 2: z
    1 a: 3: a
    3 b: 4: b4
@@ -240,15 +240,15 @@  Issue2807: alignment of line numbers wit
 linkrev vs rev
 
   $ hg annotate -r tip -n a
+  0: a
   1: a
-  0: a
   1: a
 
 linkrev vs rev with -l
 
   $ hg annotate -r tip -nl a
-  1:1: a
   0:1: a
+  1:2: a
   1:3: a
 
 Issue589: "undelete" sequence leads to crash
diff --git a/tests/test-bdiff.py b/tests/test-bdiff.py
--- a/tests/test-bdiff.py
+++ b/tests/test-bdiff.py
@@ -84,9 +84,9 @@  showdiff(
     ''.join('<%s\n-\n' % i for i in range(5)),
     ''.join('>%s\n-\n' % i for i in range(5)))
 
-print("Diff 1 to 3 lines - preference for balanced recursion:")
+print("Diff 1 to 3 lines - preference for appending:")
 showdiff('a\n', 'a\n' * 3)
-print("Diff 1 to 5 lines - preference for balanced recursion:")
+print("Diff 1 to 5 lines - preference for appending:")
 showdiff('a\n', 'a\n' * 5)
 print("Diff 3 to 1 lines - preference for balanced recursion:")
 showdiff('a\n' * 3, 'a\n')
diff --git a/tests/test-bdiff.py.out b/tests/test-bdiff.py.out
--- a/tests/test-bdiff.py.out
+++ b/tests/test-bdiff.py.out
@@ -56,20 +56,18 @@  showdiff(
  '-\n'
  20 23 '<4\n' -> '>4\n'
  '-\n'
-Diff 1 to 3 lines - preference for balanced recursion:
+Diff 1 to 3 lines - preference for appending:
 showdiff(
   'a\n',
   'a\na\na\n'):
- 0 0 '' -> 'a\n'
  'a\n'
- 2 2 '' -> 'a\n'
-Diff 1 to 5 lines - preference for balanced recursion:
+ 2 2 '' -> 'a\na\n'
+Diff 1 to 5 lines - preference for appending:
 showdiff(
   'a\n',
   'a\na\na\na\na\n'):
- 0 0 '' -> 'a\na\n'
  'a\n'
- 2 2 '' -> 'a\na\n'
+ 2 2 '' -> 'a\na\na\na\n'
 Diff 3 to 1 lines - preference for balanced recursion:
 showdiff(
   'a\na\na\n',
diff --git a/tests/test-commit-amend.t b/tests/test-commit-amend.t
--- a/tests/test-commit-amend.t
+++ b/tests/test-commit-amend.t
@@ -47,8 +47,8 @@  Amending changeset with changes in worki
   --- a/a	Thu Jan 01 00:00:00 1970 +0000
   +++ b/a	Thu Jan 01 00:00:00 1970 +0000
   @@ -1,1 +1,3 @@
+   a
   +a
-   a
   +a
   $ hg log
   changeset:   1:43f1ba15f28a
@@ -122,13 +122,13 @@  No changes, just a different message:
   uncompressed size of bundle content:
        254 (changelog)
        163 (manifests)
-       141  a
+       129  a
   saved backup bundle to $TESTTMP/.hg/strip-backup/74609c7f506e-1bfde511-amend-backup.hg (glob)
   1 changesets found
   uncompressed size of bundle content:
        250 (changelog)
        163 (manifests)
-       141  a
+       129  a
   adding branch
   adding changesets
   adding manifests
@@ -140,8 +140,8 @@  No changes, just a different message:
   --- a/a	Thu Jan 01 00:00:00 1970 +0000
   +++ b/a	Thu Jan 01 00:00:00 1970 +0000
   @@ -1,1 +1,3 @@
+   a
   +a
-   a
   +a
   $ hg log
   changeset:   1:1cd866679df8
@@ -266,13 +266,13 @@  then, test editing custom commit message
   uncompressed size of bundle content:
        249 (changelog)
        163 (manifests)
-       143  a
+       131  a
   saved backup bundle to $TESTTMP/.hg/strip-backup/5f357c7560ab-e7c84ade-amend-backup.hg (glob)
   1 changesets found
   uncompressed size of bundle content:
        257 (changelog)
        163 (manifests)
-       143  a
+       131  a
   adding branch
   adding changesets
   adding manifests
@@ -309,13 +309,13 @@  Same, but with changes in working dir (d
   uncompressed size of bundle content:
        464 (changelog)
        322 (manifests)
-       261  a
+       249  a
   saved backup bundle to $TESTTMP/.hg/strip-backup/7ab3bf440b54-8e3b5088-amend-backup.hg (glob)
   1 changesets found
   uncompressed size of bundle content:
        257 (changelog)
        163 (manifests)
-       145  a
+       133  a
   adding branch
   adding changesets
   adding manifests
diff --git a/tests/test-mq-qfold.t b/tests/test-mq-qfold.t
--- a/tests/test-mq-qfold.t
+++ b/tests/test-mq-qfold.t
@@ -51,8 +51,8 @@  specified)
   --- a/a
   +++ b/a
   @@ -1,1 +1,3 @@
+   a
   +a
-   a
   +b
 
 Fold with local changes:
@@ -67,8 +67,8 @@  Fold with local changes:
   --- a/a
   +++ b/a
   @@ -1,1 +1,3 @@
+   a
   +a
-   a
   +b
 
   $ hg revert -a --no-backup