Patchwork [01,of,24] copies: make the loss in _backwardcopies more stable

login
register
mail settings
Submitter Mads Kiilerich
Date Dec. 16, 2012, 10:33 p.m.
Message ID <36ef75411e38e3cc4b60.1355697236@localhost6.localdomain6>
Download mbox | patch
Permalink /patch/125/
State Accepted
Commit 2330d97e7707c9b70b10ab6e3e000821ac098041
Headers show

Comments

Mads Kiilerich - Dec. 16, 2012, 10:33 p.m.
# HG changeset patch
# User Mads Kiilerich <mads at kiilerich.com>
# Date 1355687455 -3600
# Node ID 36ef75411e38e3cc4b60198ec20750b87b0a7545
# Parent  8c9a52492d426741ab24392d49f44a1d4f23613e
copies: make the loss in _backwardcopies more stable

A couple of tests shows slightly more correct output. That is pure coincidence.
Kevin Bullock - Dec. 17, 2012, 4:57 a.m.
On 16 Dec 2012, at 4:33 PM, Mads Kiilerich wrote:

> # HG changeset patch
> # User Mads Kiilerich <mads at kiilerich.com>
> # Date 1355687455 -3600
> # Node ID 36ef75411e38e3cc4b60198ec20750b87b0a7545
> # Parent  8c9a52492d426741ab24392d49f44a1d4f23613e
> copies: make the loss in _backwardcopies more stable
> 
> A couple of tests shows slightly more correct output. That is pure coincidence.
> 
> diff --git a/mercurial/copies.py b/mercurial/copies.py
> --- a/mercurial/copies.py
> +++ b/mercurial/copies.py
> @@ -150,7 +150,7 @@
>     # in particular, we find renames better than copies
>     f = _forwardcopies(b, a)
>     r = {}
> -    for k, v in f.iteritems():
> +    for k, v in sorted(f.iteritems()):

Right out of the gate here, we've got some potentially visible performance-impacting changes in this series. Can we see some numbers at least on the patches (like this one) that have this possibility?

pacem in terris / ??? / ?????? / ????????? / ??
Kevin R. Bullock
Mads Kiilerich - Dec. 18, 2012, 2:17 a.m.
Kevin Bullock wrote, On 12/17/2012 05:57 AM:
> On 16 Dec 2012, at 4:33 PM, Mads Kiilerich wrote:
>
>> # HG changeset patch
>> # User Mads Kiilerich <mads at kiilerich.com>
>> # Date 1355687455 -3600
>> # Node ID 36ef75411e38e3cc4b60198ec20750b87b0a7545
>> # Parent  8c9a52492d426741ab24392d49f44a1d4f23613e
>> copies: make the loss in _backwardcopies more stable
>>
>> A couple of tests shows slightly more correct output. That is pure coincidence.
>>
>> diff --git a/mercurial/copies.py b/mercurial/copies.py
>> --- a/mercurial/copies.py
>> +++ b/mercurial/copies.py
>> @@ -150,7 +150,7 @@
>>      # in particular, we find renames better than copies
>>      f = _forwardcopies(b, a)
>>      r = {}
>> -    for k, v in f.iteritems():
>> +    for k, v in sorted(f.iteritems()):
> Right out of the gate here, we've got some potentially visible performance-impacting changes in this series. Can we see some numbers at least on the patches (like this one) that have this possibility?

Yes, I share your concern. But what kind of numbers would you like to see?

Adding a sort will obviously not make it faster. (Except for the cases 
where a consistent disk access pattern happens to give better utilization.)

The cost will of course depend on the size of the sorted list, but such 
lists are usually already both sorted and iterated with non-trivial 
constant overhead. Scaling to huge amounts of files would need a general 
solution to that anyway.

And in this case the patch introduces a sort of the list of copied files 
between two revisions when making a backward diff/status. A codepath 
that rarely is hit. It might be possible to make a test case and come up 
with some numbers, but I can't imagine how they could be used and which 
conclusion could be made.

/Mads

Patch

diff --git a/mercurial/copies.py b/mercurial/copies.py
--- a/mercurial/copies.py
+++ b/mercurial/copies.py
@@ -150,7 +150,7 @@ 
     # in particular, we find renames better than copies
     f = _forwardcopies(b, a)
     r = {}
-    for k, v in f.iteritems():
+    for k, v in sorted(f.iteritems()):
         r[v] = k
     return r
 
diff --git a/tests/test-git-export.t b/tests/test-git-export.t
--- a/tests/test-git-export.t
+++ b/tests/test-git-export.t
@@ -337,12 +337,12 @@ 
 Reversed:
 
   $ hg diff --git -r -1 -r -2
-  diff --git a/brand-new3 b/brand-new2
-  rename from brand-new3
+  diff --git a/brand-new3-2 b/brand-new2
+  rename from brand-new3-2
   rename to brand-new2
-  diff --git a/brand-new3-2 b/brand-new3-2
+  diff --git a/brand-new3 b/brand-new3
   deleted file mode 100644
-  --- a/brand-new3-2
+  --- a/brand-new3
   +++ /dev/null
   @@ -1,1 +0,0 @@
   -
diff --git a/tests/test-mv-cp-st-diff.t b/tests/test-mv-cp-st-diff.t
--- a/tests/test-mv-cp-st-diff.t
+++ b/tests/test-mv-cp-st-diff.t
@@ -1072,7 +1072,7 @@ 
   
   % hg st -C --rev . --rev 0
   M a
-    b
+    c
   R b
   R c
   
@@ -1148,7 +1148,7 @@ 
   
   % hg st -C --rev . --rev 2
   M a
-    b
+    c
   A x/y
   R b
   R c