Patchwork [1,of,6,py3,v2] similar: sort workingfilectx instances by .path()

login
register
mail settings
Submitter Augie Fackler
Date March 23, 2017, 3:11 p.m.
Message ID <1f45b1946c0cb34f84f9.1490281894@arthedain.pit.corp.google.com>
Download mbox | patch
Permalink /patch/19606/
State Accepted
Headers show

Comments

Augie Fackler - March 23, 2017, 3:11 p.m.
# HG changeset patch
# User Augie Fackler <augie@google.com>
# Date 1490280492 14400
#      Thu Mar 23 10:48:12 2017 -0400
# Node ID 1f45b1946c0cb34f84f93f8a4931ab5ea5114b1a
# Parent  66c3ae6d886cae0e3a3cff6a0058e2d2a866fd9d
similar: sort workingfilectx instances by .path()

We'd have to define rich comparison operators for workingfilectx in
Python 3, but in this case we can just get away with sorting the
filenames. Since we're cleaning this up anyway, also avoid sorting the
both sets twice. We still sort addedfiles twice because we mutate it
in the exact-matches block, and removals from a set are cheaper than
removals from a sorted list.

Note that the code works if we don't sort the set before passing it
along to the other functions, but the result can be unpredictable
depending on the set iteration order, which we'd like to avoid.

Patch

diff --git a/mercurial/similar.py b/mercurial/similar.py
--- a/mercurial/similar.py
+++ b/mercurial/similar.py
@@ -106,14 +106,17 @@  def findrenames(repo, added, removed, th
     removedfiles = set([parentctx[fp] for fp in removed
             if fp in parentctx and parentctx[fp].size() > 0])
 
+    pathsorted = lambda s: sorted(s, key=lambda x: x.path())
+    sremovedfiles = pathsorted(removedfiles)
+
     # Find exact matches.
     for (a, b) in _findexactmatches(repo,
-            sorted(addedfiles), sorted(removedfiles)):
+            pathsorted(addedfiles), sremovedfiles):
         addedfiles.remove(b)
         yield (a.path(), b.path(), 1.0)
 
     # If the user requested similar files to be matched, search for them also.
     if threshold < 1.0:
         for (a, b, score) in _findsimilarmatches(repo,
-                sorted(addedfiles), sorted(removedfiles), threshold):
+                pathsorted(addedfiles), sremovedfiles, threshold):
             yield (a.path(), b.path(), score)