Patchwork [6,of,6] convert: add support to find git copies from all files in the working copy

login
register
mail settings
Submitter Siddharth Agarwal
Date Sept. 12, 2014, 7:48 p.m.
Message ID <7d8e4019d1bb23c30276.1410551317@devbig136.prn2.facebook.com>
Download mbox | patch
Permalink /patch/5809/
State Superseded
Commit cc5f94db672bd8ec3cb0b61236ec4b774b4c67ee
Headers show

Comments

Siddharth Agarwal - Sept. 12, 2014, 7:48 p.m.
# HG changeset patch
# User Siddharth Agarwal <sid0@fb.com>
# Date 1410550110 25200
#      Fri Sep 12 12:28:30 2014 -0700
# Node ID 7d8e4019d1bb23c302764026ff0232afce6685b2
# Parent  fd79e986ea948d2eae9f532afa309094fe91398e
convert: add support to find git copies from all files in the working copy

I couldn't think of a better name for this option, so I stole the Git one in
the hope that anyone converting a Git repo knows what it means.
Pierre-Yves David - Sept. 18, 2014, 12:36 a.m.
On 09/12/2014 12:48 PM, Siddharth Agarwal wrote:
> # HG changeset patch
> # User Siddharth Agarwal <sid0@fb.com>
> # Date 1410550110 25200
> #      Fri Sep 12 12:28:30 2014 -0700
> # Node ID 7d8e4019d1bb23c302764026ff0232afce6685b2
> # Parent  fd79e986ea948d2eae9f532afa309094fe91398e
> convert: add support to find git copies from all files in the working copy

Series looks good to me and is pushed to the clowncopter.

What about setting the default for rename detection to the same as git 
default? (even if git is inconsistent from one command to another)--
Pierre-Yves David
Augie Fackler - Sept. 23, 2014, 6:13 p.m.
On Wed, Sep 17, 2014 at 05:36:24PM -0700, Pierre-Yves David wrote:
>
>
> On 09/12/2014 12:48 PM, Siddharth Agarwal wrote:
> ># HG changeset patch
> ># User Siddharth Agarwal <sid0@fb.com>
> ># Date 1410550110 25200
> >#      Fri Sep 12 12:28:30 2014 -0700
> ># Node ID 7d8e4019d1bb23c302764026ff0232afce6685b2
> ># Parent  fd79e986ea948d2eae9f532afa309094fe91398e
> >convert: add support to find git copies from all files in the working copy
>
> Series looks good to me and is pushed to the clowncopter.
>
> What about setting the default for rename detection to the same as git
> default? (even if git is inconsistent from one command to another)

Doing rename detection by default during an import from git seems like
a totally reasonable thing to do. I say do it.

> --
> Pierre-Yves David
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/hgext/convert/__init__.py b/hgext/convert/__init__.py
--- a/hgext/convert/__init__.py
+++ b/hgext/convert/__init__.py
@@ -300,6 +300,11 @@ 
         be imported as a rename if more than 90% of the file hasn't
         changed. The default is ``0``.
 
+    :convert.git.findcopiesharder: while detecting copies, look at all
+        files in the working copy instead of just changed ones. This
+        is very expensive for large projects, and is only effective when
+        ``convert.git.similarity`` is greater than 0. The default is False.
+
     Perforce Source
     ###############
 
diff --git a/hgext/convert/git.py b/hgext/convert/git.py
--- a/hgext/convert/git.py
+++ b/hgext/convert/git.py
@@ -102,6 +102,10 @@ 
             raise util.Abort(_('similarity must be between 0 and 100'))
         if similarity > 0:
             self.simopt = '--find-copies=%d%%' % similarity
+            findcopiesharder = ui.configbool('convert', 'git.findcopiesharder',
+                                             False)
+            if findcopiesharder:
+                self.simopt += ' --find-copies-harder'
         else:
             self.simopt = ''
 
diff --git a/tests/test-convert-git.t b/tests/test-convert-git.t
--- a/tests/test-convert-git.t
+++ b/tests/test-convert-git.t
@@ -277,6 +277,18 @@ 
     foo
   R foo
 
+  $ cd git-repo2
+  $ cp bar bar-copied2
+  $ git add bar-copied2
+  $ commit -a -m 'copy with no changes'
+  $ cd ..
+
+  $ hg -q convert --config convert.git.similarity=100 \
+  > --config convert.git.findcopiesharder=1 --datesort git-repo2 fullrepo
+  $ hg -R fullrepo status -C --change master
+  A bar-copied2
+    bar
+
 test binary conversion (issue1359)
 
   $ count=19
diff --git a/tests/test-convert.t b/tests/test-convert.t
--- a/tests/test-convert.t
+++ b/tests/test-convert.t
@@ -253,6 +253,12 @@ 
                     "90" means that a delete/add pair will be imported as a
                     rename if more than 90% of the file hasn't changed. The
                     default is "0".
+      convert.git.findcopiesharder
+                    while detecting copies, look at all files in the working
+                    copy instead of just changed ones. This is very expensive
+                    for large projects, and is only effective when
+                    "convert.git.similarity" is greater than 0. The default is
+                    False.
   
       Perforce Source
       ###############