Patchwork [10,of,11,V2] filemerge: use 'util.ellipsis' to trim custom conflict markers correctly

login
register
mail settings
Submitter Katsunori FUJIWARA
Date July 5, 2014, 6 p.m.
Message ID <21eb3ac84075d72de5ad.1404583209@feefifofum>
Download mbox | patch
Permalink /patch/5114/
State Accepted
Commit 78e56e70c70a870f996c835e4b195b1263e3237e
Headers show

Comments

Katsunori FUJIWARA - July 5, 2014, 6 p.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1404583001 -32400
#      Sun Jul 06 02:56:41 2014 +0900
# Node ID 21eb3ac84075d72de5adc097f8fe4bfe1f66c1b8
# Parent  23280cda5ee6b9ee8840d8f487e36d29efb7c95c
filemerge: use 'util.ellipsis' to trim custom conflict markers correctly

Before this patch, filemerge slices byte sequence directly to trim
conflict markers, but this may cause:

  - splitting at intermediate multi-byte sequence

  - incorrect calculation of column width (length of byte sequence is
    different from columns in display in many cases)

This patch uses 'util.ellipsis' to trim custom conflict markers
correctly, even if multi-byte characters are used in them.
Siddharth Agarwal - July 12, 2014, 8:22 a.m.
On 07/05/2014 11:00 AM, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> # Date 1404583001 -32400
> #      Sun Jul 06 02:56:41 2014 +0900
> # Node ID 21eb3ac84075d72de5adc097f8fe4bfe1f66c1b8
> # Parent  23280cda5ee6b9ee8840d8f487e36d29efb7c95c
> filemerge: use 'util.ellipsis' to trim custom conflict markers correctly

I've queued the first 10 of these to the clowncopter, thanks.

I'm going to leave patch 11 to someone more knowledgeable about the 
tradeoffs involved.


>
> Before this patch, filemerge slices byte sequence directly to trim
> conflict markers, but this may cause:
>
>    - splitting at intermediate multi-byte sequence
>
>    - incorrect calculation of column width (length of byte sequence is
>      different from columns in display in many cases)
>
> This patch uses 'util.ellipsis' to trim custom conflict markers
> correctly, even if multi-byte characters are used in them.
>
> diff --git a/mercurial/filemerge.py b/mercurial/filemerge.py
> --- a/mercurial/filemerge.py
> +++ b/mercurial/filemerge.py
> @@ -290,12 +290,8 @@
>       if mark:
>           mark = mark.splitlines()[0] # split for safety
>   
> -    # The <<< marks add 8 to the length, and '...' adds three, so max
> -    # length of the actual marker is 69.
> -    maxlength = 80 - 8 - 3
> -    if len(mark) > maxlength:
> -        mark = mark[:maxlength] + '...'
> -    return mark
> +    # 8 for the prefix of conflict marker lines (e.g. '<<<<<<< ')
> +    return util.ellipsis(mark, 80 - 8)
>   
>   _defaultconflictmarker = ('{node|short} ' +
>       '{ifeq(tags, "tip", "", "{tags} ")}' +
> diff --git a/tests/test-conflict.t b/tests/test-conflict.t
> --- a/tests/test-conflict.t
> +++ b/tests/test-conflict.t
> @@ -72,9 +72,42 @@
>     something
>     >>>>>>> other: test 1
>   
> +Verify line trimming of custom conflict marker using multi-byte characters
> +
> +  $ hg up -q --clean .
> +  $ python <<EOF
> +  > fp = open('logfile', 'w')
> +  > fp.write('12345678901234567890123456789012345678901234567890' +
> +  >          '1234567890') # there are 5 more columns for 80 columns
> +  >
> +  > # 2 x 4 = 8 columns, but 3 x 4 = 12 bytes
> +  > fp.write(u'\u3042\u3044\u3046\u3048'.encode('utf-8'))
> +  >
> +  > fp.close()
> +  > EOF
> +  $ hg add logfile
> +  $ hg --encoding utf-8 commit --logfile logfile
> +
> +  $ cat >> .hg/hgrc <<EOF
> +  > [ui]
> +  > mergemarkertemplate={desc|firstline}
> +  > EOF
> +
> +  $ hg -q --encoding utf-8 merge 1
> +  warning: conflicts during merge.
> +  merging a incomplete! (edit conflicts, then use 'hg resolve --mark')
> +  [1]
> +
> +  $ cat a
> +  <<<<<<< local: 123456789012345678901234567890123456789012345678901234567890\xe3\x81\x82... (esc)
> +  something else
> +  =======
> +  something
> +  >>>>>>> other: branch1
> +
>   Verify basic conflict markers
>   
> -  $ hg up -q --clean .
> +  $ hg up -q --clean 2
>     $ printf "\n[ui]\nmergemarkers=basic\n" >> .hg/hgrc
>   
>     $ hg merge 1
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Katsunori FUJIWARA - July 12, 2014, 9:41 a.m.
At Sat, 12 Jul 2014 01:22:20 -0700,
Siddharth Agarwal wrote:
> 
> On 07/05/2014 11:00 AM, FUJIWARA Katsunori wrote:
> > # HG changeset patch
> > # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> > # Date 1404583001 -32400
> > #      Sun Jul 06 02:56:41 2014 +0900
> > # Node ID 21eb3ac84075d72de5adc097f8fe4bfe1f66c1b8
> > # Parent  23280cda5ee6b9ee8840d8f487e36d29efb7c95c
> > filemerge: use 'util.ellipsis' to trim custom conflict markers correctly
> 
> I've queued the first 10 of these to the clowncopter, thanks.
> 
> I'm going to leave patch 11 to someone more knowledgeable about the 
> tradeoffs involved.

I hope that patch #11 (or another better solution for safety) will be
also queued, because encoding problems occur suddenly and easily.

In fact, I know that even the original author of "win32mbcs" extension
once fell into this "detailed conflict marker" trap :-)


> > Before this patch, filemerge slices byte sequence directly to trim
> > conflict markers, but this may cause:
> >
> >    - splitting at intermediate multi-byte sequence
> >
> >    - incorrect calculation of column width (length of byte sequence is
> >      different from columns in display in many cases)
> >
> > This patch uses 'util.ellipsis' to trim custom conflict markers
> > correctly, even if multi-byte characters are used in them.
> >
> > diff --git a/mercurial/filemerge.py b/mercurial/filemerge.py
> > --- a/mercurial/filemerge.py
> > +++ b/mercurial/filemerge.py
> > @@ -290,12 +290,8 @@
> >       if mark:
> >           mark = mark.splitlines()[0] # split for safety
> >   
> > -    # The <<< marks add 8 to the length, and '...' adds three, so max
> > -    # length of the actual marker is 69.
> > -    maxlength = 80 - 8 - 3
> > -    if len(mark) > maxlength:
> > -        mark = mark[:maxlength] + '...'
> > -    return mark
> > +    # 8 for the prefix of conflict marker lines (e.g. '<<<<<<< ')
> > +    return util.ellipsis(mark, 80 - 8)
> >   
> >   _defaultconflictmarker = ('{node|short} ' +
> >       '{ifeq(tags, "tip", "", "{tags} ")}' +
> > diff --git a/tests/test-conflict.t b/tests/test-conflict.t
> > --- a/tests/test-conflict.t
> > +++ b/tests/test-conflict.t
> > @@ -72,9 +72,42 @@
> >     something
> >     >>>>>>> other: test 1
> >   
> > +Verify line trimming of custom conflict marker using multi-byte characters
> > +
> > +  $ hg up -q --clean .
> > +  $ python <<EOF
> > +  > fp = open('logfile', 'w')
> > +  > fp.write('12345678901234567890123456789012345678901234567890' +
> > +  >          '1234567890') # there are 5 more columns for 80 columns
> > +  >
> > +  > # 2 x 4 = 8 columns, but 3 x 4 = 12 bytes
> > +  > fp.write(u'\u3042\u3044\u3046\u3048'.encode('utf-8'))
> > +  >
> > +  > fp.close()
> > +  > EOF
> > +  $ hg add logfile
> > +  $ hg --encoding utf-8 commit --logfile logfile
> > +
> > +  $ cat >> .hg/hgrc <<EOF
> > +  > [ui]
> > +  > mergemarkertemplate={desc|firstline}
> > +  > EOF
> > +
> > +  $ hg -q --encoding utf-8 merge 1
> > +  warning: conflicts during merge.
> > +  merging a incomplete! (edit conflicts, then use 'hg resolve --mark')
> > +  [1]
> > +
> > +  $ cat a
> > +  <<<<<<< local: 123456789012345678901234567890123456789012345678901234567890\xe3\x81\x82... (esc)
> > +  something else
> > +  =======
> > +  something
> > +  >>>>>>> other: branch1
> > +
> >   Verify basic conflict markers
> >   
> > -  $ hg up -q --clean .
> > +  $ hg up -q --clean 2
> >     $ printf "\n[ui]\nmergemarkers=basic\n" >> .hg/hgrc
> >   
> >     $ hg merge 1
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel@selenic.com
> > http://selenic.com/mailman/listinfo/mercurial-devel
> 
> 

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy@lares.dti.ne.jp

Patch

diff --git a/mercurial/filemerge.py b/mercurial/filemerge.py
--- a/mercurial/filemerge.py
+++ b/mercurial/filemerge.py
@@ -290,12 +290,8 @@ 
     if mark:
         mark = mark.splitlines()[0] # split for safety
 
-    # The <<< marks add 8 to the length, and '...' adds three, so max
-    # length of the actual marker is 69.
-    maxlength = 80 - 8 - 3
-    if len(mark) > maxlength:
-        mark = mark[:maxlength] + '...'
-    return mark
+    # 8 for the prefix of conflict marker lines (e.g. '<<<<<<< ')
+    return util.ellipsis(mark, 80 - 8)
 
 _defaultconflictmarker = ('{node|short} ' +
     '{ifeq(tags, "tip", "", "{tags} ")}' +
diff --git a/tests/test-conflict.t b/tests/test-conflict.t
--- a/tests/test-conflict.t
+++ b/tests/test-conflict.t
@@ -72,9 +72,42 @@ 
   something
   >>>>>>> other: test 1
 
+Verify line trimming of custom conflict marker using multi-byte characters
+
+  $ hg up -q --clean .
+  $ python <<EOF
+  > fp = open('logfile', 'w')
+  > fp.write('12345678901234567890123456789012345678901234567890' +
+  >          '1234567890') # there are 5 more columns for 80 columns
+  > 
+  > # 2 x 4 = 8 columns, but 3 x 4 = 12 bytes
+  > fp.write(u'\u3042\u3044\u3046\u3048'.encode('utf-8'))
+  > 
+  > fp.close()
+  > EOF
+  $ hg add logfile
+  $ hg --encoding utf-8 commit --logfile logfile
+
+  $ cat >> .hg/hgrc <<EOF
+  > [ui]
+  > mergemarkertemplate={desc|firstline}
+  > EOF
+
+  $ hg -q --encoding utf-8 merge 1
+  warning: conflicts during merge.
+  merging a incomplete! (edit conflicts, then use 'hg resolve --mark')
+  [1]
+
+  $ cat a
+  <<<<<<< local: 123456789012345678901234567890123456789012345678901234567890\xe3\x81\x82... (esc)
+  something else
+  =======
+  something
+  >>>>>>> other: branch1
+
 Verify basic conflict markers
 
-  $ hg up -q --clean .
+  $ hg up -q --clean 2
   $ printf "\n[ui]\nmergemarkers=basic\n" >> .hg/hgrc
 
   $ hg merge 1