Patchwork [v7] patch: add within-line color diff capacity

login
register
mail settings
Submitter matthieu.laneuville@octobus.net
Date Dec. 7, 2017, 9:26 a.m.
Message ID <6a7378611408a4bb77ad.1512638799@carbon>
Download mbox | patch
Permalink /patch/25990/
State Accepted
Headers show

Comments

matthieu.laneuville@octobus.net - Dec. 7, 2017, 9:26 a.m.
# HG changeset patch
# User Matthieu Laneuville <matthieu.laneuville@octobus.net>
# Date 1508944418 -32400
#      Thu Oct 26 00:13:38 2017 +0900
# Node ID 6a7378611408a4bb77ad2aca8c8d0ccfe697ea26
# Parent  fcc96cf0983d01c542a9b5e529b434b98941371d
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 6a7378611408
# EXP-Topic inline-diff
patch: add within-line color diff capacity

The `diff' command usually writes deletion in red and insertions in green. This
patch adds within-line colors, to highlight which part of the lines differ.
Lines to compare are decided based on their similarity ratio, as computed by
difflib SequenceMatcher, with an arbitrary threshold (0.7) to decide at which
point two lines are considered entirely different (therefore no inline-diff
required).

The current implementation is kept behind an experimental flag in order to test
the effect on performance. In order to activate it, set inline-color-diff to
true in [experimental].
Yuya Nishihara - Dec. 7, 2017, 1:07 p.m.
On Thu, 07 Dec 2017 18:26:39 +0900, matthieu.laneuville@octobus.net wrote:
> # HG changeset patch
> # User Matthieu Laneuville <matthieu.laneuville@octobus.net>
> # Date 1508944418 -32400
> #      Thu Oct 26 00:13:38 2017 +0900
> # Node ID 6a7378611408a4bb77ad2aca8c8d0ccfe697ea26
> # Parent  fcc96cf0983d01c542a9b5e529b434b98941371d
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 6a7378611408
> # EXP-Topic inline-diff
> patch: add within-line color diff capacity

Queued, thanks.

There are some nits and minor bugs. Can you send follow-up patches?

>  def difflabel(func, *args, **kw):
>      '''yields 2-tuples of (output, label) based on the output of func()'''
> +    inlinecolor = False
> +    if 'opts' in kw.keys():

Dropped '.keys()'.

> +        inlinecolor = kw['opts'].worddiff
>      headprefixes = [('diff', 'diff.diffline'),
>                      ('copy', 'diff.extended'),
>                      ('rename', 'diff.extended'),
> @@ -2479,6 +2484,9 @@ def difflabel(func, *args, **kw):
>      head = False
>      for chunk in func(*args, **kw):
>          lines = chunk.split('\n')
> +        matches = {}
> +        if inlinecolor:
> +            matches = _findmatches(lines)
>          for i, line in enumerate(lines):
>              if i != 0:
>                  yield ('\n', '')
> @@ -2506,7 +2514,14 @@ def difflabel(func, *args, **kw):
>                              if '\t' == token[0]:
>                                  yield (token, 'diff.tab')
>                              else:
> -                                yield (token, label)
> +                                if i in matches:
> +                                    for l, t in _inlinediff(

Nit: I prefer (token, label) order for consistency.

> +                                                  lines[i].rstrip(),
> +                                                  lines[matches[i]].rstrip(),
> +                                                  label):
> +                                        yield (t, l)

This doesn't work if '\t' is in stripline. See 'hg log -p Makefile' as an
example. Perhaps we'll have to move 'diff.tab' handling to _inlinediff().

> +def _inlinediff(s1, s2, operation):
> +    '''Perform string diff to highlight specific changes.'''
> +    operation_skip = '+?' if operation == 'diff.deleted' else '-?'
> +    if operation == 'diff.deleted':
> +        s2, s1 = s1, s2
> +
> +    buff = []
> +    # we never want to higlight the leading +-
> +    if operation == 'diff.deleted' and s2.startswith('-'):
> +        label = operation
> +        token = '-'
> +        s2 = s2[1:]
> +        s1 = s1[1:]
> +    elif operation == 'diff.inserted' and s1.startswith('+'):
> +        label = operation
> +        token = '+'
> +        s2 = s2[1:]
> +        s1 = s1[1:]

else:
    raise ProgrammingError?

Otherwise label and token would be unassigned.

> +    s = difflib.ndiff(re.split(br'(\W)', s2), re.split(br'(\W)', s1))
> +    for part in s:
> +        if part[0] in operation_skip:
> +            continue
> +        l = operation + '.highlight'
> +        if part[0] in ' ':
> +            l = operation
> +        if l == label: # contiguous token with same label
> +            token += part[2:]
> +            continue
> +        else:
> +            buff.append((label, token))
> +            label = l
> +            token = part[2:]
> +    buff.append((label, token))
Yuya Nishihara - Dec. 7, 2017, 1:44 p.m.
On Thu, 7 Dec 2017 22:07:30 +0900, Yuya Nishihara wrote:
> On Thu, 07 Dec 2017 18:26:39 +0900, matthieu.laneuville@octobus.net wrote:
> > # HG changeset patch
> > # User Matthieu Laneuville <matthieu.laneuville@octobus.net>
> > # Date 1508944418 -32400
> > #      Thu Oct 26 00:13:38 2017 +0900
> > # Node ID 6a7378611408a4bb77ad2aca8c8d0ccfe697ea26
> > # Parent  fcc96cf0983d01c542a9b5e529b434b98941371d
> > # Available At https://bitbucket.org/octobus/mercurial-devel/
> > #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 6a7378611408
> > # EXP-Topic inline-diff
> > patch: add within-line color diff capacity
> 
> Queued, thanks.
> 
> There are some nits and minor bugs. Can you send follow-up patches?
> 
> >  def difflabel(func, *args, **kw):
> >      '''yields 2-tuples of (output, label) based on the output of func()'''
> > +    inlinecolor = False
> > +    if 'opts' in kw.keys():
> 
> Dropped '.keys()'.

Actually changed to kw.gets('opts') as it may be None.
Next time, please make sure all tests pass.

Patch

diff -r fcc96cf0983d -r 6a7378611408 mercurial/cmdutil.py
--- a/mercurial/cmdutil.py	Fri Dec 01 14:17:20 2017 +0800
+++ b/mercurial/cmdutil.py	Thu Oct 26 00:13:38 2017 +0900
@@ -1518,7 +1518,7 @@  def diffordiffstat(ui, repo, diffopts, n
         width = 80
         if not ui.plain():
             width = ui.termwidth()
-        chunks = patch.diff(repo, node1, node2, match, changes, diffopts,
+        chunks = patch.diff(repo, node1, node2, match, changes, opts=diffopts,
                             prefix=prefix, relroot=relroot,
                             hunksfilterfn=hunksfilterfn)
         for chunk, label in patch.diffstatui(util.iterlines(chunks),
@@ -1526,7 +1526,7 @@  def diffordiffstat(ui, repo, diffopts, n
             write(chunk, label=label)
     else:
         for chunk, label in patch.diffui(repo, node1, node2, match,
-                                         changes, diffopts, prefix=prefix,
+                                         changes, opts=diffopts, prefix=prefix,
                                          relroot=relroot,
                                          hunksfilterfn=hunksfilterfn):
             write(chunk, label=label)
diff -r fcc96cf0983d -r 6a7378611408 mercurial/color.py
--- a/mercurial/color.py	Fri Dec 01 14:17:20 2017 +0800
+++ b/mercurial/color.py	Thu Oct 26 00:13:38 2017 +0900
@@ -87,12 +87,14 @@  except ImportError:
     'branches.inactive': 'none',
     'diff.changed': 'white',
     'diff.deleted': 'red',
+    'diff.deleted.highlight': 'red bold underline',
     'diff.diffline': 'bold',
     'diff.extended': 'cyan bold',
     'diff.file_a': 'red bold',
     'diff.file_b': 'green bold',
     'diff.hunk': 'magenta',
     'diff.inserted': 'green',
+    'diff.inserted.highlight': 'green bold underline',
     'diff.tab': '',
     'diff.trailingwhitespace': 'bold red_background',
     'changeset.public': '',
diff -r fcc96cf0983d -r 6a7378611408 mercurial/configitems.py
--- a/mercurial/configitems.py	Fri Dec 01 14:17:20 2017 +0800
+++ b/mercurial/configitems.py	Thu Oct 26 00:13:38 2017 +0900
@@ -481,6 +481,9 @@  coreconfigitem('experimental', 'evolutio
 coreconfigitem('experimental', 'evolution.track-operation',
     default=True,
 )
+coreconfigitem('experimental', 'worddiff',
+    default=False,
+)
 coreconfigitem('experimental', 'maxdeltachainspan',
     default=-1,
 )
diff -r fcc96cf0983d -r 6a7378611408 mercurial/mdiff.py
--- a/mercurial/mdiff.py	Fri Dec 01 14:17:20 2017 +0800
+++ b/mercurial/mdiff.py	Thu Oct 26 00:13:38 2017 +0900
@@ -67,6 +67,7 @@  class diffopts(object):
         'ignoreblanklines': False,
         'upgrade': False,
         'showsimilarity': False,
+        'worddiff': False,
         }
 
     def __init__(self, **opts):
diff -r fcc96cf0983d -r 6a7378611408 mercurial/patch.py
--- a/mercurial/patch.py	Fri Dec 01 14:17:20 2017 +0800
+++ b/mercurial/patch.py	Thu Oct 26 00:13:38 2017 +0900
@@ -10,6 +10,7 @@  from __future__ import absolute_import, 
 
 import collections
 import copy
+import difflib
 import email
 import errno
 import hashlib
@@ -2252,6 +2253,7 @@  def difffeatureopts(ui, opts=None, untru
         'showfunc': get('show_function', 'showfunc'),
         'context': get('unified', getter=ui.config),
     }
+    buildopts['worddiff'] = ui.configbool('experimental', 'worddiff')
 
     if git:
         buildopts['git'] = get('git')
@@ -2463,6 +2465,9 @@  def diffhunks(repo, node1=None, node2=No
 
 def difflabel(func, *args, **kw):
     '''yields 2-tuples of (output, label) based on the output of func()'''
+    inlinecolor = False
+    if 'opts' in kw.keys():
+        inlinecolor = kw['opts'].worddiff
     headprefixes = [('diff', 'diff.diffline'),
                     ('copy', 'diff.extended'),
                     ('rename', 'diff.extended'),
@@ -2479,6 +2484,9 @@  def difflabel(func, *args, **kw):
     head = False
     for chunk in func(*args, **kw):
         lines = chunk.split('\n')
+        matches = {}
+        if inlinecolor:
+            matches = _findmatches(lines)
         for i, line in enumerate(lines):
             if i != 0:
                 yield ('\n', '')
@@ -2506,7 +2514,14 @@  def difflabel(func, *args, **kw):
                             if '\t' == token[0]:
                                 yield (token, 'diff.tab')
                             else:
-                                yield (token, label)
+                                if i in matches:
+                                    for l, t in _inlinediff(
+                                                  lines[i].rstrip(),
+                                                  lines[matches[i]].rstrip(),
+                                                  label):
+                                        yield (t, l)
+                                else:
+                                    yield (token, label)
                     else:
                         yield (stripline, label)
                     break
@@ -2515,6 +2530,70 @@  def difflabel(func, *args, **kw):
             if line != stripline:
                 yield (line[len(stripline):], 'diff.trailingwhitespace')
 
+def _findmatches(slist):
+    '''Look for insertion matches to deletion and returns a dict of
+       correspondances.
+    '''
+    lastmatch = 0
+    matches = {}
+    for i, line in enumerate(slist):
+        if line == '':
+            continue
+        if line[0] == '-':
+            lastmatch = max(lastmatch, i)
+            newgroup = False
+            for j, newline in enumerate(slist[lastmatch + 1:]):
+                if newline == '':
+                    continue
+                if newline[0] == '-' and newgroup: # too far, no match
+                    break
+                if newline[0] == '+': # potential match
+                    newgroup = True
+                    sim = difflib.SequenceMatcher(None, line, newline).ratio()
+                    if sim > 0.7:
+                        lastmatch = lastmatch + 1 + j
+                        matches[i] = lastmatch
+                        matches[lastmatch] = i
+                        break
+    return matches
+
+def _inlinediff(s1, s2, operation):
+    '''Perform string diff to highlight specific changes.'''
+    operation_skip = '+?' if operation == 'diff.deleted' else '-?'
+    if operation == 'diff.deleted':
+        s2, s1 = s1, s2
+
+    buff = []
+    # we never want to higlight the leading +-
+    if operation == 'diff.deleted' and s2.startswith('-'):
+        label = operation
+        token = '-'
+        s2 = s2[1:]
+        s1 = s1[1:]
+    elif operation == 'diff.inserted' and s1.startswith('+'):
+        label = operation
+        token = '+'
+        s2 = s2[1:]
+        s1 = s1[1:]
+
+    s = difflib.ndiff(re.split(br'(\W)', s2), re.split(br'(\W)', s1))
+    for part in s:
+        if part[0] in operation_skip:
+            continue
+        l = operation + '.highlight'
+        if part[0] in ' ':
+            l = operation
+        if l == label: # contiguous token with same label
+            token += part[2:]
+            continue
+        else:
+            buff.append((label, token))
+            label = l
+            token = part[2:]
+    buff.append((label, token))
+
+    return buff
+
 def diffui(*args, **kw):
     '''like diff(), but yields 2-tuples of (output, label) for ui.write()'''
     return difflabel(diff, *args, **kw)
diff -r fcc96cf0983d -r 6a7378611408 tests/test-diff-color.t
--- a/tests/test-diff-color.t	Fri Dec 01 14:17:20 2017 +0800
+++ b/tests/test-diff-color.t	Thu Oct 26 00:13:38 2017 +0900
@@ -259,3 +259,95 @@  test tabs
   \x1b[0;32m+\x1b[0m\x1b[0;1;35m	\x1b[0m\x1b[0;32mall\x1b[0m\x1b[0;1;35m		\x1b[0m\x1b[0;32mtabs\x1b[0m\x1b[0;1;41m	\x1b[0m (esc)
 
   $ cd ..
+
+test inline color diff
+
+  $ hg init inline
+  $ cd inline
+  $ cat > file1 << EOF
+  > this is the first line
+  > this is the second line
+  >     third line starts with space
+  > + starts with a plus sign
+  > 
+  > this line won't change
+  > 
+  > two lines are going to
+  > be changed into three!
+  > 
+  > three of those lines will
+  > collapse onto one
+  > (to see if it works)
+  > EOF
+  $ hg add file1
+  $ hg ci -m 'commit'
+  $ cat > file1 << EOF
+  > that is the first paragraph
+  >     this is the second line
+  > third line starts with space
+  > - starts with a minus sign
+  > 
+  > this line won't change
+  > 
+  > two lines are going to
+  > (entirely magically,
+  >  assuming this works)
+  > be changed into four!
+  > 
+  > three of those lines have
+  > collapsed onto one
+  > EOF
+  $ hg diff --config experimental.worddiff=False --color=debug
+  [diff.diffline|diff --git a/file1 b/file1]
+  [diff.file_a|--- a/file1]
+  [diff.file_b|+++ b/file1]
+  [diff.hunk|@@ -1,13 +1,14 @@]
+  [diff.deleted|-this is the first line]
+  [diff.deleted|-this is the second line]
+  [diff.deleted|-    third line starts with space]
+  [diff.deleted|-+ starts with a plus sign]
+  [diff.inserted|+that is the first paragraph]
+  [diff.inserted|+    this is the second line]
+  [diff.inserted|+third line starts with space]
+  [diff.inserted|+- starts with a minus sign]
+   
+   this line won't change
+   
+   two lines are going to
+  [diff.deleted|-be changed into three!]
+  [diff.inserted|+(entirely magically,]
+  [diff.inserted|+ assuming this works)]
+  [diff.inserted|+be changed into four!]
+   
+  [diff.deleted|-three of those lines will]
+  [diff.deleted|-collapse onto one]
+  [diff.deleted|-(to see if it works)]
+  [diff.inserted|+three of those lines have]
+  [diff.inserted|+collapsed onto one]
+  $ hg diff --config experimental.worddiff=True --color=debug
+  [diff.diffline|diff --git a/file1 b/file1]
+  [diff.file_a|--- a/file1]
+  [diff.file_b|+++ b/file1]
+  [diff.hunk|@@ -1,13 +1,14 @@]
+  [diff.deleted|-this is the ][diff.deleted.highlight|first][diff.deleted| line]
+  [diff.deleted|-this is the second line]
+  [diff.deleted|-][diff.deleted.highlight|    ][diff.deleted|third line starts with space]
+  [diff.deleted|-][diff.deleted.highlight|+][diff.deleted| starts with a ][diff.deleted.highlight|plus][diff.deleted| sign]
+  [diff.inserted|+that is the first paragraph]
+  [diff.inserted|+][diff.inserted.highlight|    ][diff.inserted|this is the ][diff.inserted.highlight|second][diff.inserted| line]
+  [diff.inserted|+third line starts with space]
+  [diff.inserted|+][diff.inserted.highlight|-][diff.inserted| starts with a ][diff.inserted.highlight|minus][diff.inserted| sign]
+   
+   this line won't change
+   
+   two lines are going to
+  [diff.deleted|-be changed into ][diff.deleted.highlight|three][diff.deleted|!]
+  [diff.inserted|+(entirely magically,]
+  [diff.inserted|+ assuming this works)]
+  [diff.inserted|+be changed into ][diff.inserted.highlight|four][diff.inserted|!]
+   
+  [diff.deleted|-three of those lines ][diff.deleted.highlight|will]
+  [diff.deleted|-][diff.deleted.highlight|collapse][diff.deleted| onto one]
+  [diff.deleted|-(to see if it works)]
+  [diff.inserted|+three of those lines ][diff.inserted.highlight|have]
+  [diff.inserted|+][diff.inserted.highlight|collapsed][diff.inserted| onto one]