Patchwork [3,of,3] diff: add a `--tool` flag to use external diffing tools

login
register
mail settings
Submitter Pulkit Goyal
Date Sept. 2, 2020, 4:02 p.m.
Message ID <9a4f8b4d94c42b49f956.1599062559@workspace>
Download mbox | patch
Permalink /patch/47087/
State Accepted
Headers show

Comments

Pulkit Goyal - Sept. 2, 2020, 4:02 p.m.
# HG changeset patch
# User Pulkit Goyal <7895pulkit@gmail.com>
# Date 1598625380 -19800
#      Fri Aug 28 20:06:20 2020 +0530
# Node ID 9a4f8b4d94c42b49f956886a9222b1d88a78580e
# Parent  f41847eb060d7807c6dad5276c9f13e461ce14db
# EXP-Topic extdiff-refactor
diff: add a `--tool` flag to use external diffing tools

Diffing using an external tool is a very important use case and used a lot.
Currently in mercurial the only way to do so is to use the extdiff extension.
This approach is not extendable as extdiff kind of rewrites all the diff logic
and plug in it's specific things. Also that can be re-used in other commands.

This patch introduces a `--tool` flag to diff command which we be added by
extdiff extension. It wraps the core diff code and is extensible.

Also having `diff --tool` is better UI than `extdiff`, something which I like in
git.

In future, I will like to refactor existing extdiff logic to use `diff --tool`
logic. But before that, I am more interested in adding external tool
functionality to all places where we show a diff, for example: export, log etc.

Differential Revision: https://phab.mercurial-scm.org/D8972
Yuya Nishihara - Sept. 3, 2020, 12:18 p.m.
On Wed, 02 Sep 2020 21:32:39 +0530, Pulkit Goyal wrote:
> # HG changeset patch
> # User Pulkit Goyal <7895pulkit@gmail.com>
> # Date 1598625380 -19800
> #      Fri Aug 28 20:06:20 2020 +0530
> # Node ID 9a4f8b4d94c42b49f956886a9222b1d88a78580e
> # Parent  f41847eb060d7807c6dad5276c9f13e461ce14db
> # EXP-Topic extdiff-refactor
> diff: add a `--tool` flag to use external diffing tools

> +    extensions.wrapfunction(patch, b'diffhunks', _extdiffhunks)
> +    extensions.wrapfunction(patch, b'diffcontent', _extdiffcontent)

I'm pretty sure they are wrong functions to hook. Extdiff is fundamentally
different in that we won't do nothing about "diff"-ing internally, but
patch.*() are low-level routines to generate/parse unified diff.

Instead, I think "diff --tool" will have to be dispatched at
commands/(log)?cmdutil layer.

> +# stores temporary roots where content will be written for external tools
> +_temproots = {}

Using global to pass intermediate data around functions,

> +        # return value of patch.diffhunks is then iterated over in patch.diff
> +        # which then returns something which is read on more top level callers
> +        # We return an empty list to trick rest of the callers that there is
> +        # no diff to be processed
> +        return []

and returning fake data instead,

> +    # just return the name of paths as we won't need any other information
> +    # in extdiffhunks()
> +    return path1, path2, None, None

and returning tuple of different type?

They smell bad, and made me think that we're doing it wrong.

Patch

diff --git a/hgext/extdiff.py b/hgext/extdiff.py
--- a/hgext/extdiff.py
+++ b/hgext/extdiff.py
@@ -97,10 +97,13 @@  from mercurial.node import (
 from mercurial import (
     archival,
     cmdutil,
+    commands,
     encoding,
     error,
+    extensions,
     filemerge,
     formatter,
+    patch,
     pycompat,
     registrar,
     scmutil,
@@ -118,6 +121,10 @@  configtable = {}
 configitem = registrar.configitem(configtable)
 
 configitem(
+    b'extdiff', br'^(?!(opts|gui)).*$', default=None, generic=True,
+)
+
+configitem(
     b'extdiff', br'opts\..*', default=b'', generic=True,
 )
 
@@ -781,3 +788,190 @@  def uisetup(ui):
 
 # tell hggettext to extract docstrings from these functions:
 i18nfunctions = [savedcmd]
+
+# stores temporary roots where content will be written for external tools
+_temproots = {}
+
+
+def _gettemproot(repo, node, tmproot):
+    global _temproots
+
+    if node not in _temproots:
+        dirname = os.path.basename(repo.root)
+        if dirname == b"":
+            dirname = b"root"
+        if node is not None:
+            dirname = b'%s.%s' % (dirname, short(node))
+            base = os.path.join(tmproot, dirname)
+        else:
+            base = repo.root
+        _temproots[node] = base
+        return base
+
+    return _temproots[node]
+
+
+def _extdiffhunks(
+    orig,
+    repo,
+    ctx1,
+    ctx2,
+    match=None,
+    changes=None,
+    opts=None,
+    losedatafn=None,
+    pathfn=None,
+    copy=None,
+    copysourcematch=None,
+):
+    """ Wraps patch.diffhunks to show diff using external diff tools.
+
+    Does following things in order:
+      * Checks if we are diffing externally or not, if not call orig()
+      * Creates temporary directories where temporary files will be written
+        for external tools
+      * Calls orig(), we are wrapping `patch.diffcontent()` to write content
+        of both diff sides to files instead of producing diffs
+      * Gets the difftool to call from config and build the command
+        which needs to be run
+      * Once all diff sides are written to temp files (if required), runs
+        difftool for each file
+      * Deletes the temporary directory created
+    """
+    if opts is None or opts.tool is None:
+        # mdiffopts does not have tool set, means
+        # we are not diffing externally
+        return orig(
+            repo,
+            ctx1,
+            ctx2,
+            match,
+            changes,
+            opts,
+            losedatafn,
+            pathfn,
+            copy,
+            copysourcematch,
+        )
+
+    # create the base paths for each changesets
+    tmproot = pycompat.mkdtemp(prefix=b'extdiff.')
+    try:
+        # creates the required temporary folders where content
+        # can be written for external tools to read
+        node1 = ctx1.node()
+        node2 = ctx2.node()
+        root1 = _gettemproot(repo, node1, tmproot)
+        root2 = _gettemproot(repo, node2, tmproot)
+        if node1 is not None:
+            os.makedirs(root1)
+        if node2 is not None:
+            os.makedirs(root2)
+
+        # get the list of files which have changed
+        changes = []
+        for c in orig(
+            repo,
+            ctx1,
+            ctx2,
+            match,
+            changes,
+            opts,
+            losedatafn,
+            pathfn,
+            copy,
+            copysourcematch,
+        ):
+            changes.append(c[0])
+
+        # get the command to run
+        # TODO: tool can be configured using cmd.<tool>, we should check for
+        # that too
+        path = repo.ui.config(b'extdiff', opts.tool) or b''
+        cmdline, isgui = _gettooldetails(repo.ui, opts.tool, path)[2:]
+
+        # run the tool for each file
+        _runperfilediff(
+            cmdline,
+            repo.root,
+            repo.ui,
+            isgui,
+            False,
+            False,
+            changes,
+            tmproot,
+            root1,
+            None,
+            root2,
+            node1 if node1 else b'',
+            None,
+            node2 if node2 else b'',
+        )
+
+        # return value of patch.diffhunks is then iterated over in patch.diff
+        # which then returns something which is read on more top level callers
+        # We return an empty list to trick rest of the callers that there is
+        # no diff to be processed
+        return []
+    finally:
+        repo.ui.note(_(b'cleaning up temp directory\n'))
+        shutil.rmtree(tmproot)
+
+
+def _extdiffcontent(orig, data1, data2, header, binary, opts):
+    """ Wraps patch.diffcontent to write file contents to temporary files
+    instead of calling mdiff to produce diffs.
+
+    This is done only when we are using external tools to diff
+    """
+    if opts.tool is None:
+        # not diffing externally, go back to original way
+        return orig(data1, data2, header, binary, opts)
+
+    ctx1, path1, content1 = data1[0], data1[2], data1[4]
+    ctx2, path2, content2 = data2[0], data2[2], data2[4]
+
+    # Write content to temporary files instead of calling mdiff
+    # If node is None, means we need to diff with working directory, hence
+    # no need to write the file
+    # If content is empty, we can skip writing the file and _runperfilediff()
+    # will use /dev/null as the file is missing
+    for node, content, path in (
+        (ctx1.node(), content1, path1),
+        (ctx2.node(), content2, path2),
+    ):
+        if node is not None and content:
+            dirpath = _gettemproot(None, node, None)
+            fpath = os.path.join(dirpath, path)
+            dirfpath = os.path.dirname(fpath)
+            if not os.path.exists(dirfpath):
+                os.makedirs(dirfpath)
+
+            with open(fpath, 'wb') as fp:
+                fp.write(content)
+
+    # just return the name of paths as we won't need any other information
+    # in extdiffhunks()
+    return path1, path2, None, None
+
+
+def _diff(orig, ui, repo, *pats, **opts):
+    overrides = {}
+    if opts.get('tool'):
+        # stat cannot be shown using an external tool
+        cmdutil.check_at_most_one_arg(opts, 'tool', 'stat')
+        # if we will be diffing using external tool, turn off the pager
+        overrides[(b'ui', b'paginate')] = False
+
+    with ui.configoverride(overrides, b'extdiff'):
+        orig(ui, repo, *pats, **opts)
+
+
+def extsetup(ui):
+    diffentry = extensions.wrapcommand(commands.table, b'diff', _diff)
+    diffentry[1].append(
+        (b'', b'tool', '', _(b'show diff using external tool'),)
+    )
+
+    extensions.wrapfunction(patch, b'diffhunks', _extdiffhunks)
+    extensions.wrapfunction(patch, b'diffcontent', _extdiffcontent)
diff --git a/mercurial/configitems.py b/mercurial/configitems.py
--- a/mercurial/configitems.py
+++ b/mercurial/configitems.py
@@ -136,6 +136,9 @@  def _registerdiffopts(section, configpre
         section, configprefix + b'nodates', default=False,
     )
     coreconfigitem(
+        section, configprefix + b'tool', default=None,
+    )
+    coreconfigitem(
         section, configprefix + b'showfunc', default=False,
     )
     coreconfigitem(
diff --git a/mercurial/diffutil.py b/mercurial/diffutil.py
--- a/mercurial/diffutil.py
+++ b/mercurial/diffutil.py
@@ -77,6 +77,7 @@  def difffeatureopts(
         b'context': get(b'unified', getter=ui.config),
     }
     buildopts[b'xdiff'] = ui.configbool(b'experimental', b'xdiff')
+    buildopts[b'tool'] = get(b'tool')
 
     if git:
         buildopts[b'git'] = get(b'git')
diff --git a/mercurial/logcmdutil.py b/mercurial/logcmdutil.py
--- a/mercurial/logcmdutil.py
+++ b/mercurial/logcmdutil.py
@@ -91,6 +91,10 @@  def diffordiffstat(
         relroot = b''
     copysourcematch = None
 
+    if stat:
+        # explicitly set external tooling to None if we are processing stat
+        diffopts.tool = None
+
     def compose(f, g):
         return lambda x: f(g(x))
 
diff --git a/mercurial/mdiff.py b/mercurial/mdiff.py
--- a/mercurial/mdiff.py
+++ b/mercurial/mdiff.py
@@ -40,6 +40,8 @@  splitnewlines = bdiff.splitnewlines
 # TODO: this looks like it could be an attrs, which might help pytype
 class diffopts(object):
     '''context is the number of context lines
+    tool represents which external tool will be used for diff
+         None if no external tool is used
     text treats all files as text
     showfunc enables diff -p output
     git enables the git extended patch format
@@ -56,6 +58,7 @@  class diffopts(object):
 
     defaults = {
         b'context': 3,
+        b'tool': None,
         b'text': False,
         b'showfunc': False,
         b'git': False,
diff --git a/tests/test-extdiff.t b/tests/test-extdiff.t
--- a/tests/test-extdiff.t
+++ b/tests/test-extdiff.t
@@ -543,3 +543,54 @@  Test that diffing a single file works, e
   diffing "*\\a" "*\\a" (glob) (windows !)
   diffing */a */a (glob) (no-windows !)
   [1]
+
+
+Test that `diff --tool` is present and it works
+
+  $ hg help diff | grep 'tool'
+      --tool VALUE          show diff using external tool
+
+  $ hg diff -c . --tool
+  hg diff: option --tool requires argument
+  hg diff [OPTION]... ([-c REV] | [-r REV1 [-r REV2]]) [FILE]...
+  
+  diff repository (or selected files)
+  
+  options ([+] can be repeated):
+  
+   -r --rev REV [+]         revision
+   -c --change REV          change made by revision
+   -a --text                treat all files as text
+   -g --git                 use git extended diff format
+      --binary              generate binary diffs in git mode (default)
+      --nodates             omit dates from diff headers
+      --noprefix            omit a/ and b/ prefixes from filenames
+   -p --show-function       show which function each change is in
+      --reverse             produce a diff that undoes the changes
+   -w --ignore-all-space    ignore white space when comparing lines
+   -b --ignore-space-change ignore changes in the amount of white space
+   -B --ignore-blank-lines  ignore changes whose lines are all blank
+   -Z --ignore-space-at-eol ignore changes in whitespace at EOL
+   -U --unified NUM         number of lines of context to show
+      --stat                output diffstat-style summary of changes
+      --root DIR            produce diffs relative to subdirectory
+   -I --include PATTERN [+] include names matching the given patterns
+   -X --exclude PATTERN [+] exclude names matching the given patterns
+   -S --subrepos            recurse into subrepositories
+      --tool VALUE          show diff using external tool
+  
+  (use 'hg diff -h' to show more help)
+  [255]
+
+  $ hg diff -c . --tool fancyDiffTool
+  /bin/sh: 1: fancyDiffTool: not found (no-chg !)
+  sh: 1: fancyDiffTool: not found (chg !)
+
+  $ hg diff -c . --tool 4463b2
+  b2-naked single quoted double quoted */extdiff.*/testsinglefile.cb9a9f314b8b/a */extdiff.*/testsinglefile.2c635a92f535/a (glob)
+
+  $ hg diff --tool odd -c . --debug | grep "^running"
+  running "/bin/echo --foo='a,cZ\x92\xf55w,\xf8\xc1nfp\x8c\t\xf5Tq\xbe)' 'a,cZ\x92\xf55w,\xf8\xc1nfp\x8c\t\xf5Tq\xbe)' --bar='a,cZ\x92\xf55w,\xf8\xc1nfp\x8c\t\xf5Tq\xbe)' 'a,cZ\x92\xf55w,\xf8\xc1nfp\x8c\t\xf5Tq\xbe)'" in */extdiff.* (glob)
+
+  $ hg diff --debug --tool kdiff3 -c . | grep "^running"
+  running "echo --L1 'a\xcb\x9a\x9f1K\x8b\x07\xbaq\x01/\xcd\xbcTKZM\x82\xff[' --L2 'a,cZ\x92\xf55w,\xf8\xc1nfp\x8c\t\xf5Tq\xbe)' */extdiff.*/testsinglefile.cb9a9f314b8b/a */extdiff.*/testsinglefile.2c635a92f535/a" in */extdiff.* (glob)