Patchwork [2,of,2,STABLE,V3] extdiff: quote only options specified at runtime (issue4463)

login
register
mail settings
Submitter Katsunori FUJIWARA
Date Dec. 5, 2014, 12:14 p.m.
Message ID <1bae72e4e9fc1ad7abda.1417781677@juju>
Download mbox | patch
Permalink /patch/7011/
State Superseded
Headers show

Comments

Katsunori FUJIWARA - Dec. 5, 2014, 12:14 p.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1417781468 -32400
#      Fri Dec 05 21:11:08 2014 +0900
# Branch stable
# Node ID 1bae72e4e9fc1ad7abdadc848adb9b97d75beddc
# Parent  73b764f8fdf2c33ad56795ef2513d9687bd14734
extdiff: quote only options specified at runtime (issue4463)

For options including space characters, changeset 72a89cf86fcd
introduced fully quoting on all options for external diff command.

But this causes unexpected behavior of extdiff with WinMerge, because
WinMerge can't work correctly, when command line options in Windows
standard style are quoted: for example, 'WinMerge /r ....' is OK, but
'WinMerge "/r" ....' is NG.

"contrib/mergetools.hgrc" file also specifies some options in Windows
standard style for WinMerge.

See also https://bitbucket.org/tortoisehg/thg/issue/3978/ for detail
about this problem.

On the other hand, without such quoting in extdiff, users can't
specify options including space characters.

The root cause of this issue is that "shlex.split + util.shellquote"
combination loses whether users really want to quote each options or
not, even though these can be quoted arbitrarily in configuration
files.

To resolve this problem, this patch introduces mixed quoting policy:

  - for options specified at runtime by "-o" of extdiff

    These should be quoted in extdiff, because quoting in the command
    line is stripped by the command shell. Without quoting in extdiff,
    this stripping breaks options including space characters.

    This patch explicitly quotes them at "extdiff" and "mydiff",
    instead of quoting in "dodiff".

  - for options specified in configuration files

    To keep quoting in configuration files, this patch uses
    "util.shellsplit" instead of "shlex.split" (or avoid "shlex.split"
    invocation).

BTW, this patch may save some complicated external diff configurations
like below:

  - configuration using command separations and redirections

    "shlex.split" doesn't suppose them, and quoting by
    "util.shellquote" prevents such separators from working correctly.

    For example, "shlex.split" splits "echo foo;echo bar" into
    ["echo", "foo;echo", "bar"].

  - configuration using environment variable containing space characters

    Environment variables are expanded before splitting command lines
    into each components.

    Without quoting, such environment variable containing space
    characters is treated as multiple options. But forcible quoting
    like 72a89cf86fcd prevents it from being treated as multiple
    options.

    For example, when CONCAT="foo bar baz':

      - mydiff $CONCAT   => mydiff foo bar baz   (taking 3 arguments)
      - mydiff "$CONCAT" => mydiff "foo bar baz" (taking only 1 argument)

    This should be decided not by Mercurial implementation but by
    users.
Mads Kiilerich - Dec. 5, 2014, 2:57 p.m.
On 12/05/2014 01:14 PM, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> # Date 1417781468 -32400
> #      Fri Dec 05 21:11:08 2014 +0900
> # Branch stable
> # Node ID 1bae72e4e9fc1ad7abdadc848adb9b97d75beddc
> # Parent  73b764f8fdf2c33ad56795ef2513d9687bd14734
> extdiff: quote only options specified at runtime (issue4463)

Instead, couldn't we just let shellquote (on all platforms) just return 
the string itself if it is "safe"? That would also give more clean 
output when inspecting what actually is being run.

/Mads
Yuya Nishihara - Dec. 6, 2014, 9:21 a.m.
On Fri, 05 Dec 2014 15:57:59 +0100, Mads Kiilerich wrote:
> On 12/05/2014 01:14 PM, FUJIWARA Katsunori wrote:
> > # HG changeset patch
> > # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> > # Date 1417781468 -32400
> > #      Fri Dec 05 21:11:08 2014 +0900
> > # Branch stable
> > # Node ID 1bae72e4e9fc1ad7abdadc848adb9b97d75beddc
> > # Parent  73b764f8fdf2c33ad56795ef2513d9687bd14734
> > extdiff: quote only options specified at runtime (issue4463)
> 
> Instead, couldn't we just let shellquote (on all platforms) just return
> the string itself if it is "safe"? That would also give more clean
> output when inspecting what actually is being run.

If I understand it, the problem begins when extdiff parses shell command
by shlex.split().  shlex.split() discards the information how each argument
is quoted, but it is necessary to process environment variables properly,
for example.

filemerge and shell alias pass the given command string as-is, so only extdiff
behaves differently.

For example,

  % cat .hgrc
  [ui]
  merge = echo

  [extdiff]
  echo =

  [merge-tools]
  echo.args = --opt $OPT '$LIT' $local $base $other
  echo.diffargs = --opt $OPT '$LIT' $parent $child

  % hg merge
  # runs '/bin/echo' --opt $OPT '$LIT' 'foo' 'foo~base' 'foo~other'

  % hg-3.2/hg echo -r0
  # runs 'echo' --opt $OPT $LIT '.../foo' 'foo'

  % hg-3.2.1/hg echo -r0
  # runs 'echo' '--opt' '$OPT' '$LIT' ''.../foo'' ''foo''

Oh, I see another issue. extdiff of hg 3.2.1 can't handle file name with
white spaces.

Maybe extdiff.dodiff() shouldn't accept diffcmd and diffopts separately
so that the caller can avoid unnecessary shlex.split().

Regards,

Patch

diff --git a/hgext/extdiff.py b/hgext/extdiff.py
--- a/hgext/extdiff.py
+++ b/hgext/extdiff.py
@@ -64,7 +64,7 @@  pretty fast (at least faster than having
 from mercurial.i18n import _
 from mercurial.node import short, nullid
 from mercurial import cmdutil, scmutil, util, commands, encoding
-import os, shlex, shutil, tempfile, re
+import os, shutil, tempfile, re
 
 cmdtable = {}
 command = cmdutil.command(cmdtable)
@@ -121,7 +121,7 @@  def dodiff(ui, repo, diffcmd, diffopts, 
 
     revs = opts.get('rev')
     change = opts.get('change')
-    args = ' '.join(map(util.shellquote, diffopts))
+    args = ' '.join(diffopts)
     do3way = '$parent2' in args
 
     if revs and change:
@@ -268,7 +268,9 @@  def extdiff(ui, repo, *pats, **opts):
     revisions are specified, the working directory files are compared
     to its parent.'''
     program = opts.get('program')
-    option = opts.get('option')
+    # they should be quoted explicitly, because quotations
+    # for them are already stripped by shell
+    option = map(util.shellquote, opts['option'])
     if not program:
         program = 'diff'
         option = option or ['-Npru']
@@ -280,26 +282,28 @@  def uisetup(ui):
             cmd = cmd[4:]
             if not path:
                 path = cmd
-            diffopts = shlex.split(ui.config('extdiff', 'opts.' + cmd, ''))
+            diffopts = ui.config('extdiff', 'opts.' + cmd, '')
         elif cmd.startswith('opts.'):
             continue
         else:
             # command = path opts
             if path:
-                diffopts = shlex.split(path)
-                path = diffopts.pop(0)
+                path, diffopts = util.shellsplit(path, all=False)
             else:
-                path, diffopts = cmd, []
+                path, diffopts = cmd, ''
         # look for diff arguments in [diff-tools] then [merge-tools]
-        if diffopts == []:
-            args = ui.config('diff-tools', cmd+'.diffargs') or \
+        if not diffopts:
+            diffopts = ui.config('diff-tools', cmd+'.diffargs') or \
                    ui.config('merge-tools', cmd+'.diffargs')
-            if args:
-                diffopts = shlex.split(args)
         def save(cmd, path, diffopts):
             '''use closure to save diff command to use'''
             def mydiff(ui, repo, *pats, **opts):
-                return dodiff(ui, repo, path, diffopts + opts['option'],
+                # they should be quoted explicitly, because quotations
+                # for them are already stripped by shell
+                runtimeopts = map(util.shellquote, opts['option'])
+                if diffopts:
+                    runtimeopts.insert(0, diffopts)
+                return dodiff(ui, repo, path, runtimeopts,
                               pats, opts)
             doc = _('''\
 use %(path)s to diff repository (or selected files)