Patchwork [2,of,7] merge: use no-minimal for premerge too

login
register
mail settings
Submitter Pierre-Yves David
Date Aug. 5, 2014, 12:28 a.m.
Message ID <8917721c8a9310b77b25.1407198488@marginatus.alto.octopoid.net>
Download mbox | patch
Permalink /patch/5258/
State Accepted
Commit 2ea6d906cf9b7b7338594bff33dffb7d6a43384f
Headers show

Comments

Pierre-Yves David - Aug. 5, 2014, 12:28 a.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@fb.com>
# Date 1406660101 25200
#      Tue Jul 29 11:55:01 2014 -0700
# Node ID 8917721c8a9310b77b257b7ce920d682f6ea09b5
# Parent  43d2c747287871d320bf740ff279991fb9971c5e
merge: use no-minimal for premerge too

ecc1387138ba disabled minimal for `internal:merge` but forgot to also disabled
it for premerge. This is now done.

This gives me an occasion to shamelessly includes my explanation of why this
minimisation feature must disappear:


Detailled explanation
Matt Mackall - Aug. 5, 2014, 9:28 p.m.
On Mon, 2014-08-04 at 17:28 -0700, pierre-yves.david@ens-lyon.org wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1406660101 25200
> #      Tue Jul 29 11:55:01 2014 -0700
> # Node ID 8917721c8a9310b77b257b7ce920d682f6ea09b5
> # Parent  43d2c747287871d320bf740ff279991fb9971c5e
> merge: use no-minimal for premerge too

First two are queued for default, thanks.
Matt Mackall - Aug. 5, 2014, 9:39 p.m.
On Mon, 2014-08-04 at 17:28 -0700, pierre-yves.david@ens-lyon.org wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1406660101 25200
> #      Tue Jul 29 11:55:01 2014 -0700
> # Node ID 8917721c8a9310b77b257b7ce920d682f6ea09b5
> # Parent  43d2c747287871d320bf740ff279991fb9971c5e
> merge: use no-minimal for premerge too

First two are queued for default, thanks.

Patch

=====================


The ``simplemerge`` code use in ``internal:merge`` has a feature called
"minimization". It reprocess conflicting chunks to find common changes
inside them and excludes such common sections from the marker.

This approach seems a significant win at first glance but produces very
confusing results in some other cases.

Simple example
--------------

A simple example is enough to show the benefit of this feature.  In this merge,
both sides change all numbers from letters to digits, but one side is also
changing some values.

  $ cat << EOF > base
  > Small Mathematical Series.
  > One
  > Two
  > Three
  > Four
  > Five
  > Hop we are done.
  > EOF

  $ cat << EOF > local
  > Small Mathematical Series.
  > 1
  > 2
  > 3
  > 4
  > 5
  > Hop we are done.
  > EOF

  $ cat << EOF > other
  > Small Mathematical Series.
  > 1
  > 2
  > 3
  > 6
  > 8
  > Hop we are done.
  > EOF

In the minimalists case, the markers focus on the disagreement between the two
sides.

  $ $TESTDIR/../contrib/simplemerge --print local base other
  Small Mathematical Series.
  1
  2
  3
  <<<<<<< local
  4
  5
  =======
  6
  8
  >>>>>>> other
  Hop we are done.
  warning: conflicts during merge.
  [1]

In the non minimalist case, the whole chunk is included in the conflict marker.
Making it harder spot actual differences.

  $ $TESTDIR/../contrib/simplemerge --print --no-minimal local base other
  Small Mathematical Series.
  <<<<<<< local
  1
  2
  3
  4
  5
  =======
  1
  2
  3
  6
  8
  >>>>>>> other
  Hop we are done.
  warning: conflicts during merge.
  [1]

Practical Advantages of minimalisation: merge of grafted change
---------------------------------------------------------------

This feature can be very useful when a change have been grafted in another
branch and then some change have been made to the grafted code.

  $ cat << EOF > base
  > # empty file
  > EOF

  $ cat << EOF > local
  > def somefunction(one, two):
  >     some = one
  >     stuff = two
  >     are(happening)
  >     here()
  > EOF

  $ cat << EOF > other
  > def somefunction(one, two):
  >     some = one
  >     change = two
  >     are(happening)
  >     here()
  > EOF

The minimalist case recognises the grafted content as similar and highlight the
actual change.


  $ $TESTDIR/../contrib/simplemerge --print local base other
  def somefunction(one, two):
      some = one
  <<<<<<< local
      stuff = two
  =======
      change = two
  >>>>>>> other
      are(happening)
      here()
  warning: conflicts during merge.
  [1]

Again, the non-minimalist case produces a larger conflict. Making it harder to
spot the actual conflict.

  $ $TESTDIR/../contrib/simplemerge --print --no-minimal local base other
  <<<<<<< local
  def somefunction(one, two):
      some = one
      stuff = two
      are(happening)
      here()
  =======
  def somefunction(one, two):
      some = one
      change = two
      are(happening)
      here()
  >>>>>>> other
  warning: conflicts during merge.
  [1]


Practical disadvantage: multiple functions on each side
---------------------------------------------------------------

So, if this "minimalist" help so much, why introduce a setting to disable it?

The issue is that this minimisation will grab any common lines for breaking
chunks. This may result in partial context when solving a merge. The most
simple example is a merge where both side added some (different) functions
separated by blank lines. The "minimalist" approach will recognise the blank
line as "common" and over slice the chunks, turning a simple conflict case into
multiple pairs of conflicting functions.

  $ cat << EOF > base
  > # empty file
  > EOF

  $ cat << EOF > local
  > def function1():
  >     bla()
  >     bla()
  >     bla()
  >
  > def function2():
  >     ble()
  >     ble()
  >     ble()
  > EOF

  $ cat << EOF > other
  > def function3():
  >     bli()
  >     bli()
  >     bli()
  >
  > def function4():
  >     blo()
  >     blo()
  >     blo()
  > EOF

The minimal case presents each function as a separated context.

  $ $TESTDIR/../contrib/simplemerge --print local base other
  <<<<<<< local
  def function1():
      bla()
      bla()
      bla()
  =======
  def function3():
      bli()
      bli()
      bli()
  >>>>>>> other

  <<<<<<< local
  def function2():
      ble()
      ble()
      ble()
  =======
  def function4():
      blo()
      blo()
      blo()
  >>>>>>> other
  warning: conflicts during merge.
  [1]

The non-minimalist approach produces a simpler version with more context in
each block. Solving such conflicts is usually as simple as dropping the 3 lines
dedicated to markers.

  $ $TESTDIR/../contrib/simplemerge --prin --no-minimal local base other
  <<<<<<< local
  def function1():
      bla()
      bla()
      bla()

  def function2():
      ble()
      ble()
      ble()
  =======
  def function3():
      bli()
      bli()
      bli()

  def function4():
      blo()
      blo()
      blo()
  >>>>>>> other
  warning: conflicts during merge.
  [1]

Practical disaster: programing language have a lot of common line
=================================================================

If only blank lines between function where the only frequent content of a code
file. But programming language tend to repeat them self much more often. In that
case, the minimalist approach turns a simple conflict into a massive mess.

Consider this example where two unrelated functions are added on each side.
Those function shares common programming constructs by chance.

  $ cat << EOF > base
  > # empty file
  > EOF

  $ cat << EOF > local
  > def longfunction():
  >     if bla:
  >        foo
  >     else:
  >        bar
  >     try:
  >        ret = some stuff
  >     except Exception:
  >        ret = None
  >     if ret is not None:
  >         return ret
  >     return 0
  >
  > def shortfunction(foo):
  >     goo()
  >     ret = foo + 5
  >     return ret
  > EOF

  $ cat << EOF > other
  > def otherlongfunction():
  >     for x in xxx:
  >        if coin:
  >            break
  >        tutu
  >     else:
  >        bar()
  >     baz()
  >     ret = week()
  >     try:
  >        groumpf = tutu
  >        fool()
  >     except Exception:
  >        zoo()
  >     pool()
  >     if cond:
  >         return ret
  >
  >     # some big block
  >     ret ** 6
  >     koin()
  >     return ret
  > EOF

The minimalist approach will hash the whole conflict into small chunks that
does not match any meaningful semantic and are impossible to solve.

  $ $TESTDIR/../contrib/simplemerge --print local base other
  <<<<<<< local
  def longfunction():
      if bla:
         foo
  =======
  def otherlongfunction():
      for x in xxx:
         if coin:
             break
         tutu
  >>>>>>> other
      else:
  <<<<<<< local
         bar
  =======
         bar()
      baz()
      ret = week()
  >>>>>>> other
      try:
  <<<<<<< local
         ret = some stuff
  =======
         groumpf = tutu
         fool()
  >>>>>>> other
      except Exception:
  <<<<<<< local
         ret = None
      if ret is not None:
  =======
         zoo()
      pool()
      if cond:
  >>>>>>> other
          return ret
  <<<<<<< local
      return 0
  =======
  >>>>>>> other

  <<<<<<< local
  def shortfunction(foo):
      goo()
      ret = foo + 5
  =======
      # some big block
      ret ** 6
      koin()
  >>>>>>> other
      return ret
  warning: conflicts during merge.
  [1]

The non minimalist approach will properly produce a single set of conflict
markers. Highlighting that the two chunk are unrelated. Such conflict from
unrelated content added at the same place is usually solved by dropping the
marker an keeping both content. Something impossible with minimised markers.


  $ $TESTDIR/../contrib/simplemerge --prin --no-minimal local base other
  <<<<<<< local
  def longfunction():
      if bla:
         foo
      else:
         bar
      try:
         ret = some stuff
      except Exception:
         ret = None
      if ret is not None:
          return ret
      return 0

  def shortfunction(foo):
      goo()
      ret = foo + 5
      return ret
  =======
  def otherlongfunction():
      for x in xxx:
         if coin:
             break
         tutu
      else:
         bar()
      baz()
      ret = week()
      try:
         groumpf = tutu
         fool()
      except Exception:
         zoo()
      pool()
      if cond:
          return ret

      # some big block
      ret ** 6
      koin()
      return ret
  >>>>>>> other
  warning: conflicts during merge.
  [1]

diff --git a/mercurial/filemerge.py b/mercurial/filemerge.py
--- a/mercurial/filemerge.py
+++ b/mercurial/filemerge.py
@@ -189,11 +189,12 @@  def _premerge(repo, toolconf, files, lab
             raise error.ConfigError(_("%s.premerge not valid "
                                       "('%s' is neither boolean nor %s)") %
                                     (tool, premerge, _valid))
 
     if premerge:
-        r = simplemerge.simplemerge(ui, a, b, c, quiet=True, label=labels)
+        r = simplemerge.simplemerge(ui, a, b, c, quiet=True, label=labels,
+                                    no_minimal=True)
         if not r:
             ui.debug(" premerge successful\n")
             return 0
         if premerge != 'keep':
             util.copyfile(back, a) # restore from backup and try again