Patchwork D5514: test: change test's diff generation to use mdiff for nicer output

login
register
mail settings
Submitter phabricator
Date Jan. 7, 2019, 2:54 p.m.
Message ID <differential-rev-PHID-DREV-ll3xv4mpxxrkf6pg3cdp-req@phab.mercurial-scm.org>
Download mbox | patch
Permalink /patch/37531/
State New
Headers show

Comments

phabricator - Jan. 7, 2019, 2:54 p.m.
sangeet259 created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  The current diff being used by tests upon failing is not very good. Sometimes there is a lot of redundancy.
  
  To see for yourself, follow these steps:
  
  1. Download this bundle : http://bit.ly/2DuJjsS
  
    $ hg clone https://www.mercurial-scm.org/hg
    $ cd hg/
    $ hg unbundle af8efb83e1d6.hg
    $ hg up a1e70c1dbec0^
    $ hg revert --all --rev a1e70c1dbec0
    $ hg revert -i tests/test-bookmarks-pushpull.t # revert only the last three hunks (press 'n' for the first 8, and 'y' for the last three)
    $ cd tests
    $ hg diff --rev a1e70c1dbec0 # nice output from Mercurial
    $ ./run-test.py test-bookmarks-pushpull.t # poor output from the test runner
  
  If you couldn't follow those steps, you can simply check the diffs here:
  
  - test's diff: https://pastebin.com/Z4LRg4vx
  - diff used by hg diff: https://pastebin.com/qanQEsiA
  
  Tests uses _unified_diff based on difflib.py . The output isn't great!
  
  Mercurial's diff uses mdiff which gives better diffs.
  This patch uses mdiff's unidiff function to generate the diff for the failed tests.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5514

AFFECTED FILES
  mercurial/mdiff.py
  tests/run-tests.py

CHANGE DETAILS




To: sangeet259, #hg-reviewers
Cc: mercurial-devel
phabricator - Jan. 7, 2019, 3:06 p.m.
sangeet259 added a comment.


  Note: To see this in action, one needs to add the changes from this `mdiff.py` to the mdiff.py of their installed mercurial.
  This is because unless you do that, you cannot import `mdiff.new_patch` as that function won't be there in your installed version, and hence it will keep falling back to the `unified_diff`.
  
  Else, can it be implemented in any other way.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5514

To: sangeet259, #hg-reviewers
Cc: mercurial-devel
phabricator - Jan. 8, 2019, 4:48 p.m.
durin42 requested changes to this revision.
durin42 added inline comments.
This revision now requires changes to proceed.

INLINE COMMENTS

> run-tests.py:72
>  import xml.dom.minidom as minidom
> +from mercurial import mdiff, patch
>  

This will fail if Mercurial isn't already globally installed on the machine, as would be the case in at least some packaging circumstances when building from a source tarball.

Probably need a try/except, and _maybe_ a sanity check that mdiff's API matches what we expect?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5514

To: sangeet259, #hg-reviewers, durin42
Cc: durin42, mercurial-devel
phabricator - Jan. 8, 2019, 5:27 p.m.
pulkit added a comment.


  > If you couldn't follow those steps, you can simply check the diffs here:
  > 
  > test's diff: https://pastebin.com/Z4LRg4vx
  >  diff used by hg diff: https://pastebin.com/qanQEsiA
  
  Can you make the diffs a part of commit message since pastebin's pastes can be removed in future?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5514

To: sangeet259, #hg-reviewers, durin42
Cc: pulkit, durin42, mercurial-devel
phabricator - Jan. 9, 2019, 8:05 a.m.
sangeet259 added a comment.


  @durin42 So the `try/except` will fall back to `unified diff`?
  Is there a way we can enforce this on system's that don't have mercurial installed globally and not have to fall back on the earlier practice.
  
  Also, I didn't get your comment on checking the API of `mdiff` :/

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5514

To: sangeet259, #hg-reviewers, durin42
Cc: pulkit, durin42, mercurial-devel
phabricator - Jan. 9, 2019, 8:47 p.m.
durin42 added a comment.


  In https://phab.mercurial-scm.org/D5514#81800, @sangeet259 wrote:
  
  > @durin42 So the `try/except` will fall back to `unified diff`?
  
  
  Correct.
  
  > Is there a way we can enforce this on system's that don't have mercurial installed globally and not have to fall back on the earlier practice.
  
  I think we don't have any choice: if hg isn't installed, we have to fall back to the old codepath, since we won't know if mdiff is present.
  
  > Also, I didn't get your comment on checking the API of `mdiff` :/
  
  Well, what if we iterate on the API of mdiff.new_diff? then again, if the point of it is to have the same API as difflib.unified_diff, probably don't need to worry (and so we only have to fall back to difflib if mdiff isn't available or is too old).

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5514

To: sangeet259, #hg-reviewers, durin42
Cc: pulkit, durin42, mercurial-devel
phabricator - Jan. 10, 2019, 3:19 p.m.
durin42 added inline comments.

INLINE COMMENTS

> mdiff.py:554
> +
> +def new_diff(expected, output, ref, err):
> +    exp, date1, out, date2, opts = prepare_mdiff(expected, output)

At least add a docstring taht explains this is designed to be a similar API to `difflib.unified_diff`?

> run-tests.py:50
>  import collections
> +import datetime
>  import difflib

unused?

> run-tests.py:54
>  import errno
> +import itertools
>  import json

unused?

> run-tests.py:81
> +    _diff = mdiff.new_diff
> +except Exception:
> +    print("Falling back to unified_diff")

Nit: `except (ImportError, AttributeError):` instead of just catching `Exception`.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5514

To: sangeet259, #hg-reviewers, durin42
Cc: pulkit, durin42, mercurial-devel
phabricator - Feb. 14, 2019, 4:17 a.m.
sangeet259 added a comment.


  ping @durin42

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5514

To: sangeet259, #hg-reviewers, durin42
Cc: pulkit, durin42, mercurial-devel
phabricator - Feb. 19, 2019, 9:19 a.m.
sangeet259 added a comment.


  Hey @durin42 , take a look at the changes when you get time.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5514

To: sangeet259, #hg-reviewers, durin42
Cc: pulkit, durin42, mercurial-devel
phabricator - Feb. 19, 2019, 3 p.m.
durin42 added a comment.


  I've done some cleanups (dropped an unused import, copyedited the commit message a bit) and should push this soon. Thanks!

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5514

To: sangeet259, #hg-reviewers, durin42
Cc: pulkit, durin42, mercurial-devel
phabricator - Feb. 19, 2019, 3:08 p.m.
durin42 added a comment.


  Even with my in-flight fixes, I can't get this to use the non-unified_diff codepath. Can you test again?
  
  https://bitbucket.org/snippets/durin42/nejj94 has my in-flight fixes, but you may find it challenging to apply (the test output in the commit message confuses the patch parser). Hopefully that helps...

INLINE COMMENTS

> run-tests.py:80
> +    _diff = mdiff.new_diff
> +except except (ImportError, AttributeError):
> +    print("Falling back to unified_diff")

In the future please *run tests* before requesting another round of review: this is a trivial syntax error and proves you didn't try running tests.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5514

To: sangeet259, #hg-reviewers, durin42
Cc: pulkit, durin42, mercurial-devel
phabricator - Feb. 20, 2019, 8:21 p.m.
durin42 requested changes to this revision.
durin42 added a comment.
This revision now requires changes to proceed.


  (moving this off the dashboard until it gets updated)

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5514

To: sangeet259, #hg-reviewers, durin42
Cc: pulkit, durin42, mercurial-devel

Patch

diff --git a/tests/run-tests.py b/tests/run-tests.py
--- a/tests/run-tests.py
+++ b/tests/run-tests.py
@@ -48,8 +48,10 @@ 
 import argparse
 import collections
 import difflib
+import datetime
 import distutils.version as version
 import errno
+import itertools
 import json
 import multiprocessing
 import os
@@ -67,6 +69,7 @@ 
 import unittest
 import uuid
 import xml.dom.minidom as minidom
+from mercurial import mdiff, patch
 
 try:
     import Queue as queue
@@ -603,10 +606,16 @@ 
     import functools
     _unified_diff = functools.partial(difflib.diff_bytes, difflib.unified_diff)
 
+try:
+    _diff = mdiff.new_diff
+except Exception:
+    print("Falling back to unified_diff")
+    _diff = _unified_diff
+
 def getdiff(expected, output, ref, err):
     servefail = False
     lines = []
-    for line in _unified_diff(expected, output, ref, err):
+    for line in _diff(expected, output, ref, err):
         if line.startswith(b'+++') or line.startswith(b'---'):
             line = line.replace(b'\\', b'/')
             if line.endswith(b' \n'):
diff --git a/mercurial/mdiff.py b/mercurial/mdiff.py
--- a/mercurial/mdiff.py
+++ b/mercurial/mdiff.py
@@ -7,6 +7,8 @@ 
 
 from __future__ import absolute_import
 
+import datetime
+import itertools
 import re
 import struct
 import zlib
@@ -525,3 +527,33 @@ 
 
 def replacediffheader(oldlen, newlen):
     return struct.pack(">lll", 0, oldlen, newlen)
+
+def prepare_mdiff(expected, output):
+    """Prepare the inputs for the mdiff.unidiff function"""
+    date1 = datetime.datetime.now().strftime("%a %b %d %y %H:%M:%S %Y %z")
+    date2 = date1
+    # join all the different elements into a single string
+    # to regenerate that file
+    exp = "".join(expected)
+    out = "".join(output)
+    opts = diffopts(noprefix=True)
+    return exp, date1, out, date2, opts
+
+def process_mdiff(uheaders, hunks):
+    """Process the output of mdiff into a list of lines,
+    to be used by getdiff"""
+    # the hunklines are in the hunks generator
+    # get the hunklines from the (hunkrange,hunklines) tuple
+    hunklines = [x[1] for x in hunks]
+    # extract and insert the headers at the beginnng of the hunklines list
+    headers = [uheaders[0].split("\t")[0]+"\n", uheaders[1].split("\t")[0]+"\n"]
+    hunklines.insert(0, headers)
+    difflines = itertools.chain.from_iterable(hunklines)
+    return difflines
+
+def new_diff(expected, output, ref, err):
+    exp, date1, out, date2, opts = prepare_mdiff(expected, output)
+    uheaders, hunks = unidiff(exp, date1, out, date2,
+                            ref, err, binary=False, opts=opts)
+    difflines = process_mdiff(uheaders, hunks)
+    return difflines