From patchwork Thu Apr 30 13:23:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: diff performance: re-establish linear runtime performance From: Elmar Bartel X-Patchwork-Id: 46252 Message-Id: To: mercurial-devel@mercurial-scm.org Cc: elb_hg@leo.org Date: Thu, 30 Apr 2020 15:23:11 +0200 # HG changeset patch # User Elmar Bartel # Date 1588252205 -7200 # Thu Apr 30 15:10:05 2020 +0200 # Node ID fd68848f8d5515f9a1afe288787f27fcdb130d42 # Parent 5c1f356b108e07782e07b824c206a87d3a9abcff diff performance: re-establish linear runtime performance The previous method with sum() and list() creates a new list object for every hunk. Then sum() is used to flatten out this sequence of lists. The sum() function is not "lazy", but creates a new list object for every "+" operation and so this code had quadratic runtime behaviour. diff -r 5c1f356b108e -r fd68848f8d55 mercurial/patch.py --- a/mercurial/patch.py Fri Apr 24 12:37:43 2020 -0700 +++ b/mercurial/patch.py Thu Apr 30 15:10:05 2020 +0200 @@ -2558,7 +2558,7 @@ def diff( fctx2 is not None ), b'fctx2 unexpectly None in diff hunks filtering' hunks = hunksfilterfn(fctx2, hunks) - text = b''.join(sum((list(hlines) for hrange, hlines in hunks), [])) + text = b''.join((b''.join(hlines) for hrange, hlines in hunks)) if hdr and (text or len(hdr) > 1): yield b'\n'.join(hdr) + b'\n' if text: