Patchwork D12413: stringutil: try to avoid running `splitlines()` only to get first line

login
register
mail settings
Submitter phabricator
Date March 25, 2022, 4:55 p.m.
Message ID <differential-rev-PHID-DREV-vy5vfy26asstk73siob6-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/50758/
State New
Headers show

Comments

phabricator - March 25, 2022, 4:55 p.m.
martinvonz created this revision.
Herald added a reviewer: hg-reviewers.
Herald added a subscriber: mercurial-patches.

REVISION SUMMARY
  It's wasteful to call `splitlines()` and only get the first line from
  it. However, Python doesn't seem to provide a built-in way of doing
  just one split based on the set of bytes used by `splitlines()`. As a
  workaround, we do an initial split on just LF and then call
  `splitlines()` on the result. Thanks to Joerg for this suggestion. I
  didn't bother to also split on CR, so users with old Windows editors
  (or repos created by such editors) will not get this performance
  improvement.

REPOSITORY
  rHG Mercurial

BRANCH
  default

REVISION DETAIL
  https://phab.mercurial-scm.org/D12413

AFFECTED FILES
  mercurial/utils/stringutil.py

CHANGE DETAILS




To: martinvonz, #hg-reviewers
Cc: mercurial-patches, mercurial-devel

Patch

diff --git a/mercurial/utils/stringutil.py b/mercurial/utils/stringutil.py
--- a/mercurial/utils/stringutil.py
+++ b/mercurial/utils/stringutil.py
@@ -687,6 +687,10 @@ 
 
 def firstline(text):
     """Return the first line of the input"""
+    # Try to avoid running splitlines() on the whole string
+    i = text.find(b'\n')
+    if i != -1:
+        text = text[:i]
     try:
         return text.splitlines()[0]
     except IndexError: