Patchwork [3,of,4] changelog: lazy decode description (API)

login
register
mail settings
Submitter Gregory Szorc
Date Feb. 28, 2016, 7:27 a.m.
Message ID <8427442ba08dd8dc324e.1456644452@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/13454/
State Accepted
Headers show

Comments

Gregory Szorc - Feb. 28, 2016, 7:27 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1456640714 28800
#      Sat Feb 27 22:25:14 2016 -0800
# Node ID 8427442ba08dd8dc324ea9e1fd30f65c89b2b753
# Parent  2d5c509edd9094ebc26b177db85842a46caf9c38
changelog: lazy decode description (API)

Currently, changelog reading decodes read values. This is wasteful
because a lot of times consumers aren't interested in some of these
values.

This patch changes description decoding to occur in changectx as
needed.

revsets reading changelog entries appear to speed up slightly:

revset #7: author(lmoscovicz)
   plain
0) 0.906329
1) 0.872653

revset #8: author(mpm)
   plain
0) 0.903478
1) 0.878037

revset #9: author(lmoscovicz) or author(mpm)
   plain
0) 1.817855
1) 1.778680

revset #10: author(mpm) or author(lmoscovicz)
   plain
0) 1.837052
1) 1.764568

Patch

diff --git a/mercurial/changelog.py b/mercurial/changelog.py
--- a/mercurial/changelog.py
+++ b/mercurial/changelog.py
@@ -329,22 +329,30 @@  class changelog(revlog.revlog):
         user\n          : user, no \n or \r allowed
         time tz extra\n : date (time is int or float, timezone is int)
                         : extra is metadata, encoded and separated by '\0'
                         : older versions ignore it
         files\n\n       : files modified by the cset, no \n or \r allowed
         (.*)            : comment (free text, ideally utf-8)
 
         changelog v0 doesn't use extra
+
+        Returns a 6-tuple consisting of the following:
+          - manifest node (binary)
+          - user (encoding.localstr)
+          - (time, timezone) 2-tuple of a float and int offset
+          - list of files modified by the cset
+          - commit message / description (binary)
+          - dict of extra entries
         """
         text = self.revision(node)
         if not text:
             return nullid, "", (0, 0), [], "", _defaultextra
         last = text.index("\n\n")
-        desc = encoding.tolocal(text[last + 2:])
+        desc = text[last + 2:]
         l = text[:last].split('\n')
         manifest = bin(l[0])
         user = encoding.tolocal(l[1])
 
         tdata = l[2].split(' ', 2)
         if len(tdata) != 3:
             time = float(tdata[0])
             try:
diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -549,17 +549,17 @@  class changectx(basectx):
 
     def user(self):
         return self._changeset[1]
     def date(self):
         return self._changeset[2]
     def files(self):
         return self._changeset[3]
     def description(self):
-        return self._changeset[4]
+        return encoding.tolocal(self._changeset[4])
     def branch(self):
         return encoding.tolocal(self._changeset[5].get("branch"))
     def closesbranch(self):
         return 'close' in self._changeset[5]
     def extra(self):
         return self._changeset[5]
     def tags(self):
         return self._repo.nodetags(self._node)