Patchwork [1,of,3] flagprocessor: simplify flag processor interface

mail settings
Submitter Jun Wu
Date May 11, 2017, 1:47 a.m.
Message ID <1b4a17fefa2e67b6bf92.1494467253@x1c>
Download mbox | patch
Permalink /patch/20563/
State Superseded
Headers show


Jun Wu - May 11, 2017, 1:47 a.m.
# HG changeset patch
# User Jun Wu <>
# Date 1494455614 25200
#      Wed May 10 15:33:34 2017 -0700
# Node ID 1b4a17fefa2e67b6bf9294c9fbce586a6646bdaa
# Parent  1ada3d18e7fbc9069910f2c036992d2f2b28e058
# Available At
#              hg pull -r 1b4a17fefa2e
flagprocessor: simplify flag processor interface

This is a series of cleaning up flag processor to make it more solid and
easier to use.

The current flag processor API allows different "validatehash" result per
revision, per read, write, raw "operation". That looks flexible but could
cause confusing - read and write operations should not both return True for
validatehash, if rawtext != text.

"validatehash" is conceptually a static setting - pick one from ['text',
'rawtext'] and let mercurial hash that.

Practically, there are no real users wanting to hash "rawtext" for now.

And we don't really support hashing "rawtext" at present - it breaks "cmp"
test which assumes "text" is used in hash.

Instead of supporting the choice of hashing "rawtext" and adding overhead to
"cmp" for non-existed yet usecase, I'd prefer dropping "rawtext" hashing for
now - i.e. assume all flag processors want to hash "text" for now. If it's
really useful, we can add it back as an optional argument in the future.

This patch deprecates "addflagprocessor" and adds a new, simplified
"registerflagprocessor" method in


diff --git a/mercurial/ b/mercurial/
--- a/mercurial/
+++ b/mercurial/
@@ -81,27 +81,13 @@  ProgrammingError = error.ProgrammingErro
-def addflagprocessor(flag, processor):
+def registerflagprocessor(flag, read, write):
     """Register a flag processor on a revision data flag.
-    Invariant:
-    - Flags need to be defined in REVIDX_KNOWN_FLAGS and REVIDX_FLAGS_ORDER.
-    - Only one flag processor can be registered on a specific flag.
-    - flagprocessors must be 3-tuples of functions (read, write, raw) with the
-      following signatures:
-          - (read)  f(self, rawtext) -> text, bool
-          - (write) f(self, text) -> rawtext, bool
-          - (raw)   f(self, rawtext) -> bool
-      "text" is presented to the user. "rawtext" is stored in revlog data, not
-      directly visible to the user.
-      The boolean returned by these transforms is used to determine whether
-      the returned text can be used for hash integrity checking. For example,
-      if "write" returns False, then "text" is used to generate hash. If
-      "write" returns True, that basically means "rawtext" returned by "write"
-      should be used to generate hash. Usually, "write" and "read" return
-      different booleans. And "raw" returns a same boolean as "write".
+    flag:  int, should be defined in REVIDX_KNOWN_FLAGS, and REVIDX_FLAGS_ORDER
+    read:  function, (revlog, rawtext) -> text
+    write: function, (revlog, text) -> rawtext
-      Note: The 'raw' transform is used for changegroup generation and in some
-      debug commands. In this case the transform only indicates whether the
-      contents can be used for hash integrity checks.
+    "text" is presented to the user (ex. working copy) and decides hash.
+    "rawtext" is stored in revlog data, used for exchange and bundle.
     if not flag & REVIDX_KNOWN_FLAGS:
@@ -114,5 +100,12 @@  def addflagprocessor(flag, processor):
         msg = _("cannot register multiple processors on flag '%#x'.") % (flag)
         raise error.Abort(msg)
-    _flagprocessors[flag] = processor
+    _flagprocessors[flag] = (read, write)
+def addflagprocessor(flag, processor):
+    """deprecated. please use registerflagprocessor instead"""
+    readtransform, writetransform, rawtransform = processor
+    read = lambda rlog, rawtext: readtransform(rlog, rawtext)[0]
+    write = lambda rlog, text: writetransform(rlog, text)[0]
+    registerflagprocessor(flag, read, write)
 def getoffset(q):
@@ -1377,6 +1370,4 @@  class revlog(object):
             # related operation transform and update result tuple.
             if flag & flags:
-                vhash = True
                 if flag not in _flagprocessors:
                     message = _("missing processor for flag '%#x'") % (flag)
@@ -1385,13 +1376,13 @@  class revlog(object):
                 processor = _flagprocessors[flag]
                 if processor is not None:
-                    readtransform, writetransform, rawtransform = processor
+                    readtransform, writetransform = processor
                     if raw:
-                        vhash = rawtransform(self, text)
+                        validatehash = False
                     elif operation == 'read':
-                        text, vhash = readtransform(self, text)
+                        text = readtransform(self, text)
                     else: # write operation
-                        text, vhash = writetransform(self, text)
-                validatehash = validatehash and vhash
+                        text = writetransform(self, text)
+                        validatehash = False
         return text, validatehash