Patchwork [02,of,14] util: implement varint functions

login
register
mail settings
Submitter Boris Feld
Date Jan. 18, 2018, 11:21 a.m.
Message ID <1262be5f656bb0597b59.1516274488@FB>
Download mbox | patch
Permalink /patch/26844/
State Superseded
Headers show

Comments

Boris Feld - Jan. 18, 2018, 11:21 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1474779566 25200
#      Sat Sep 24 21:59:26 2016 -0700
# Node ID 1262be5f656bb0597b59a7dd2139b81922829fae
# Parent  939c242897c47ede8cda0b0a0149e15b74803402
# EXP-Topic b2-stream
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 1262be5f656b
util: implement varint functions

This will be useful in an incoming version-2 of the stream format.
Yuya Nishihara - Jan. 19, 2018, 2 p.m.
On Thu, 18 Jan 2018 12:21:28 +0100, Boris Feld wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1474779566 25200
> #      Sat Sep 24 21:59:26 2016 -0700
> # Node ID 1262be5f656bb0597b59a7dd2139b81922829fae
> # Parent  939c242897c47ede8cda0b0a0149e15b74803402
> # EXP-Topic b2-stream
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 1262be5f656b
> util: implement varint functions

(I stopped here. This series will need to be reviewed by someone having more
expertise, and I don't think I can finish reviewing today.)

> +def uvarintencode(value):
> +    """Encode an unsigned integer value to a varint.
> +
> +    A varint is a variable length integer of 1 or more bytes. Each byte
> +    except the last has the most significant bit set. The lower 7 bits of
> +    each byte store the 2's complement representation, least significant group
> +    first.
> +    """
> +    bits = value & 0x7f
> +    value >>= 7
> +    bytes = []
> +    while value:
> +        bytes.append(chr(0x80 | bits))
> +        bits = value & 0x7f
> +        value >>= 7
> +    bytes.append(chr(bits))

Nit: use pycompat.bytechr()

> +    return ''.join(bytes)
> +
> +def uvarintdecodestream(fh):
> +    """Decode an unsigned variable length integer from a stream.
> +
> +    The passed argument is anything that has a ``.read(N)`` method.
> +    """
> +    result = 0
> +    shift = 0
> +    while True:
> +        byte = ord(fh.read(1))

Need to test EOF?

Alternatively, we could split this function into
 a) read bytes while MSB is set
 b) decode bytes into integer (in reverse order)

> +        result |= ((byte & 0x7f) << shift)
> +        if not (byte & 0x80):
> +            return result
> +        shift += 7

Patch

diff --git a/mercurial/util.py b/mercurial/util.py
--- a/mercurial/util.py
+++ b/mercurial/util.py
@@ -3865,3 +3865,36 @@  def safename(f, tag, ctx, others=None):
         fn = '%s~%s~%s' % (f, tag, n)
         if fn not in ctx and fn not in others:
             return fn
+
+def uvarintencode(value):
+    """Encode an unsigned integer value to a varint.
+
+    A varint is a variable length integer of 1 or more bytes. Each byte
+    except the last has the most significant bit set. The lower 7 bits of
+    each byte store the 2's complement representation, least significant group
+    first.
+    """
+    bits = value & 0x7f
+    value >>= 7
+    bytes = []
+    while value:
+        bytes.append(chr(0x80 | bits))
+        bits = value & 0x7f
+        value >>= 7
+    bytes.append(chr(bits))
+
+    return ''.join(bytes)
+
+def uvarintdecodestream(fh):
+    """Decode an unsigned variable length integer from a stream.
+
+    The passed argument is anything that has a ``.read(N)`` method.
+    """
+    result = 0
+    shift = 0
+    while True:
+        byte = ord(fh.read(1))
+        result |= ((byte & 0x7f) << shift)
+        if not (byte & 0x80):
+            return result
+        shift += 7