Patchwork [1,of,3] dirstate: change placeholder hash length to 20 bytes

login
register
mail settings
Submitter Durham Goode
Date Nov. 10, 2016, 11:01 a.m.
Message ID <84d7edbc0712c8c71b5e.1478775664@dev111.prn1.facebook.com>
Download mbox | patch
Permalink /patch/17435/
State Accepted
Headers show

Comments

Durham Goode - Nov. 10, 2016, 11:01 a.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1478773042 28800
#      Thu Nov 10 02:17:22 2016 -0800
# Node ID 84d7edbc0712c8c71b5e32c9cc1ee04032b9eb2d
# Parent  3fd53cc1aad882ac9191d7388885acdbbc2d7103
dirstate: change placeholder hash length to 20 bytes

Previously the new-node placeholder hash for manifests generated from the
dirstate was a 21byte long string of "!" characters. Normal hashes are only 20
bytes long.  This makes it complicated to implement more efficient manifest
implementations which rely on the hashes being fixed length.

Let's change this hash to just be 20 bytes long, and rely on the astronomical
improbability of an actual hash being 20 "!" bytes in a row (just like we rely
on no hash ever being the nullid).

A future diff will do this for added and modified dirstate markers as well, so
we're putting the new newnodeid in node.py so there's a common place for these
placeholders.
Sean Farley - Nov. 10, 2016, 6:07 p.m.
Durham Goode <durham@fb.com> writes:

> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1478773042 28800
> #      Thu Nov 10 02:17:22 2016 -0800
> # Node ID 84d7edbc0712c8c71b5e32c9cc1ee04032b9eb2d
> # Parent  3fd53cc1aad882ac9191d7388885acdbbc2d7103
> dirstate: change placeholder hash length to 20 bytes
>
> Previously the new-node placeholder hash for manifests generated from the
> dirstate was a 21byte long string of "!" characters. Normal hashes are only 20
> bytes long.  This makes it complicated to implement more efficient manifest
> implementations which rely on the hashes being fixed length.
>
> Let's change this hash to just be 20 bytes long, and rely on the astronomical
> improbability of an actual hash being 20 "!" bytes in a row (just like we rely
> on no hash ever being the nullid).

Will this cause a problem for a user that has a 21-length dirstate entry
then upgrades Mercurial?

Patch

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -16,6 +16,7 @@  from .i18n import _
 from .node import (
     bin,
     hex,
+    newnodeid,
     nullid,
     nullrev,
     short,
@@ -39,11 +40,6 @@  from . import (
 
 propertycache = util.propertycache
 
-# Phony node value to stand-in for new files in some uses of
-# manifests. Manifests support 21-byte hashes for nodes which are
-# dirty in the working copy.
-_newnode = '!' * 21
-
 nonascii = re.compile(r'[^\x21-\x7f]').search
 
 class basectx(object):
@@ -142,7 +138,7 @@  class basectx(object):
                 removed.append(fn)
             elif flag1 != flag2:
                 modified.append(fn)
-            elif node2 != _newnode:
+            elif node2 != newnodeid:
                 # When comparing files between two commits, we save time by
                 # not comparing the file contents when the nodeids differ.
                 # Note that this means we incorrectly report a reverted change
@@ -1580,7 +1576,7 @@  class workingctx(committablectx):
         """
         mf = self._repo['.']._manifestmatches(match, s)
         for f in s.modified + s.added:
-            mf[f] = _newnode
+            mf[f] = newnodeid
             mf.setflag(f, self.flags(f))
         for f in s.removed:
             if f in mf:
diff --git a/mercurial/copies.py b/mercurial/copies.py
--- a/mercurial/copies.py
+++ b/mercurial/copies.py
@@ -278,7 +278,7 @@  def _makegetfctx(ctx):
         ac = repo.changelog.ancestors(revs, inclusive=True)
         ctx._ancestrycontext = ac
     def makectx(f, n):
-        if len(n) != 20:  # in a working context?
+        if len(n) != 20 or n in node.wdirnodes:  # in a working context?
             if ctx.rev() is None:
                 return ctx.filectx(f)
             return repo[None][f]
diff --git a/mercurial/node.py b/mercurial/node.py
--- a/mercurial/node.py
+++ b/mercurial/node.py
@@ -17,6 +17,12 @@  nullrev = -1
 nullid = b"\0" * 20
 nullhex = hex(nullid)
 
+# Phony node value to stand-in for new files in some uses of
+# manifests.
+newnodeid = '!' * 20
+
+wdirnodes = set((newnodeid,))
+
 # pseudo identifiers for working directory
 # (they are experimental, so don't add too many dependencies on them)
 wdirrev = 0x7fffffff