Patchwork [7,of,7] largefiles: use readasstandin() to read hex hash directly from filectx

login
register
mail settings
Submitter Katsunori FUJIWARA
Date March 31, 2017, 5:41 p.m.
Message ID <bd7e74372d5d3763b387.1490982112@speaknoevil>
Download mbox | patch
Permalink /patch/19879/
State Accepted
Headers show

Comments

Katsunori FUJIWARA - March 31, 2017, 5:41 p.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1490981569 -32400
#      Sat Apr 01 02:32:49 2017 +0900
# Node ID bd7e74372d5d3763b38717159a805cf557b07604
# Parent  89507622f5b4cb8b8e9f99a02db43b64f10d95f3
largefiles: use readasstandin() to read hex hash directly from filectx

BTW, C implementation of hexdigest() for SHA-1/256/512 returns hex
hash in lower case, and doctest in Python standard hashlib assumes
that, too. But it isn't explicitly described in API document or so.

Therefore, we can't assume that hexdigest() always returns hex hash in
lower case, for any hash algorithms, on any Python runtimes and
versions.

From point of view of that, it is reasonable for portability that
40800668e019 applies lower() on hex hash in overridefilemerge().

But on the other hand, in largefiles extension, there are still many
code paths comparing between hex hashes or storing hex hash into
standin file, without lower().

Switching to hash algorithm other than SHA-1 may be good chance to
clarify our policy about hexdigest()-ed hash value string.

  - assume that hexdigest() always returns hex hash in lower case, or

  - apply lower() on hex hash in appropriate layers to ensure
    lower-case-ness of it for portability
Yuya Nishihara - April 2, 2017, 4:44 a.m.
On Sat, 01 Apr 2017 02:41:52 +0900, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> # Date 1490981569 -32400
> #      Sat Apr 01 02:32:49 2017 +0900
> # Node ID bd7e74372d5d3763b38717159a805cf557b07604
> # Parent  89507622f5b4cb8b8e9f99a02db43b64f10d95f3
> largefiles: use readasstandin() to read hex hash directly from filectx

Queued, thanks.

> BTW, C implementation of hexdigest() for SHA-1/256/512 returns hex
> hash in lower case, and doctest in Python standard hashlib assumes
> that, too. But it isn't explicitly described in API document or so.
> 
> Therefore, we can't assume that hexdigest() always returns hex hash in
> lower case, for any hash algorithms, on any Python runtimes and
> versions.

[...]

We'll need to stop using .hexdigest() for Py3 compatibility anyway, and
use node.hex() instead. node.hex() isn't documented to return lowercase
hex, but everything would be broken if it doesn't.

Patch

diff --git a/hgext/largefiles/basestore.py b/hgext/largefiles/basestore.py
--- a/hgext/largefiles/basestore.py
+++ b/hgext/largefiles/basestore.py
@@ -130,7 +130,7 @@  class basestore(object):
                     key = (filename, fctx.filenode())
                     if key not in verified:
                         verified.add(key)
-                        expectedhash = fctx.data().strip()
+                        expectedhash = lfutil.readasstandin(fctx)
                         filestocheck.append((cset, filename, expectedhash))
 
         failed = self._verifyfiles(contents, filestocheck)
diff --git a/hgext/largefiles/lfcommands.py b/hgext/largefiles/lfcommands.py
--- a/hgext/largefiles/lfcommands.py
+++ b/hgext/largefiles/lfcommands.py
@@ -406,7 +406,7 @@  def cachelfiles(ui, repo, node, filelist
     ctx = repo[node]
     for lfile in lfiles:
         try:
-            expectedhash = ctx[lfutil.standin(lfile)].data().strip()
+            expectedhash = lfutil.readasstandin(ctx[lfutil.standin(lfile)])
         except IOError as err:
             if err.errno == errno.ENOENT:
                 continue # node must be None and standin wasn't found in wctx
diff --git a/hgext/largefiles/lfutil.py b/hgext/largefiles/lfutil.py
--- a/hgext/largefiles/lfutil.py
+++ b/hgext/largefiles/lfutil.py
@@ -174,7 +174,7 @@  def lfdirstatestatus(lfdirstate, repo):
             fctx = pctx[standin(lfile)]
         except LookupError:
             fctx = None
-        if not fctx or fctx.data().strip() != hashfile(repo.wjoin(lfile)):
+        if not fctx or readasstandin(fctx) != hashfile(repo.wjoin(lfile)):
             modified.append(lfile)
         else:
             clean.append(lfile)
@@ -528,7 +528,7 @@  def getlfilestoupload(repo, missing, add
                     files.add(f)
         for fn in files:
             if isstandin(fn) and fn in ctx:
-                addfunc(fn, ctx[fn].data().strip())
+                addfunc(fn, readasstandin(ctx[fn]))
     repo.ui.progress(_('finding outgoing largefiles'), None)
 
 def updatestandinsbymatch(repo, match):
diff --git a/hgext/largefiles/overrides.py b/hgext/largefiles/overrides.py
--- a/hgext/largefiles/overrides.py
+++ b/hgext/largefiles/overrides.py
@@ -553,9 +553,9 @@  def overridefilemerge(origfn, premerge, 
         return origfn(premerge, repo, mynode, orig, fcd, fco, fca,
                       labels=labels)
 
-    ahash = fca.data().strip().lower()
-    dhash = fcd.data().strip().lower()
-    ohash = fco.data().strip().lower()
+    ahash = lfutil.readasstandin(fca).lower()
+    dhash = lfutil.readasstandin(fcd).lower()
+    ohash = lfutil.readasstandin(fco).lower()
     if (ohash != ahash and
         ohash != dhash and
         (dhash == ahash or
diff --git a/hgext/largefiles/reposetup.py b/hgext/largefiles/reposetup.py
--- a/hgext/largefiles/reposetup.py
+++ b/hgext/largefiles/reposetup.py
@@ -172,7 +172,7 @@  def reposetup(ui, repo):
                             if standin not in ctx1:
                                 # from second parent
                                 modified.append(lfile)
-                            elif ctx1[standin].data().strip() \
+                            elif lfutil.readasstandin(ctx1[standin]) \
                                     != lfutil.hashfile(self.wjoin(lfile)):
                                 modified.append(lfile)
                             else:
@@ -188,7 +188,7 @@  def reposetup(ui, repo):
                             standin = lfutil.standin(lfile)
                             if standin in ctx1:
                                 abslfile = self.wjoin(lfile)
-                                if ((ctx1[standin].data().strip() !=
+                                if ((lfutil.readasstandin(ctx1[standin]) !=
                                      lfutil.hashfile(abslfile)) or
                                     (checkexec and
                                      ('x' in ctx1.flags(standin)) !=