Patchwork [1,of,8] largefiles: reuse hexsha1() to centralize hash calculation logic into it

Submitter Katsunori FUJIWARA
Date March 27, 2017, 1:53 a.m.
Message ID <936283df3680c7106951.1490579621@speaknoevil>
Katsunori FUJIWARA - March 27, 2017, 1:53 a.m.
# HG changeset patch
# User FUJIWARA Katsunori <>
# Date 1490575474 -32400
#      Mon Mar 27 09:44:34 2017 +0900
# Node ID 936283df3680c7106951d752c9055f438d667411
# Parent  e86eb75e74ce1b0803c26d86a229b9b711f6d76a
largefiles: reuse hexsha1() to centralize hash calculation logic into it

This patch also renames argument of hexsha1(), not only for
readability ("data" isn't good name for file-like object), but also
for reviewability (including hexsha1() code helps reviewers to confirm
how these functions are similar).

BTW, copyandhash() has also similar logic, but it can't reuse
hexsha1(), because it writes read-in data into specified fileobj


diff --git a/hgext/largefiles/ b/hgext/largefiles/
--- a/hgext/largefiles/
+++ b/hgext/largefiles/
@@ -373,11 +373,8 @@  def copyandhash(instream, outfile):
 def hashfile(file):
     if not os.path.exists(file):
         return ''
-    hasher = hashlib.sha1('')
     with open(file, 'rb') as fd:
-        for data in util.filechunkiter(fd):
-            hasher.update(data)
-    return hasher.hexdigest()
+        return hexsha1(fd)
 def getexecutable(filename):
     mode = os.stat(filename).st_mode
@@ -398,11 +395,11 @@  def urljoin(first, second, *arg):
         url = join(url, a)
     return url
-def hexsha1(data):
+def hexsha1(fileobj):
     """hexsha1 returns the hex-encoded sha1 sum of the data in the file-like
     object data"""
     h = hashlib.sha1()
-    for chunk in util.filechunkiter(data):
+    for chunk in util.filechunkiter(fileobj):
     return h.hexdigest()