Patchwork [1,of,3] lfs: introduce a localstore method for downloading from remote stores

login
register
mail settings
Submitter Matt Harbison
Date Jan. 7, 2018, 7:22 a.m.
Message ID <32b06a5033f350fe73ea.1515309753@Envy>
Download mbox | patch
Permalink /patch/26601/
State Accepted
Headers show

Comments

Matt Harbison - Jan. 7, 2018, 7:22 a.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1513900564 18000
#      Thu Dec 21 18:56:04 2017 -0500
# Node ID 32b06a5033f350fe73eaac8d27fd153df302257d
# Parent  58fda95a0202fc6327d1f5d9df26f7ff16538d57
lfs: introduce a localstore method for downloading from remote stores

The current local.write() method requires the full data, which means
concatenating file chunks in memory when downloading from a git server.  The
dedicated method downloads in chunks, verifies the content on the fly, and
creates the usercache hardlink if successful.  It can also be used for the file
system based remotestore.

An explicit division of labor between downloading from a remote store (which
should be verified) and writing to the store because of a commit or similar
(which doesn't need verification), seems clearer.  I can't figure out how to
make a similar function for upload, because for a file remote store, it's a
simple open/read/write operation.  For a gitremote store, it's open the file
and a urlreq.request(), and process that.
Yuya Nishihara - Jan. 7, 2018, 8:28 a.m.
On Sun, 07 Jan 2018 02:22:33 -0500, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1513900564 18000
> #      Thu Dec 21 18:56:04 2017 -0500
> # Node ID 32b06a5033f350fe73eaac8d27fd153df302257d
> # Parent  58fda95a0202fc6327d1f5d9df26f7ff16538d57
> lfs: introduce a localstore method for downloading from remote stores

Looks good. Queued, thanks.

Patch

diff --git a/hgext/lfs/blobstore.py b/hgext/lfs/blobstore.py
--- a/hgext/lfs/blobstore.py
+++ b/hgext/lfs/blobstore.py
@@ -114,6 +114,26 @@ 
 
         return self.vfs(oid, 'rb')
 
+    def download(self, oid, src):
+        """Read the blob from the remote source in chunks, verify the content,
+        and write to this local blobstore."""
+        sha256 = hashlib.sha256()
+
+        with self.vfs(oid, 'wb', atomictemp=True) as fp:
+            for chunk in util.filechunkiter(src, size=1048576):
+                fp.write(chunk)
+                sha256.update(chunk)
+
+            realoid = sha256.hexdigest()
+            if realoid != oid:
+                raise error.Abort(_('corrupt remote lfs object: %s') % oid)
+
+        # XXX: should we verify the content of the cache, and hardlink back to
+        # the local store on success, but truncate, write and link on failure?
+        if not self.cachevfs.exists(oid):
+            self.ui.note(_('lfs: adding %s to the usercache\n') % oid)
+            lfutil.link(self.vfs.join(oid), self.cachevfs.join(oid))
+
     def write(self, oid, data, verify=True):
         """Write blob to local blobstore."""
         if verify: