Patchwork [1,of,2] cat: don't prefetch files unless the output requires it

login
register
mail settings
Submitter Matt Harbison
Date June 15, 2019, 12:10 a.m.
Message ID <6ad49ba335f3aa429321.1560557405@Envy>
Download mbox | patch
Permalink /patch/40526/
State Accepted
Headers show

Comments

Matt Harbison - June 15, 2019, 12:10 a.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1538629031 14400
#      Thu Oct 04 00:57:11 2018 -0400
# Node ID 6ad49ba335f3aa429321727edca21f378e4fc606
# Parent  87a34c76738492397abae40a27b6337f8be1b90d
cat: don't prefetch files unless the output requires it

It's a waste to cache lfs blobs when cat'ing the raw data at best, but a hassle
debugging when the blob is missing.  I'm not sure if there are other commands
that have '{data}' for output, and if there's a general way to prefetch on that
keyword.

It's interesting that the verbose output seems to leak into the JSON output, but
that seems like an existing bug.
Yuya Nishihara - June 15, 2019, 12:43 a.m.
On Fri, 14 Jun 2019 20:10:05 -0400, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1538629031 14400
> #      Thu Oct 04 00:57:11 2018 -0400
> # Node ID 6ad49ba335f3aa429321727edca21f378e4fc606
> # Parent  87a34c76738492397abae40a27b6337f8be1b90d
> cat: don't prefetch files unless the output requires it

Queued, thanks.

Patch

diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
--- a/mercurial/cmdutil.py
+++ b/mercurial/cmdutil.py
@@ -2345,14 +2345,22 @@  def remove(ui, repo, m, prefix, uipathfn
 
     return ret
 
+def _catfmtneedsdata(fm):
+    return not fm.datahint() or 'data' in fm.datahint()
+
 def _updatecatformatter(fm, ctx, matcher, path, decode):
     """Hook for adding data to the formatter used by ``hg cat``.
 
     Extensions (e.g., lfs) can wrap this to inject keywords/data, but must call
     this method first."""
-    data = ctx[path].data()
-    if decode:
-        data = ctx.repo().wwritedata(path, data)
+
+    # data() can be expensive to fetch (e.g. lfs), so don't fetch it if it
+    # wasn't requested.
+    data = b''
+    if _catfmtneedsdata(fm):
+        data = ctx[path].data()
+        if decode:
+            data = ctx.repo().wwritedata(path, data)
     fm.startitem()
     fm.context(ctx=ctx)
     fm.write('data', '%s', data)
@@ -2383,13 +2391,15 @@  def cat(ui, repo, ctx, matcher, basefm, 
         mfnode = ctx.manifestnode()
         try:
             if mfnode and mfl[mfnode].find(file)[0]:
-                scmutil.prefetchfiles(repo, [ctx.rev()], matcher)
+                if _catfmtneedsdata(basefm):
+                    scmutil.prefetchfiles(repo, [ctx.rev()], matcher)
                 write(file)
                 return 0
         except KeyError:
             pass
 
-    scmutil.prefetchfiles(repo, [ctx.rev()], matcher)
+    if _catfmtneedsdata(basefm):
+        scmutil.prefetchfiles(repo, [ctx.rev()], matcher)
 
     for abs in ctx.walk(matcher):
         write(abs)
diff --git a/tests/test-lfs-serve.t b/tests/test-lfs-serve.t
--- a/tests/test-lfs-serve.t
+++ b/tests/test-lfs-serve.t
@@ -537,8 +537,55 @@  Misc: process dies early if a requiremen
 
   $ hg clone -qU http://localhost:$HGPORT $TESTTMP/bulkfetch
 
+Cat doesn't prefetch unless data is needed (e.g. '-T {rawdata}' doesn't need it)
+
+  $ hg --cwd $TESTTMP/bulkfetch cat -vr tip lfspair1.bin -T '{rawdata}\n{path}\n'
+  lfs: assuming remote store: http://localhost:$HGPORT/.git/info/lfs
+  version https://git-lfs.github.com/spec/v1
+  oid sha256:cf1b2787b74e66547d931b6ebe28ff63303e803cb2baa14a8f57c4383d875782
+  size 20
+  x-is-binary 0
+  
+  lfspair1.bin
+
+  $ hg --cwd $TESTTMP/bulkfetch cat -vr tip lfspair1.bin -T json
+  lfs: assuming remote store: http://localhost:$HGPORT/.git/info/lfs
+  [lfs: assuming remote store: http://localhost:$HGPORT/.git/info/lfs
+  lfs: downloading cf1b2787b74e66547d931b6ebe28ff63303e803cb2baa14a8f57c4383d875782 (20 bytes)
+  lfs: processed: cf1b2787b74e66547d931b6ebe28ff63303e803cb2baa14a8f57c4383d875782
+  lfs: downloaded 1 files (20 bytes)
+  lfs: found cf1b2787b74e66547d931b6ebe28ff63303e803cb2baa14a8f57c4383d875782 in the local lfs store
+  
+   {
+    "data": "this is an lfs file\n",
+    "path": "lfspair1.bin",
+    "rawdata": "version https://git-lfs.github.com/spec/v1\noid sha256:cf1b2787b74e66547d931b6ebe28ff63303e803cb2baa14a8f57c4383d875782\nsize 20\nx-is-binary 0\n"
+   }
+  ]
+
+  $ rm -r $TESTTMP/bulkfetch/.hg/store/lfs
+
+  $ hg --cwd $TESTTMP/bulkfetch cat -vr tip lfspair1.bin -T '{data}\n'
+  lfs: assuming remote store: http://localhost:$HGPORT/.git/info/lfs
+  lfs: assuming remote store: http://localhost:$HGPORT/.git/info/lfs
+  lfs: downloading cf1b2787b74e66547d931b6ebe28ff63303e803cb2baa14a8f57c4383d875782 (20 bytes)
+  lfs: processed: cf1b2787b74e66547d931b6ebe28ff63303e803cb2baa14a8f57c4383d875782
+  lfs: downloaded 1 files (20 bytes)
+  lfs: found cf1b2787b74e66547d931b6ebe28ff63303e803cb2baa14a8f57c4383d875782 in the local lfs store
+  this is an lfs file
+  
+  $ hg --cwd $TESTTMP/bulkfetch cat -vr tip lfspair2.bin
+  lfs: assuming remote store: http://localhost:$HGPORT/.git/info/lfs
+  lfs: assuming remote store: http://localhost:$HGPORT/.git/info/lfs
+  lfs: downloading d96eda2c74b56e95cfb5ffb66b6503e198cc6fc4a09dc877de925feebc65786e (24 bytes)
+  lfs: processed: d96eda2c74b56e95cfb5ffb66b6503e198cc6fc4a09dc877de925feebc65786e
+  lfs: downloaded 1 files (24 bytes)
+  lfs: found d96eda2c74b56e95cfb5ffb66b6503e198cc6fc4a09dc877de925feebc65786e in the local lfs store
+  this is an lfs file too
+
 Export will prefetch all needed files across all needed revisions
 
+  $ rm -r $TESTTMP/bulkfetch/.hg/store/lfs
   $ hg -R $TESTTMP/bulkfetch -v export -r 0:tip -o all.export
   lfs: assuming remote store: http://localhost:$HGPORT/.git/info/lfs
   exporting patches: