Patchwork [2,of,4,RESEND] largefiles: show also how many data entities are outgoing at "hg summary"

login
register
mail settings
Submitter Katsunori FUJIWARA
Date July 8, 2014, 2:40 a.m.
Message ID <f0329c068eaacd80e4fd.1404787224@juju>
Download mbox | patch
Permalink /patch/5129/
State Accepted
Commit 12019e6aa8a21613f0ba413048b77a231c5ac7e3
Headers show

Comments

Katsunori FUJIWARA - July 8, 2014, 2:40 a.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1404726346 -32400
#      Mon Jul 07 18:45:46 2014 +0900
# Node ID f0329c068eaacd80e4fd2a0aeac2b9c70799517d
# Parent  6f3a1b7ead9e3f3bf371bec2fee34d605edb72a8
largefiles: show also how many data entities are outgoing at "hg summary"

Before this patch, "hg summary --large" shows how many largefiles are
changed or added in outgoing revisions only in the point of the view
of filenames.

For example, according to the number of outgoing largefiles shown in
"hg summary" output, users should expect that the former below costs
much more to upload outgoing largefiles than the latter.

  - outgoing revisions add a hundred largefiles, but all of them refer
    the same data entity

    in this case, only one data entity is outgoing, even though "hg
    summary" says that a hundred largefiles are outgoing.

  - a hundred outgoing revisions change only one largefile with
    distinct data

    in this case, a hundred data entities are outgoing, even though
    "hg summary" says that only one largefile is outgoing.

But the latter costs much more than the former, in fact.

This patch shows also how many data entities are outgoing at "hg
summary" by counting number of unique hash values for outgoing
largefiles.

This patch introduces "_getoutgoings" to centralize the logic
(de-duplication, too) into it for convenience of subsequent patches,
even though it is not required in "hg summary" case.

Patch

diff --git a/hgext/largefiles/overrides.py b/hgext/largefiles/overrides.py
--- a/hgext/largefiles/overrides.py
+++ b/hgext/largefiles/overrides.py
@@ -992,6 +992,21 @@  def overrideforget(orig, ui, repo, *pats
 
     return result
 
+def _getoutgoings(repo, missing, addfunc):
+    """get pairs of filename and largefile hash in outgoing revisions
+    in 'missing'.
+
+    'addfunc' is invoked with each unique pairs of filename and
+    largefile hash value.
+    """
+    knowns = set()
+    def dedup(fn, lfhash):
+        k = (fn, lfhash)
+        if k not in knowns:
+            knowns.add(k)
+            addfunc(fn, lfhash)
+    lfutil.getlfilestoupload(repo, missing, dedup)
+
 def outgoinghook(ui, repo, other, opts, missing):
     if opts.pop('large', None):
         toupload = set()
@@ -1020,14 +1035,19 @@  def summaryremotehook(ui, repo, opts, ch
             return
 
         toupload = set()
-        lfutil.getlfilestoupload(repo, outgoing.missing,
-                                 lambda fn, lfhash: toupload.add(fn))
+        lfhashes = set()
+        def addfunc(fn, lfhash):
+            toupload.add(fn)
+            lfhashes.add(lfhash)
+        _getoutgoings(repo, outgoing.missing, addfunc)
+
         if not toupload:
             # i18n: column positioning for "hg summary"
             ui.status(_('largefiles: (no files to upload)\n'))
         else:
             # i18n: column positioning for "hg summary"
-            ui.status(_('largefiles: %d to upload\n') % len(toupload))
+            ui.status(_('largefiles: %d entities for %d files to upload\n')
+                      % (len(lfhashes), len(toupload)))
 
 def overridesummary(orig, ui, repo, *pats, **opts):
     try:
diff --git a/tests/test-largefiles-misc.t b/tests/test-largefiles-misc.t
--- a/tests/test-largefiles-misc.t
+++ b/tests/test-largefiles-misc.t
@@ -468,7 +468,7 @@  check messages when there are files to u
   branch: default
   commit: (clean)
   update: (current)
-  largefiles: 1 to upload
+  largefiles: 1 entities for 1 files to upload
   $ hg -R clone2 outgoing --large
   comparing with $TESTTMP/issue3651/src (glob)
   searching for changes
@@ -503,7 +503,7 @@  check messages when there are files to u
   branch: default
   commit: (clean)
   update: (current)
-  largefiles: 3 to upload
+  largefiles: 1 entities for 3 files to upload
   $ hg -R clone2 outgoing --large -T "{rev}:{node|short}\n"
   comparing with $TESTTMP/issue3651/src (glob)
   searching for changes
@@ -533,7 +533,7 @@  check messages when there are files to u
   branch: default
   commit: (clean)
   update: (current)
-  largefiles: 3 to upload
+  largefiles: 3 entities for 3 files to upload
   $ hg -R clone2 outgoing --large -T "{rev}:{node|short}\n"
   comparing with $TESTTMP/issue3651/src (glob)
   searching for changes