Patchwork [rfc] bundle: when verbose, show what takes up the space in the uncompressed bundle

login
register
mail settings
Submitter Mads Kiilerich
Date Aug. 29, 2014, 9:45 a.m.
Message ID <c6f7a80ae75f32f5d876.1409305524@localhost.localdomain>
Download mbox | patch
Permalink /patch/5619/
State Accepted
Headers show

Comments

Mads Kiilerich - Aug. 29, 2014, 9:45 a.m.
# HG changeset patch
# User Mads Kiilerich <madski@unity3d.com>
# Date 1408124612 -7200
#      Fri Aug 15 19:43:32 2014 +0200
# Node ID c6f7a80ae75f32f5d876fbdf3a4e1b8a6a389128
# Parent  926bc0d3b595caf37c5d70833a347eb43285de2f
bundle: when verbose, show what takes up the space in the uncompressed bundle

This is kind of similar to the debugbundle command but gives summarized actual
numbers when creating the bundle.

Useful before pushing stuff from others to assess whether it makes sense to
increase the repo size that much or if large files accidentally have been
committed.

This output doesn't combine well with debug output so we only enable it when
verbose without debug.
Matt Mackall - Aug. 29, 2014, 3:06 p.m.
On Fri, 2014-08-29 at 14:23 +0200, Kevin Bullock wrote:
> On Aug 29, 2014, at 11:45 AM, Mads Kiilerich <mads@kiilerich.com> wrote:
> 
> > # HG changeset patch
> > # User Mads Kiilerich <madski@unity3d.com>
> > # Date 1408124612 -7200
> > #      Fri Aug 15 19:43:32 2014 +0200
> > # Node ID c6f7a80ae75f32f5d876fbdf3a4e1b8a6a389128
> > # Parent  926bc0d3b595caf37c5d70833a347eb43285de2f
> > bundle: when verbose, show what takes up the space in the uncompressed bundle
> > 
> > This is kind of similar to the debugbundle command but gives summarized actual
> > numbers when creating the bundle.
> > 
> > Useful before pushing stuff from others to assess whether it makes sense to
> > increase the repo size that much or if large files accidentally have been
> > committed.
> > 
> > This output doesn't combine well with debug output so we only enable it when
> > verbose without debug.
> 
> Not a bad idea, but I'd like to see units on those numbers.

Let's put this under --debug?
Mads Kiilerich - Aug. 29, 2014, 9:59 p.m.
On 08/29/2014 05:06 PM, Matt Mackall wrote:
> On Fri, 2014-08-29 at 14:23 +0200, Kevin Bullock wrote:
>> On Aug 29, 2014, at 11:45 AM, Mads Kiilerich <mads@kiilerich.com> wrote:
>>
>>> # HG changeset patch
>>> # User Mads Kiilerich <madski@unity3d.com>
>>> # Date 1408124612 -7200
>>> #      Fri Aug 15 19:43:32 2014 +0200
>>> # Node ID c6f7a80ae75f32f5d876fbdf3a4e1b8a6a389128
>>> # Parent  926bc0d3b595caf37c5d70833a347eb43285de2f
>>> bundle: when verbose, show what takes up the space in the uncompressed bundle
>>>
>>> This is kind of similar to the debugbundle command but gives summarized actual
>>> numbers when creating the bundle.
>>>
>>> Useful before pushing stuff from others to assess whether it makes sense to
>>> increase the repo size that much or if large files accidentally have been
>>> committed.
>>>
>>> This output doesn't combine well with debug output so we only enable it when
>>> verbose without debug.
>> Not a bad idea, but I'd like to see units on those numbers.

Adding "bytes" to every line would be weird. And a "bundle content and size in bytes" preample seems even more weird than the existing one.


> Let's put this under --debug?

That would interleave it with "progress" info like on 
http://selenic.com/repo/hg/file/bdc0e04df243/tests/test-bundle.t#l623 
and I don't think that would be useful ... unless the old progress info 
could be removed. I doubt it would be feasible to collect these numbers 
at the level where the progress info is collected.

Anyway, this is just a very rough approximation of the amount of 
information (a la Nyquist) added to the repository. Both the bundle 
format and the repo storage format are however far from that so the 
approximation might be way off and irrelevant ... but it seems to be the 
best we have.

Now, enter the brave new world of clever deltas and bundle2 ... That 
will make these numbers more relevant.

/Mads
Pierre-Yves David - Sept. 18, 2014, 12:02 a.m.
So what's the status of this patch ?

On 08/29/2014 02:59 PM, Mads Kiilerich wrote:
> On 08/29/2014 05:06 PM, Matt Mackall wrote:
>> On Fri, 2014-08-29 at 14:23 +0200, Kevin Bullock wrote:
>>> On Aug 29, 2014, at 11:45 AM, Mads Kiilerich <mads@kiilerich.com> wrote:
>>>
>>>> # HG changeset patch
>>>> # User Mads Kiilerich <madski@unity3d.com>
>>>> # Date 1408124612 -7200
>>>> #      Fri Aug 15 19:43:32 2014 +0200
>>>> # Node ID c6f7a80ae75f32f5d876fbdf3a4e1b8a6a389128
>>>> # Parent  926bc0d3b595caf37c5d70833a347eb43285de2f
>>>> bundle: when verbose, show what takes up the space in the
>>>> uncompressed bundle
>>>>
>>>> This is kind of similar to the debugbundle command but gives
>>>> summarized actual
>>>> numbers when creating the bundle.
>>>>
>>>> Useful before pushing stuff from others to assess whether it makes
>>>> sense to
>>>> increase the repo size that much or if large files accidentally have
>>>> been
>>>> committed.
>>>>
>>>> This output doesn't combine well with debug output so we only enable
>>>> it when
>>>> verbose without debug.
>>> Not a bad idea, but I'd like to see units on those numbers.
>
> Adding "bytes" to every line would be weird. And a "bundle content and
> size in bytes" preample seems even more weird than the existing one.
>
>
>> Let's put this under --debug?
>
> That would interleave it with "progress" info like on
> http://selenic.com/repo/hg/file/bdc0e04df243/tests/test-bundle.t#l623
> and I don't think that would be useful ... unless the old progress info
> could be removed. I doubt it would be feasible to collect these numbers
> at the level where the progress info is collected.
>
> Anyway, this is just a very rough approximation of the amount of
> information (a la Nyquist) added to the repository. Both the bundle
> format and the repo storage format are however far from that so the
> approximation might be way off and irrelevant ... but it seems to be the
> best we have.
>
> Now, enter the brave new world of clever deltas and bundle2 ... That
> will make these numbers more relevant.
>
> /Mads
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
--- a/mercurial/changegroup.py
+++ b/mercurial/changegroup.py
@@ -344,17 +344,28 @@  class bundle10(object):
                         fnodes[f].setdefault(n, clnode)
             return clnode
 
+        if not self._repo.ui.debugflag:
+            self._repo.ui.note('bundle content and size:\n')
+
+        size = 0
         for chunk in self.group(clnodes, cl, lookupcl, units=_('changesets'),
                                 reorder=reorder):
+            size += len(chunk)
             yield chunk
+        if not self._repo.ui.debugflag:
+            self._repo.ui.note('%8.i (changelog)\n' % size)
         progress(msgbundling, None)
 
         for f in changedfiles:
             fnodes[f] = {}
         mfnodes = self.prune(mf, mfs, commonrevs, source)
+        size = 0
         for chunk in self.group(mfnodes, mf, lookupmf, units=_('manifests'),
                                 reorder=reorder):
+            size += len(chunk)
             yield chunk
+        if not self._repo.ui.debugflag:
+            self._repo.ui.note('%8.i (manifests)\n' % size)
         progress(msgbundling, None)
 
         mfs.clear()
@@ -405,10 +416,15 @@  class bundle10(object):
             if filenodes:
                 progress(msgbundling, i + 1, item=fname, unit=msgfiles,
                          total=total)
-                yield self.fileheader(fname)
+                h = self.fileheader(fname)
+                size = len(h)
+                yield h
                 for chunk in self.group(filenodes, filerevlog, lookupfilelog,
                                         reorder=reorder):
+                    size += len(chunk)
                     yield chunk
+                if not self._repo.ui.debugflag:
+                    self._repo.ui.note('%8.i  %s\n' % (size, fname))
 
     def revchunk(self, revlog, rev, prev, linknode):
         node = revlog.node(rev)
diff --git a/tests/test-commit-amend.t b/tests/test-commit-amend.t
--- a/tests/test-commit-amend.t
+++ b/tests/test-commit-amend.t
@@ -109,8 +109,16 @@  No changes, just a different message:
   a
   stripping amended changeset 74609c7f506e
   1 changesets found
+  bundle content and size:
+       250 (changelog)
+       143 (manifests)
+       109  a
   saved backup bundle to $TESTTMP/.hg/strip-backup/74609c7f506e-amend-backup.hg (glob)
   1 changesets found
+  bundle content and size:
+       246 (changelog)
+       143 (manifests)
+       109  a
   adding branch
   adding changesets
   adding manifests
@@ -236,8 +244,16 @@  then, test editing custom commit message
   a
   stripping amended changeset 5f357c7560ab
   1 changesets found
+  bundle content and size:
+       238 (changelog)
+       143 (manifests)
+       111  a
   saved backup bundle to $TESTTMP/.hg/strip-backup/5f357c7560ab-amend-backup.hg (glob)
   1 changesets found
+  bundle content and size:
+       246 (changelog)
+       143 (manifests)
+       111  a
   adding branch
   adding changesets
   adding manifests
@@ -265,8 +281,16 @@  Same, but with changes in working dir (d
   stripping intermediate changeset a0ea9b1a4c8c
   stripping amended changeset 7ab3bf440b54
   2 changesets found
+  bundle content and size:
+       450 (changelog)
+       282 (manifests)
+       209  a
   saved backup bundle to $TESTTMP/.hg/strip-backup/7ab3bf440b54-amend-backup.hg (glob)
   1 changesets found
+  bundle content and size:
+       246 (changelog)
+       143 (manifests)
+       113  a
   adding branch
   adding changesets
   adding manifests
diff --git a/tests/test-debugbundle.t b/tests/test-debugbundle.t
--- a/tests/test-debugbundle.t
+++ b/tests/test-debugbundle.t
@@ -6,8 +6,13 @@  Create a test repository:
   $ touch a ; hg add a ; hg ci -ma
   $ touch b ; hg add b ; hg ci -mb
   $ touch c ; hg add c ; hg ci -mc
-  $ hg bundle --base 0 --rev tip bundle.hg
+  $ hg bundle --base 0 --rev tip bundle.hg -v
   2 changesets found
+  bundle content and size:
+       332 (changelog)
+       282 (manifests)
+       105  b
+       105  c
 
 Terse output:
 
diff --git a/tests/test-phases-exchange.t b/tests/test-phases-exchange.t
--- a/tests/test-phases-exchange.t
+++ b/tests/test-phases-exchange.t
@@ -764,6 +764,10 @@  Bare push with next changeset and common
   pushing to ../alpha
   searching for changes
   1 changesets found
+  bundle content and size:
+       172 (changelog)
+       145 (manifests)
+       111  a-H
   adding changesets
   adding manifests
   adding file changes
diff --git a/tests/test-push-warn.t b/tests/test-push-warn.t
--- a/tests/test-push-warn.t
+++ b/tests/test-push-warn.t
@@ -142,6 +142,10 @@ 
   pushing to ../c
   searching for changes
   2 changesets found
+  bundle content and size:
+       308 (changelog)
+       286 (manifests)
+       213  foo
   adding changesets
   adding manifests
   adding file changes