Patchwork [V2] commands: add debugdeltachain command

login
register
mail settings
Submitter Gregory Szorc
Date Dec. 6, 2015, 11:20 p.m.
Message ID <01581986b2c1426c790f.1449444008@ubuntu-main>
Download mbox | patch
Permalink /patch/11878/
State Accepted
Headers show

Comments

Gregory Szorc - Dec. 6, 2015, 11:20 p.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1449387466 28800
#      Sat Dec 05 23:37:46 2015 -0800
# Node ID 01581986b2c1426c790f7728f932ee2a725581a8
# Parent  094073353710dad068f30a8e4d1d2524e5dc34ca
commands: add debugdeltachain command

We have debug commands for displaying overall revlog statistics
(debugrevlog) and for dumping a revlog index (debugindex). As part
of investigating various aspects of revlog behavior and performance,
I found it important to have an understanding of how revlog
delta chains behave in practice.

This patch implements a "debugdeltachain" command. For each revision
in a revlog, it dumps information about the delta chain. Which delta
chain it is part of, length of the delta chain, distance since base
revision, info about base revision, size of the delta chain, etc. The
generic formatting facility is used, which means we can templatize
output and get machine readable output like JSON.

This command has already uncovered some weird history in
mozilla-central I didn't know about. So I think it's valuable.
Pierre-Yves David - Dec. 7, 2015, 4:09 a.m.
On 12/06/2015 03:20 PM, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1449387466 28800
> #      Sat Dec 05 23:37:46 2015 -0800
> # Node ID 01581986b2c1426c790f7728f932ee2a725581a8
> # Parent  094073353710dad068f30a8e4d1d2524e5dc34ca
> commands: add debugdeltachain command

Pushed to the clowncopter, thanks for the upgrade to json.

I'm a bit surprise that the test (even before you touched it) use a 
single entry manifest for it's testing. Seems a bit light.

Cheers,
Gregory Szorc - Dec. 7, 2015, 4:25 a.m.
On Sun, Dec 6, 2015 at 8:09 PM, Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 12/06/2015 03:20 PM, Gregory Szorc wrote:
>
>> # HG changeset patch
>> # User Gregory Szorc <gregory.szorc@gmail.com>
>> # Date 1449387466 28800
>> #      Sat Dec 05 23:37:46 2015 -0800
>> # Node ID 01581986b2c1426c790f7728f932ee2a725581a8
>> # Parent  094073353710dad068f30a8e4d1d2524e5dc34ca
>> commands: add debugdeltachain command
>>
>
> Pushed to the clowncopter, thanks for the upgrade to json.
>
> I'm a bit surprise that the test (even before you touched it) use a single
> entry manifest for it's testing. Seems a bit light.
>

Looking at test-debugcommands.t, I'd say our test coverage for debug*
commands is overall a bit light :/ But I don't like scope bloating when not
necessary.

Patch

diff --git a/mercurial/commands.py b/mercurial/commands.py
--- a/mercurial/commands.py
+++ b/mercurial/commands.py
@@ -2496,16 +2496,127 @@  def debugindexdot(ui, repo, file_=None, 
     for i in r:
         node = r.node(i)
         pp = r.parents(node)
         ui.write("\t%d -> %d\n" % (r.rev(pp[0]), i))
         if pp[1] != nullid:
             ui.write("\t%d -> %d\n" % (r.rev(pp[1]), i))
     ui.write("}\n")
 
+@command('debugdeltachain',
+    debugrevlogopts + formatteropts,
+    _('-c|-m|FILE'),
+    optionalrepo=True)
+def debugdeltachain(ui, repo, file_=None, **opts):
+    """dump information about delta chains in a revlog
+
+    Output can be templatized. Available template keywords are:
+
+       rev          revision number
+       chainid      delta chain identifier (numbered by unique base)
+       chainlen     delta chain length to this revision
+       prevrev      previous revision in delta chain
+       deltatype    role of delta / how it was computed
+       compsize     compressed size of revision
+       uncompsize   uncompressed size of revision
+       chainsize    total size of compressed revisions in chain
+       chainratio   total chain size divided by uncompressed revision size
+                    (new delta chains typically start at ratio 2.00)
+       lindist      linear distance from base revision in delta chain to end
+                    of this revision
+       extradist    total size of revisions not part of this delta chain from
+                    base of delta chain to end of this revision; a measurement
+                    of how much extra data we need to read/seek across to read
+                    the delta chain for this revision
+       extraratio   extradist divided by chainsize; another representation of
+                    how much unrelated data is needed to load this delta chain
+    """
+    r = cmdutil.openrevlog(repo, 'debugdeltachain', file_, opts)
+    index = r.index
+    generaldelta = r.version & revlog.REVLOGGENERALDELTA
+
+    def revinfo(rev):
+        iterrev = rev
+        e = index[iterrev]
+        chain = []
+        compsize = e[1]
+        uncompsize = e[2]
+        chainsize = 0
+
+        if generaldelta:
+            if e[3] == e[5]:
+                deltatype = 'p1'
+            elif e[3] == e[6]:
+                deltatype = 'p2'
+            elif e[3] == rev - 1:
+                deltatype = 'prev'
+            elif e[3] == rev:
+                deltatype = 'base'
+            else:
+                deltatype = 'other'
+        else:
+            if e[3] == rev:
+                deltatype = 'base'
+            else:
+                deltatype = 'prev'
+
+        while iterrev != e[3]:
+            chain.append(iterrev)
+            chainsize += e[1]
+            if generaldelta:
+                iterrev = e[3]
+            else:
+                iterrev -= 1
+            e = index[iterrev]
+        else:
+            chainsize += e[1]
+            chain.append(iterrev)
+
+        chain.reverse()
+        return compsize, uncompsize, deltatype, chain, chainsize
+
+    fm = ui.formatter('debugdeltachain', opts)
+
+    fm.plain('    rev  chain# chainlen     prev   delta       '
+             'size    rawsize  chainsize     ratio   lindist extradist '
+             'extraratio\n')
+
+    chainbases = {}
+    for rev in r:
+        comp, uncomp, deltatype, chain, chainsize = revinfo(rev)
+        chainbase = chain[0]
+        chainid = chainbases.setdefault(chainbase, len(chainbases) + 1)
+        basestart = r.start(chainbase)
+        revstart = r.start(rev)
+        lineardist = revstart + comp - basestart
+        extradist = lineardist - chainsize
+        try:
+            prevrev = chain[-2]
+        except IndexError:
+            prevrev = -1
+
+        chainratio = float(chainsize) / float(uncomp)
+        extraratio = float(extradist) / float(chainsize)
+
+        fm.startitem()
+        fm.write('rev chainid chainlen prevrev deltatype compsize '
+                 'uncompsize chainsize chainratio lindist extradist '
+                 'extraratio',
+                 '%7d %7d %8d %8d %7s %10d %10d %10d %9.5f %9d %9d %10.5f\n',
+                 rev, chainid, len(chain), prevrev, deltatype, comp,
+                 uncomp, chainsize, chainratio, lineardist, extradist,
+                 extraratio,
+                 rev=rev, chainid=chainid, chainlen=len(chain),
+                 prevrev=prevrev, deltatype=deltatype, compsize=comp,
+                 uncompsize=uncomp, chainsize=chainsize,
+                 chainratio=chainratio, lindist=lineardist,
+                 extradist=extradist, extraratio=extraratio)
+
+    fm.end()
+
 @command('debuginstall', [], '', norepo=True)
 def debuginstall(ui):
     '''test Mercurial installation
 
     Returns 0 on success.
     '''
 
     def writetemp(contents):
diff --git a/tests/test-completion.t b/tests/test-completion.t
--- a/tests/test-completion.t
+++ b/tests/test-completion.t
@@ -75,16 +75,17 @@  Show debug commands if there are no othe
   debugcheckstate
   debugcommands
   debugcomplete
   debugconfig
   debugcreatestreamclonebundle
   debugdag
   debugdata
   debugdate
+  debugdeltachain
   debugdirstate
   debugdiscovery
   debugextensions
   debugfileset
   debugfsinfo
   debuggetbundle
   debugignore
   debugindex
@@ -238,16 +239,17 @@  Show all commands + options
   debugbundle: all
   debugcheckstate: 
   debugcommands: 
   debugcomplete: options
   debugcreatestreamclonebundle: 
   debugdag: tags, branches, dots, spaces
   debugdata: changelog, manifest, dir
   debugdate: extended
+  debugdeltachain: changelog, manifest, dir, template
   debugdirstate: nodates, datesort
   debugdiscovery: old, nonheads, ssh, remotecmd, insecure
   debugextensions: template
   debugfileset: rev
   debugfsinfo: 
   debuggetbundle: head, common, type
   debugignore: 
   debugindex: changelog, manifest, dir, format
diff --git a/tests/test-debugcommands.t b/tests/test-debugcommands.t
--- a/tests/test-debugcommands.t
+++ b/tests/test-debugcommands.t
@@ -39,16 +39,42 @@  Test debugindex, with and without the --
        0         0       3   ....       0 b789fdd96dc2f3bd229c1dd8eedf0fc60e2b68e3 0000000000000000000000000000000000000000 0000000000000000000000000000000000000000 (re)
   $ hg debugindex -f 1 a
      rev flag   offset   length     size  .....   link     p1     p2       nodeid (re)
        0 0000        0        3        2   ....      0     -1     -1 b789fdd96dc2 (re)
   $ hg --debug debugindex -f 1 a
      rev flag   offset   length     size  .....   link     p1     p2                                   nodeid (re)
        0 0000        0        3        2   ....      0     -1     -1 b789fdd96dc2f3bd229c1dd8eedf0fc60e2b68e3 (re)
 
+debugdelta chain basic output
+
+  $ hg debugdeltachain -m
+      rev  chain# chainlen     prev   delta       size    rawsize  chainsize     ratio   lindist extradist extraratio
+        0       1        1       -1    base         44         43         44   1.02326        44         0    0.00000
+
+  $ hg debugdeltachain -m -T '{rev} {chainid} {chainlen}\n'
+  0 1 1
+
+  $ hg debugdeltachain -m -Tjson
+  [
+   {
+    "chainid": 1,
+    "chainlen": 1,
+    "chainratio": 1.02325581395,
+    "chainsize": 44,
+    "compsize": 44,
+    "deltatype": "base",
+    "extradist": 0,
+    "extraratio": 0.0,
+    "lindist": 44,
+    "prevrev": -1,
+    "rev": 0,
+    "uncompsize": 43
+   }
+  ]
 
 Test max chain len
   $ cat >> $HGRCPATH << EOF
   > [format]
   > maxchainlen=4
   > EOF
 
   $ printf "This test checks if maxchainlen config value is respected also it can serve as basic test for debugrevlog -d <file>.\n" >> a
diff --git a/tests/test-help.t b/tests/test-help.t
--- a/tests/test-help.t
+++ b/tests/test-help.t
@@ -807,16 +807,18 @@  Test list of internal help commands
    debugcomplete
                  returns the completion list associated with the given command
    debugcreatestreamclonebundle
                  create a stream clone bundle file
    debugdag      format the changelog or an index DAG as a concise textual
                  description
    debugdata     dump the contents of a data file revision
    debugdate     parse and display a date
+   debugdeltachain
+                 dump information about delta chains in a revlog
    debugdirstate
                  show the contents of the current dirstate
    debugdiscovery
                  runs the changeset discovery protocol in isolation
    debugextensions
                  show information about active extensions
    debugfileset  parse and apply a fileset specification
    debugfsinfo   show information detected about current filesystem