Patchwork [V2] journal: new experimental extension

login
register
mail settings
Submitter Martijn Pieters
Date June 20, 2016, 4:46 p.m.
Message ID <ce6a007db814d426159a.1466441161@mjpieters-mbp>
Download mbox | patch
Permalink /patch/15555/
State Superseded
Headers show

Comments

Martijn Pieters - June 20, 2016, 4:46 p.m.
# HG changeset patch
# User Martijn Pieters <mjpieters@fb.com>
# Date 1466440095 -3600
#      Mon Jun 20 17:28:15 2016 +0100
# Node ID ce6a007db814d426159adc01b999b22c1a2ced05
# Parent  fcaf20175b1b05aa753e1b9f65f10d35a86224df
journal: new experimental extension

Records bookmark locations and shows you where bookmarks were located in the
past.

This is the first in a planned series of locations to be recorded; a future
patch will add working copy (dirstate) tracking, and remote bookmarks will be
supported as well, so the journal storage format should be fairly generic to
support those use-cases.
Sean Farley - June 20, 2016, 9:54 p.m.
Martijn Pieters <mj@zopatista.com> writes:

> # HG changeset patch
> # User Martijn Pieters <mjpieters@fb.com>
> # Date 1466440095 -3600
> #      Mon Jun 20 17:28:15 2016 +0100
> # Node ID ce6a007db814d426159adc01b999b22c1a2ced05
> # Parent  fcaf20175b1b05aa753e1b9f65f10d35a86224df
> journal: new experimental extension
>
> Records bookmark locations and shows you where bookmarks were located in the
> past.
>
> This is the first in a planned series of locations to be recorded; a future
> patch will add working copy (dirstate) tracking, and remote bookmarks will be
> supported as well, so the journal storage format should be fairly generic to
> support those use-cases.

I've brought this up at the previous sprints but if we're going to be
making a generic storage format then is it worth it to combine this with
the blackbox format?

It seems to me that they will both have to solve the same problems
(locking, generic format, etc.). If we put some thought into this now,
we could perhaps get some server-side features and maybe extension
points.

My thinking is that by having a storage layer, extensions could do
queries like:

select * where type = 'remotebookmark'

and even adding that data so that core doesn't have to have a hard-coded
list.

Perhaps it's too early for this, though.
Martijn Pieters - June 21, 2016, 12:34 p.m.
On 20 Jun 2016, at 22:54, Sean Farley <sean@farley.io> wrote:
> Martijn Pieters <mj@zopatista.com> writes:
> 
>> # HG changeset patch
>> # User Martijn Pieters <mjpieters@fb.com>
>> # Date 1466440095 -3600
>> #      Mon Jun 20 17:28:15 2016 +0100
>> # Node ID ce6a007db814d426159adc01b999b22c1a2ced05
>> # Parent  fcaf20175b1b05aa753e1b9f65f10d35a86224df
>> journal: new experimental extension
>> 
>> Records bookmark locations and shows you where bookmarks were located in the
>> past.
>> 
>> This is the first in a planned series of locations to be recorded; a future
>> patch will add working copy (dirstate) tracking, and remote bookmarks will be
>> supported as well, so the journal storage format should be fairly generic to
>> support those use-cases.
> 
> I've brought this up at the previous sprints but if we're going to be
> making a generic storage format then is it worth it to combine this with
> the blackbox format?
> 
> It seems to me that they will both have to solve the same problems
> (locking, generic format, etc.). If we put some thought into this now,
> we could perhaps get some server-side features and maybe extension
> points.
> 
> My thinking is that by having a storage layer, extensions could do
> queries like:
> 
> select * where type = 'remotebookmark'
> 
> and even adding that data so that core doesn't have to have a hard-coded
> list.
> 
> Perhaps it's too early for this, though.

Because blackbox serves a different use case (post-mortem analysis) and records data in a very different manner (via ui.log hooks) I did not see an easy path to consolidating the two. journal needs both different data (specifically recording old and new hashes) as well as the ability to filter quickly on a specific name, neither of which blackbox can currently support without a major overhaul.

I perhaps could have used `ui.log()` calls to record the information, but blackbox doesn't currently lock, and locking during the dirstate writing phase (done during unlock) adds the requirement to *not* lock as needed, complicating the API a little more. Listing journal entries would also require that rotated files are taken into account and require more scanning and parsing of the info to pull out just the info journal is interested in. That all felt like bending blackbox rather far away from its original purpose, as a fire-and-forget log recorder.

That said, I see a future where Mercurial makes heavy use of SQLite to track loads of different kinds of extra information about your repository and working copy. In that scenario it would make sense for both blackbox and journal to be fed by the same tables (assuming that the previous location for working copies and bookmarks is easily queried for).

Martijn

Patch

diff --git a/hgext/journal.py b/hgext/journal.py
new file mode 100644
--- /dev/null
+++ b/hgext/journal.py
@@ -0,0 +1,245 @@ 
+# journal.py
+#
+# Copyright 2014-2016 Facebook, Inc.
+#
+# This software may be used and distributed according to the terms of the
+# GNU General Public License version 2 or any later version.
+"""Track previous positions of bookmarks (EXPERIMENTAL)
+
+This extension adds a new command: `hg journal`, which shows you where
+bookmarks were previously located.
+
+"""
+
+from __future__ import absolute_import
+
+import os
+
+from mercurial.i18n import _
+
+from mercurial import (
+    bookmarks,
+    cmdutil,
+    commands,
+    dispatch,
+    error,
+    extensions,
+    node,
+    scmutil,
+    util,
+)
+
+cmdtable = {}
+command = cmdutil.command(cmdtable)
+
+# Note for extension authors: ONLY specify testedwith = 'internal' for
+# extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
+# be specifying the version(s) of Mercurial they are tested with, or
+# leave the attribute unspecified.
+testedwith = 'internal'
+
+# storage format version; increment when the format changes
+storage_version = 0
+
+# namespaces
+bookmarktype = 'bookmark'
+
+# Journal recording, register hooks and storage object
+def extsetup(ui):
+    extensions.wrapfunction(dispatch, 'runcommand', runcommand)
+    extensions.wrapfunction(bookmarks.bmstore, '_write', recordbookmarks)
+
+def reposetup(ui, repo):
+    if repo.local():
+        repo.journal = journalstorage(repo)
+
+def runcommand(orig, lui, repo, cmd, fullargs, *args):
+    """Track the command line options for recording in the journal"""
+    journalstorage.recordcommand(*fullargs)
+    return orig(lui, repo, cmd, fullargs, *args)
+
+def recordbookmarks(orig, store, fp):
+    """Records all bookmark changes in the journal."""
+    repo = store._repo
+    if util.safehasattr(repo, 'journal'):
+        oldmarks = bookmarks.bmstore(repo)
+        for mark, value in store.iteritems():
+            oldvalue = oldmarks.get(mark, node.nullid)
+            if value != oldvalue:
+                repo.journal.record(bookmarktype, mark, oldvalue, value)
+    return orig(store, fp)
+
+class journalstorage(object):
+    _currentcommand = ()
+
+    def __init__(self, repo):
+        self.repo = repo
+        self.user = util.getuser()
+        self.vfs = repo.vfs
+
+    # track the current command for recording in journal entries
+    @property
+    def command(self):
+        commandstr = ' '.join(
+            map(util.shellquote, journalstorage._currentcommand))
+        if '\n' in commandstr:
+            # truncate multi-line commands
+            commandstr = commandstr.partition('\n')[0] + ' ...'
+        return commandstr
+
+    @classmethod
+    def recordcommand(cls, *fullargs):
+        """Set the current hg arguments, stored with recorded entries"""
+        # Set the current command on the class because we may have started
+        # with a non-local repo (cloning for example).
+        cls._currentcommand = fullargs
+
+    def record(self, namespace, name, oldhashes, newhashes):
+        """Record a new journal entry
+
+        * namespace: an opaque string; this can be used to filter on the type
+          of recorded entries.
+        * name: the name defining this entry; for bookmarks, this is the
+          bookmark name. Can be filtered on when retrieving entries.
+        * oldhashes and newhashes: each a single binary hash, or a list of
+          binary hashes. These represent the old and new position of the named
+          item.
+
+        """
+        if not isinstance(oldhashes, list):
+            oldhashes = [oldhashes]
+        if not isinstance(newhashes, list):
+            newhashes = [newhashes]
+
+        timestamp, tz = map(str, util.makedate())
+        date = ' '.join((timestamp, tz))
+        oldhashes = ','.join([node.hex(hash) for hash in oldhashes])
+        newhashes = ','.join([node.hex(hash) for hash in newhashes])
+        data = '\n'.join((
+            date, self.user, self.command, namespace, name, oldhashes,
+            newhashes))
+
+        with self.repo.wlock():
+            version = None
+            with self.vfs('journal', mode='a+b') as f:
+                f.seek(0, os.SEEK_SET)
+                version = f.read(5).partition('\0')[0]
+                if version and version != str(storage_version):
+                    # different version of the storage.  Since there have
+                    # been no previous versions, just abort, as this can
+                    # only mean the file is corrupt.
+                    self.repo.ui.warn(
+                        _("unknown journal file version '%s'\n") % version)
+                    return
+                if not version:
+                    # empty file, write version first
+                    f.write(str(storage_version) + '\0')
+                f.seek(0, os.SEEK_END)
+                f.write(data + '\0')
+
+    def filtered(self, namespace=None, name=None):
+        """Yield all journal entries with the given namespace or name
+
+        Both the namespace and the name are optional; if neither is given all
+        entries in the journal are produced.
+
+        """
+        for entry in self:
+            entry_ns, entry_name = entry[3:5]
+            if namespace is not None and entry_ns != namespace:
+                continue
+            if name is not None and entry_name != name:
+                continue
+            yield entry
+
+    def __iter__(self):
+        if not self.vfs.exists('journal'):
+            return
+
+        with self.repo.wlock():
+            with self.vfs('journal') as f:
+                raw = f.read()
+
+        lines = raw.split('\0')
+        version = lines and lines[0]
+        if version != str(storage_version):
+            version = version or _('not available')
+            raise error.Abort(_("unknown journal file version '%s'") % version)
+
+        # Skip the first line, it's a version number. Reverse the rest.
+        lines = reversed(lines[1:])
+
+        for line in lines:
+            if not line:
+                continue
+            parts = tuple(line.split('\n'))
+            timestamp, tz = parts[0].split()
+            timestamp, tz = float(timestamp), int(tz)
+            oldhashes, newhashes = parts[-2:]
+            oldhashes = oldhashes.split(',')
+            newhashes = newhashes.split(',')
+            yield ((timestamp, tz),) + parts[1:-2] + (oldhashes, newhashes)
+
+# journal reading
+@command(
+    'journal', [
+        ('c', 'commits', None, 'show commit metadata'),
+    ] + commands.logopts, '[OPTION]... [BOOKMARKNAME]')
+def journal(ui, repo, *args, **opts):
+    """show the previous position of bookmarks
+
+    The journal is used to see the previous commits of bookmarks. By default
+    the previous locations for all bookmarks are shown.  Passing a bookmark
+    name will show all the previous positions of that bookmark.
+
+    By default hg journal only shows the commit hash and the command that was
+    running at that time. -v/--verbose will show the prior hash, the user, and
+    the time at which it happened.
+
+    Use in -c/--commits to output log information on each commit hash.
+
+    `hg journal -T json` can be used to produce machine readable output.
+
+    """
+    bookmarkname = None
+    if args:
+        bookmarkname = args[0]
+
+    fm = ui.formatter('journal', opts)
+
+    if opts.get("template") != "json":
+        if bookmarkname is None:
+            name = _('all bookmarks')
+        else:
+            name = "'%s'" % bookmarkname
+        ui.status(_("Previous locations of %s:\n") % name)
+
+    entry = None
+    for count, entry in enumerate(repo.journal.filtered(name=bookmarkname)):
+        timestamp, user, command, namespace, name, oldhashes, newhashes = entry
+        newhashesstr = ','.join([hash[:12] for hash in newhashes])
+        oldhashesstr = ','.join([hash[:12] for hash in oldhashes])
+
+        fm.startitem()
+        fm.condwrite(ui.verbose, 'oldhashes', '%s -> ', oldhashesstr)
+        fm.write('newhashes', '%s', newhashesstr)
+        fm.condwrite(ui.verbose, 'user', ' %s', user.ljust(8))
+
+        timestring = util.datestr(timestamp, '%Y-%m-%d %H:%M %1%2')
+        fm.condwrite(ui.verbose, 'date', ' %s', timestring)
+        fm.write('command', '  %s\n', command)
+
+        if opts.get("commits"):
+            displayer = cmdutil.show_changeset(ui, repo, opts, buffered=False)
+            for hash in newhashes:
+                try:
+                    ctx = repo[hash]
+                    displayer.show(ctx)
+                except error.RepoLookupError as e:
+                    fm.write('repolookuperror', "%s\n\n", str(e))
+            displayer.close()
+
+    fm.end()
+
+    if entry is None:
+        ui.status(_("no recorded locations\n"))
diff --git a/tests/test-journal.t b/tests/test-journal.t
new file mode 100644
--- /dev/null
+++ b/tests/test-journal.t
@@ -0,0 +1,121 @@ 
+Tests for the journal extension; records bookmark locations.
+
+  $ cat >> $HGRCPATH << EOF
+  > [extensions]
+  > journal=
+  > EOF
+
+Setup repo
+
+  $ hg init repo
+  $ cd repo
+  $ echo a > a
+  $ hg commit -Aqm a
+  $ echo b > a
+  $ hg commit -Aqm b
+  $ hg up 0
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+Test empty journal
+
+  $ hg journal
+  Previous locations of all bookmarks:
+  no recorded locations
+  $ hg journal foo
+  Previous locations of 'foo':
+  no recorded locations
+
+Test that bookmarks are tracked
+
+  $ hg book -r tip bar
+  $ hg journal bar
+  Previous locations of 'bar':
+  1e6c11564562  book -r tip bar
+  $ hg book -f bar
+  $ hg journal bar
+  Previous locations of 'bar':
+  cb9a9f314b8b  book -f bar
+  1e6c11564562  book -r tip bar
+  $ hg up
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  updating bookmark bar
+  $ hg journal bar
+  Previous locations of 'bar':
+  1e6c11564562  up
+  cb9a9f314b8b  book -f bar
+  1e6c11564562  book -r tip bar
+
+Test that you can list all bookmarks as well as filter on them
+
+  $ hg book -r tip baz
+  $ hg journal
+  Previous locations of all bookmarks:
+  1e6c11564562  book -r tip baz
+  1e6c11564562  up
+  cb9a9f314b8b  book -f bar
+  1e6c11564562  book -r tip bar
+  $ hg journal baz
+  Previous locations of 'baz':
+  1e6c11564562  book -r tip baz
+  $ hg journal bar
+  Previous locations of 'bar':
+  1e6c11564562  up
+  cb9a9f314b8b  book -f bar
+  1e6c11564562  book -r tip bar
+  $ hg journal foo
+  Previous locations of 'foo':
+  no recorded locations
+
+Test that verbose and commit output work
+
+  $ hg journal --verbose
+  Previous locations of all bookmarks:
+  000000000000 -> 1e6c11564562 \w+ \d{4}-\d{2}-\d{2} \d{2}:\d{2} [+-]\d{4}  book -r tip baz (re)
+  cb9a9f314b8b -> 1e6c11564562 \w+ \d{4}-\d{2}-\d{2} \d{2}:\d{2} [+-]\d{4}  up (re)
+  1e6c11564562 -> cb9a9f314b8b \w+ \d{4}-\d{2}-\d{2} \d{2}:\d{2} [+-]\d{4}  book -f bar (re)
+  000000000000 -> 1e6c11564562 \w+ \d{4}-\d{2}-\d{2} \d{2}:\d{2} [+-]\d{4}  book -r tip bar (re)
+  $ hg journal --commit
+  Previous locations of all bookmarks:
+  1e6c11564562  book -r tip baz
+  changeset:   1:1e6c11564562
+  bookmark:    bar
+  bookmark:    baz
+  tag:         tip
+  user:        test
+  date:        Thu Jan 01 00:00:00 1970 +0000
+  summary:     b
+  
+  1e6c11564562  up
+  changeset:   1:1e6c11564562
+  bookmark:    bar
+  bookmark:    baz
+  tag:         tip
+  user:        test
+  date:        Thu Jan 01 00:00:00 1970 +0000
+  summary:     b
+  
+  cb9a9f314b8b  book -f bar
+  changeset:   0:cb9a9f314b8b
+  user:        test
+  date:        Thu Jan 01 00:00:00 1970 +0000
+  summary:     a
+  
+  1e6c11564562  book -r tip bar
+  changeset:   1:1e6c11564562
+  bookmark:    bar
+  bookmark:    baz
+  tag:         tip
+  user:        test
+  date:        Thu Jan 01 00:00:00 1970 +0000
+  summary:     b
+  
+
+Test for behaviour on unexpected storage version information
+
+  $ printf '42\0' > .hg/journal
+  $ hg journal
+  Previous locations of all bookmarks:
+  abort: unknown journal file version '42'
+  [255]
+  $ hg book -r tip doomed
+  unknown journal file version '42'