Patchwork [1,of,4] revlog: add a callback "tracking" duplicate node addition

login
register
mail settings
Submitter Boris Feld
Date Sept. 27, 2018, 4:49 p.m.
Message ID <1bed338fee8612ca502b.1538066989@localhost.localdomain>
Download mbox | patch
Permalink /patch/35136/
State Accepted
Headers show

Comments

Boris Feld - Sept. 27, 2018, 4:49 p.m.
# HG changeset patch
# User Boris Feld <boris.feld@octobus.net>
# Date 1537383767 -7200
#      Wed Sep 19 21:02:47 2018 +0200
# Node ID 1bed338fee8612ca502b2ef462c8cd7a59efe0aa
# Parent  ddca38941b2b80124220646554bbc2a0af1aff21
# EXP-Topic revlog-duplicates
# Available At https://bitbucket.org/octobus/mercurial-devel/
#              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 1bed338fee86
revlog: add a callback "tracking" duplicate node addition

If a changegroup contains node already added to the repository, they will be
skipped. Skipping them is the right behavior (we don't need to store things
twice), but it can hide some information to the code doing the unbundle (eg:
shelve looking for the tip of the bundle).

The first step to improve this situation is to add a low level callback. We do
not need this tracking on all revlog, so actual tracking will be added in the
next changeset.
Gregory Szorc - Sept. 27, 2018, 8:53 p.m.
On Thu, Sep 27, 2018 at 9:55 AM Boris Feld <boris.feld@octobus.net> wrote:

> # HG changeset patch
> # User Boris Feld <boris.feld@octobus.net>
> # Date 1537383767 -7200
> #      Wed Sep 19 21:02:47 2018 +0200
> # Node ID 1bed338fee8612ca502b2ef462c8cd7a59efe0aa
> # Parent  ddca38941b2b80124220646554bbc2a0af1aff21
> # EXP-Topic revlog-duplicates
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r
> 1bed338fee86
> revlog: add a callback "tracking" duplicate node addition
>

I have a preference to handle this by passing in a callback function or by
returning the necessary data from somewhere. But we'll cross that bridge
when we divorce the changelog class from revlog in the future.

Queued, thanks.


>
> If a changegroup contains node already added to the repository, they will
> be
> skipped. Skipping them is the right behavior (we don't need to store things
> twice), but it can hide some information to the code doing the unbundle
> (eg:
> shelve looking for the tip of the bundle).
>
> The first step to improve this situation is to add a low level callback.
> We do
> not need this tracking on all revlog, so actual tracking will be added in
> the
> next changeset.
>
> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -1797,6 +1797,10 @@ class revlog(object):
>          tr.replace(self.indexfile, trindex * self._io.size)
>          self._chunkclear()
>
> +    def _nodeduplicatecallback(self, transaction, node):
> +        """called when trying to add a node already stored.
> +        """
> +
>      def addrevision(self, text, transaction, link, p1, p2,
> cachedelta=None,
>                      node=None, flags=REVIDX_DEFAULT_FLAGS,
> deltacomputer=None):
>          """add a revision to the log
> @@ -2078,6 +2082,7 @@ class revlog(object):
>                  nodes.append(node)
>
>                  if node in self.nodemap:
> +                    self._nodeduplicatecallback(transaction, node)
>                      # this can happen if two branches make the same change
>                      continue
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>

Patch

diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -1797,6 +1797,10 @@  class revlog(object):
         tr.replace(self.indexfile, trindex * self._io.size)
         self._chunkclear()
 
+    def _nodeduplicatecallback(self, transaction, node):
+        """called when trying to add a node already stored.
+        """
+
     def addrevision(self, text, transaction, link, p1, p2, cachedelta=None,
                     node=None, flags=REVIDX_DEFAULT_FLAGS, deltacomputer=None):
         """add a revision to the log
@@ -2078,6 +2082,7 @@  class revlog(object):
                 nodes.append(node)
 
                 if node in self.nodemap:
+                    self._nodeduplicatecallback(transaction, node)
                     # this can happen if two branches make the same change
                     continue