Patchwork [3,of,5,flagprocessor,v6] revlog: pass revlog flags to addrevision

login
register
mail settings
Submitter Remi Chaintron
Date Dec. 24, 2016, 4:36 p.m.
Message ID <b1d220e584e6fc861a40.1482597389@remi-mbp2>
Download mbox | patch
Permalink /patch/18024/
State Changes Requested
Delegated to: Pierre-Yves David
Headers show

Comments

Remi Chaintron - Dec. 24, 2016, 4:36 p.m.
# HG changeset patch
# User Remi Chaintron <remi@fb.com>
# Date 1482451731 18000
#      Thu Dec 22 19:08:51 2016 -0500
# Node ID b1d220e584e6fc861a40a5aeefa0c3b19ae09126
# Parent  d0476160913323da1dada49ae46e72d6a7c17d78
revlog: pass revlog flags to addrevision

Adding the ability to passing flags to addrevision instead of simply passing
default flags to _addrevision will allow extensions relying on flag transforms
to wrap around addrevision() in order to update revlog flags.

The first use case of this patch will be the lfs extension marking nodes as
stored externally when the contents are larger than the defined threshold.

One of the reasons leading to setting flags in addrevision() wrappers in the
flag processor design is that it allows to detect files larger than the 2GB
limit before the check is performed, which allows lfs to transform the contents
into metadata.
Pierre-Yves David - Dec. 30, 2016, 10:24 a.m.
On 12/24/2016 05:36 PM, Remi Chaintron wrote:
> # HG changeset patch
> # User Remi Chaintron <remi@fb.com>
> # Date 1482451731 18000
> #      Thu Dec 22 19:08:51 2016 -0500
> # Node ID b1d220e584e6fc861a40a5aeefa0c3b19ae09126
> # Parent  d0476160913323da1dada49ae46e72d6a7c17d78
> revlog: pass revlog flags to addrevision
>
> Adding the ability to passing flags to addrevision instead of simply passing
> default flags to _addrevision will allow extensions relying on flag transforms
> to wrap around addrevision() in order to update revlog flags.
>
> The first use case of this patch will be the lfs extension marking nodes as
> stored externally when the contents are larger than the defined threshold.
>
> One of the reasons leading to setting flags in addrevision() wrappers in the
> flag processor design is that it allows to detect files larger than the 2GB
> limit before the check is performed, which allows lfs to transform the contents
> into metadata.

As we discussed over voice, that approach seems fine. However, there is 
a strange, apparently unrelated code-drop in this patch. See below.

> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -1235,11 +1235,6 @@
>          if rev is None:
>              rev = self.rev(node)
>
> -        # check rev flags
> -        if self.flags(rev) & ~REVIDX_KNOWN_FLAGS:
> -            raise RevlogError(_('incompatible revision flag %x') %
> -                              (self.flags(rev) & ~REVIDX_KNOWN_FLAGS))
> -

That seems like an related code-drop in 'revision(…)'. I don't think you 
intended to do that. Can you drop that from this patch (and, if needed, 
reintroduce it on an appropriate patches documenting why it is something 
we need to do)

>          chain, stopped = self._deltachain(rev, stoprev=cachedrev)
>          if stopped:
>              text = self._cache[2]
> @@ -1332,7 +1327,7 @@
>          self._chunkclear()
>
>      def addrevision(self, text, transaction, link, p1, p2, cachedelta=None,
> -                    node=None):
> +                    node=None, flags=REVIDX_DEFAULT_FLAGS):
>          """add a revision to the log
>
>          text - the revision data to add
> @@ -1343,6 +1338,7 @@
>          node - nodeid of revision; typically node is not specified, and it is
>              computed by default as hash(text, p1, p2), however subclasses might
>              use different hashing method (and override checkhash() in such case)
> +        flags - the known flags to set on the revision
>          """
>          if link == nullrev:
>              raise RevlogError(_("attempted to add linkrev -1 to %s")
> @@ -1363,8 +1359,7 @@
>          ifh = self.opener(self.indexfile, "a+", checkambig=self._checkambig)
>          try:
>              return self._addrevision(node, text, transaction, link, p1, p2,
> -                                     REVIDX_DEFAULT_FLAGS, cachedelta, ifh, dfh,
> -                                     raw=False)
> +                                     flags, cachedelta, ifh, dfh, raw=False)
>          finally:
>              if dfh:
>                  dfh.close()
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Rémi Chaintron - Dec. 30, 2016, 4:38 p.m.
You're right, this is a mistake on my part. The reasoning behind dropping
this code is that in my current implementation, processflags() handles
checking whether the flags are known (both for revision() and
_addrevision()) so I moved this snippet to processflags().
Still makes sense to not spread the refactor across two patches, will
update.

On Fri, 30 Dec 2016 at 05:25 Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 12/24/2016 05:36 PM, Remi Chaintron wrote:
> > # HG changeset patch
> > # User Remi Chaintron <remi@fb.com>
> > # Date 1482451731 18000
> > #      Thu Dec 22 19:08:51 2016 -0500
> > # Node ID b1d220e584e6fc861a40a5aeefa0c3b19ae09126
> > # Parent  d0476160913323da1dada49ae46e72d6a7c17d78
> > revlog: pass revlog flags to addrevision
> >
> > Adding the ability to passing flags to addrevision instead of simply
> passing
> > default flags to _addrevision will allow extensions relying on flag
> transforms
> > to wrap around addrevision() in order to update revlog flags.
> >
> > The first use case of this patch will be the lfs extension marking nodes
> as
> > stored externally when the contents are larger than the defined
> threshold.
> >
> > One of the reasons leading to setting flags in addrevision() wrappers in
> the
> > flag processor design is that it allows to detect files larger than the
> 2GB
> > limit before the check is performed, which allows lfs to transform the
> contents
> > into metadata.
>
> As we discussed over voice, that approach seems fine. However, there is
> a strange, apparently unrelated code-drop in this patch. See below.
>
> > diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> > --- a/mercurial/revlog.py
> > +++ b/mercurial/revlog.py
> > @@ -1235,11 +1235,6 @@
> >          if rev is None:
> >              rev = self.rev(node)
> >
> > -        # check rev flags
> > -        if self.flags(rev) & ~REVIDX_KNOWN_FLAGS:
> > -            raise RevlogError(_('incompatible revision flag %x') %
> > -                              (self.flags(rev) & ~REVIDX_KNOWN_FLAGS))
> > -
>
> That seems like an related code-drop in 'revision(…)'. I don't think you
> intended to do that. Can you drop that from this patch (and, if needed,
> reintroduce it on an appropriate patches documenting why it is something
> we need to do)
>
> >          chain, stopped = self._deltachain(rev, stoprev=cachedrev)
> >          if stopped:
> >              text = self._cache[2]
> > @@ -1332,7 +1327,7 @@
> >          self._chunkclear()
> >
> >      def addrevision(self, text, transaction, link, p1, p2,
> cachedelta=None,
> > -                    node=None):
> > +                    node=None, flags=REVIDX_DEFAULT_FLAGS):
> >          """add a revision to the log
> >
> >          text - the revision data to add
> > @@ -1343,6 +1338,7 @@
> >          node - nodeid of revision; typically node is not specified, and
> it is
> >              computed by default as hash(text, p1, p2), however
> subclasses might
> >              use different hashing method (and override checkhash() in
> such case)
> > +        flags - the known flags to set on the revision
> >          """
> >          if link == nullrev:
> >              raise RevlogError(_("attempted to add linkrev -1 to %s")
> > @@ -1363,8 +1359,7 @@
> >          ifh = self.opener(self.indexfile, "a+",
> checkambig=self._checkambig)
> >          try:
> >              return self._addrevision(node, text, transaction, link, p1,
> p2,
> > -                                     REVIDX_DEFAULT_FLAGS, cachedelta,
> ifh, dfh,
> > -                                     raw=False)
> > +                                     flags, cachedelta, ifh, dfh,
> raw=False)
> >          finally:
> >              if dfh:
> >                  dfh.close()
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel@mercurial-scm.org
> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> >
>
> --
> Pierre-Yves David
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Pierre-Yves David - Dec. 31, 2016, 11:46 a.m.
On 12/30/2016 05:38 PM, Rémi Chaintron wrote:
> You're right, this is a mistake on my part. The reasoning behind
> dropping this code is that in my current implementation, processflags()
> handles checking whether the flags are known (both for revision() and
> _addrevision()) so I moved this snippet to processflags().
> Still makes sense to not spread the refactor across two patches, will
> update.

The issue with not grouping this is that this check is effectively gone 
between the commit that remove it and the commit that adds it back. This 
means we no longer have a "Code is correct at all commit" property. This 
can prevent us to take the first part of the series without the second 
part (which might need rework) and this make potential backout (if we 
find a regression for example) much more harder for use since we have do 
more extensive checks of potential other commit entangled with the one 
we would backout.

Cheers,
Rémi Chaintron - Dec. 31, 2016, 11:47 a.m.
I'm convinced already :)
On Sat, 31 Dec 2016 at 06:46, Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 12/30/2016 05:38 PM, Rémi Chaintron wrote:
> > You're right, this is a mistake on my part. The reasoning behind
> > dropping this code is that in my current implementation, processflags()
> > handles checking whether the flags are known (both for revision() and
> > _addrevision()) so I moved this snippet to processflags().
> > Still makes sense to not spread the refactor across two patches, will
> > update.
>
> The issue with not grouping this is that this check is effectively gone
> between the commit that remove it and the commit that adds it back. This
> means we no longer have a "Code is correct at all commit" property. This
> can prevent us to take the first part of the series without the second
> part (which might need rework) and this make potential backout (if we
> find a regression for example) much more harder for use since we have do
> more extensive checks of potential other commit entangled with the one
> we would backout.
>
> Cheers,
>
> --
> Pierre-Yves David
>

Patch

diff --git a/mercurial/revlog.py b/mercurial/revlog.py

--- a/mercurial/revlog.py

+++ b/mercurial/revlog.py

@@ -1235,11 +1235,6 @@ 

         if rev is None:
             rev = self.rev(node)
 
-        # check rev flags

-        if self.flags(rev) & ~REVIDX_KNOWN_FLAGS:

-            raise RevlogError(_('incompatible revision flag %x') %

-                              (self.flags(rev) & ~REVIDX_KNOWN_FLAGS))

-

         chain, stopped = self._deltachain(rev, stoprev=cachedrev)
         if stopped:
             text = self._cache[2]
@@ -1332,7 +1327,7 @@ 

         self._chunkclear()
 
     def addrevision(self, text, transaction, link, p1, p2, cachedelta=None,
-                    node=None):

+                    node=None, flags=REVIDX_DEFAULT_FLAGS):

         """add a revision to the log
 
         text - the revision data to add
@@ -1343,6 +1338,7 @@ 

         node - nodeid of revision; typically node is not specified, and it is
             computed by default as hash(text, p1, p2), however subclasses might
             use different hashing method (and override checkhash() in such case)
+        flags - the known flags to set on the revision

         """
         if link == nullrev:
             raise RevlogError(_("attempted to add linkrev -1 to %s")
@@ -1363,8 +1359,7 @@ 

         ifh = self.opener(self.indexfile, "a+", checkambig=self._checkambig)
         try:
             return self._addrevision(node, text, transaction, link, p1, p2,
-                                     REVIDX_DEFAULT_FLAGS, cachedelta, ifh, dfh,

-                                     raw=False)

+                                     flags, cachedelta, ifh, dfh, raw=False)

         finally:
             if dfh:
                 dfh.close()