Patchwork [4,of,5,v4] revlog: REVIDX_ISLARGEFILE flag

login
register
mail settings
Submitter Remi Chaintron
Date Nov. 23, 2016, 5:39 p.m.
Message ID <75ee4746c198f039a394.1479922779@iphonesleepsort.dhcp.thefacebook.com>
Download mbox | patch
Permalink /patch/17727/
State Changes Requested
Headers show

Comments

Remi Chaintron - Nov. 23, 2016, 5:39 p.m.
# HG changeset patch
# User Remi Chaintron <remi@fb.com>
# Date 1479922644 0
#      Wed Nov 23 17:37:24 2016 +0000
# Branch stable
# Node ID 75ee4746c198f039a39400e855e9335afc34f1dd
# Parent  da91f91e979d6bf807912e956cf2f29573ede56f
revlog: REVIDX_ISLARGEFILE flag

Add the REVIDX_ISLARGEFILE flag for the `lfs` extension to interact with
revisions by registering transforms in the flagprocessor.
Pierre-Yves David - Nov. 29, 2016, 6:59 a.m.
On 11/23/2016 06:39 PM, Remi Chaintron wrote:
> # HG changeset patch
> # User Remi Chaintron <remi@fb.com>
> # Date 1479922644 0
> #      Wed Nov 23 17:37:24 2016 +0000
> # Branch stable
> # Node ID 75ee4746c198f039a39400e855e9335afc34f1dd
> # Parent  da91f91e979d6bf807912e956cf2f29573ede56f
> revlog: REVIDX_ISLARGEFILE flag
>
> Add the REVIDX_ISLARGEFILE flag for the `lfs` extension to interact with
> revisions by registering transforms in the flagprocessor.

small naming question: Should we actually call this 'lfs'/'LARGEFILE'. 
Of courses. the extension using it is intended for large file. However, 
the core semantic of this 'flag' seems to be "the content is stored 
outsided of revlog". This external storage could be applied to any use 
cases (eg, fetching sensitive content from a secure server on update).

What do you think about adapting the name to reflect this?


> diff --git a/mercurial/help/internals/revlogs.txt b/mercurial/help/internals/revlogs.txt
> --- a/mercurial/help/internals/revlogs.txt
> +++ b/mercurial/help/internals/revlogs.txt
> @@ -90,6 +90,7 @@
>  6-7 (2 bytes)
>     Bit flags impacting revision behavior. The following bit offsets define:
>     0: 'censor' extension flag.
> +   1: 'lfs' extension flag.
>  8-11 (4 bytes)
>     Compressed length of revision data / chunk as stored in revlog.
>  12-15 (4 bytes)
> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -54,11 +54,13 @@
>
>  # revlog index flags
>  REVIDX_ISCENSORED = (1 << 15) # revision has censor metadata, must be verified
> +REVIDX_ISLARGEFILE = (1 << 14)
>  REVIDX_DEFAULT_FLAGS = 0
> -REVIDX_KNOWN_FLAGS = REVIDX_ISCENSORED
> +REVIDX_KNOWN_FLAGS = REVIDX_ISCENSORED | REVIDX_ISLARGEFILE
>  # stable order in which flags need to be processed by the flagprocessor
>  REVIDX_FLAGS_PROCESSING_ORDER = [
>      REVIDX_ISCENSORED,
> +    REVIDX_ISLARGEFILE,
>  ]
>
>  # max size of revlog with inline data
> @@ -1773,6 +1775,10 @@
>          """Check if a file revision is censored."""
>          return False
>
> +    def islargefile(self, rev):
> +        """Check if a file revision is flagged as large."""
> +        return self.flags(rev) & REVIDX_ISLARGEFILE
> +
>      def _peek_iscensored(self, baserev, delta, flush):
>          """Quickly check if a delta produces a censored revision."""
>          return False
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Rémi Chaintron - Nov. 29, 2016, 5:02 p.m.
On Tue, 29 Nov 2016 at 06:59 Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 11/23/2016 06:39 PM, Remi Chaintron wrote:
> > # HG changeset patch
> > # User Remi Chaintron <remi@fb.com>
> > # Date 1479922644 0
> > #      Wed Nov 23 17:37:24 2016 +0000
> > # Branch stable
> > # Node ID 75ee4746c198f039a39400e855e9335afc34f1dd
> > # Parent  da91f91e979d6bf807912e956cf2f29573ede56f
> > revlog: REVIDX_ISLARGEFILE flag
> >
> > Add the REVIDX_ISLARGEFILE flag for the `lfs` extension to interact with
> > revisions by registering transforms in the flagprocessor.
>
> small naming question: Should we actually call this 'lfs'/'LARGEFILE'.
> Of courses. the extension using it is intended for large file. However,
> the core semantic of this 'flag' seems to be "the content is stored
> outsided of revlog". This external storage could be applied to any use
> cases (eg, fetching sensitive content from a secure server on update).
>
> What do you think about adapting the name to reflect this?
>

 That's a good point, and it depends on what we want to do with these
flags. My understanding was that we wanted them to represent extension
flags (at least a few of them).

One aspect to consider is that if we were to rename this flag to something
more generic (such as "REVIDX_EXTERNAL_STORAGE" for example), we might end
up in a situation where this is used by different extensions with different
behaviors, that might then become incompatible.
One solution moving forward might be to allow extensions to register
several transforms on a single flag, but this will make handling
non-commutative operations more doable. This might be something worth
iterating over, though.

One aspect Augie and I discussed was to keep a few placeholder flags for
home brewed extensions to register transforms on.
Pierre-Yves David - Dec. 3, 2016, 2:51 a.m.
On 11/29/2016 06:02 PM, Rémi Chaintron wrote:
> On Tue, 29 Nov 2016 at 06:59 Pierre-Yves David <
> pierre-yves.david@ens-lyon.org> wrote:
>
>>
>>
>> On 11/23/2016 06:39 PM, Remi Chaintron wrote:
>>> # HG changeset patch
>>> # User Remi Chaintron <remi@fb.com>
>>> # Date 1479922644 0
>>> #      Wed Nov 23 17:37:24 2016 +0000
>>> # Branch stable
>>> # Node ID 75ee4746c198f039a39400e855e9335afc34f1dd
>>> # Parent  da91f91e979d6bf807912e956cf2f29573ede56f
>>> revlog: REVIDX_ISLARGEFILE flag
>>>
>>> Add the REVIDX_ISLARGEFILE flag for the `lfs` extension to interact with
>>> revisions by registering transforms in the flagprocessor.
>>
>> small naming question: Should we actually call this 'lfs'/'LARGEFILE'.
>> Of courses. the extension using it is intended for large file. However,
>> the core semantic of this 'flag' seems to be "the content is stored
>> outsided of revlog". This external storage could be applied to any use
>> cases (eg, fetching sensitive content from a secure server on update).
>>
>> What do you think about adapting the name to reflect this?
>>
>
>  That's a good point, and it depends on what we want to do with these
> flags. My understanding was that we wanted them to represent extension
> flags (at least a few of them).

Extensions are more volatile than usecase/behavior. For example you are 
introducing a LFS extensions to handle largefile, while we already had 
an extension to handle largefile. In addition, in a couple of year the 
LFS protocol might be extinct and someone else will use create a new 
extension to achieve the same effect using a node-rail.elm protocol on 
the uber-cloud platefor. Having these people able to reuse that flag 
would be great.

> One aspect to consider is that if we were to rename this flag to something
> more generic (such as "REVIDX_EXTERNAL_STORAGE" for example), we might end
> up in a situation where this is used by different extensions with different
> behaviors, that might then become incompatible.

So a flag should result in a similar class of behavior. In our case this 
is "Fetch the content from outside the revlog file".

> One solution moving forward might be to allow extensions to register
> several transforms on a single flag, but this will make handling
> non-commutative operations more doable. This might be something worth
> iterating over, though.

 From the top of my head, a simple idea here would be to have a 
'protocol key' defined as part of a basic format to be expected from the 
flag usage. In our case we would have a 'lfs' protocol for the 
EXT_STORAGE flag. The lfs extensions registed an handler for EXT_STORAGE 
and "pass" if the protocol is not 'lfs' (or anything else it handles). 
Extensions collaborate so that the one who can handle it get to do it, 
but only one 'protocol'/extension gets into play. There is not 
commutative operation involved.

> One aspect Augie and I discussed was to keep a few placeholder flags for
> home brewed extensions to register transforms on.

Do you have some specific example in mind, or just having some USR1-USR4 
reserve for in-house usage?

Cheers,
Augie Fackler - Dec. 3, 2016, 5:49 a.m.
> On Dec 2, 2016, at 9:51 PM, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote:
> 
>> One aspect Augie and I discussed was to keep a few placeholder flags for
>> home brewed extensions to register transforms on.
> 
> Do you have some specific example in mind, or just having some USR1-USR4 reserve for in-house usage?

It was more the USR1-USR4 idea, but I’m still not sure if it’s a good one. It feels risky on some level, but I can’t get beyond a bad gut feeling about it.

Patch

diff --git a/mercurial/help/internals/revlogs.txt b/mercurial/help/internals/revlogs.txt
--- a/mercurial/help/internals/revlogs.txt
+++ b/mercurial/help/internals/revlogs.txt
@@ -90,6 +90,7 @@ 
 6-7 (2 bytes)
    Bit flags impacting revision behavior. The following bit offsets define:
    0: 'censor' extension flag.
+   1: 'lfs' extension flag.
 8-11 (4 bytes)
    Compressed length of revision data / chunk as stored in revlog.
 12-15 (4 bytes)
diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -54,11 +54,13 @@ 
 
 # revlog index flags
 REVIDX_ISCENSORED = (1 << 15) # revision has censor metadata, must be verified
+REVIDX_ISLARGEFILE = (1 << 14)
 REVIDX_DEFAULT_FLAGS = 0
-REVIDX_KNOWN_FLAGS = REVIDX_ISCENSORED
+REVIDX_KNOWN_FLAGS = REVIDX_ISCENSORED | REVIDX_ISLARGEFILE
 # stable order in which flags need to be processed by the flagprocessor
 REVIDX_FLAGS_PROCESSING_ORDER = [
     REVIDX_ISCENSORED,
+    REVIDX_ISLARGEFILE,
 ]
 
 # max size of revlog with inline data
@@ -1773,6 +1775,10 @@ 
         """Check if a file revision is censored."""
         return False
 
+    def islargefile(self, rev):
+        """Check if a file revision is flagged as large."""
+        return self.flags(rev) & REVIDX_ISLARGEFILE
+
     def _peek_iscensored(self, baserev, delta, flush):
         """Quickly check if a delta produces a censored revision."""
         return False