Patchwork largefiles: ignore hidden changesets with 'verify --large --lfa'

login
register
mail settings
Submitter Matt Harbison
Date June 9, 2015, 12:52 a.m.
Message ID <7beb2d6c7bf755f83293.1433811171@Envy>
Download mbox | patch
Permalink /patch/9567/
State Accepted
Delegated to: Pierre-Yves David
Headers show

Comments

Matt Harbison - June 9, 2015, 12:52 a.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1433643018 14400
#      Sat Jun 06 22:10:18 2015 -0400
# Node ID 7beb2d6c7bf755f83293e9780b82a2632fc2918a
# Parent  378a8e700e02794e991d3521423a4f581b635666
largefiles: ignore hidden changesets with 'verify --large --lfa'

Previously, if there were any hidden changesets, the --lfa argument would cause
the command to abort with a hint about using --hidden when it tripped over a
hidden changeset.
Pierre-Yves David - June 9, 2015, 2:16 a.m.
On 06/08/2015 05:52 PM, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1433643018 14400
> #      Sat Jun 06 22:10:18 2015 -0400
> # Node ID 7beb2d6c7bf755f83293e9780b82a2632fc2918a
> # Parent  378a8e700e02794e991d3521423a4f581b635666
> largefiles: ignore hidden changesets with 'verify --large --lfa'
>
> Previously, if there were any hidden changesets, the --lfa argument would cause
> the command to abort with a hint about using --hidden when it tripped over a
> hidden changeset.

do you want operation on an unfiltered repository instead?

>
> diff --git a/hgext/largefiles/lfcommands.py b/hgext/largefiles/lfcommands.py
> --- a/hgext/largefiles/lfcommands.py
> +++ b/hgext/largefiles/lfcommands.py
> @@ -364,9 +364,7 @@
>       matches the revision ID).  With --all, check every changeset in
>       this repository.'''
>       if all:
> -        # Pass a list to the function rather than an iterator because we know a
> -        # list will work.
> -        revs = range(len(repo))
> +        revs = repo.revs('all()')

(in all cases)
revs = iter(repo.changelog) ? (not convince it is so much better)
Matt Harbison - June 9, 2015, 2:55 a.m.
On Mon, 08 Jun 2015 22:16:20 -0400, Pierre-Yves David  
<pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 06/08/2015 05:52 PM, Matt Harbison wrote:
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison@yahoo.com>
>> # Date 1433643018 14400
>> #      Sat Jun 06 22:10:18 2015 -0400
>> # Node ID 7beb2d6c7bf755f83293e9780b82a2632fc2918a
>> # Parent  378a8e700e02794e991d3521423a4f581b635666
>> largefiles: ignore hidden changesets with 'verify --large --lfa'
>>
>> Previously, if there were any hidden changesets, the --lfa argument  
>> would cause
>> the command to abort with a hint about using --hidden when it tripped  
>> over a
>> hidden changeset.
>
> do you want operation on an unfiltered repository instead?

I thought about it, but I guess there's the potential for a largefile to  
be amended, and the original cset hidden.  So why bother verifying the  
hidden file if --hidden isn't specified?  It seems that this way, both  
visible and visible + --hidden are possible, depending on what the user  
wants.

>> diff --git a/hgext/largefiles/lfcommands.py  
>> b/hgext/largefiles/lfcommands.py
>> --- a/hgext/largefiles/lfcommands.py
>> +++ b/hgext/largefiles/lfcommands.py
>> @@ -364,9 +364,7 @@
>>       matches the revision ID).  With --all, check every changeset in
>>       this repository.'''
>>       if all:
>> -        # Pass a list to the function rather than an iterator because  
>> we know a
>> -        # list will work.
>> -        revs = range(len(repo))
>> +        revs = repo.revs('all()')
>
> (in all cases)
> revs = iter(repo.changelog) ? (not convince it is so much better)

The method 'revs' is passed to does 'len(revs)', which revset handles.   
But this seems to work:

    revs = list(repo.changelog)

I was thinking about a followup with repo.revs("filelog('path:.hglf/')")  
to drop some of this manual processing here, but couldn't get it to work.   
I can resend with the list(..) if there aren't any issues with that.
Pierre-Yves David - June 9, 2015, 6:04 p.m.
On 06/08/2015 07:55 PM, Matt Harbison wrote:
> On Mon, 08 Jun 2015 22:16:20 -0400, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org> wrote:
>
>>
>>
>> On 06/08/2015 05:52 PM, Matt Harbison wrote:
>>> # HG changeset patch
>>> # User Matt Harbison <matt_harbison@yahoo.com>
>>> # Date 1433643018 14400
>>> #      Sat Jun 06 22:10:18 2015 -0400
>>> # Node ID 7beb2d6c7bf755f83293e9780b82a2632fc2918a
>>> # Parent  378a8e700e02794e991d3521423a4f581b635666
>>> largefiles: ignore hidden changesets with 'verify --large --lfa'
>>>
>>> Previously, if there were any hidden changesets, the --lfa argument
>>> would cause
>>> the command to abort with a hint about using --hidden when it tripped
>>> over a
>>> hidden changeset.
>>
>> do you want operation on an unfiltered repository instead?
>
> I thought about it, but I guess there's the potential for a largefile to
> be amended, and the original cset hidden.  So why bother verifying the
> hidden file if --hidden isn't specified?  It seems that this way, both
> visible and visible + --hidden are possible, depending on what the user
> wants.

hg verify runs on an unfiltered repo, it would be consistent to run this 
one on unfiltered repository too. Why would we want something different?

What does --lfa and --large argument do exactly?

>
>>> diff --git a/hgext/largefiles/lfcommands.py
>>> b/hgext/largefiles/lfcommands.py
>>> --- a/hgext/largefiles/lfcommands.py
>>> +++ b/hgext/largefiles/lfcommands.py
>>> @@ -364,9 +364,7 @@
>>>       matches the revision ID).  With --all, check every changeset in
>>>       this repository.'''
>>>       if all:
>>> -        # Pass a list to the function rather than an iterator
>>> because we know a
>>> -        # list will work.
>>> -        revs = range(len(repo))
>>> +        revs = repo.revs('all()')
>>
>> (in all cases)
>> revs = iter(repo.changelog) ? (not convince it is so much better)
>
> The method 'revs' is passed to does 'len(revs)', which revset handles.
> But this seems to work:
>
>     revs = list(repo.changelog)

We could also use repo.changelog. The use of len will makes it a bit 
hard to be efficient, but 'all()' is probably you best bet here.

This does not need to changes.
Matt Harbison - June 10, 2015, 12:55 a.m.
On Tue, 09 Jun 2015 14:04:48 -0400, Pierre-Yves David  
<pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 06/08/2015 07:55 PM, Matt Harbison wrote:
>> On Mon, 08 Jun 2015 22:16:20 -0400, Pierre-Yves David
>> <pierre-yves.david@ens-lyon.org> wrote:
>>
>>>
>>>
>>> On 06/08/2015 05:52 PM, Matt Harbison wrote:
>>>> # HG changeset patch
>>>> # User Matt Harbison <matt_harbison@yahoo.com>
>>>> # Date 1433643018 14400
>>>> #      Sat Jun 06 22:10:18 2015 -0400
>>>> # Node ID 7beb2d6c7bf755f83293e9780b82a2632fc2918a
>>>> # Parent  378a8e700e02794e991d3521423a4f581b635666
>>>> largefiles: ignore hidden changesets with 'verify --large --lfa'
>>>>
>>>> Previously, if there were any hidden changesets, the --lfa argument
>>>> would cause
>>>> the command to abort with a hint about using --hidden when it tripped
>>>> over a
>>>> hidden changeset.
>>>
>>> do you want operation on an unfiltered repository instead?
>>
>> I thought about it, but I guess there's the potential for a largefile to
>> be amended, and the original cset hidden.  So why bother verifying the
>> hidden file if --hidden isn't specified?  It seems that this way, both
>> visible and visible + --hidden are possible, depending on what the user
>> wants.
>
> hg verify runs on an unfiltered repo, it would be consistent to run this  
> one on unfiltered repository too. Why would we want something different?
>
> What does --lfa and --large argument do exactly?

--large is actually implied by --lfa, so ignore that (it used to not be).

--lfa means "verify all largefiles", instead of verifying only the  
largefiles in rev '.'.

"Verification" is simply looking for each '.hglf/foo', and whatever hash  
it contains, make sure there is a matching file in the store.  With --lfc,  
it will then hash the file it finds, to make sure it matches its file name  
(i.e., the store file wasn't corrupted).

I'm _guessing_ that normal verify operates even on hidden csets due to the  
sequential nature of revlog, so you almost have to?  Aside from not  
wanting to waste time verifying largefile content that has been amended  
away, there's this that I just remembered:

http://bz.selenic.com/show_bug.cgi?id=4242

Basically, if there's a --config paths.default, verify makes sure that the  
largefiles are on the server store, instead of checking the local store.   
Clearly, a largefile that is amended won't be pushed to the server, so the  
verify will fail if we use an unfiltered repo and there's a path.  (For  
the record, I agree with the bug report that the behavior is wrong by  
default, but it helped me just last week, so I don't want to get rid of it  
completely).
Pierre-Yves David - June 10, 2015, 11:32 p.m.
On 06/09/2015 05:55 PM, Matt Harbison wrote:
> On Tue, 09 Jun 2015 14:04:48 -0400, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org> wrote:
>
>>
>>
>> On 06/08/2015 07:55 PM, Matt Harbison wrote:
>>> On Mon, 08 Jun 2015 22:16:20 -0400, Pierre-Yves David
>>> <pierre-yves.david@ens-lyon.org> wrote:
>>>
>>>>
>>>>
>>>> On 06/08/2015 05:52 PM, Matt Harbison wrote:
>>>>> # HG changeset patch
>>>>> # User Matt Harbison <matt_harbison@yahoo.com>
>>>>> # Date 1433643018 14400
>>>>> #      Sat Jun 06 22:10:18 2015 -0400
>>>>> # Node ID 7beb2d6c7bf755f83293e9780b82a2632fc2918a
>>>>> # Parent  378a8e700e02794e991d3521423a4f581b635666
>>>>> largefiles: ignore hidden changesets with 'verify --large --lfa'
>>>>>
>>>>> Previously, if there were any hidden changesets, the --lfa argument
>>>>> would cause
>>>>> the command to abort with a hint about using --hidden when it tripped
>>>>> over a
>>>>> hidden changeset.
>>>>
>>>> do you want operation on an unfiltered repository instead?
>>>
>>> I thought about it, but I guess there's the potential for a largefile to
>>> be amended, and the original cset hidden.  So why bother verifying the
>>> hidden file if --hidden isn't specified?  It seems that this way, both
>>> visible and visible + --hidden are possible, depending on what the user
>>> wants.
>>
>> hg verify runs on an unfiltered repo, it would be consistent to run
>> this one on unfiltered repository too. Why would we want something
>> different?
>>
>> What does --lfa and --large argument do exactly?
>
> --large is actually implied by --lfa, so ignore that (it used to not be).
>
> --lfa means "verify all largefiles", instead of verifying only the
> largefiles in rev '.'.
>
> "Verification" is simply looking for each '.hglf/foo', and whatever hash
> it contains, make sure there is a matching file in the store.  With
> --lfc, it will then hash the file it finds, to make sure it matches its
> file name (i.e., the store file wasn't corrupted).
>
> I'm _guessing_ that normal verify operates even on hidden csets due to
> the sequential nature of revlog, so you almost have to?  Aside from not
> wanting to waste time verifying largefile content that has been amended
> away, there's this that I just remembered:
>
> http://bz.selenic.com/show_bug.cgi?id=4242
>
> Basically, if there's a --config paths.default, verify makes sure that
> the largefiles are on the server store, instead of checking the local
> store.  Clearly, a largefile that is amended won't be pushed to the
> server, so the verify will fail if we use an unfiltered repo and there's
> a path.  (For the record, I agree with the bug report that the behavior
> is wrong by default, but it helped me just last week, so I don't want to
> get rid of it completely).

Sold, pushed to the clowncopter.

Patch

diff --git a/hgext/largefiles/lfcommands.py b/hgext/largefiles/lfcommands.py
--- a/hgext/largefiles/lfcommands.py
+++ b/hgext/largefiles/lfcommands.py
@@ -364,9 +364,7 @@ 
     matches the revision ID).  With --all, check every changeset in
     this repository.'''
     if all:
-        # Pass a list to the function rather than an iterator because we know a
-        # list will work.
-        revs = range(len(repo))
+        revs = repo.revs('all()')
     else:
         revs = ['.']
 
diff --git a/tests/test-lfconvert.t b/tests/test-lfconvert.t
--- a/tests/test-lfconvert.t
+++ b/tests/test-lfconvert.t
@@ -320,13 +320,21 @@ 
 Verify will fail (for now) if the usercache is purged before converting, since
 largefiles are not cached in the converted repo's local store by the conversion
 process.
+  $ cd largefiles-repo-hg
+  $ cat >> .hg/hgrc <<EOF
+  > [experimental]
+  > evolution=createmarkers
+  > EOF
+  $ hg debugobsolete `hg log -r tip -T "{node}"`
+  $ cd ..
+
   $ hg -R largefiles-repo-hg verify --large --lfa
   checking changesets
   checking manifests
   crosschecking files in changesets and manifests
   checking files
   9 files, 8 changesets, 13 total revisions
-  searching 8 changesets for largefiles
+  searching 7 changesets for largefiles
   changeset 0:d4892ec57ce2: large references missing $TESTTMP/largefiles-repo-hg/.hg/largefiles/2e000fa7e85759c7f4c254d4d9c33ef481e459a7 (glob)
   changeset 1:334e5237836d: sub/maybelarge.dat references missing $TESTTMP/largefiles-repo-hg/.hg/largefiles/34e163be8e43c5631d8b92e9c43ab0bf0fa62b9c (glob)
   changeset 2:261ad3f3f037: stuff/maybelarge.dat references missing $TESTTMP/largefiles-repo-hg/.hg/largefiles/34e163be8e43c5631d8b92e9c43ab0bf0fa62b9c (glob)