Patchwork [STABLE] largefiles: fix an explicit largefile commit after a remove (issue4969)

login
register
mail settings
Submitter Matt Harbison
Date Jan. 24, 2016, 1:58 a.m.
Message ID <d07901a07264cc495565.1453600680@Envy>
Download mbox | patch
Permalink /patch/12879/
State Accepted
Delegated to: Yuya Nishihara
Headers show

Comments

Matt Harbison - Jan. 24, 2016, 1:58 a.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1453600277 18000
#      Sat Jan 23 20:51:17 2016 -0500
# Branch stable
# Node ID d07901a07264cc495565bce44eccb7d80afb7a25
# Parent  4c6053a6b17d682b34fb88bbeb5e94ed9085d900
largefiles: fix an explicit largefile commit after a remove (issue4969)

The change in b68797f244e4 to handle a normal -> largefile switch was too
aggressive in preserving the original matcher names.  If a largefile is
explicitly provided by the user, but only the standin exists in dirstate, then
only the standin can be committed.

There's still maybe an issue when the largefile is deleted outside of Mercurial:

  $ rm large
  $ hg ci -m "oops" large
  large: The system cannot find the file specified
  nothing changed
  [1]

That may be OK, since the standin file could have been changed since the last
commit, and this seems safer since the largefile isn't present to regenerate it.
Matt Harbison - Jan. 24, 2016, 6:32 a.m.
On Sat, 23 Jan 2016 20:58:00 -0500, Matt Harbison <mharbison72@gmail.com>  
wrote:

> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1453600277 18000
> #      Sat Jan 23 20:51:17 2016 -0500
> # Branch stable
> # Node ID d07901a07264cc495565bce44eccb7d80afb7a25
> # Parent  4c6053a6b17d682b34fb88bbeb5e94ed9085d900
> largefiles: fix an explicit largefile commit after a remove (issue4969)
>

[snip]

>
> There's still maybe an issue when the largefile is deleted outside of  
> Mercurial:
>
>   $ rm large
>   $ hg ci -m "oops" large
>   large: The system cannot find the file specified
>   nothing changed
>   [1]
>
> That may be OK, since the standin file could have been changed since the  
> last
> commit, and this seems safer since the largefile isn't present to  
> regenerate it.

Actually, this isn't OK, and is fixed by the next series.  This patch  
isn't the cause of the issue though, so whoever queues this patch can drop  
this part.
Yuya Nishihara - Jan. 25, 2016, 3:07 p.m.
On Sat, 23 Jan 2016 20:58:00 -0500, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1453600277 18000
> #      Sat Jan 23 20:51:17 2016 -0500
> # Branch stable
> # Node ID d07901a07264cc495565bce44eccb7d80afb7a25
> # Parent  4c6053a6b17d682b34fb88bbeb5e94ed9085d900
> largefiles: fix an explicit largefile commit after a remove (issue4969)
> 
> The change in b68797f244e4 to handle a normal -> largefile switch was too
> aggressive in preserving the original matcher names.  If a largefile is
> explicitly provided by the user, but only the standin exists in dirstate, then
> only the standin can be committed.
> 
> There's still maybe an issue when the largefile is deleted outside of Mercurial:
> 
>   $ rm large
>   $ hg ci -m "oops" large
>   large: The system cannot find the file specified
>   nothing changed
>   [1]
> 
> That may be OK, since the standin file could have been changed since the last
> commit, and this seems safer since the largefile isn't present to regenerate it.
> 
> diff --git a/hgext/largefiles/lfutil.py b/hgext/largefiles/lfutil.py
> --- a/hgext/largefiles/lfutil.py
> +++ b/hgext/largefiles/lfutil.py
> @@ -575,11 +575,15 @@
>          fstandin = standin(f)
>  
>          # For largefiles, only one of the normal and standin should be
> -        # committed (except if one of them is a remove).
> +        # committed (except if one of them is a remove).  In the case of a
> +        # standin removal, drop the normal file if it is unknown to dirstate.
>          # Thus, skip plain largefile names but keep the standin.
> -        if (f in lfiles or fstandin in standins) and \
> -            repo.dirstate[f] != 'r' and repo.dirstate[fstandin] != 'r':
> -            continue
> +        if f in lfiles or fstandin in standins:
> +            if repo.dirstate[fstandin] != 'r':
> +                if repo.dirstate[f] != 'r':
> +                    continue
> +            elif repo.dirstate[f] == '?':
> +                continue

Looks good to me, but I'm not confident. I want to clarify my understanding:

  (dirstate[f], dirstate[fstandin])
    = ('?', 'r') => removal of 'f', exclude 'f' but not 'fstandin'
      ('r', _) | (_, 'r') => might be add/remove pair, include both

Right?
Matt Harbison - Jan. 26, 2016, 1:29 a.m.
On Mon, 25 Jan 2016 10:07:21 -0500, Yuya Nishihara <yuya@tcha.org> wrote:

> On Sat, 23 Jan 2016 20:58:00 -0500, Matt Harbison wrote:
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison@yahoo.com>
>> # Date 1453600277 18000
>> #      Sat Jan 23 20:51:17 2016 -0500
>> # Branch stable
>> # Node ID d07901a07264cc495565bce44eccb7d80afb7a25
>> # Parent  4c6053a6b17d682b34fb88bbeb5e94ed9085d900
>> largefiles: fix an explicit largefile commit after a remove (issue4969)
>>
>> The change in b68797f244e4 to handle a normal -> largefile switch was  
>> too
>> aggressive in preserving the original matcher names.  If a largefile is
>> explicitly provided by the user, but only the standin exists in  
>> dirstate, then
>> only the standin can be committed.
>>
>> There's still maybe an issue when the largefile is deleted outside of  
>> Mercurial:
>>
>>   $ rm large
>>   $ hg ci -m "oops" large
>>   large: The system cannot find the file specified
>>   nothing changed
>>   [1]
>>
>> That may be OK, since the standin file could have been changed since  
>> the last
>> commit, and this seems safer since the largefile isn't present to  
>> regenerate it.
>>
>> diff --git a/hgext/largefiles/lfutil.py b/hgext/largefiles/lfutil.py
>> --- a/hgext/largefiles/lfutil.py
>> +++ b/hgext/largefiles/lfutil.py
>> @@ -575,11 +575,15 @@
>>          fstandin = standin(f)
>>
>>          # For largefiles, only one of the normal and standin should be
>> -        # committed (except if one of them is a remove).
>> +        # committed (except if one of them is a remove).  In the case  
>> of a
>> +        # standin removal, drop the normal file if it is unknown to  
>> dirstate.
>>          # Thus, skip plain largefile names but keep the standin.
>> -        if (f in lfiles or fstandin in standins) and \
>> -            repo.dirstate[f] != 'r' and repo.dirstate[fstandin] != 'r':
>> -            continue
>> +        if f in lfiles or fstandin in standins:
>> +            if repo.dirstate[fstandin] != 'r':
>> +                if repo.dirstate[f] != 'r':
>> +                    continue
>> +            elif repo.dirstate[f] == '?':
>> +                continue
>
> Looks good to me, but I'm not confident. I want to clarify my  
> understanding:
>
>   (dirstate[f], dirstate[fstandin])
>     = ('?', 'r') => removal of 'f', exclude 'f' but not 'fstandin'

Yes, this is the case of removing only a largefile.  The tracked standin  
is committed, and the largefile is untracked in dirstate.

>       ('r', _) | (_, 'r') => might be add/remove pair, include both

I believe so.  This was the previous logic to handle changing between  
large and normal for 'f', and then it makes sense to commit both.  All I  
did was unroll the conditional to tack on the elif branch for the case  
above.

> Right?
Yuya Nishihara - Jan. 26, 2016, 2:38 p.m.
On Mon, 25 Jan 2016 20:29:09 -0500, Matt Harbison wrote:
> On Mon, 25 Jan 2016 10:07:21 -0500, Yuya Nishihara <yuya@tcha.org> wrote:
> 
> > On Sat, 23 Jan 2016 20:58:00 -0500, Matt Harbison wrote:
> >> # HG changeset patch
> >> # User Matt Harbison <matt_harbison@yahoo.com>
> >> # Date 1453600277 18000
> >> #      Sat Jan 23 20:51:17 2016 -0500
> >> # Branch stable
> >> # Node ID d07901a07264cc495565bce44eccb7d80afb7a25
> >> # Parent  4c6053a6b17d682b34fb88bbeb5e94ed9085d900
> >> largefiles: fix an explicit largefile commit after a remove (issue4969)
> >>
> >> The change in b68797f244e4 to handle a normal -> largefile switch was  
> >> too
> >> aggressive in preserving the original matcher names.  If a largefile is
> >> explicitly provided by the user, but only the standin exists in  
> >> dirstate, then
> >> only the standin can be committed.
> >>
> >> There's still maybe an issue when the largefile is deleted outside of  
> >> Mercurial:
> >>
> >>   $ rm large
> >>   $ hg ci -m "oops" large
> >>   large: The system cannot find the file specified
> >>   nothing changed
> >>   [1]
> >>
> >> That may be OK, since the standin file could have been changed since  
> >> the last
> >> commit, and this seems safer since the largefile isn't present to  
> >> regenerate it.
> >>
> >> diff --git a/hgext/largefiles/lfutil.py b/hgext/largefiles/lfutil.py
> >> --- a/hgext/largefiles/lfutil.py
> >> +++ b/hgext/largefiles/lfutil.py
> >> @@ -575,11 +575,15 @@
> >>          fstandin = standin(f)
> >>
> >>          # For largefiles, only one of the normal and standin should be
> >> -        # committed (except if one of them is a remove).
> >> +        # committed (except if one of them is a remove).  In the case  
> >> of a
> >> +        # standin removal, drop the normal file if it is unknown to  
> >> dirstate.
> >>          # Thus, skip plain largefile names but keep the standin.
> >> -        if (f in lfiles or fstandin in standins) and \
> >> -            repo.dirstate[f] != 'r' and repo.dirstate[fstandin] != 'r':
> >> -            continue
> >> +        if f in lfiles or fstandin in standins:
> >> +            if repo.dirstate[fstandin] != 'r':
> >> +                if repo.dirstate[f] != 'r':
> >> +                    continue
> >> +            elif repo.dirstate[f] == '?':
> >> +                continue
> >
> > Looks good to me, but I'm not confident. I want to clarify my  
> > understanding:
> >
> >   (dirstate[f], dirstate[fstandin])
> >     = ('?', 'r') => removal of 'f', exclude 'f' but not 'fstandin'
> 
> Yes, this is the case of removing only a largefile.  The tracked standin  
> is committed, and the largefile is untracked in dirstate.
> 
> >       ('r', _) | (_, 'r') => might be add/remove pair, include both
> 
> I believe so.  This was the previous logic to handle changing between  
> large and normal for 'f', and then it makes sense to commit both.  All I  
> did was unroll the conditional to tack on the elif branch for the case  
> above.

Thanks, pushed to the clowncopter.

Patch

diff --git a/hgext/largefiles/lfutil.py b/hgext/largefiles/lfutil.py
--- a/hgext/largefiles/lfutil.py
+++ b/hgext/largefiles/lfutil.py
@@ -575,11 +575,15 @@ 
         fstandin = standin(f)
 
         # For largefiles, only one of the normal and standin should be
-        # committed (except if one of them is a remove).
+        # committed (except if one of them is a remove).  In the case of a
+        # standin removal, drop the normal file if it is unknown to dirstate.
         # Thus, skip plain largefile names but keep the standin.
-        if (f in lfiles or fstandin in standins) and \
-            repo.dirstate[f] != 'r' and repo.dirstate[fstandin] != 'r':
-            continue
+        if f in lfiles or fstandin in standins:
+            if repo.dirstate[fstandin] != 'r':
+                if repo.dirstate[f] != 'r':
+                    continue
+            elif repo.dirstate[f] == '?':
+                continue
 
         actualfiles.append(f)
     match._files = actualfiles
diff --git a/tests/test-largefiles-cache.t b/tests/test-largefiles-cache.t
--- a/tests/test-largefiles-cache.t
+++ b/tests/test-largefiles-cache.t
@@ -17,7 +17,7 @@ 
   $ hg add --large large
   $ hg commit -m 'add largefile'
   $ hg rm large
-  $ hg commit -m 'branchhead without largefile'
+  $ hg commit -m 'branchhead without largefile' large
   $ hg up -qr 0
   $ cd ..