Patchwork [2,of,2] forget: don't report rejected files as forgotten as well

login
register
mail settings
Submitter Matt Harbison
Date Jan. 14, 2015, 2:11 a.m.
Message ID <394c03620d245326b3fa.1421201484@Envy>
Download mbox | patch
Permalink /patch/7449/
State Accepted
Commit b95b9fd7ba29d0cb27538b1d7bffb1b2b4acf7ce
Headers show

Comments

Matt Harbison - Jan. 14, 2015, 2:11 a.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1421036723 18000
#      Sun Jan 11 23:25:23 2015 -0500
# Node ID 394c03620d245326b3fa8ac87e0c89c253df9cda
# Parent  9e24ea4fa4d77dc77375467a1a8263d5d7e1f4d3
forget: don't report rejected files as forgotten as well

It seems like a mistake to report a file as forgotten and rejected.  The
forgotten list doesn't seem to be used by anything in core, so no test changes.
Martin von Zweigbergk - Jan. 14, 2015, 2:29 a.m.
Looks like possible quadratic runtime. Can both rejected and forget be
large lists?

Not your fault, but do you think rejected and match.files() on the line
above can both be large?

On Tue, Jan 13, 2015, 18:11 Matt Harbison <mharbison72@gmail.com> wrote:

> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1421036723 18000
> #      Sun Jan 11 23:25:23 2015 -0500
> # Node ID 394c03620d245326b3fa8ac87e0c89c253df9cda
> # Parent  9e24ea4fa4d77dc77375467a1a8263d5d7e1f4d3
> forget: don't report rejected files as forgotten as well
>
> It seems like a mistake to report a file as forgotten and rejected.  The
> forgotten list doesn't seem to be used by anything in core, so no test
> changes.
>
> diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
> --- a/mercurial/cmdutil.py
> +++ b/mercurial/cmdutil.py
> @@ -2057,7 +2057,7 @@
>
>      rejected = wctx.forget(forget, prefix)
>      bad.extend(f for f in rejected if f in match.files())
> -    forgot.extend(forget)
> +    forgot.extend(f for f in forget if f not in rejected)
>      return bad, forgot
>
>  def remove(ui, repo, m, prefix, after, force, subrepos):
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>
Matt Harbison - Jan. 14, 2015, 3:16 a.m.
On Tue, 13 Jan 2015 21:29:39 -0500, Martin von Zweigbergk  
<martinvonz@google.com> wrote:

> Looks like possible quadratic runtime. Can both rejected and forget be
> large lists?
>
> Not your fault, but do you think rejected and match.files() on the line
> above can both be large?

'rejected' should only contain entries for each file given to  
wctx.forget() that is *not* in dirstate.  And since repo.status() is used  
to build that list, the reject list should be empty most of the time (I  
think).  Therefore, it doesn't look like bad/untracked files passed on the  
command line are ever passed to wctx.forget().  Of course the write lock  
isn't taken until inside wctx.forget(), so I suppose it is possible  
something else forgets/removes those files after the status has been  
calculated, but before making it to wctx.forget()?

The only optimization that I see here is converting rejected to a set, but  
I'm not sure if that matters much given its typical empty state.  I think  
match.files() is much more likely to have more items, and it looks like  
that is a list too.  Those are easy tweaks though if you think it's  
useful/necessary.

--Matt

> On Tue, Jan 13, 2015, 18:11 Matt Harbison <mharbison72@gmail.com> wrote:
>
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison@yahoo.com>
>> # Date 1421036723 18000
>> #      Sun Jan 11 23:25:23 2015 -0500
>> # Node ID 394c03620d245326b3fa8ac87e0c89c253df9cda
>> # Parent  9e24ea4fa4d77dc77375467a1a8263d5d7e1f4d3
>> forget: don't report rejected files as forgotten as well
>>
>> It seems like a mistake to report a file as forgotten and rejected.  The
>> forgotten list doesn't seem to be used by anything in core, so no test
>> changes.
>>
>> diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
>> --- a/mercurial/cmdutil.py
>> +++ b/mercurial/cmdutil.py
>> @@ -2057,7 +2057,7 @@
>>
>>      rejected = wctx.forget(forget, prefix)
>>      bad.extend(f for f in rejected if f in match.files())
>> -    forgot.extend(forget)
>> +    forgot.extend(f for f in forget if f not in rejected)
>>      return bad, forgot
>>
>>  def remove(ui, repo, m, prefix, after, force, subrepos):
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel@selenic.com
>> http://selenic.com/mailman/listinfo/mercurial-devel
Martin von Zweigbergk - Jan. 14, 2015, 4:53 a.m.
On Tue Jan 13 2015 at 7:16:30 PM Matt Harbison <mharbison72@gmail.com>
wrote:

> On Tue, 13 Jan 2015 21:29:39 -0500, Martin von Zweigbergk
> <martinvonz@google.com> wrote:
>
> > Looks like possible quadratic runtime. Can both rejected and forget be
> > large lists?
> >
> > Not your fault, but do you think rejected and match.files() on the line
> > above can both be large?
>
> 'rejected' should only contain entries for each file given to
> wctx.forget() that is *not* in dirstate.  And since repo.status() is used
> to build that list, the reject list should be empty most of the time (I
> think).  Therefore, it doesn't look like bad/untracked files passed on the
> command line are ever passed to wctx.forget().  Of course the write lock
> isn't taken until inside wctx.forget(), so I suppose it is possible
> something else forgets/removes those files after the status has been
> calculated, but before making it to wctx.forget()?
>

In that case I wouldn't bother. Thanks for checking and explaining.


> The only optimization that I see here is converting rejected to a set, but
> I'm not sure if that matters much given its typical empty state.  I think
> match.files() is much more likely to have more items, and it looks like
> that is a list too.  Those are easy tweaks though if you think it's
> useful/necessary.
>

I've fixed a few of these "(for x in xs if x not in ys)" by converting "ys"
to a set. It's usually corner cases (naturally -- otherwise they would have
been fixed earlier), but on big repos like Mozilla it can be very
noticeable if one happens to run into one of these cases. In this
particular case ("in match.files()"), it's probably going to have to be
very contrived case like using a "set:" pattern to make match.files() large
and I don't even know how to make "rejected" large (bad permissions?), so I
would not fix it if I couldn't find a way to cause the slowness.


> --Matt
>
> > On Tue, Jan 13, 2015, 18:11 Matt Harbison <mharbison72@gmail.com> wrote:
> >
> >> # HG changeset patch
> >> # User Matt Harbison <matt_harbison@yahoo.com>
> >> # Date 1421036723 18000
> >> #      Sun Jan 11 23:25:23 2015 -0500
> >> # Node ID 394c03620d245326b3fa8ac87e0c89c253df9cda
> >> # Parent  9e24ea4fa4d77dc77375467a1a8263d5d7e1f4d3
> >> forget: don't report rejected files as forgotten as well
> >>
> >> It seems like a mistake to report a file as forgotten and rejected.  The
> >> forgotten list doesn't seem to be used by anything in core, so no test
> >> changes.
> >>
> >> diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
> >> --- a/mercurial/cmdutil.py
> >> +++ b/mercurial/cmdutil.py
> >> @@ -2057,7 +2057,7 @@
> >>
> >>      rejected = wctx.forget(forget, prefix)
> >>      bad.extend(f for f in rejected if f in match.files())
> >> -    forgot.extend(forget)
> >> +    forgot.extend(f for f in forget if f not in rejected)
> >>      return bad, forgot
> >>
> >>  def remove(ui, repo, m, prefix, after, force, subrepos):
> >> _______________________________________________
> >> Mercurial-devel mailing list
> >> Mercurial-devel@selenic.com
> >> http://selenic.com/mailman/listinfo/mercurial-devel
>
Matt Mackall - Jan. 14, 2015, 10:32 p.m.
On Tue, 2015-01-13 at 21:11 -0500, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1421036723 18000
> #      Sun Jan 11 23:25:23 2015 -0500
> # Node ID 394c03620d245326b3fa8ac87e0c89c253df9cda
> # Parent  9e24ea4fa4d77dc77375467a1a8263d5d7e1f4d3
> forget: don't report rejected files as forgotten as well

These are queued for default, thanks.

Patch

diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
--- a/mercurial/cmdutil.py
+++ b/mercurial/cmdutil.py
@@ -2057,7 +2057,7 @@ 
 
     rejected = wctx.forget(forget, prefix)
     bad.extend(f for f in rejected if f in match.files())
-    forgot.extend(forget)
+    forgot.extend(f for f in forget if f not in rejected)
     return bad, forgot
 
 def remove(ui, repo, m, prefix, after, force, subrepos):