Patchwork dirstate: avoid invalidating every entries when list is empty

login
register
mail settings
Submitter Pierre-Yves David
Date June 5, 2015, 5:19 a.m.
Message ID <88a6501758d2779176b4.1433481581@marginatus.alto.octopoid.net>
Download mbox | patch
Permalink /patch/9509/
State Accepted
Commit 2bbfc2042d939972e51e2e549f5f48cdb4debba3
Headers show

Comments

Pierre-Yves David - June 5, 2015, 5:19 a.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@fb.com>
# Date 1433481032 25200
#      Thu Jun 04 22:10:32 2015 -0700
# Node ID 88a6501758d2779176b47f23a0bca33f563881d8
# Parent  51e7acc34b0ab0e540dffdb22127914f2353d5e2
dirstate: avoid invalidating every entries when list is empty

Default value was not tested with 'is None', this made empty list seen as
default value and result the invalidation of every single entry in the
dirstate. On repos with hundred of thousand of files, this results in minutes
of lookup time instead nothing.

This is a text book example of why we should test 'is None' if this is what we
mean.
Martin von Zweigbergk - June 5, 2015, 5:24 a.m.
On Thu, Jun 4, 2015 at 10:20 PM Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1433481032 25200
> #      Thu Jun 04 22:10:32 2015 -0700
> # Node ID 88a6501758d2779176b47f23a0bca33f563881d8
> # Parent  51e7acc34b0ab0e540dffdb22127914f2353d5e2
> dirstate: avoid invalidating every entries when list is empty
>
> Default value was not tested with 'is None', this made empty list seen as
> default value and result the invalidation of every single entry in the
> dirstate. On repos with hundred of thousand of files, this results in
> minutes
> of lookup time instead nothing.
>

On what command?


>
> This is a text book example of why we should test 'is None' if this is
> what we
> mean.
>
> diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
> --- a/mercurial/dirstate.py
> +++ b/mercurial/dirstate.py
> @@ -570,11 +570,12 @@ class dirstate(object):
>          self._pl = [nullid, nullid]
>          self._lastnormaltime = 0
>          self._dirty = True
>
>      def rebuild(self, parent, allfiles, changedfiles=None):
> -        changedfiles = changedfiles or allfiles
> +        if changedfiles is None:
> +            changedfiles = allfiles
>          oldmap = self._map
>          self.clear()
>          for f in allfiles:
>              if f not in changedfiles:
>                  self._map[f] = oldmap[f]
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel
>
Pierre-Yves David - June 5, 2015, 5:35 a.m.
On 06/04/2015 10:24 PM, Martin von Zweigbergk wrote:
>
>
> On Thu, Jun 4, 2015 at 10:20 PM Pierre-Yves David
> <pierre-yves.david@ens-lyon.org <mailto:pierre-yves.david@ens-lyon.org>>
> wrote:
>
>     # HG changeset patch
>     # User Pierre-Yves David <pierre-yves.david@fb.com
>     <mailto:pierre-yves.david@fb.com>>
>     # Date 1433481032 25200
>     #      Thu Jun 04 22:10:32 2015 -0700
>     # Node ID 88a6501758d2779176b47f23a0bca33f563881d8
>     # Parent  51e7acc34b0ab0e540dffdb22127914f2353d5e2
>     dirstate: avoid invalidating every entries when list is empty
>
>     Default value was not tested with 'is None', this made empty list
>     seen as
>     default value and result the invalidation of every single entry in the
>     dirstate. On repos with hundred of thousand of files, this results
>     in minutes
>     of lookup time instead nothing.
>
>
> On what command?

This is trigger by the `hg reset` command from our experimental 
extensions. That's an API bug, I'm not aware of any faulty user in core 
(but spend ø time searching).
Augie Fackler - June 5, 2015, 5:41 p.m.
On Thu, Jun 04, 2015 at 10:35:18PM -0700, Pierre-Yves David wrote:
>
>
> On 06/04/2015 10:24 PM, Martin von Zweigbergk wrote:
> >
> >
> >On Thu, Jun 4, 2015 at 10:20 PM Pierre-Yves David
> ><pierre-yves.david@ens-lyon.org <mailto:pierre-yves.david@ens-lyon.org>>
> >wrote:
> >
> >    # HG changeset patch
> >    # User Pierre-Yves David <pierre-yves.david@fb.com
> >    <mailto:pierre-yves.david@fb.com>>
> >    # Date 1433481032 25200
> >    #      Thu Jun 04 22:10:32 2015 -0700
> >    # Node ID 88a6501758d2779176b47f23a0bca33f563881d8
> >    # Parent  51e7acc34b0ab0e540dffdb22127914f2353d5e2
> >    dirstate: avoid invalidating every entries when list is empty
> >
> >    Default value was not tested with 'is None', this made empty list
> >    seen as
> >    default value and result the invalidation of every single entry in the
> >    dirstate. On repos with hundred of thousand of files, this results
> >    in minutes
> >    of lookup time instead nothing.
> >
> >
> >On what command?
>
> This is trigger by the `hg reset` command from our experimental extensions.
> That's an API bug, I'm not aware of any faulty user in core (but spend ø
> time searching).

Looks like enough of a correctness fix I'm queueing it. Thanks!

>
> --
> Pierre-Yves David
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel
Matt Mackall - June 5, 2015, 5:53 p.m.
On Thu, 2015-06-04 at 22:19 -0700, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1433481032 25200
> #      Thu Jun 04 22:10:32 2015 -0700
> # Node ID 88a6501758d2779176b47f23a0bca33f563881d8
> # Parent  51e7acc34b0ab0e540dffdb22127914f2353d5e2
> dirstate: avoid invalidating every entries when list is empty

Queued for default, thanks.
Pierre-Yves David - June 5, 2015, 5:57 p.m.
On 06/05/2015 10:53 AM, Matt Mackall wrote:
> On Thu, 2015-06-04 at 22:19 -0700, Pierre-Yves David wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david@fb.com>
>> # Date 1433481032 25200
>> #      Thu Jun 04 22:10:32 2015 -0700
>> # Node ID 88a6501758d2779176b47f23a0bca33f563881d8
>> # Parent  51e7acc34b0ab0e540dffdb22127914f2353d5e2
>> dirstate: avoid invalidating every entries when list is empty
>
> Queued for default, thanks.

Augies won the queuing race by 12 minutes.

Patch

diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -570,11 +570,12 @@  class dirstate(object):
         self._pl = [nullid, nullid]
         self._lastnormaltime = 0
         self._dirty = True
 
     def rebuild(self, parent, allfiles, changedfiles=None):
-        changedfiles = changedfiles or allfiles
+        if changedfiles is None:
+            changedfiles = allfiles
         oldmap = self._map
         self.clear()
         for f in allfiles:
             if f not in changedfiles:
                 self._map[f] = oldmap[f]