Submitter | Martin von Zweigbergk |
---|---|
Date | Nov. 20, 2014, 5:53 a.m. |
Message ID | <0eff6940a2efac28fbbf.1416462807@martinvonz.mtv.corp.google.com> |
Download | mbox | patch |
Permalink | /patch/6807/ |
State | Accepted |
Commit | a179db3db9b96b38c10c491e6e7e7ad5f40a7787 |
Headers | show |
Comments
On 11/19/2014 09:53 PM, Martin von Zweigbergk wrote: > # HG changeset patch > # User Martin von Zweigbergk <martinvonz@google.com> > # Date 1416460026 28800 > # Wed Nov 19 21:07:06 2014 -0800 > # Node ID 0eff6940a2efac28fbbf4180592a0f4e1404744e > # Parent cc83cac41619b32375b28f821237fb0747010b80 > dirstate: speed up repeated missing directory checks Nice and nice, pushed to the clowncopter.
Hmm... that 'break' should have been dropped :-( It won't impact timings, so no need to update description. Will you fix or do you want a resend? Either way, might as well also change the 'self._map' to 'dmap' for consistency. On Wed Nov 19 2014 at 10:49:01 PM Pierre-Yves David < pierre-yves.david@ens-lyon.org> wrote: > > > On 11/19/2014 09:53 PM, Martin von Zweigbergk wrote: > > # HG changeset patch > > # User Martin von Zweigbergk <martinvonz@google.com> > > # Date 1416460026 28800 > > # Wed Nov 19 21:07:06 2014 -0800 > > # Node ID 0eff6940a2efac28fbbf4180592a0f4e1404744e > > # Parent cc83cac41619b32375b28f821237fb0747010b80 > > dirstate: speed up repeated missing directory checks > > Nice and nice, pushed to the clowncopter. > > -- > Pierre-Yves David >
On 11/19/2014 10:59 PM, Martin von Zweigbergk wrote: > Hmm... that 'break' should have been dropped :-( It won't impact > timings, so no need to update description. Will you fix or do you want a > resend? Either way, might as well also change the 'self._map' to 'dmap' > for consistency. patch update on the clowncopter
Patch
diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py --- a/mercurial/dirstate.py +++ b/mercurial/dirstate.py @@ -629,6 +629,7 @@ results = dict.fromkeys(subrepos) results['.hg'] = None + alldirs = None for ff in files: if normalize: nf = normalize(normpath(ff), False, True) @@ -657,13 +658,13 @@ if nf in dmap: # does it exactly match a missing file? results[nf] = None else: # does it match a missing directory? - prefix = nf + "/" - for fn in dmap: - if fn.startswith(prefix): - if matchedir: - matchedir(nf) - notfoundadd(nf) - break + if alldirs is None: + alldirs = scmutil.dirs(self._map) + if nf in alldirs: + if matchedir: + matchedir(nf) + notfoundadd(nf) + break else: badfn(ff, inst.strerror)
# HG changeset patch # User Martin von Zweigbergk <martinvonz@google.com> # Date 1416460026 28800 # Wed Nov 19 21:07:06 2014 -0800 # Node ID 0eff6940a2efac28fbbf4180592a0f4e1404744e # Parent cc83cac41619b32375b28f821237fb0747010b80 dirstate: speed up repeated missing directory checks In a mozilla repo with tip at bb3ff09f52fe, hg update tip~1000 && time hg revert -nq -r tip . displays ~4:20 minutes. With tip~100, it runs in ~11 s. With revision 100000, it did not finish in 12 minutes. Revert calls dirstate.status() with a matcher that matches each file in the target revision. The main problem [1] lies in dirstate._walkexplicit(), which looks for matching deleted directories by checking whether each path is prefix of any path in the dirstate. With m files in the dirstate and n files in the target revision that are not in the dirstate, this is clearly O(m*n). Let's improve by keeping a lazily initialized set of all the directories in the dirstate, so the time becomes O(m+n). After this patch, the 4:20 minutes become 5.5 s, while for a single missing path, it slows down from 1.092 s to 1.150 s (best of 4). The >12 min case becomes 5.8 s. [1] A narrower optimization would be to make revert take the fast path for '.' and '--all'.