Patchwork context: extend efficient manifest filtering to when all paths are files

login
register
mail settings
Submitter Siddharth Agarwal
Date July 16, 2014, 10:18 p.m.
Message ID <8fe65fda60f276308ec8.1405549120@dev1738.prn1.facebook.com>
Download mbox | patch
Permalink /patch/5181/
State Superseded
Commit 5809d62e7106a80cc8087bc8f1e76fbfe1f9b141
Headers show

Comments

Siddharth Agarwal - July 16, 2014, 10:18 p.m.
# HG changeset patch
# User Siddharth Agarwal <sid0@fb.com>
# Date 1405547583 25200
#      Wed Jul 16 14:53:03 2014 -0700
# Node ID 8fe65fda60f276308ec84f998014f2c848410a50
# Parent  e6c10ed302aaba92013aae7c4d6ca7d855851978
context: extend efficient manifest filtering to when all paths are files

On a repository with over 250,000 files and 700,000 commits, this improves
cases like

hg status -r <rev> -- <file>  # rev is not .

from 2.1 seconds to 1.4 seconds.

There is further scope for improvement here: for a single file or a small set
of files, it is probably more efficient to use filelog linkrevs when possible.
However there will always be cases where that will fail (multiple commits
pointing to the same file revision, removed files...), so this is independently
useful.
Siddharth Agarwal - July 16, 2014, 10:29 p.m.
On 07/16/2014 03:18 PM, Siddharth Agarwal wrote:
> # HG changeset patch
> # User Siddharth Agarwal <sid0@fb.com>
> # Date 1405547583 25200
> #      Wed Jul 16 14:53:03 2014 -0700
> # Node ID 8fe65fda60f276308ec84f998014f2c848410a50
> # Parent  e6c10ed302aaba92013aae7c4d6ca7d855851978
> context: extend efficient manifest filtering to when all paths are files
>
> On a repository with over 250,000 files and 700,000 commits, this improves
> cases like
>
> hg status -r <rev> -- <file>  # rev is not .

Messed up this message (-r is not --rev). Sending again with a fixed 
message.

>
> from 2.1 seconds to 1.4 seconds.
>
> There is further scope for improvement here: for a single file or a small set
> of files, it is probably more efficient to use filelog linkrevs when possible.
> However there will always be cases where that will fail (multiple commits
> pointing to the same file revision, removed files...), so this is independently
> useful.
>
> diff --git a/mercurial/context.py b/mercurial/context.py
> --- a/mercurial/context.py
> +++ b/mercurial/context.py
> @@ -74,8 +74,10 @@
>           if match.always():
>               return self.manifest().copy()
>   
> -        if match.matchfn == match.exact:
> -            return self.manifest().intersectfiles(match.files())
> +        files = match.files()
> +        if (match.matchfn == match.exact or
> +            (not match.anypats() and util.all(fn in self for fn in files))):
> +            return self.manifest().intersectfiles(files)
>   
>           mf = self.manifest().copy()
>           for fn in mf.keys():
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -74,8 +74,10 @@ 
         if match.always():
             return self.manifest().copy()
 
-        if match.matchfn == match.exact:
-            return self.manifest().intersectfiles(match.files())
+        files = match.files()
+        if (match.matchfn == match.exact or
+            (not match.anypats() and util.all(fn in self for fn in files))):
+            return self.manifest().intersectfiles(files)
 
         mf = self.manifest().copy()
         for fn in mf.keys():