Patchwork [2,of,2] revert: special case 'hg revert --all'

login
register
mail settings
Submitter Durham Goode
Date Sept. 20, 2014, 1:53 a.m.
Message ID <4ba21a38b9dc7d351e88.1411177991@dev2000.prn2.facebook.com>
Download mbox | patch
Permalink /patch/5899/
State Superseded
Commit f528bfb25b45e329665dcc21cccf61e1ea954aa6
Headers show

Comments

Durham Goode - Sept. 20, 2014, 1:53 a.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1411177433 25200
#      Fri Sep 19 18:43:53 2014 -0700
# Node ID 4ba21a38b9dc7d351e88cc575516443046f6d913
# Parent  b74b7d5f46d5d8ad13ca74e6b740970129090529
revert: special case 'hg revert --all'

On large repos, hg revert --all can take over 13 seconds. This is mainly due to
it walking the tree three times: once to find the list of files in the
dirstate, once to find the list of files in the target, and once to compute the
status from the dirstate to the target.

This optimizes the hg revert --all case to only require the final status. This
speeds it up to 1.3 seconds or so (with hgwatchman enabled).

Further optimizations could be done for the -r NODE and pattern cases, but they
are significantly more complex.
Pierre-Yves David - Sept. 26, 2014, 7:19 a.m.
On 09/19/2014 06:53 PM, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1411177433 25200
> #      Fri Sep 19 18:43:53 2014 -0700
> # Node ID 4ba21a38b9dc7d351e88cc575516443046f6d913
> # Parent  b74b7d5f46d5d8ad13ca74e6b740970129090529
> revert: special case 'hg revert --all'

Augies said he was going to queue it. But a pparently he went too deep 
in the manifest internal and have probably been eaten by a grue by then.

The change looks okay to me (except that I would add a comment 
explaining we have this if/else purely for performance reason.)

But I cannot crew it because patch 1 is in the crew repo. I'll push it 
as soon as the crew repo makes it in main (with the comment things).
Augie Fackler - Sept. 29, 2014, 7:09 p.m.
On Fri, Sep 19, 2014 at 06:53:11PM -0700, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1411177433 25200
> #      Fri Sep 19 18:43:53 2014 -0700
> # Node ID 4ba21a38b9dc7d351e88cc575516443046f6d913
> # Parent  b74b7d5f46d5d8ad13ca74e6b740970129090529
> revert: special case 'hg revert --all'

Queued for crew. Sorry for delay.

(Thanks to Pierre-Yves for reminding me.)

>
> On large repos, hg revert --all can take over 13 seconds. This is mainly due to
> it walking the tree three times: once to find the list of files in the
> dirstate, once to find the list of files in the target, and once to compute the
> status from the dirstate to the target.
>
> This optimizes the hg revert --all case to only require the final status. This
> speeds it up to 1.3 seconds or so (with hgwatchman enabled).
>
> Further optimizations could be done for the -r NODE and pattern cases, but they
> are significantly more complex.
>
> diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
> --- a/mercurial/cmdutil.py
> +++ b/mercurial/cmdutil.py
> @@ -2480,34 +2480,43 @@
>          # walk dirstate to fill `names`
>
>          m = scmutil.match(repo[None], pats, opts)
> -        m.bad = lambda x, y: False
> -        for abs in repo.walk(m):
> -            names[abs] = m.rel(abs), m.exact(abs)
> -
> -        # walk target manifest to fill `names`
> -
> -        def badfn(path, msg):
> -            if path in names:
> -                return
> -            if path in ctx.substate:
> -                return
> -            path_ = path + '/'
> -            for f in names:
> -                if f.startswith(path_):
> +        if not m.always() or node != parent:
> +            m.bad = lambda x, y: False
> +            for abs in repo.walk(m):
> +                names[abs] = m.rel(abs), m.exact(abs)
> +
> +            # walk target manifest to fill `names`
> +
> +            def badfn(path, msg):
> +                if path in names:
>                      return
> -            ui.warn("%s: %s\n" % (m.rel(path), msg))
> -
> -        m = scmutil.match(ctx, pats, opts)
> -        m.bad = badfn
> -        for abs in ctx.walk(m):
> -            if abs not in names:
> -                names[abs] = m.rel(abs), m.exact(abs)
> -
> -        # Find status of all file in `names`.
> -        m = scmutil.matchfiles(repo, names)
> -
> -        changes = repo.status(node1=node, match=m,
> -                              unknown=True, ignored=True, clean=True)
> +                if path in ctx.substate:
> +                    return
> +                path_ = path + '/'
> +                for f in names:
> +                    if f.startswith(path_):
> +                        return
> +                ui.warn("%s: %s\n" % (m.rel(path), msg))
> +
> +            m = scmutil.match(ctx, pats, opts)
> +            m.bad = badfn
> +            for abs in ctx.walk(m):
> +                if abs not in names:
> +                    names[abs] = m.rel(abs), m.exact(abs)
> +
> +            # Find status of all file in `names`.
> +            m = scmutil.matchfiles(repo, names)
> +
> +            changes = repo.status(node1=node, match=m,
> +                                  unknown=True, ignored=True, clean=True)
> +        else:
> +            changes = repo.status(match=m)
> +            for kind in changes:
> +                for abs in kind:
> +                    names[abs] = m.rel(abs), m.exact(abs)
> +
> +            m = scmutil.matchfiles(repo, names)
> +
>          modified = set(changes[0])
>          added    = set(changes[1])
>          removed  = set(changes[2])
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
--- a/mercurial/cmdutil.py
+++ b/mercurial/cmdutil.py
@@ -2480,34 +2480,43 @@ 
         # walk dirstate to fill `names`
 
         m = scmutil.match(repo[None], pats, opts)
-        m.bad = lambda x, y: False
-        for abs in repo.walk(m):
-            names[abs] = m.rel(abs), m.exact(abs)
-
-        # walk target manifest to fill `names`
-
-        def badfn(path, msg):
-            if path in names:
-                return
-            if path in ctx.substate:
-                return
-            path_ = path + '/'
-            for f in names:
-                if f.startswith(path_):
+        if not m.always() or node != parent:
+            m.bad = lambda x, y: False
+            for abs in repo.walk(m):
+                names[abs] = m.rel(abs), m.exact(abs)
+
+            # walk target manifest to fill `names`
+
+            def badfn(path, msg):
+                if path in names:
                     return
-            ui.warn("%s: %s\n" % (m.rel(path), msg))
-
-        m = scmutil.match(ctx, pats, opts)
-        m.bad = badfn
-        for abs in ctx.walk(m):
-            if abs not in names:
-                names[abs] = m.rel(abs), m.exact(abs)
-
-        # Find status of all file in `names`.
-        m = scmutil.matchfiles(repo, names)
-
-        changes = repo.status(node1=node, match=m,
-                              unknown=True, ignored=True, clean=True)
+                if path in ctx.substate:
+                    return
+                path_ = path + '/'
+                for f in names:
+                    if f.startswith(path_):
+                        return
+                ui.warn("%s: %s\n" % (m.rel(path), msg))
+
+            m = scmutil.match(ctx, pats, opts)
+            m.bad = badfn
+            for abs in ctx.walk(m):
+                if abs not in names:
+                    names[abs] = m.rel(abs), m.exact(abs)
+
+            # Find status of all file in `names`.
+            m = scmutil.matchfiles(repo, names)
+
+            changes = repo.status(node1=node, match=m,
+                                  unknown=True, ignored=True, clean=True)
+        else:
+            changes = repo.status(match=m)
+            for kind in changes:
+                for abs in kind:
+                    names[abs] = m.rel(abs), m.exact(abs)
+
+            m = scmutil.matchfiles(repo, names)
+
         modified = set(changes[0])
         added    = set(changes[1])
         removed  = set(changes[2])