Patchwork [4,of,4,V2] parsers: pure implementation of the new dirstate ordering

login
register
mail settings
Submitter Laurent Charignon
Date Dec. 1, 2015, 12:52 a.m.
Message ID <f9959f2e4cd73d0bf06c.1448931166@lcharignon-mbp.local>
Download mbox | patch
Permalink /patch/11698/
State Changes Requested
Headers show

Comments

Laurent Charignon - Dec. 1, 2015, 12:52 a.m.
# HG changeset patch
# User Laurent Charignon <lcharignon@fb.com>
# Date 1448411881 28800
#      Tue Nov 24 16:38:01 2015 -0800
# Node ID f9959f2e4cd73d0bf06c6a325c4a33e4f5bd1a04
# Parent  749f9f5d6476c60ba9841415e4e94358fe3adf01
parsers: pure implementation of the new dirstate ordering

This patch makes the pure implementation of pack_dirstate up to date with the
C implementation. It effectively writes the dirstate starting with the
non-normal files and ends with the normal files.
Sean Farley - Dec. 1, 2015, 1:30 a.m.
Laurent Charignon <lcharignon@fb.com> writes:

> # HG changeset patch
> # User Laurent Charignon <lcharignon@fb.com>
> # Date 1448411881 28800
> #      Tue Nov 24 16:38:01 2015 -0800
> # Node ID f9959f2e4cd73d0bf06c6a325c4a33e4f5bd1a04
> # Parent  749f9f5d6476c60ba9841415e4e94358fe3adf01
> parsers: pure implementation of the new dirstate ordering
>
> This patch makes the pure implementation of pack_dirstate up to date with the
> C implementation. It effectively writes the dirstate starting with the
> non-normal files and ends with the normal files.

Sorry if this was answered in the first version but are we now
(unconditionally) going to write the dirstate in order? I assume because
there would be some speedup (IIRC Durham had a patch assuming the order
of the dirstate)?
Sean Farley - Dec. 1, 2015, 2:18 a.m.
Laurent Charignon <lcharignon@fb.com> writes:

> # HG changeset patch
> # User Laurent Charignon <lcharignon@fb.com>
> # Date 1448411881 28800
> #      Tue Nov 24 16:38:01 2015 -0800
> # Node ID f9959f2e4cd73d0bf06c6a325c4a33e4f5bd1a04
> # Parent  749f9f5d6476c60ba9841415e4e94358fe3adf01
> parsers: pure implementation of the new dirstate ordering
>
> This patch makes the pure implementation of pack_dirstate up to date with the
> C implementation. It effectively writes the dirstate starting with the
> non-normal files and ends with the normal files.
>
> diff --git a/mercurial/pure/parsers.py b/mercurial/pure/parsers.py
> --- a/mercurial/pure/parsers.py
> +++ b/mercurial/pure/parsers.py
> @@ -88,8 +88,13 @@ def pack_dirstate(dmap, copymap, pl, now
>      cs = cStringIO.StringIO()
>      write = cs.write
>      write("".join(pl))
> -    if True:
> +    # First pass: non normal files, second pass: normal files to improve status
> +    # performance as status generally only need the non normal files
> +    for _pass in [0, 1]:

Very minor nit: this should probably be (0, 1) since we're not modifying
_pass.
Sean Farley - Dec. 1, 2015, 2:20 a.m.
Sean Farley <sean@farley.io> writes:

> Laurent Charignon <lcharignon@fb.com> writes:
>
>> # HG changeset patch
>> # User Laurent Charignon <lcharignon@fb.com>
>> # Date 1448411881 28800
>> #      Tue Nov 24 16:38:01 2015 -0800
>> # Node ID f9959f2e4cd73d0bf06c6a325c4a33e4f5bd1a04
>> # Parent  749f9f5d6476c60ba9841415e4e94358fe3adf01
>> parsers: pure implementation of the new dirstate ordering
>>
>> This patch makes the pure implementation of pack_dirstate up to date with the
>> C implementation. It effectively writes the dirstate starting with the
>> non-normal files and ends with the normal files.
>
> Sorry if this was answered in the first version but are we now
> (unconditionally) going to write the dirstate in order? I assume because
> there would be some speedup (IIRC Durham had a patch assuming the order
> of the dirstate)?

Regardless of the answer to my question, I like the speed up of the
series. The C code looks fine to me and the slowdown is minor enough
that this passes my review.
Augie Fackler - Dec. 1, 2015, 4:20 p.m.
On Mon, Nov 30, 2015 at 04:52:46PM -0800, Laurent Charignon wrote:
> # HG changeset patch
> # User Laurent Charignon <lcharignon@fb.com>
> # Date 1448411881 28800
> #      Tue Nov 24 16:38:01 2015 -0800
> # Node ID f9959f2e4cd73d0bf06c6a325c4a33e4f5bd1a04
> # Parent  749f9f5d6476c60ba9841415e4e94358fe3adf01
> parsers: pure implementation of the new dirstate ordering
>
> This patch makes the pure implementation of pack_dirstate up to date with the
> C implementation. It effectively writes the dirstate starting with the
> non-normal files and ends with the normal files.
>
> diff --git a/mercurial/pure/parsers.py b/mercurial/pure/parsers.py
> --- a/mercurial/pure/parsers.py
> +++ b/mercurial/pure/parsers.py
> @@ -88,8 +88,13 @@ def pack_dirstate(dmap, copymap, pl, now
>      cs = cStringIO.StringIO()
>      write = cs.write
>      write("".join(pl))
> -    if True:
> +    # First pass: non normal files, second pass: normal files to improve status
> +    # performance as status generally only need the non normal files
> +    for _pass in [0, 1]:
>          for f, e in dmap.iteritems():
> +            normal = e[0] == 'n' and e[3] != -1
> +            if normal != _pass:

Using boolean equality with integers is weirding me out here. Can I
interest you in doing something like:

for writenormal in False, True:
  normal = (thing)
  if normal and not writenormal:
    continue

instead of what you've got?

> +                continue
>              if e[0] == 'n' and e[3] == now:
>                  # The file was last modified "simultaneously" with the current
>                  # write to dirstate (i.e. within the same second for file-
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel
Augie Fackler - Dec. 1, 2015, 5:39 p.m.
On Mon, Nov 30, 2015 at 04:52:46PM -0800, Laurent Charignon wrote:
> # HG changeset patch
> # User Laurent Charignon <lcharignon@fb.com>
> # Date 1448411881 28800
> #      Tue Nov 24 16:38:01 2015 -0800
> # Node ID f9959f2e4cd73d0bf06c6a325c4a33e4f5bd1a04
> # Parent  749f9f5d6476c60ba9841415e4e94358fe3adf01
> parsers: pure implementation of the new dirstate ordering
>
> This patch makes the pure implementation of pack_dirstate up to date with the
> C implementation. It effectively writes the dirstate starting with the
> non-normal files and ends with the normal files.
>
> diff --git a/mercurial/pure/parsers.py b/mercurial/pure/parsers.py
> --- a/mercurial/pure/parsers.py
> +++ b/mercurial/pure/parsers.py
> @@ -88,8 +88,13 @@ def pack_dirstate(dmap, copymap, pl, now
>      cs = cStringIO.StringIO()
>      write = cs.write
>      write("".join(pl))
> -    if True:
> +    # First pass: non normal files, second pass: normal files to improve status
> +    # performance as status generally only need the non normal files
> +    for _pass in [0, 1]:
>          for f, e in dmap.iteritems():
> +            normal = e[0] == 'n' and e[3] != -1
> +            if normal != _pass:
> +                continue

Hm, I just realized what I sent isn't even equivalent to this. I'd
definitely like to see same-type comparisons here though.

Might be clearer to extract the writing of entries from the map into a
function which can then write either normal or abnormal files into the
dirstate object?

>              if e[0] == 'n' and e[3] == now:
>                  # The file was last modified "simultaneously" with the current
>                  # write to dirstate (i.e. within the same second for file-
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/pure/parsers.py b/mercurial/pure/parsers.py
--- a/mercurial/pure/parsers.py
+++ b/mercurial/pure/parsers.py
@@ -88,8 +88,13 @@  def pack_dirstate(dmap, copymap, pl, now
     cs = cStringIO.StringIO()
     write = cs.write
     write("".join(pl))
-    if True:
+    # First pass: non normal files, second pass: normal files to improve status
+    # performance as status generally only need the non normal files
+    for _pass in [0, 1]:
         for f, e in dmap.iteritems():
+            normal = e[0] == 'n' and e[3] != -1
+            if normal != _pass:
+                continue
             if e[0] == 'n' and e[3] == now:
                 # The file was last modified "simultaneously" with the current
                 # write to dirstate (i.e. within the same second for file-