Patchwork D1253: dirstate: avoid reading the map when possible (issue5713) (issue5717)

login
register
mail settings
Submitter phabricator
Date Oct. 26, 2017, 11:25 p.m.
Message ID <differential-rev-PHID-DREV-eeotwvw7mzegor25dxh3-req@phab.mercurial-scm.org>
Download mbox | patch
Permalink /patch/25289/
State Superseded
Headers show

Comments

phabricator - Oct. 26, 2017, 11:25 p.m.
durham created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Before the recent refactor, we would not load the entire map until it was
  accessed. As part of the refactor, that got lost and even just trying to load
  the dirstate parents would load the whole map. This caused a perf regression
  (issue5713) and a regression with static http serving (issue5717).
  
  Making it lazy loaded again fixes both.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1253

AFFECTED FILES
  mercurial/dirstate.py

CHANGE DETAILS




To: durham, #hg-reviewers
Cc: mercurial-devel
phabricator - Oct. 27, 2017, 3:32 a.m.
mharbison72 added a comment.


  This makes Windows mostly happy again, but now I'm getting the following diff.  Any ideas?
  
  diff --git a/tests/test-static-http.t b/tests/test-static-http.t
  
  - a/tests/test-static-http.t
  
  +++ b/tests/test-static-http.t
  @@ -221,34 +221,31 @@
  
    $ cat server.log | sed -n -e 's|.*GET \(/[^ ]*\).*|\1|p' | sort -u
    /.hg/bookmarks
    /.hg/bookmarks.current
  
  - /.hg/cache/hgtagsfnodes1 /.hg/requires /.hg/store/00changelog.i /.hg/store/00manifest.i /.hg/store/data/%7E2ehgsub.i /.hg/store/data/%7E2ehgsubstate.i /.hg/store/data/a.i
  
  +  /.hg/cache/hgtagsfnodes1 (glob)
  
    /notarepo/.hg/00changelog.i
    /notarepo/.hg/requires
    /remote-with-names/.hg/bookmarks
    /remote-with-names/.hg/bookmarks.current
  
  - /remote-with-names/.hg/cache/branch2-served
  - /remote-with-names/.hg/cache/hgtagsfnodes1
  - /remote-with-names/.hg/cache/tags2-served /remote-with-names/.hg/localtags /remote-with-names/.hg/requires /remote-with-names/.hg/store/00changelog.i /remote-with-names/.hg/store/00manifest.i /remote-with-names/.hg/store/data/%7E2ehgtags.i /remote-with-names/.hg/store/data/foo.i
  
  +  /remote-with-names/.hg/cache/branch2-base (glob)
  +  /remote-with-names/.hg/cache/branch2-immutable (glob)
  +  /remote-with-names/.hg/cache/branch2-served (glob)
  +  /remote-with-names/.hg/cache/hgtagsfnodes1 (glob)
  +  /remote-with-names/.hg/cache/rbc-names-v1 (glob)
  +  /remote-with-names/.hg/cache/tags2-served (glob)
  
    /remote/.hg/bookmarks
    /remote/.hg/bookmarks.current
  
  - /remote/.hg/cache/branch2-base
  - /remote/.hg/cache/branch2-immutable
  - /remote/.hg/cache/branch2-served
  - /remote/.hg/cache/hgtagsfnodes1
  - /remote/.hg/cache/rbc-names-v1
  - /remote/.hg/cache/tags2-served /remote/.hg/localtags /remote/.hg/requires /remote/.hg/store/00changelog.i
  
  @@ -257,6 +254,12 @@
  
    /remote/.hg/store/data/%7E2ehgtags.i
    /remote/.hg/store/data/bar.i
    /remote/.hg/store/data/quux.i
  
  +  /remote/.hg/cache/branch2-base (glob)
  +  /remote/.hg/cache/branch2-immutable (glob)
  +  /remote/.hg/cache/branch2-served (glob)
  +  /remote/.hg/cache/hgtagsfnodes1 (glob)
  +  /remote/.hg/cache/rbc-names-v1 (glob)
  +  /remote/.hg/cache/tags2-served (glob)
  
    /remotempty/.hg/bookmarks
    /remotempty/.hg/bookmarks.current
    /remotempty/.hg/requires
  
  @@ -264,9 +267,9 @@
  
    /remotempty/.hg/store/00manifest.i
    /sub/.hg/bookmarks
    /sub/.hg/bookmarks.current
  
  - /sub/.hg/cache/hgtagsfnodes1 /sub/.hg/requires /sub/.hg/store/00changelog.i /sub/.hg/store/00manifest.i /sub/.hg/store/data/%7E2ehgtags.i /sub/.hg/store/data/test.i
  
  +  /sub/.hg/cache/hgtagsfnodes1 (glob)

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1253

To: durham, #hg-reviewers
Cc: mharbison72, mercurial-devel
phabricator - Oct. 27, 2017, 3:42 a.m.
mharbison72 added a comment.


  Try that again without the reformatting
  
    diff --git a/tests/test-static-http.t b/tests/test-static-http.t
    --- a/tests/test-static-http.t
    +++ b/tests/test-static-http.t
    @@ -221,34 +221,31 @@
       $ cat server.log | sed -n -e 's|.*GET \(/[^ ]*\).*|\1|p' | sort -u
       /.hg/bookmarks
       /.hg/bookmarks.current
    -  /.hg/cache/hgtagsfnodes1
       /.hg/requires
       /.hg/store/00changelog.i
       /.hg/store/00manifest.i
       /.hg/store/data/%7E2ehgsub.i
       /.hg/store/data/%7E2ehgsubstate.i
       /.hg/store/data/a.i
    +  /.hg/cache/hgtagsfnodes1 (glob)
       /notarepo/.hg/00changelog.i
       /notarepo/.hg/requires
       /remote-with-names/.hg/bookmarks
       /remote-with-names/.hg/bookmarks.current
    -  /remote-with-names/.hg/cache/branch2-served
    -  /remote-with-names/.hg/cache/hgtagsfnodes1
    -  /remote-with-names/.hg/cache/tags2-served
       /remote-with-names/.hg/localtags
       /remote-with-names/.hg/requires
       /remote-with-names/.hg/store/00changelog.i
       /remote-with-names/.hg/store/00manifest.i
       /remote-with-names/.hg/store/data/%7E2ehgtags.i
       /remote-with-names/.hg/store/data/foo.i
    +  /remote-with-names/.hg/cache/branch2-base (glob)
    +  /remote-with-names/.hg/cache/branch2-immutable (glob)
    +  /remote-with-names/.hg/cache/branch2-served (glob)
    +  /remote-with-names/.hg/cache/hgtagsfnodes1 (glob)
    +  /remote-with-names/.hg/cache/rbc-names-v1 (glob)
    +  /remote-with-names/.hg/cache/tags2-served (glob)
       /remote/.hg/bookmarks
       /remote/.hg/bookmarks.current
    -  /remote/.hg/cache/branch2-base
    -  /remote/.hg/cache/branch2-immutable
    -  /remote/.hg/cache/branch2-served
    -  /remote/.hg/cache/hgtagsfnodes1
    -  /remote/.hg/cache/rbc-names-v1
    -  /remote/.hg/cache/tags2-served
       /remote/.hg/localtags
       /remote/.hg/requires
       /remote/.hg/store/00changelog.i
    @@ -257,6 +254,12 @@
       /remote/.hg/store/data/%7E2ehgtags.i
       /remote/.hg/store/data/bar.i
       /remote/.hg/store/data/quux.i
    +  /remote/.hg/cache/branch2-base (glob)
    +  /remote/.hg/cache/branch2-immutable (glob)
    +  /remote/.hg/cache/branch2-served (glob)
    +  /remote/.hg/cache/hgtagsfnodes1 (glob)
    +  /remote/.hg/cache/rbc-names-v1 (glob)
    +  /remote/.hg/cache/tags2-served (glob)
       /remotempty/.hg/bookmarks
       /remotempty/.hg/bookmarks.current
       /remotempty/.hg/requires
    @@ -264,9 +267,9 @@
       /remotempty/.hg/store/00manifest.i
       /sub/.hg/bookmarks
       /sub/.hg/bookmarks.current
    -  /sub/.hg/cache/hgtagsfnodes1
       /sub/.hg/requires
       /sub/.hg/store/00changelog.i
       /sub/.hg/store/00manifest.i
       /sub/.hg/store/data/%7E2ehgtags.i
       /sub/.hg/store/data/test.i
    +  /sub/.hg/cache/hgtagsfnodes1 (glob)

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1253

To: durham, #hg-reviewers
Cc: mharbison72, mercurial-devel
phabricator - Oct. 27, 2017, 2:27 p.m.
lothiraldan accepted this revision.
lothiraldan added a comment.


  It looks good to me.
  
  I did some quick and dirty performance check and I see a performance improvement. On the mozilla-central repository, without this change, I get these results:
  
    $ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
    .....................
    command: Mean +- std dev: 154 ms +- 2 ms
    $ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
    .....................
    command: Mean +- std dev: 154 ms +- 2 ms
    $ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
    .....................
    command: Mean +- std dev: 156 ms +- 3 ms
  
  And with this changeset I get better results:
  
    $ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
    .....................
    command: Mean +- std dev: 119 ms +- 3 ms
    $ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
    .....................
    command: Mean +- std dev: 120 ms +- 3 ms
    $ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
    .....................
    command: Mean +- std dev: 123 ms +- 7 ms
  
  Where perf is https://perf.readthedocs.io/en/latest/

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1253

To: durham, #hg-reviewers, lothiraldan
Cc: lothiraldan, mharbison72, mercurial-devel
phabricator - Oct. 27, 2017, 3:17 p.m.
yuja added a comment.


  (I don't read the patch content yet.)
  
  I just made static-http repo not see dirstate because of the issue5717.
  I don't know the issue5713, but a fewer loading over HTTP should be generally better.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1253

To: durham, #hg-reviewers, lothiraldan
Cc: yuja, lothiraldan, mharbison72, mercurial-devel
phabricator - Oct. 27, 2017, 3:45 p.m.
lothiraldan added a comment.


  Here are some measures https://phab.mercurial-scm.org/rHG0865d25e8a8ad664e8da31ab1bf12f1fc9bf0c7a:
  
    $ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
    .....................
    command: Mean +- std dev: 109 ms +- 3 ms
    $ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
    .....................
    command: Mean +- std dev: 110 ms +- 2 ms
    $ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
    .....................
    command: Mean +- std dev: 111 ms +- 3 ms
  
  And here are the mesaures on https://phab.mercurial-scm.org/rHGc36c3fa7d35b0aa1c29d316927b21f77ecc809b8:
  
    $ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
    .....................
    command: Mean +- std dev: 166 ms +- 5 ms
    $ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
    .....................
    command: Mean +- std dev: 165 ms +- 3 ms
    $ python -m perf command /home/lothiraldan/project/mercurial/mercurial/hg id -r .
    .....................
    command: Mean +- std dev: 166 ms +- 5 ms
  
  So it's seems that this patch solves most of the performance regression, but with so many changesets since https://phab.mercurial-scm.org/rHGc36c3fa7d35b0aa1c29d316927b21f77ecc809b8 I'm not sure we can do better.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1253

To: durham, #hg-reviewers, lothiraldan
Cc: yuja, lothiraldan, mharbison72, mercurial-devel
phabricator - Oct. 27, 2017, 3:54 p.m.
durham added a comment.


  @mharbison72 your test output looks like the sort order changed?  Does it consistently repro before and after this patch?  The sort order is from the sort -u in the test, so I can't imagine this affecting it.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1253

To: durham, #hg-reviewers, lothiraldan
Cc: yuja, lothiraldan, mharbison72, mercurial-devel
phabricator - Oct. 28, 2017, 7:54 a.m.
yuja added a comment.


  Queued up to this patch, thanks.

INLINE COMMENTS

> dirstate.py:132
>          (state, mode, size, time).'''
> -        self._read()
> +        self._map = dirstatemap(self._ui, self._opener, self._root)
>          return self._map

Perhaps this can be moved to __init__() since calling dirstatemap()
doesn't do any IO now.

> dirstate.py:1209
> +        self._map = {}
> +        self.read()
> +        return self._map

Nit: dirstatemap.read() could be a private function.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1253

To: durham, #hg-reviewers, lothiraldan, yuja
Cc: yuja, lothiraldan, mharbison72, mercurial-devel
phabricator - Oct. 28, 2017, 6:54 p.m.
lothiraldan added a comment.


  After rereading the numbers, I realize that with this patch we are still 10ms (around 10%) slower than before. The numbers before `0865d25e8a8a` were `[109, 110, 111]`ms and with this patch we are at `[119, 120, 123]`ms.
  
  We should take a look at it.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1253

To: durham, #hg-reviewers, lothiraldan, yuja
Cc: yuja, lothiraldan, mharbison72, mercurial-devel
phabricator - Oct. 29, 2017, 1:55 a.m.
mharbison72 added a comment.


  @durham It did turn out to be a sort issue: some paths had '\' and were (partially?) converted to '/' by the test runner _after_ the sort.  Piping through sed to fix the paths beforehand reduces it to this:
  
    --- c:/Users/Matt/projects/hg/tests/test-static-http.t
    +++ c:/Users/Matt/projects/hg/tests/test-static-http.t.err
    @@ -232,8 +232,11 @@
       /notarepo/.hg/requires
       /remote-with-names/.hg/bookmarks
       /remote-with-names/.hg/bookmarks.current
    +  /remote-with-names/.hg/cache/branch2-base
    +  /remote-with-names/.hg/cache/branch2-immutable
       /remote-with-names/.hg/cache/branch2-served
       /remote-with-names/.hg/cache/hgtagsfnodes1
    +  /remote-with-names/.hg/cache/rbc-names-v1
       /remote-with-names/.hg/cache/tags2-served
       /remote-with-names/.hg/localtags
       /remote-with-names/.hg/requires
  
  Not sure why yet, but this patch isn't the problem.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1253

To: durham, #hg-reviewers, lothiraldan, yuja
Cc: yuja, lothiraldan, mharbison72, mercurial-devel

Patch

diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -129,7 +129,7 @@ 
     def _map(self):
         '''Return the dirstate contents as a map from filename to
         (state, mode, size, time).'''
-        self._read()
+        self._map = dirstatemap(self._ui, self._opener, self._root)
         return self._map
 
     @property
@@ -353,10 +353,6 @@ 
             f.discard()
             raise
 
-    def _read(self):
-        self._map = dirstatemap(self._ui, self._opener, self._root)
-        self._map.read()
-
     def invalidate(self):
         '''Causes the next access to reread the dirstate.
 
@@ -1201,14 +1197,24 @@ 
         self._root = root
         self._filename = 'dirstate'
 
-        self._map = {}
-        self.copymap = {}
         self._parents = None
         self._dirtyparents = False
 
         # for consistent view between _pl() and _read() invocations
         self._pendingmode = None
 
+    @propertycache
+    def _map(self):
+        self._map = {}
+        self.read()
+        return self._map
+
+    @propertycache
+    def copymap(self):
+        self.copymap = {}
+        self._map
+        return self.copymap
+
     def clear(self):
         self._map = {}
         self.copymap = {}
@@ -1388,7 +1394,7 @@ 
 
     @propertycache
     def identity(self):
-        self.read()
+        self._map
         return self.identity
 
     @propertycache