Submitter | Pulkit Goyal |
---|---|
Date | March 16, 2017, 4:13 a.m. |
Message ID | <b5673a08993652a92c0e.1489637600@pulkit-goyal> |
Download | mbox | patch |
Permalink | /patch/19378/ |
State | Accepted |
Headers | show |
Comments
On Wed, Mar 15, 2017 at 9:13 PM, Pulkit Goyal <7895pulkit@gmail.com> wrote: > # HG changeset patch > # User Pulkit Goyal <7895pulkit@gmail.com> > # Date 1489635027 -19800 > # Thu Mar 16 09:00:27 2017 +0530 > # Node ID b5673a08993652a92c0e20a4e24d842194872454 > # Parent 69aafa5d3e0f109c7b501758381f65fd9b375196 > dirstate: use list comprehension to get a list of keys > > We have used dict.keys() which returns a dict_keys() object instead > of list on Python 3. So this patch replaces that with list comprehension > which works both on Python 2 and 3. > > diff -r 69aafa5d3e0f -r b5673a089936 mercurial/dirstate.py > --- a/mercurial/dirstate.py Thu Mar 16 08:57:53 2017 +0530 > +++ b/mercurial/dirstate.py Thu Mar 16 09:00:27 2017 +0530 > @@ -1079,7 +1079,7 @@ > # a) not matching matchfn b) ignored, c) missing, or d) under a > # symlink directory. > if not results and matchalways: > - visit = dmap.keys() > + visit = [f for f in dmap] Dirstate performance is critical. I don't know if this particular place matters. Could you see if this makes a measurable difference on a large repo like Mozilla's? > else: > visit = [f for f in dmap if f not in results and matchfn(f)] > visit.sort() > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
On Wed, Mar 15, 2017 at 10:19 PM, Pulkit Goyal <7895pulkit@gmail.com> wrote: > > > On Thu, Mar 16, 2017 at 10:30 AM, Martin von Zweigbergk > <martinvonz@google.com> wrote: >> >> On Wed, Mar 15, 2017 at 9:13 PM, Pulkit Goyal <7895pulkit@gmail.com> >> wrote: >> > # HG changeset patch >> > # User Pulkit Goyal <7895pulkit@gmail.com> >> > # Date 1489635027 -19800 >> > # Thu Mar 16 09:00:27 2017 +0530 >> > # Node ID b5673a08993652a92c0e20a4e24d842194872454 >> > # Parent 69aafa5d3e0f109c7b501758381f65fd9b375196 >> > dirstate: use list comprehension to get a list of keys >> > >> > We have used dict.keys() which returns a dict_keys() object instead >> > of list on Python 3. So this patch replaces that with list comprehension >> > which works both on Python 2 and 3. >> > >> > diff -r 69aafa5d3e0f -r b5673a089936 mercurial/dirstate.py >> > --- a/mercurial/dirstate.py Thu Mar 16 08:57:53 2017 +0530 >> > +++ b/mercurial/dirstate.py Thu Mar 16 09:00:27 2017 +0530 >> > @@ -1079,7 +1079,7 @@ >> > # a) not matching matchfn b) ignored, c) missing, or d) >> > under a >> > # symlink directory. >> > if not results and matchalways: >> > - visit = dmap.keys() >> > + visit = [f for f in dmap] >> >> Dirstate performance is critical. I don't know if this particular >> place matters. Could you see if this makes a measurable difference on >> a large repo like Mozilla's? > > > I didn't tried on Mozilla but I can see significance performance difference. Where did you see that difference? Simply when running "hg status" in the hg core repo? > I will send a V2 with not touching the dirstate code. Note that 1-3 are already queued.
On Wed, Mar 15, 2017 at 10:40 PM, Pulkit Goyal <7895pulkit@gmail.com> wrote: > > > On Thu, Mar 16, 2017 at 10:55 AM, Martin von Zweigbergk > <martinvonz@google.com> wrote: >> >> On Wed, Mar 15, 2017 at 10:19 PM, Pulkit Goyal <7895pulkit@gmail.com> >> wrote: >> > >> > >> > On Thu, Mar 16, 2017 at 10:30 AM, Martin von Zweigbergk >> > <martinvonz@google.com> wrote: >> >> >> >> On Wed, Mar 15, 2017 at 9:13 PM, Pulkit Goyal <7895pulkit@gmail.com> >> >> wrote: >> >> > # HG changeset patch >> >> > # User Pulkit Goyal <7895pulkit@gmail.com> >> >> > # Date 1489635027 -19800 >> >> > # Thu Mar 16 09:00:27 2017 +0530 >> >> > # Node ID b5673a08993652a92c0e20a4e24d842194872454 >> >> > # Parent 69aafa5d3e0f109c7b501758381f65fd9b375196 >> >> > dirstate: use list comprehension to get a list of keys >> >> > >> >> > We have used dict.keys() which returns a dict_keys() object instead >> >> > of list on Python 3. So this patch replaces that with list >> >> > comprehension >> >> > which works both on Python 2 and 3. >> >> > >> >> > diff -r 69aafa5d3e0f -r b5673a089936 mercurial/dirstate.py >> >> > --- a/mercurial/dirstate.py Thu Mar 16 08:57:53 2017 +0530 >> >> > +++ b/mercurial/dirstate.py Thu Mar 16 09:00:27 2017 +0530 >> >> > @@ -1079,7 +1079,7 @@ >> >> > # a) not matching matchfn b) ignored, c) missing, or d) >> >> > under a >> >> > # symlink directory. >> >> > if not results and matchalways: >> >> > - visit = dmap.keys() >> >> > + visit = [f for f in dmap] >> >> >> >> Dirstate performance is critical. I don't know if this particular >> >> place matters. Could you see if this makes a measurable difference on >> >> a large repo like Mozilla's? >> > >> > >> > I didn't tried on Mozilla but I can see significance performance >> > difference. >> >> Where did you see that difference? Simply when running "hg status" in >> the hg core repo? > > > I did the following and saw significant difference. This can be wrong as I > don't know much about the factors taken into consideration while computing > the time. > >>>> dic = {1:2, 3:4, 5:6, 7:8} > >>>> def a(): > ... return dic.keys() > ... > >>>> def b(): > ... return [ls for ls in dic] > ... > >>>> timeit.timeit(b, number=1000) > 0.0006921291351318359 > >>>> timeit.timeit(a, number=1000) > 0.0004601478576660156 That is quite a significant difference. However, it may not be significant in the context of any commands the user may run, and if that's not the case, it doesn't matter if it got slower. Actually, I ended up testing it myself on the Mozilla repo. It turns out "hg status -a" causes that line to be run. I could not see any difference at all in the time of "hg status -a" (stable at 450ms). The repo had about 67k files at the revision I tested. So I'll queue this patch too, thanks. > >> >> >> > I will send a V2 with not touching the dirstate code. >> >> Note that 1-3 are already queued. > > > Thanks! > >
Patch
diff -r 69aafa5d3e0f -r b5673a089936 mercurial/dirstate.py --- a/mercurial/dirstate.py Thu Mar 16 08:57:53 2017 +0530 +++ b/mercurial/dirstate.py Thu Mar 16 09:00:27 2017 +0530 @@ -1079,7 +1079,7 @@ # a) not matching matchfn b) ignored, c) missing, or d) under a # symlink directory. if not results and matchalways: - visit = dmap.keys() + visit = [f for f in dmap] else: visit = [f for f in dmap if f not in results and matchfn(f)] visit.sort()