From patchwork Sun Nov 22 01:14:02 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: [5, of, 6, frozen-repos] localrepo: evaluate revsets against frozen repos From: Gregory Szorc X-Patchwork-Id: 11569 Message-Id: To: mercurial-devel@selenic.com Date: Sat, 21 Nov 2015 17:14:02 -0800 # HG changeset patch # User Gregory Szorc # Date 1448146482 28800 # Sat Nov 21 14:54:42 2015 -0800 # Node ID ad891b362564b30ec69bc844a3fd98e73b6e032e # Parent 888c2171adffa8340406b50aae02375f7bef50f4 localrepo: evaluate revsets against frozen repos Previously, revsets were evaluated against a repository/changelog that could change. This felt wrong. And, changectx lookups during revset evaluation would result in repoview constantly performing changelog correctness checks, adding overhead. This patch results in some significant performance wins, especially when changectx are involved. There are some minor regressions, but the absolute time increase is so small that they can arguably be ignored. A detailed analysis follows. Running the revset benchmarks in default mode of returning integers, we see some interesting changes: revset #1: draft() plain 0) 0.000040 1) 0.000053 132% plain 0) 0.000233 1) 0.000236 revset #7: author(lmoscovicz) plain 0) 0.994968 1) 0.702156 70% revset #8: author(mpm) plain 0) 0.982039 1) 0.696124 70% revset #9: author(lmoscovicz) or author(mpm) plain 0) 1.944505 1) 1.372315 70% revset #10: author(mpm) or author(lmoscovicz) plain 0) 1.970464 1) 1.393157 70% revset #13: roots((tip~100::) - (tip~100::tip)) plain 0) 0.000636 1) 0.000603 94% revset #15: 42:68 and roots(42:tip) plain 0) 0.000226 1) 0.000178 78% revset #19: draft() plain 0) 0.000040 1) 0.000056 140% revset #22: (not public() - obsolete()) plain 0) 0.000088 1) 0.000111 126% revset #23: (_intlist('20000\x0020001')) and merge() plain 0) 0.000066 1) 0.000086 130% First, the improvements. revsets with author() improved significantly. The reason is that unlike most changesets, author() needs to obtain a changectx to inspect the author field. And since repo.changelog lookups are faster, that revset function became faster. Now, the regressions. The percentages here are concerning. However, when you look at the absolute values, we're going from e.g. 40us/call to 53us/call. Across the board, very fast revsets regressed by 10-20us. Profiling reveals this regression is because of the creation and instantiation of the new frozen repo class. The very act of dynamically defining and then instantiating *any* proxy class appears to add this overhead: even if the class's __init__ does practically nothing. While there is a regression here, the wall values are so small that I don't think it is concerning. It's also worth noting that if I modify `hg perfrevset` to re-use the same frozen repo instance (instead of instantiating a new one every benchmark loop), benchmark times improve across the board! So, as more internal consumers start using frozen repo consumers, there will be fewer frozen repo classes instantiated and less of a performance penalty. Of course, a penalty of 20us is probably not worrying about. When we benchmark revsets obtaining changectxs, we see a more drastic improvement: revset #0: all() plain 0) 0.164450 1) 0.057012 34% revset #1: draft() plain 0) 0.000131 1) 0.000091 69% revset #2: ::tip plain 0) 0.202160 1) 0.094987 46% revset #4: ::tip and draft() plain 0) 0.000269 1) 0.000251 93% revset #5: 0::tip plain 0) 0.165030 1) 0.056363 34% revset #7: author(lmoscovicz) plain 0) 0.997949 1) 0.727015 72% revset #8: author(mpm) plain 0) 1.015544 1) 0.743669 73% revset #9: author(lmoscovicz) or author(mpm) plain 0) 2.087182 1) 1.483129 71% revset #10: author(mpm) or author(lmoscovicz) plain 0) 2.087200 1) 1.482509 71% revset #11: tip:0 plain 0) 0.169505 1) 0.056132 33% revset #12: 0:: plain 0) 0.210607 1) 0.089506 42% revset #13: roots((tip~100::) - (tip~100::tip)) plain 0) 0.000701 1) 0.000594 84% revset #15: 42:68 and roots(42:tip) plain 0) 0.000234 1) 0.000177 75% revset #16: ::p1(p1(tip)):: plain 0) 0.266980 1) 0.144333 54% revset #17: public() plain 0) 0.190038 1) 0.068690 36% revset #18: :10000 and public() plain 0) 0.070399 1) 0.026308 37% revset #19: draft() plain 0) 0.000131 1) 0.000090 68% revset #22: (not public() - obsolete()) plain 0) 0.000187 1) 0.000146 78% revset #23: (_intlist('20000\x0020001')) and merge() plain 0) 0.000067 1) 0.000083 123% revset #25: (20000::) - (20000) plain 0) 0.073760 1) 0.044524 60% revset #26: (children(ancestor(tip~5, tip)) and ::(tip~5)):: plain 0) 0.041947 1) 0.039813 94% The one regression in revset #23 is likely probe overhead and normal variation due to the extremely small times involved. diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py --- a/mercurial/localrepo.py +++ b/mercurial/localrepo.py @@ -530,26 +530,27 @@ class localrepository(object): The revset is specified as a string ``expr`` that may contain %-formatting to escape certain types. See ``revset.formatspec``. Return a revset.abstractsmartset, which is a list-like interface that contains integer revisions. ''' expr = revset.formatspec(expr, *args) m = revset.match(None, expr) - return m(self) + return m(self.frozen()) def set(self, expr, *args): '''Find revisions matching a revset and emit changectx instances. This is a convenience wrapper around ``revs()`` that iterates the result and is a generator of changectx instances. ''' - for r in self.revs(expr, *args): - yield self[r] + frozen = self.frozen() + for r in frozen.revs(expr, *args): + yield frozen[r] def url(self): return 'file:' + self.root def hook(self, name, throw=False, **args): """Call a hook, passing this repo instance. This a convenience method to aid invoking hooks. Extensions likely