Submitter | Siddharth Agarwal |
---|---|
Date | July 15, 2014, 11:15 p.m. |
Message ID | <2d645e993f8cb3d386ae.1405466131@dev1738.prn1.facebook.com> |
Download | mbox | patch |
Permalink | /patch/5178/ |
State | Accepted |
Commit | d516b6de38210dea25aad7cb59ee999c6cbe37fd |
Headers | show |
Comments
On Tue, Jul 15, 2014 at 04:15:31PM -0700, Siddharth Agarwal wrote: > # HG changeset patch > # User Siddharth Agarwal <sid0@fb.com> > # Date 1405463690 25200 > # Tue Jul 15 15:34:50 2014 -0700 > # Node ID 2d645e993f8cb3d386ae520e7233089316e830f2 > # Parent 8ec138de734383da9ab4fd60e4a61054906f50ed > match: use util.re.escape instead of re.escape Series looks sensible and straighforward. Queued. (I particuarly like that (ab)use of @propertycache to return different functions and have it look like a class method. Clever.) > > For a pathological .hgignore with over 2500 glob lines and over 200000 calls to > re.escape, and with re2 available, this speeds up parsing the .hgignore from > 0.75 seconds to 0.20 seconds. This causes e.g. 'hg status' with hgwatchman > enabled to go from 1.02 seconds to 0.47 seconds. > > diff --git a/mercurial/match.py b/mercurial/match.py > --- a/mercurial/match.py > +++ b/mercurial/match.py > @@ -247,7 +247,7 @@ > i, n = 0, len(pat) > res = '' > group = 0 > - escape = re.escape > + escape = util.re.escape > def peek(): > return i < n and pat[i] > while i < n: > @@ -310,11 +310,11 @@ > if kind == 're': > return pat > if kind == 'path': > - return '^' + re.escape(pat) + '(?:/|$)' > + return '^' + util.re.escape(pat) + '(?:/|$)' > if kind == 'relglob': > return '(?:|.*/)' + _globre(pat) + globsuffix > if kind == 'relpath': > - return re.escape(pat) + '(?:/|$)' > + return util.re.escape(pat) + '(?:/|$)' > if kind == 'relre': > if pat.startswith('^'): > return pat > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@selenic.com > http://selenic.com/mailman/listinfo/mercurial-devel
On 07/17/2014 07:26 PM, Augie Fackler wrote: > On Tue, Jul 15, 2014 at 04:15:31PM -0700, Siddharth Agarwal wrote: >> # HG changeset patch >> # User Siddharth Agarwal <sid0@fb.com> >> # Date 1405463690 25200 >> # Tue Jul 15 15:34:50 2014 -0700 >> # Node ID 2d645e993f8cb3d386ae520e7233089316e830f2 >> # Parent 8ec138de734383da9ab4fd60e4a61054906f50ed >> match: use util.re.escape instead of re.escape > Series looks sensible and straighforward. Queued. > > (I particuarly like that (ab)use of @propertycache to return different > functions and have it look like a class method. Clever.) My motivation for it was to avoid an extra function call per invocation of util.re.escape -- once the value's been saved as a local, as is done in patch 9. > >> For a pathological .hgignore with over 2500 glob lines and over 200000 calls to >> re.escape, and with re2 available, this speeds up parsing the .hgignore from >> 0.75 seconds to 0.20 seconds. This causes e.g. 'hg status' with hgwatchman >> enabled to go from 1.02 seconds to 0.47 seconds. >> >> diff --git a/mercurial/match.py b/mercurial/match.py >> --- a/mercurial/match.py >> +++ b/mercurial/match.py >> @@ -247,7 +247,7 @@ >> i, n = 0, len(pat) >> res = '' >> group = 0 >> - escape = re.escape >> + escape = util.re.escape >> def peek(): >> return i < n and pat[i] >> while i < n: >> @@ -310,11 +310,11 @@ >> if kind == 're': >> return pat >> if kind == 'path': >> - return '^' + re.escape(pat) + '(?:/|$)' >> + return '^' + util.re.escape(pat) + '(?:/|$)' >> if kind == 'relglob': >> return '(?:|.*/)' + _globre(pat) + globsuffix >> if kind == 'relpath': >> - return re.escape(pat) + '(?:/|$)' >> + return util.re.escape(pat) + '(?:/|$)' >> if kind == 'relre': >> if pat.startswith('^'): >> return pat >> _______________________________________________ >> Mercurial-devel mailing list >> Mercurial-devel@selenic.com >> http://selenic.com/mailman/listinfo/mercurial-devel > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@selenic.com > http://selenic.com/mailman/listinfo/mercurial-devel
Patch
diff --git a/mercurial/match.py b/mercurial/match.py --- a/mercurial/match.py +++ b/mercurial/match.py @@ -247,7 +247,7 @@ i, n = 0, len(pat) res = '' group = 0 - escape = re.escape + escape = util.re.escape def peek(): return i < n and pat[i] while i < n: @@ -310,11 +310,11 @@ if kind == 're': return pat if kind == 'path': - return '^' + re.escape(pat) + '(?:/|$)' + return '^' + util.re.escape(pat) + '(?:/|$)' if kind == 'relglob': return '(?:|.*/)' + _globre(pat) + globsuffix if kind == 'relpath': - return re.escape(pat) + '(?:/|$)' + return util.re.escape(pat) + '(?:/|$)' if kind == 'relre': if pat.startswith('^'): return pat