Patchwork [6,of,6] check-code: cache translation table for repquote to reuse in subsequent use

login
register
mail settings
Submitter Katsunori FUJIWARA
Date May 31, 2016, 12:15 p.m.
Message ID <434fb7ea4e47f82ab473.1464696940@feefifofum>
Download mbox | patch
Permalink /patch/15278/
State Changes Requested
Headers show

Comments

Katsunori FUJIWARA - May 31, 2016, 12:15 p.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1464696191 -32400
#      Tue May 31 21:03:11 2016 +0900
# Node ID 434fb7ea4e47f82ab47368498ed325dbe2cf303b
# Parent  d01e8eeec80bcd1057fce6d70dba942bc92100a5
check-code: cache translation table for repquote to reuse in subsequent use

Rebuilding translation table (256 size) at each repquote() invocations
is redundant.

For example, this patch decreases user time of command invocation
below from 18.417s to 13.065s (about -30%) on a Linux box. This
command is main part of test-check-code.t.

    hg locate | xargs python contrib/check-code.py --warnings --per-file=0
timeless - May 31, 2016, 6:55 p.m.
thanks!
Yuya Nishihara - June 1, 2016, 2:58 p.m.
On Tue, 31 May 2016 21:15:40 +0900, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> # Date 1464696191 -32400
> #      Tue May 31 21:03:11 2016 +0900
> # Node ID 434fb7ea4e47f82ab47368498ed325dbe2cf303b
> # Parent  d01e8eeec80bcd1057fce6d70dba942bc92100a5
> check-code: cache translation table for repquote to reuse in subsequent use
> 
> Rebuilding translation table (256 size) at each repquote() invocations
> is redundant.
> 
> For example, this patch decreases user time of command invocation
> below from 18.417s to 13.065s (about -30%) on a Linux box. This
> command is main part of test-check-code.t.
> 
>     hg locate | xargs python contrib/check-code.py --warnings --per-file=0
> 
> diff --git a/contrib/check-code.py b/contrib/check-code.py
> --- a/contrib/check-code.py
> +++ b/contrib/check-code.py
> @@ -50,7 +50,7 @@ def compilere(pat, multiline=False):
>              pass
>      return re.compile(pat)
>  
> -def repquote(m):
> +def _repquotett():
>      # check "rules depending on implementation of repquote()" in each
>      # patterns (especially pypats), before changing this function
>      fixedmap = {' ': ' ', '\n': '\n', '.': 'p', ':': 'q',
> @@ -66,9 +66,16 @@ def repquote(m):
>          if c.isdigit():
>              return 'n'
>          return 'o'
> +    tt = ''.join(encodechr(i) for i in xrange(256))
> +
> +    # return tt immediately at subsequent invocations
> +    global _repquotett
> +    _repquotett = lambda : tt
> +    return tt
> +
> +def repquote(m):
>      t = m.group('text')
> -    tt = ''.join(encodechr(i) for i in xrange(256))
> -    t = t.translate(tt)
> +    t = t.translate(_repquotett())
>      return m.group('quote') + t + m.group('quote')

Any reason you didn't make 'tt' a global constant?
Katsunori FUJIWARA - June 1, 2016, 6:54 p.m.
At Wed, 1 Jun 2016 23:58:12 +0900,
Yuya Nishihara wrote:
> 
> On Tue, 31 May 2016 21:15:40 +0900, FUJIWARA Katsunori wrote:
> > # HG changeset patch
> > # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> > # Date 1464696191 -32400
> > #      Tue May 31 21:03:11 2016 +0900
> > # Node ID 434fb7ea4e47f82ab47368498ed325dbe2cf303b
> > # Parent  d01e8eeec80bcd1057fce6d70dba942bc92100a5
> > check-code: cache translation table for repquote to reuse in subsequent use
> > 
> > Rebuilding translation table (256 size) at each repquote() invocations
> > is redundant.
> > 
> > For example, this patch decreases user time of command invocation
> > below from 18.417s to 13.065s (about -30%) on a Linux box. This
> > command is main part of test-check-code.t.
> > 
> >     hg locate | xargs python contrib/check-code.py --warnings --per-file=0
> > 
> > diff --git a/contrib/check-code.py b/contrib/check-code.py
> > --- a/contrib/check-code.py
> > +++ b/contrib/check-code.py
> > @@ -50,7 +50,7 @@ def compilere(pat, multiline=False):
> >              pass
> >      return re.compile(pat)
> >  
> > -def repquote(m):
> > +def _repquotett():
> >      # check "rules depending on implementation of repquote()" in each
> >      # patterns (especially pypats), before changing this function
> >      fixedmap = {' ': ' ', '\n': '\n', '.': 'p', ':': 'q',
> > @@ -66,9 +66,16 @@ def repquote(m):
> >          if c.isdigit():
> >              return 'n'
> >          return 'o'
> > +    tt = ''.join(encodechr(i) for i in xrange(256))
> > +
> > +    # return tt immediately at subsequent invocations
> > +    global _repquotett
> > +    _repquotett = lambda : tt
> > +    return tt
> > +
> > +def repquote(m):
> >      t = m.group('text')
> > -    tt = ''.join(encodechr(i) for i in xrange(256))
> > -    t = t.translate(tt)
> > +    t = t.translate(_repquotett())
> >      return m.group('quote') + t + m.group('quote')
> 
> Any reason you didn't make 'tt' a global constant?
> 

I thought about delaying calculation of translation table until
invocation of repquote(). But main usecase of check-code.py is
test-check-code.t, and it always invoke check-code.py *.py/*.c files.
Therefore, this delaying isn't so effective.

I'll post revised one.

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy@lares.dti.ne.jp

Patch

diff --git a/contrib/check-code.py b/contrib/check-code.py
--- a/contrib/check-code.py
+++ b/contrib/check-code.py
@@ -50,7 +50,7 @@  def compilere(pat, multiline=False):
             pass
     return re.compile(pat)
 
-def repquote(m):
+def _repquotett():
     # check "rules depending on implementation of repquote()" in each
     # patterns (especially pypats), before changing this function
     fixedmap = {' ': ' ', '\n': '\n', '.': 'p', ':': 'q',
@@ -66,9 +66,16 @@  def repquote(m):
         if c.isdigit():
             return 'n'
         return 'o'
+    tt = ''.join(encodechr(i) for i in xrange(256))
+
+    # return tt immediately at subsequent invocations
+    global _repquotett
+    _repquotett = lambda : tt
+    return tt
+
+def repquote(m):
     t = m.group('text')
-    tt = ''.join(encodechr(i) for i in xrange(256))
-    t = t.translate(tt)
+    t = t.translate(_repquotett())
     return m.group('quote') + t + m.group('quote')
 
 def reppython(m):
diff --git a/tests/test-contrib-check-code.t b/tests/test-contrib-check-code.t
--- a/tests/test-contrib-check-code.t
+++ b/tests/test-contrib-check-code.t
@@ -262,7 +262,18 @@  web templates
   >        'bar foo-'
   >        'bar')
   > EOF
-  $ "$check_code" stringjoin.py
+
+'missing _() in ui message' detection
+
+  $ cat > missinggettext.py <<EOF
+  > ui.write('01234' '%s foobar')
+  > EOF
+
+(Checking multiple invalid files at once examines whether caching
+translation table for repquote() works as expected or not. All files
+should break rules depending on result of repquote(), in this case)
+
+  $ "$check_code" stringjoin.py missinggettext.py
   stringjoin.py:1:
    > foo = (' foo'
    string join across lines with no space
@@ -287,4 +298,7 @@  web templates
   stringjoin.py:8:
    >        'bar foo-'
    string join across lines with no space
+  missinggettext.py:1:
+   > ui.write('01234' '%s foobar')
+   missing _() in ui message (use () to hide false-positives)
   [1]