Patchwork [1,of,2,STABLE] context: don't complain about a matcher's subrepo paths in changectx.walk()

login
register
mail settings
Submitter Matt Mackall
Date May 19, 2015, 1:14 p.m.
Message ID <1432041253.12150.182.camel@selenic.com>
Download mbox | patch
Permalink /patch/9170/
State Not Applicable
Headers show

Comments

Matt Mackall - May 19, 2015, 1:14 p.m.
On Mon, 2015-05-18 at 22:14 -0400, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1431839170 14400
> #      Sun May 17 01:06:10 2015 -0400
> # Branch stable
> # Node ID 73abecce8140e8c03b215c1b68c2669d44c62d91
> # Parent  91c2278c68a387903c00a7170509c9dd343a7e55
> context: don't complain about a matcher's subrepo paths in changectx.walk()

Here's an alternate solution to the problem of the "bad" callback, let
me know what you think:
Matt Harbison - May 19, 2015, 11:24 p.m.
On Tue, 19 May 2015 09:14:13 -0400, Matt Mackall <mpm@selenic.com> wrote:

> On Mon, 2015-05-18 at 22:14 -0400, Matt Harbison wrote:
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison@yahoo.com>
>> # Date 1431839170 14400
>> #      Sun May 17 01:06:10 2015 -0400
>> # Branch stable
>> # Node ID 73abecce8140e8c03b215c1b68c2669d44c62d91
>> # Parent  91c2278c68a387903c00a7170509c9dd343a7e55
>> context: don't complain about a matcher's subrepo paths in  
>> changectx.walk()
>
> Here's an alternate solution to the problem of the "bad" callback, let
> me know what you think:

I like it.  I was worried about recursive methods like cmdutil.add(), but  
it doesn't look like any of the 20 or so matches for '\.bad =' would be  
problematic as long as the bad matcher isn't passed to the recursive  
call.  And it would fix some current mis-usage.

Any reason not to make it a function on matchmod that internally calls  
copy() and then overwrites bad?  I'm thinking avoiding long undetected and  
subtle bugs if someone adds a field to the base class, but forgets to copy  
it here.  I have no idea what a copy() costs though.

I'll make a pass over these 20 once it shows up in the main repo in  
whatever form.  Should we add a checkcode rule, even though some uses  
(like scmutil.matchandpats or lfutil.getstandinmatcher) don't need to be  
(and shouldn't be) replaced?


> diff -r 472a685a4961 mercurial/context.py
> --- a/mercurial/context.py	Tue May 19 07:17:57 2015 -0500
> +++ b/mercurial/context.py	Tue May 19 08:11:21 2015 -0500
> @@ -9,7 +9,7 @@
>  from i18n import _
>  import mdiff, error, util, scmutil, subrepo, patch, encoding, phases
>  import match as matchmod
> -import copy, os, errno, stat
> +import os, errno, stat
>  import obsolete as obsmod
>  import repoview
>  import fileset
> @@ -591,19 +591,16 @@
>      def walk(self, match):
>          '''Generates matching file names.'''
> -        # Override match.bad method to have message with nodeid
> -        match = copy.copy(match)
> -        oldbad = match.bad
>          def bad(fn, msg):
>              # The manifest doesn't know about subrepos, so don't  
> complain about
>              # paths into valid subrepos.
>              if any(fn == s or fn.startswith(s + '/')
>                     for s in self.substate):
>                  return
> -            oldbad(fn, _('no such file in rev %s') % self)
> -        match.bad = bad
> +            match.bad(fn, _('no such file in rev %s') % self)
> +        m = matchmod.badmatch(match, bad)
> -        return self._manifest.walk(match)
> +        return self._manifest.walk(m)
>     def matches(self, match):
>          return self.walk(match)
> diff -r 472a685a4961 mercurial/match.py
> --- a/mercurial/match.py	Tue May 19 07:17:57 2015 -0500
> +++ b/mercurial/match.py	Tue May 19 08:11:21 2015 -0500
> @@ -234,6 +234,23 @@
>  def always(root, cwd):
>      return match(root, cwd, [])
> +class badmatch(match):
> +    '''wrap a matcher with a bad() callback'''
> +    def __init__(self, matcher, badfn):
> +        # copy base object
> +        self._root = matcher._root
> +        self._cwd = matcher._cwd
> +        self._matcher = matcher
> +        self._always = matcher._always
> +        self._pathrestricted = matcher._pathrestricted
> +        self._files = matcher._files
> +        self._always = matcher._always
> +        self._anypats = matcher._anypats
> +        self._fileroots = matcher._fileroots
> +        self.matchfn = matcher.matchfn
> +        # add callback
> +        self.bad = badfn
> +
>  class narrowmatcher(match):
>      """Adapt a matcher to work on a subdirectory only.
Matt Harbison - June 4, 2015, 12:52 a.m.
Gentle ping on this.  I didn't want to just swipe the idea and run with it.

------- Forwarded message -------
From: "Matt Harbison" <mharbison72@gmail.com>
To: "Matt Mackall" <mpm@selenic.com>
Cc: mercurial-devel@selenic.com, matt_harbison@yahoo.com
Subject: Re: [PATCH 1 of 2 STABLE] context: don't complain about a  
matcher's subrepo paths in changectx.walk()
Date: Tue, 19 May 2015 19:24:51 -0400

On Tue, 19 May 2015 09:14:13 -0400, Matt Mackall <mpm@selenic.com> wrote:

> On Mon, 2015-05-18 at 22:14 -0400, Matt Harbison wrote:
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison@yahoo.com>
>> # Date 1431839170 14400
>> #      Sun May 17 01:06:10 2015 -0400
>> # Branch stable
>> # Node ID 73abecce8140e8c03b215c1b68c2669d44c62d91
>> # Parent  91c2278c68a387903c00a7170509c9dd343a7e55
>> context: don't complain about a matcher's subrepo paths in  
>> changectx.walk()
>
> Here's an alternate solution to the problem of the "bad" callback, let
> me know what you think:

I like it.  I was worried about recursive methods like cmdutil.add(), but
it doesn't look like any of the 20 or so matches for '\.bad =' would be
problematic as long as the bad matcher isn't passed to the recursive
call.  And it would fix some current mis-usage.

Any reason not to make it a function on matchmod that internally calls
copy() and then overwrites bad?  I'm thinking avoiding long undetected and
subtle bugs if someone adds a field to the base class, but forgets to copy
it here.  I have no idea what a copy() costs though.

I'll make a pass over these 20 once it shows up in the main repo in
whatever form.  Should we add a checkcode rule, even though some uses
(like scmutil.matchandpats or lfutil.getstandinmatcher) don't need to be
(and shouldn't be) replaced?


> diff -r 472a685a4961 mercurial/context.py
> --- a/mercurial/context.py	Tue May 19 07:17:57 2015 -0500
> +++ b/mercurial/context.py	Tue May 19 08:11:21 2015 -0500
> @@ -9,7 +9,7 @@
>  from i18n import _
>  import mdiff, error, util, scmutil, subrepo, patch, encoding, phases
>  import match as matchmod
> -import copy, os, errno, stat
> +import os, errno, stat
>  import obsolete as obsmod
>  import repoview
>  import fileset
> @@ -591,19 +591,16 @@
>      def walk(self, match):
>          '''Generates matching file names.'''
> -        # Override match.bad method to have message with nodeid
> -        match = copy.copy(match)
> -        oldbad = match.bad
>          def bad(fn, msg):
>              # The manifest doesn't know about subrepos, so don't  
> complain about
>              # paths into valid subrepos.
>              if any(fn == s or fn.startswith(s + '/')
>                     for s in self.substate):
>                  return
> -            oldbad(fn, _('no such file in rev %s') % self)
> -        match.bad = bad
> +            match.bad(fn, _('no such file in rev %s') % self)
> +        m = matchmod.badmatch(match, bad)
> -        return self._manifest.walk(match)
> +        return self._manifest.walk(m)
>     def matches(self, match):
>          return self.walk(match)
> diff -r 472a685a4961 mercurial/match.py
> --- a/mercurial/match.py	Tue May 19 07:17:57 2015 -0500
> +++ b/mercurial/match.py	Tue May 19 08:11:21 2015 -0500
> @@ -234,6 +234,23 @@
>  def always(root, cwd):
>      return match(root, cwd, [])
> +class badmatch(match):
> +    '''wrap a matcher with a bad() callback'''
> +    def __init__(self, matcher, badfn):
> +        # copy base object
> +        self._root = matcher._root
> +        self._cwd = matcher._cwd
> +        self._matcher = matcher
> +        self._always = matcher._always
> +        self._pathrestricted = matcher._pathrestricted
> +        self._files = matcher._files
> +        self._always = matcher._always
> +        self._anypats = matcher._anypats
> +        self._fileroots = matcher._fileroots
> +        self.matchfn = matcher.matchfn
> +        # add callback
> +        self.bad = badfn
> +
>  class narrowmatcher(match):
>      """Adapt a matcher to work on a subdirectory only.
Martin von Zweigbergk - June 4, 2015, 4:37 a.m.
On Wed, Jun 3, 2015 at 5:52 PM Matt Harbison <mharbison72@gmail.com> wrote:

> Gentle ping on this.  I didn't want to just swipe the idea and run with it.
>
> ------- Forwarded message -------
> From: "Matt Harbison" <mharbison72@gmail.com>
> To: "Matt Mackall" <mpm@selenic.com>
> Cc: mercurial-devel@selenic.com, matt_harbison@yahoo.com
> Subject: Re: [PATCH 1 of 2 STABLE] context: don't complain about a
> matcher's subrepo paths in changectx.walk()
> Date: Tue, 19 May 2015 19:24:51 -0400
>
> On Tue, 19 May 2015 09:14:13 -0400, Matt Mackall <mpm@selenic.com> wrote:
>
> > On Mon, 2015-05-18 at 22:14 -0400, Matt Harbison wrote:
> >> # HG changeset patch
> >> # User Matt Harbison <matt_harbison@yahoo.com>
> >> # Date 1431839170 14400
> >> #      Sun May 17 01:06:10 2015 -0400
> >> # Branch stable
> >> # Node ID 73abecce8140e8c03b215c1b68c2669d44c62d91
> >> # Parent  91c2278c68a387903c00a7170509c9dd343a7e55
> >> context: don't complain about a matcher's subrepo paths in
> >> changectx.walk()
> >
> > Here's an alternate solution to the problem of the "bad" callback, let
> > me know what you think:
>
> I like it.  I was worried about recursive methods like cmdutil.add(), but
> it doesn't look like any of the 20 or so matches for '\.bad =' would be
> problematic as long as the bad matcher isn't passed to the recursive
> call.  And it would fix some current mis-usage.
>
> Any reason not to make it a function on matchmod that internally calls
> copy() and then overwrites bad?  I'm thinking avoiding long undetected and
> subtle bugs if someone adds a field to the base class, but forgets to copy
> it here.


I think a copy() method makes sense.


> I have no idea what a copy() costs though.
>

I don't think matchers are created frequently enough that it matters. It's
usually a few times per repo, right? Ever per revision? Definitely not per
file, I would think.


>
> I'll make a pass over these 20 once it shows up in the main repo in
> whatever form.  Should we add a checkcode rule, even though some uses
> (like scmutil.matchandpats or lfutil.getstandinmatcher) don't need to be
> (and shouldn't be) replaced?
>
>
> > diff -r 472a685a4961 mercurial/context.py
> > --- a/mercurial/context.py    Tue May 19 07:17:57 2015 -0500
> > +++ b/mercurial/context.py    Tue May 19 08:11:21 2015 -0500
> > @@ -9,7 +9,7 @@
> >  from i18n import _
> >  import mdiff, error, util, scmutil, subrepo, patch, encoding, phases
> >  import match as matchmod
> > -import copy, os, errno, stat
> > +import os, errno, stat
> >  import obsolete as obsmod
> >  import repoview
> >  import fileset
> > @@ -591,19 +591,16 @@
> >      def walk(self, match):
> >          '''Generates matching file names.'''
> > -        # Override match.bad method to have message with nodeid
> > -        match = copy.copy(match)
> > -        oldbad = match.bad
> >          def bad(fn, msg):
> >              # The manifest doesn't know about subrepos, so don't
> > complain about
> >              # paths into valid subrepos.
> >              if any(fn == s or fn.startswith(s + '/')
> >                     for s in self.substate):
> >                  return
> > -            oldbad(fn, _('no such file in rev %s') % self)
> > -        match.bad = bad
> > +            match.bad(fn, _('no such file in rev %s') % self)
> > +        m = matchmod.badmatch(match, bad)
> > -        return self._manifest.walk(match)
> > +        return self._manifest.walk(m)
> >     def matches(self, match):
> >          return self.walk(match)
> > diff -r 472a685a4961 mercurial/match.py
> > --- a/mercurial/match.py      Tue May 19 07:17:57 2015 -0500
> > +++ b/mercurial/match.py      Tue May 19 08:11:21 2015 -0500
> > @@ -234,6 +234,23 @@
> >  def always(root, cwd):
> >      return match(root, cwd, [])
> > +class badmatch(match):
> > +    '''wrap a matcher with a bad() callback'''
> > +    def __init__(self, matcher, badfn):
> > +        # copy base object
> > +        self._root = matcher._root
> > +        self._cwd = matcher._cwd
> > +        self._matcher = matcher
> > +        self._always = matcher._always
> > +        self._pathrestricted = matcher._pathrestricted
> > +        self._files = matcher._files
> > +        self._always = matcher._always
> > +        self._anypats = matcher._anypats
> > +        self._fileroots = matcher._fileroots
> > +        self.matchfn = matcher.matchfn
> > +        # add callback
> > +        self.bad = badfn
> > +
> >  class narrowmatcher(match):
> >      """Adapt a matcher to work on a subdirectory only.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel
>
Matt Mackall - June 4, 2015, 5:08 p.m.
On Wed, 2015-06-03 at 20:52 -0400, Matt Harbison wrote:
> Gentle ping on this.  I didn't want to just swipe the idea and run with it.

Go for it.

> Any reason not to make it a function on matchmod that internally calls
> copy() and then overwrites bad?  I'm thinking avoiding long undetected and
> subtle bugs if someone adds a field to the base class, but forgets to copy
> it here.  I have no idea what a copy() costs though.

Shortly after I proposed this, Drew added some new member variables to
match, so I may be coming around to your point of view.
Matt Harbison - June 5, 2015, 1:03 a.m.
On Thu, 04 Jun 2015 00:37:44 -0400, Martin von Zweigbergk  
<martinvonz@google.com> wrote:

> On Wed, Jun 3, 2015 at 5:52 PM Matt Harbison <mharbison72@gmail.com>  
> wrote:
>
>> Gentle ping on this.  I didn't want to just swipe the idea and run with  
>> it.
>>
>> ------- Forwarded message -------
>> From: "Matt Harbison" <mharbison72@gmail.com>
>> To: "Matt Mackall" <mpm@selenic.com>
>> Cc: mercurial-devel@selenic.com, matt_harbison@yahoo.com
>> Subject: Re: [PATCH 1 of 2 STABLE] context: don't complain about a
>> matcher's subrepo paths in changectx.walk()
>> Date: Tue, 19 May 2015 19:24:51 -0400
>>
>> On Tue, 19 May 2015 09:14:13 -0400, Matt Mackall <mpm@selenic.com>  
>> wrote:
>>
>> > On Mon, 2015-05-18 at 22:14 -0400, Matt Harbison wrote:
>> >> # HG changeset patch
>> >> # User Matt Harbison <matt_harbison@yahoo.com>
>> >> # Date 1431839170 14400
>> >> #      Sun May 17 01:06:10 2015 -0400
>> >> # Branch stable
>> >> # Node ID 73abecce8140e8c03b215c1b68c2669d44c62d91
>> >> # Parent  91c2278c68a387903c00a7170509c9dd343a7e55
>> >> context: don't complain about a matcher's subrepo paths in
>> >> changectx.walk()
>> >
>> > Here's an alternate solution to the problem of the "bad" callback, let
>> > me know what you think:
>>
>> I like it.  I was worried about recursive methods like cmdutil.add(),  
>> but
>> it doesn't look like any of the 20 or so matches for '\.bad =' would be
>> problematic as long as the bad matcher isn't passed to the recursive
>> call.  And it would fix some current mis-usage.
>>
>> Any reason not to make it a function on matchmod that internally calls
>> copy() and then overwrites bad?  I'm thinking avoiding long undetected  
>> and
>> subtle bugs if someone adds a field to the base class, but forgets to  
>> copy
>> it here.
>
>
> I think a copy() method makes sense.
>
>
>> I have no idea what a copy() costs though.
>>
>
> I don't think matchers are created frequently enough that it matters.  
> It's
> usually a few times per repo, right? Ever per revision? Definitely not  
> per
> file, I would think.

To my knowledge, it's generally once per command.  The exception being  
narrowing for subrepos, and some largefile -> standin translations  
(usually also once per command per repo).

I just haven't see it used much in the code I'm familiar with aside from  
largefiles, and given the manual copying in the strawman, I thought maybe  
there's a reason to avoid it.  I'll submit a series with a method and an  
internal copy().

Patch

diff -r 472a685a4961 mercurial/context.py
--- a/mercurial/context.py	Tue May 19 07:17:57 2015 -0500
+++ b/mercurial/context.py	Tue May 19 08:11:21 2015 -0500
@@ -9,7 +9,7 @@ 
 from i18n import _
 import mdiff, error, util, scmutil, subrepo, patch, encoding, phases
 import match as matchmod
-import copy, os, errno, stat
+import os, errno, stat
 import obsolete as obsmod
 import repoview
 import fileset
@@ -591,19 +591,16 @@ 
     def walk(self, match):
         '''Generates matching file names.'''
 
-        # Override match.bad method to have message with nodeid
-        match = copy.copy(match)
-        oldbad = match.bad
         def bad(fn, msg):
             # The manifest doesn't know about subrepos, so don't complain about
             # paths into valid subrepos.
             if any(fn == s or fn.startswith(s + '/')
                    for s in self.substate):
                 return
-            oldbad(fn, _('no such file in rev %s') % self)
-        match.bad = bad
+            match.bad(fn, _('no such file in rev %s') % self)
+        m = matchmod.badmatch(match, bad)
 
-        return self._manifest.walk(match)
+        return self._manifest.walk(m)
 
     def matches(self, match):
         return self.walk(match)
diff -r 472a685a4961 mercurial/match.py
--- a/mercurial/match.py	Tue May 19 07:17:57 2015 -0500
+++ b/mercurial/match.py	Tue May 19 08:11:21 2015 -0500
@@ -234,6 +234,23 @@ 
 def always(root, cwd):
     return match(root, cwd, [])
 
+class badmatch(match):
+    '''wrap a matcher with a bad() callback'''
+    def __init__(self, matcher, badfn):
+        # copy base object
+        self._root = matcher._root
+        self._cwd = matcher._cwd
+        self._matcher = matcher
+        self._always = matcher._always
+        self._pathrestricted = matcher._pathrestricted
+        self._files = matcher._files
+        self._always = matcher._always
+        self._anypats = matcher._anypats
+        self._fileroots = matcher._fileroots
+        self.matchfn = matcher.matchfn
+        # add callback
+        self.bad = badfn
+
 class narrowmatcher(match):
     """Adapt a matcher to work on a subdirectory only.