Patchwork [3,of,8] ignore: add ui to ignore stack

login
register
mail settings
Submitter Durham Goode
Date May 13, 2015, 3:13 p.m.
Message ID <5af3cb8572c462a3f21c.1431529997@dev2000.prn2.facebook.com>
Download mbox | patch
Permalink /patch/9045/
State Changes Requested
Headers show

Comments

Durham Goode - May 13, 2015, 3:13 p.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1431468882 25200
#      Tue May 12 15:14:42 2015 -0700
# Node ID 5af3cb8572c462a3f21c45f7f8d9fdb0289dd373
# Parent  1236d101b01d251715016df1b0ba53d3a37edfb8
ignore: add ui to ignore stack

In a future patch we will need access to the ui.config inside the ignore logic.
This patch adds it to the functions.  A future patch will also remove the
redundant 'warn' variable here.
Pierre-Yves David - May 14, 2015, 1:20 a.m.
On 05/13/2015 08:13 AM, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1431468882 25200
> #      Tue May 12 15:14:42 2015 -0700
> # Node ID 5af3cb8572c462a3f21c45f7f8d9fdb0289dd373
> # Parent  1236d101b01d251715016df1b0ba53d3a37edfb8
> ignore: add ui to ignore stack
>
> In a future patch we will need access to the ui.config inside the ignore logic.
> This patch adds it to the functions.  A future patch will also remove the
> redundant 'warn' variable here.

Do you have to pass the whole ui object there. Can't we just use the one 
boolean of interested in there? I'm getting concerns at the `ui` object 
getting its own cult, leaking everywhere, having small and simple piece 
of code depending on it.
Durham Goode - May 14, 2015, 1:34 a.m.
On 5/13/15, 6:20 PM, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org>
wrote:

>
>
>On 05/13/2015 08:13 AM, Durham Goode wrote:
>> # HG changeset patch
>> # User Durham Goode <durham@fb.com>
>> # Date 1431468882 25200
>> #      Tue May 12 15:14:42 2015 -0700
>> # Node ID 5af3cb8572c462a3f21c45f7f8d9fdb0289dd373
>> # Parent  1236d101b01d251715016df1b0ba53d3a37edfb8
>> ignore: add ui to ignore stack
>>
>> In a future patch we will need access to the ui.config inside the
>>ignore logic.
>> This patch adds it to the functions.  A future patch will also remove
>>the
>> redundant 'warn' variable here.
>
>Do you have to pass the whole ui object there. Can't we just use the one
>boolean of interested in there? I'm getting concerns at the `ui` object
>getting its own cult, leaking everywhere, having small and simple piece
>of code depending on it.

I figured the fact that it needs to call warn() means the ui probably
should've been passed in to begin with (since passing the warn function is
funky).  In this case, since it needs both ui.warn and ui.config, it seems
like passing the ui makes sense.

If people disagree, I can change it.  I just think passing the ui is
cleaner and the down sides are minimal.
Pierre-Yves David - May 14, 2015, 6:03 a.m.
On 05/13/2015 06:34 PM, Durham Goode wrote:
>
>
> On 5/13/15, 6:20 PM, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org>
> wrote:
>
>>
>>
>> On 05/13/2015 08:13 AM, Durham Goode wrote:
>>> # HG changeset patch
>>> # User Durham Goode <durham@fb.com>
>>> # Date 1431468882 25200
>>> #      Tue May 12 15:14:42 2015 -0700
>>> # Node ID 5af3cb8572c462a3f21c45f7f8d9fdb0289dd373
>>> # Parent  1236d101b01d251715016df1b0ba53d3a37edfb8
>>> ignore: add ui to ignore stack
>>>
>>> In a future patch we will need access to the ui.config inside the
>>> ignore logic.
>>> This patch adds it to the functions.  A future patch will also remove
>>> the
>>> redundant 'warn' variable here.
>>
>> Do you have to pass the whole ui object there. Can't we just use the one
>> boolean of interested in there? I'm getting concerns at the `ui` object
>> getting its own cult, leaking everywhere, having small and simple piece
>> of code depending on it.
>
> I figured the fact that it needs to call warn() means the ui probably
> should've been passed in to begin with (since passing the warn function is
> funky).  In this case, since it needs both ui.warn and ui.config, it seems
> like passing the ui makes sense.
>
> If people disagree, I can change it.  I just think passing the ui is
> cleaner and the down sides are minimal.

The warn function is a fairly agnostic way to "express error message if 
any", we use this pattern in some other part of the code.

Passing the ui object is definitely not a crazy approach. But I would 
like to avoid the automatism of having ui object everywhere. I'll not 
fight too much about that if the rest of the project disagree.
Matt Mackall - May 15, 2015, 9:26 p.m.
On Wed, 2015-05-13 at 23:03 -0700, Pierre-Yves David wrote:
> 
> On 05/13/2015 06:34 PM, Durham Goode wrote:
> >
> >
> > On 5/13/15, 6:20 PM, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org>
> > wrote:
> >
> >>
> >>
> >> On 05/13/2015 08:13 AM, Durham Goode wrote:
> >>> # HG changeset patch
> >>> # User Durham Goode <durham@fb.com>
> >>> # Date 1431468882 25200
> >>> #      Tue May 12 15:14:42 2015 -0700
> >>> # Node ID 5af3cb8572c462a3f21c45f7f8d9fdb0289dd373
> >>> # Parent  1236d101b01d251715016df1b0ba53d3a37edfb8
> >>> ignore: add ui to ignore stack
> >>>
> >>> In a future patch we will need access to the ui.config inside the
> >>> ignore logic.
> >>> This patch adds it to the functions.  A future patch will also remove
> >>> the
> >>> redundant 'warn' variable here.
> >>
> >> Do you have to pass the whole ui object there. Can't we just use the one
> >> boolean of interested in there? I'm getting concerns at the `ui` object
> >> getting its own cult, leaking everywhere, having small and simple piece
> >> of code depending on it.
> >
> > I figured the fact that it needs to call warn() means the ui probably
> > should've been passed in to begin with (since passing the warn function is
> > funky).  In this case, since it needs both ui.warn and ui.config, it seems
> > like passing the ui makes sense.
> >
> > If people disagree, I can change it.  I just think passing the ui is
> > cleaner and the down sides are minimal.
> 
> The warn function is a fairly agnostic way to "express error message if 
> any", we use this pattern in some other part of the code.
> 
> Passing the ui object is definitely not a crazy approach. But I would 
> like to avoid the automatism of having ui object everywhere. I'll not 
> fight too much about that if the rest of the project disagree.

It's not obvious to me that we need a config knob here, but perhaps I'm
missing something.
Pierre-Yves David - May 16, 2015, 7:18 a.m.
On 05/15/2015 02:26 PM, Matt Mackall wrote:
> On Wed, 2015-05-13 at 23:03 -0700, Pierre-Yves David wrote:
>>
>> On 05/13/2015 06:34 PM, Durham Goode wrote:
>>>
>>>
>>> On 5/13/15, 6:20 PM, "Pierre-Yves David" <pierre-yves.david@ens-lyon.org>
>>> wrote:
>>>
>>>>
>>>>
>>>> On 05/13/2015 08:13 AM, Durham Goode wrote:
>>>>> # HG changeset patch
>>>>> # User Durham Goode <durham@fb.com>
>>>>> # Date 1431468882 25200
>>>>> #      Tue May 12 15:14:42 2015 -0700
>>>>> # Node ID 5af3cb8572c462a3f21c45f7f8d9fdb0289dd373
>>>>> # Parent  1236d101b01d251715016df1b0ba53d3a37edfb8
>>>>> ignore: add ui to ignore stack
>>>>>
>>>>> In a future patch we will need access to the ui.config inside the
>>>>> ignore logic.
>>>>> This patch adds it to the functions.  A future patch will also remove
>>>>> the
>>>>> redundant 'warn' variable here.
>>>>
>>>> Do you have to pass the whole ui object there. Can't we just use the one
>>>> boolean of interested in there? I'm getting concerns at the `ui` object
>>>> getting its own cult, leaking everywhere, having small and simple piece
>>>> of code depending on it.
>>>
>>> I figured the fact that it needs to call warn() means the ui probably
>>> should've been passed in to begin with (since passing the warn function is
>>> funky).  In this case, since it needs both ui.warn and ui.config, it seems
>>> like passing the ui makes sense.
>>>
>>> If people disagree, I can change it.  I just think passing the ui is
>>> cleaner and the down sides are minimal.
>>
>> The warn function is a fairly agnostic way to "express error message if
>> any", we use this pattern in some other part of the code.
>>
>> Passing the ui object is definitely not a crazy approach. But I would
>> like to avoid the automatism of having ui object everywhere. I'll not
>> fight too much about that if the rest of the project disagree.
>
> It's not obvious to me that we need a config knob here, but perhaps I'm
> missing something.

I discussed with Durham about that today and here is the reasoning:

- We do not know how to combine regex to add the prefix.
- So we should not allow regex in included ignored by default.
- User who know about the limitation and might want regex anyway still 
want the feature.
- So we need a config knob to disable that by default but let power user 
enable it.
Martin von Zweigbergk - May 16, 2015, 9:18 a.m.
As mentioned elsewhere, I and the other Martin would really like to see if
we can afford (performance-wise) to replace pattern-rewriting with
application of the pattern to a substring. It just seems like the right
thing to do from the user's point of view. At least if done in C and
avoiding string copying, it doesn't seem like a terribly expensive
operation. What else is done in this code path? Stat'ing the files? Listing
files in directory? Seems like that should be more expensive, but maybe not
on warm disk?

On Sat, May 16, 2015, 00:19 Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 05/15/2015 02:26 PM, Matt Mackall wrote:
> > On Wed, 2015-05-13 at 23:03 -0700, Pierre-Yves David wrote:
> >>
> >> On 05/13/2015 06:34 PM, Durham Goode wrote:
> >>>
> >>>
> >>> On 5/13/15, 6:20 PM, "Pierre-Yves David" <
> pierre-yves.david@ens-lyon.org>
> >>> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 05/13/2015 08:13 AM, Durham Goode wrote:
> >>>>> # HG changeset patch
> >>>>> # User Durham Goode <durham@fb.com>
> >>>>> # Date 1431468882 25200
> >>>>> #      Tue May 12 15:14:42 2015 -0700
> >>>>> # Node ID 5af3cb8572c462a3f21c45f7f8d9fdb0289dd373
> >>>>> # Parent  1236d101b01d251715016df1b0ba53d3a37edfb8
> >>>>> ignore: add ui to ignore stack
> >>>>>
> >>>>> In a future patch we will need access to the ui.config inside the
> >>>>> ignore logic.
> >>>>> This patch adds it to the functions.  A future patch will also remove
> >>>>> the
> >>>>> redundant 'warn' variable here.
> >>>>
> >>>> Do you have to pass the whole ui object there. Can't we just use the
> one
> >>>> boolean of interested in there? I'm getting concerns at the `ui`
> object
> >>>> getting its own cult, leaking everywhere, having small and simple
> piece
> >>>> of code depending on it.
> >>>
> >>> I figured the fact that it needs to call warn() means the ui probably
> >>> should've been passed in to begin with (since passing the warn
> function is
> >>> funky).  In this case, since it needs both ui.warn and ui.config, it
> seems
> >>> like passing the ui makes sense.
> >>>
> >>> If people disagree, I can change it.  I just think passing the ui is
> >>> cleaner and the down sides are minimal.
> >>
> >> The warn function is a fairly agnostic way to "express error message if
> >> any", we use this pattern in some other part of the code.
> >>
> >> Passing the ui object is definitely not a crazy approach. But I would
> >> like to avoid the automatism of having ui object everywhere. I'll not
> >> fight too much about that if the rest of the project disagree.
> >
> > It's not obvious to me that we need a config knob here, but perhaps I'm
> > missing something.
>
> I discussed with Durham about that today and here is the reasoning:
>
> - We do not know how to combine regex to add the prefix.
> - So we should not allow regex in included ignored by default.
> - User who know about the limitation and might want regex anyway still
> want the feature.
> - So we need a config knob to disable that by default but let power user
> enable it.
>
> --
> Pierre-Yves David
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>
Pierre-Yves David - May 18, 2015, 3:12 a.m.
On 05/16/2015 02:18 AM, Martin von Zweigbergk wrote:
> As mentioned elsewhere, I and the other Martin would really like to see
> if we can afford (performance-wise) to replace pattern-rewriting with
> application of the pattern to a substring. It just seems like the right
> thing to do from the user's point of view. At least if done in C and
> avoiding string copying, it doesn't seem like a terribly expensive
> operation. What else is done in this code path? Stat'ing the files?
> Listing files in directory? Seems like that should be more expensive,
> but maybe not on warm disk?

With watchman, the disk is rarely touched directly.
Martin von Zweigbergk - May 18, 2015, 9:12 a.m.
OK, so that argument didn't fly. How about this:

I agree that the matcher logic needs to be fast in general, but how often
will this particular matcher be called? My guess is that it's only on
untracked files and directories (but not on files inside ignored
directories). Is that correct? How many such files and directories will
reasonably exist? (That last question wasn't rhetorical; I'd be happy to
hear if you can find out.)

Do we at least agree about the UI problem in having a config option that
makes the user say they're using sane regexes? I don't think it's a huge
problem, but I think it's large enough to do some profiling with hgwatchman
on a realistic working copy (and maybe a "worst-case" one).

On Sun, May 17, 2015, 20:12 Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 05/16/2015 02:18 AM, Martin von Zweigbergk wrote:
> > As mentioned elsewhere, I and the other Martin would really like to see
> > if we can afford (performance-wise) to replace pattern-rewriting with
> > application of the pattern to a substring. It just seems like the right
> > thing to do from the user's point of view. At least if done in C and
> > avoiding string copying, it doesn't seem like a terribly expensive
> > operation. What else is done in this code path? Stat'ing the files?
> > Listing files in directory? Seems like that should be more expensive,
> > but maybe not on warm disk?
>
> With watchman, the disk is rarely touched directly.
>
> --
> Pierre-Yves David
>
Durham Goode - May 18, 2015, 5:59 p.m.
On 5/18/15 2:12 AM, Martin von Zweigbergk wrote:
>
> OK, so that argument didn't fly. How about this:
>
> I agree that the matcher logic needs to be fast in general, but how 
> often will this particular matcher be called? My guess is that it's 
> only on untracked files and directories (but not on files inside 
> ignored directories). Is that correct? How many such files and 
> directories will reasonably exist? (That last question wasn't 
> rhetorical; I'd be happy to hear if you can find out.)
>
> Do we at least agree about the UI problem in having a config option 
> that makes the user say they're using sane regexes? I don't think it's 
> a huge problem, but I think it's large enough to do some profiling 
> with hgwatchman on a realistic working copy (and maybe a "worst-case" 
> one).
>
I'm going to send a patch series with doing it the way you mentioned.  
It's easier to do it that way when doing the feature the way Matt wants 
("include:path/to/ignore" style rules).  We'll deal with the perf it it 
becomes a problem.
>
> On Sun, May 17, 2015, 20:12 Pierre-Yves David 
> <pierre-yves.david@ens-lyon.org 
> <mailto:pierre-yves.david@ens-lyon.org>> wrote:
>
>
>
>     On 05/16/2015 02:18 AM, Martin von Zweigbergk wrote:
>     > As mentioned elsewhere, I and the other Martin would really like
>     to see
>     > if we can afford (performance-wise) to replace pattern-rewriting
>     with
>     > application of the pattern to a substring. It just seems like
>     the right
>     > thing to do from the user's point of view. At least if done in C and
>     > avoiding string copying, it doesn't seem like a terribly expensive
>     > operation. What else is done in this code path? Stat'ing the files?
>     > Listing files in directory? Seems like that should be more
>     expensive,
>     > but maybe not on warm disk?
>
>     With watchman, the disk is rarely touched directly.
>
>     --
>     Pierre-Yves David
>
>
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -149,7 +149,7 @@  class dirstate(object):
                 # we need to use os.path.join here rather than self._join
                 # because path is arbitrary and user-specified
                 files.append(os.path.join(self._rootdir, util.expandpath(path)))
-        return ignore.ignore(self._root, files, self._ui.warn)
+        return ignore.ignore(self._ui, self._root, files, self._ui.warn)
 
     @propertycache
     def _slash(self):
diff --git a/mercurial/ignore.py b/mercurial/ignore.py
--- a/mercurial/ignore.py
+++ b/mercurial/ignore.py
@@ -55,7 +55,7 @@  def ignorepats(lines):
 
     return patterns, warnings
 
-def readignorefile(filepath, warn, skipwarning=False):
+def readignorefile(ui, filepath, warn, skipwarning=False):
     try:
         pats = []
         fp = open(filepath)
@@ -69,7 +69,7 @@  def readignorefile(filepath, warn, skipw
                  (filepath, inst.strerror))
     return pats
 
-def readpats(root, files, warn):
+def readpats(ui, root, files, warn):
     '''return a dict mapping ignore-file-name to list-of-patterns'''
 
     pats = {}
@@ -77,11 +77,11 @@  def readpats(root, files, warn):
         if f in pats:
             continue
         skipwarning = f == files[0]
-        pats[f] = readignorefile(f, warn, skipwarning=skipwarning)
+        pats[f] = readignorefile(ui, f, warn, skipwarning=skipwarning)
 
     return [(f, pats[f]) for f in files if f in pats]
 
-def ignore(root, files, warn):
+def ignore(ui, root, files, warn):
     '''return matcher covering patterns in 'files'.
 
     the files parsed for patterns include:
@@ -101,7 +101,7 @@  def ignore(root, files, warn):
     glob:pattern   # non-rooted glob
     pattern        # pattern of the current default type'''
 
-    pats = readpats(root, files, warn)
+    pats = readpats(ui, root, files, warn)
 
     allpats = []
     for f, patlist in pats: