Patchwork [2,of,3] revert: rename working directory matcher 'm' -> 'wmatch'

login
register
mail settings
Submitter Matt Harbison
Date March 26, 2015, 4:20 a.m.
Message ID <781d27e7fcd3a438d02b.1427343656@Envy>
Download mbox | patch
Permalink /patch/8288/
State Accepted
Headers show

Comments

Matt Harbison - March 26, 2015, 4:20 a.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1427333998 14400
#      Wed Mar 25 21:39:58 2015 -0400
# Node ID 781d27e7fcd3a438d02bf2e1a9ba0a85537da6b9
# Parent  6e1b26088b8eca13825a91e4e1e3547c2e89f6a0
revert: rename working directory matcher 'm' -> 'wmatch'

Further down in the function, 'm' is reassigned to a matcher for a list of
files.  At the bottom of the function, another matcher is created to figure out
what subrepos need to be reverted.  For consistency with 5b85a5bc5bbb, this will
be changed to a working directory matcher.

We don't just recreate the working directory matcher based on 'pats' however,
because reusing the original working directory matcher will allow the 'pats'
parameter to be converted to a matcher.  That will allow better subrepo support,
and dropping the scmutil.match monkey patching done by largefiles.
Matt Mackall - March 26, 2015, 9:09 p.m.
On Thu, 2015-03-26 at 00:20 -0400, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1427333998 14400
> #      Wed Mar 25 21:39:58 2015 -0400
> # Node ID 781d27e7fcd3a438d02bf2e1a9ba0a85537da6b9
> # Parent  6e1b26088b8eca13825a91e4e1e3547c2e89f6a0
> revert: rename working directory matcher 'm' -> 'wmatch'
> 
> Further down in the function, 'm' is reassigned to a matcher for a list of
> files.  At the bottom of the function, another matcher is created to figure out
> what subrepos need to be reverted.  For consistency with 5b85a5bc5bbb, this will
> be changed to a working directory matcher.

Not sure about this one. The first use of 'm' here is the standard "make
a matcher from the supplied args" pattern and the second definition
appears to have very limited use.

It seems the shadowing here is kind of intentional: we're making a more
narrower version of the initial match that should be used for the rest
of the function. So perhaps we should instead be moving the subrepo set
calculation higher up instead of doing any renaming?

But really, your patch 3 would actually be _simpler_ as a patch 2 that
just did s/ctx/wctx/ locally, right? I'll just do that for now. Patch 1
and 3' queued, thanks.
Matt Harbison - March 27, 2015, 1:06 a.m.
On Thu, 26 Mar 2015 17:09:13 -0400, Matt Mackall <mpm@selenic.com> wrote:

> On Thu, 2015-03-26 at 00:20 -0400, Matt Harbison wrote:
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison@yahoo.com>
>> # Date 1427333998 14400
>> #      Wed Mar 25 21:39:58 2015 -0400
>> # Node ID 781d27e7fcd3a438d02bf2e1a9ba0a85537da6b9
>> # Parent  6e1b26088b8eca13825a91e4e1e3547c2e89f6a0
>> revert: rename working directory matcher 'm' -> 'wmatch'
>>
>> Further down in the function, 'm' is reassigned to a matcher for a list  
>> of
>> files.  At the bottom of the function, another matcher is created to  
>> figure out
>> what subrepos need to be reverted.  For consistency with 5b85a5bc5bbb,  
>> this will
>> be changed to a working directory matcher.
>
> Not sure about this one. The first use of 'm' here is the standard "make
> a matcher from the supplied args" pattern and the second definition
> appears to have very limited use.
>
> It seems the shadowing here is kind of intentional: we're making a more
> narrower version of the initial match that should be used for the rest
> of the function. So perhaps we should instead be moving the subrepo set
> calculation higher up instead of doing any renaming?
>
> But really, your patch 3 would actually be _simpler_ as a patch 2 that
> just did s/ctx/wctx/ locally, right? I'll just do that for now. Patch 1
> and 3' queued, thanks.

That looks like it works for this series.  The next thing to do though is  
to pass a matcher to the subrepo, that is a narrowed form of the original  
passed to the function.  The original matcher needs to make it to the  
subrepo section, but moving the whole subrepo section to the top looks  
like it would work too.


I guess now is a good time to ask before doing the work- subrepo revert  
works by essentially doing a clean update of the subrepo, after doing a  
cmdutil.revert() on that layer to generate backup files.  That  
functionality would seem to preclude the path/into/subrepo form that is  
being supported on more commands, and would seem to violate the documented  
help for revert:

	 "Because revert does not change the working directory parents..."

Granted, it isn't changing the parents of the outer repo, but the subrepo  
working directory seems like an extension of the outer repo's working  
directory.  Backout doesn't support subrepos, and maybe it isn't possible  
to do so.  But it certainly can't if subrepos are being updated like this  
by revert.  And because of the recursive nature, a subrepo N subrepos deep  
will be clean updated N times.

It's been like this for 3 years now, but revert subrepo support is totally  
undocumented AFAICT- nothing in 'hg help revert' or in 'hg help  
subrepos'.  Would you consider this a bug (and accept patches to fix  
this), or are we stuck with this?

--Matt
Matt Mackall - March 27, 2015, 6:15 p.m.
On Thu, 2015-03-26 at 21:06 -0400, Matt Harbison wrote:
> On Thu, 26 Mar 2015 17:09:13 -0400, Matt Mackall <mpm@selenic.com> wrote:
> 
> > On Thu, 2015-03-26 at 00:20 -0400, Matt Harbison wrote:
> >> # HG changeset patch
> >> # User Matt Harbison <matt_harbison@yahoo.com>
> >> # Date 1427333998 14400
> >> #      Wed Mar 25 21:39:58 2015 -0400
> >> # Node ID 781d27e7fcd3a438d02bf2e1a9ba0a85537da6b9
> >> # Parent  6e1b26088b8eca13825a91e4e1e3547c2e89f6a0
> >> revert: rename working directory matcher 'm' -> 'wmatch'
> >>
> >> Further down in the function, 'm' is reassigned to a matcher for a list  
> >> of
> >> files.  At the bottom of the function, another matcher is created to  
> >> figure out
> >> what subrepos need to be reverted.  For consistency with 5b85a5bc5bbb,  
> >> this will
> >> be changed to a working directory matcher.
> >
> > Not sure about this one. The first use of 'm' here is the standard "make
> > a matcher from the supplied args" pattern and the second definition
> > appears to have very limited use.
> >
> > It seems the shadowing here is kind of intentional: we're making a more
> > narrower version of the initial match that should be used for the rest
> > of the function. So perhaps we should instead be moving the subrepo set
> > calculation higher up instead of doing any renaming?
> >
> > But really, your patch 3 would actually be _simpler_ as a patch 2 that
> > just did s/ctx/wctx/ locally, right? I'll just do that for now. Patch 1
> > and 3' queued, thanks.
> 
> That looks like it works for this series.  The next thing to do though is  
> to pass a matcher to the subrepo, that is a narrowed form of the original  
> passed to the function.  The original matcher needs to make it to the  
> subrepo section, but moving the whole subrepo section to the top looks  
> like it would work too.
> 
> 
> I guess now is a good time to ask before doing the work- subrepo revert  
> works by essentially doing a clean update of the subrepo, after doing a  
> cmdutil.revert() on that layer to generate backup files.  That  
> functionality would seem to preclude the path/into/subrepo form that is  
> being supported on more commands, and would seem to violate the documented  
> help for revert:
> 
> 	 "Because revert does not change the working directory parents..."
> 
> Granted, it isn't changing the parents of the outer repo, but the subrepo  
> working directory seems like an extension of the outer repo's working  
> directory.  Backout doesn't support subrepos, and maybe it isn't possible  
> to do so.  But it certainly can't if subrepos are being updated like this  
> by revert.  And because of the recursive nature, a subrepo N subrepos deep  
> will be clean updated N times.

Oh man, that's a hard question. My internal model of subrepo semantics
says that yes, we should get a clean checkout of subrepos under the top
level, but I can believe that would surprise people. I'm not sure if the
alternative is particularly logically consistent though.

> It's been like this for 3 years now, but revert subrepo support is totally  
> undocumented AFAICT- nothing in 'hg help revert' or in 'hg help  
> subrepos'.  Would you consider this a bug (and accept patches to fix  
> this), or are we stuck with this?

If "been like this" means we do get clean checkouts today, then I think
we should stick with it.
Jordi GutiƩrrez Hermoso - March 27, 2015, 6:59 p.m.
On Fri, 2015-03-27 at 13:15 -0500, Matt Mackall wrote:
> On Thu, 2015-03-26 at 21:06 -0400, Matt Harbison wrote:

> > 	 "Because revert does not change the working directory parents..."
[snip]
> 
> Oh man, that's a hard question. My internal model of subrepo semantics
> says that yes, we should get a clean checkout of subrepos under the top
> level, but I can believe that would surprise people. I'm not sure if the
> alternative is particularly logically consistent though.

I think the expected behaviour would be to propagate operations to
subrepos? revert at the top level also reverts any unclean state in
subrepos, and update at the top level also changes the revisions in
subrepos.

It would be super-weird if revert at the top level translated into
update in a subrepo. It's already weird enough that update ceases to
be a local operation with subrepos.
Matt Harbison - March 28, 2015, 1:45 a.m.
On Fri, 27 Mar 2015 14:15:09 -0400, Matt Mackall <mpm@selenic.com> wrote:

> On Thu, 2015-03-26 at 21:06 -0400, Matt Harbison wrote:
>> On Thu, 26 Mar 2015 17:09:13 -0400, Matt Mackall <mpm@selenic.com>  
>> wrote:
>>
>> > On Thu, 2015-03-26 at 00:20 -0400, Matt Harbison wrote:
>> >> # HG changeset patch
>> >> # User Matt Harbison <matt_harbison@yahoo.com>
>> >> # Date 1427333998 14400
>> >> #      Wed Mar 25 21:39:58 2015 -0400
>> >> # Node ID 781d27e7fcd3a438d02bf2e1a9ba0a85537da6b9
>> >> # Parent  6e1b26088b8eca13825a91e4e1e3547c2e89f6a0
>> >> revert: rename working directory matcher 'm' -> 'wmatch'
>> >>
>> >> Further down in the function, 'm' is reassigned to a matcher for a  
>> list
>> >> of
>> >> files.  At the bottom of the function, another matcher is created to
>> >> figure out
>> >> what subrepos need to be reverted.  For consistency with  
>> 5b85a5bc5bbb,
>> >> this will
>> >> be changed to a working directory matcher.
>> >
>> > Not sure about this one. The first use of 'm' here is the standard  
>> "make
>> > a matcher from the supplied args" pattern and the second definition
>> > appears to have very limited use.
>> >
>> > It seems the shadowing here is kind of intentional: we're making a  
>> more
>> > narrower version of the initial match that should be used for the rest
>> > of the function. So perhaps we should instead be moving the subrepo  
>> set
>> > calculation higher up instead of doing any renaming?
>> >
>> > But really, your patch 3 would actually be _simpler_ as a patch 2 that
>> > just did s/ctx/wctx/ locally, right? I'll just do that for now. Patch  
>> 1
>> > and 3' queued, thanks.
>>
>> That looks like it works for this series.  The next thing to do though  
>> is
>> to pass a matcher to the subrepo, that is a narrowed form of the  
>> original
>> passed to the function.  The original matcher needs to make it to the
>> subrepo section, but moving the whole subrepo section to the top looks
>> like it would work too.
>>
>>
>> I guess now is a good time to ask before doing the work- subrepo revert
>> works by essentially doing a clean update of the subrepo, after doing a
>> cmdutil.revert() on that layer to generate backup files.  That
>> functionality would seem to preclude the path/into/subrepo form that is
>> being supported on more commands, and would seem to violate the  
>> documented
>> help for revert:
>>
>> 	 "Because revert does not change the working directory parents..."
>>
>> Granted, it isn't changing the parents of the outer repo, but the  
>> subrepo
>> working directory seems like an extension of the outer repo's working
>> directory.  Backout doesn't support subrepos, and maybe it isn't  
>> possible
>> to do so.  But it certainly can't if subrepos are being updated like  
>> this
>> by revert.  And because of the recursive nature, a subrepo N subrepos  
>> deep
>> will be clean updated N times.
>
> Oh man, that's a hard question. My internal model of subrepo semantics
> says that yes, we should get a clean checkout of subrepos under the top
> level, but I can believe that would surprise people. I'm not sure if the
> alternative is particularly logically consistent though.

I'm not sure I understand the logical consistency part.  If I want to  
revert a directory in a flat repo, that's spelled 'hg revert dir/subdir'.   
The result may or may not be consistent codewise- other code depending on  
subdir may now be missing a dependency, but that's what I asked for, and  
the tool doesn't get in the way.

Given that a subdirectory can be reverted (and because other commands  
support path/into/subrepo syntax), I think from a command line POV, it is  
more consistent to be able to do 'hg revert subrepo/dir' or 'hg revert  
subrepo/file'.  A subrepo clean update prevents that (or at the very least  
is inconsistent with that, if we keep the current behavior as well when no  
path or only the root of the subrepo is given.)

Any other top level repo using the subrepo is still in a consistent state,  
until it updates the subrepo to a different revision.  (And then it is  
free to add more commits to the subrepo to "fix" it before committing a  
specific subrepo rev).

The reason I brought up backout is because in a flat repo, a one line  
commit even hundreds of revs back can be surgically nuked.  But a 'backout  
-S' using the current revert would kill *all* of the subsequent changes,  
and I think that would be very surprising.

I remember the threads a few years back about 'should commit recurse into  
subrepos or abort?'.  You argued consistency then, and my take away was  
that letting a user accidentally commit something not identical to the  
filesystem would be a nasty (and permanent, without history editing)  
surprise.  I don't see how that principle applies here either though-  
revert isn't recording history.


>> It's been like this for 3 years now, but revert subrepo support is  
>> totally
>> undocumented AFAICT- nothing in 'hg help revert' or in 'hg help
>> subrepos'.  Would you consider this a bug (and accept patches to fix
>> this), or are we stuck with this?
>
> If "been like this" means we do get clean checkouts today, then I think
> we should stick with it.

Yes.  The current code basically translates the parent rev to a subrepo  
rev, and does 'hg update -C subrev', after making the backup files.

--Matt
Matt Harbison - March 28, 2015, 1:54 a.m.
On Fri, 27 Mar 2015 14:59:17 -0400, Jordi GutiƩrrez Hermoso  
<jordigh@octave.org> wrote:

> On Fri, 2015-03-27 at 13:15 -0500, Matt Mackall wrote:
>> On Thu, 2015-03-26 at 21:06 -0400, Matt Harbison wrote:
>
>> > 	 "Because revert does not change the working directory parents..."
> [snip]
>>
>> Oh man, that's a hard question. My internal model of subrepo semantics
>> says that yes, we should get a clean checkout of subrepos under the top
>> level, but I can believe that would surprise people. I'm not sure if the
>> alternative is particularly logically consistent though.
>
> I think the expected behaviour would be to propagate operations to
> subrepos? revert at the top level also reverts any unclean state in
> subrepos, and update at the top level also changes the revisions in
> subrepos.

That's my expectation as well- I can't think of any commands that support  
subrepos that aren't command propagation.  (Maybe update, but that's still  
update, possibly with a pull/clone if necessary).

For better or worse, I think of the support commands have for subrepos as  
basically shorthand for 'hg -R subrepo <cmd> [subrepo/path/to/file]'.  And  
that to me is nice, because the more commands support them directly, the  
more they fade into the background.

> It would be super-weird if revert at the top level translated into
> update in a subrepo. It's already weird enough that update ceases to
> be a local operation with subrepos.
>
Matt Mackall - March 31, 2015, 6:46 p.m.
On Fri, 2015-03-27 at 21:45 -0400, Matt Harbison wrote:
> On Fri, 27 Mar 2015 14:15:09 -0400, Matt Mackall <mpm@selenic.com> wrote:
> 
> > On Thu, 2015-03-26 at 21:06 -0400, Matt Harbison wrote:
> >> On Thu, 26 Mar 2015 17:09:13 -0400, Matt Mackall <mpm@selenic.com>  
> >> wrote:
> >>
> >> > On Thu, 2015-03-26 at 00:20 -0400, Matt Harbison wrote:
> >> >> # HG changeset patch
> >> >> # User Matt Harbison <matt_harbison@yahoo.com>
> >> >> # Date 1427333998 14400
> >> >> #      Wed Mar 25 21:39:58 2015 -0400
> >> >> # Node ID 781d27e7fcd3a438d02bf2e1a9ba0a85537da6b9
> >> >> # Parent  6e1b26088b8eca13825a91e4e1e3547c2e89f6a0
> >> >> revert: rename working directory matcher 'm' -> 'wmatch'
> >> >>
> >> >> Further down in the function, 'm' is reassigned to a matcher for a  
> >> list
> >> >> of
> >> >> files.  At the bottom of the function, another matcher is created to
> >> >> figure out
> >> >> what subrepos need to be reverted.  For consistency with  
> >> 5b85a5bc5bbb,
> >> >> this will
> >> >> be changed to a working directory matcher.
> >> >
> >> > Not sure about this one. The first use of 'm' here is the standard  
> >> "make
> >> > a matcher from the supplied args" pattern and the second definition
> >> > appears to have very limited use.
> >> >
> >> > It seems the shadowing here is kind of intentional: we're making a  
> >> more
> >> > narrower version of the initial match that should be used for the rest
> >> > of the function. So perhaps we should instead be moving the subrepo  
> >> set
> >> > calculation higher up instead of doing any renaming?
> >> >
> >> > But really, your patch 3 would actually be _simpler_ as a patch 2 that
> >> > just did s/ctx/wctx/ locally, right? I'll just do that for now. Patch  
> >> 1
> >> > and 3' queued, thanks.
> >>
> >> That looks like it works for this series.  The next thing to do though  
> >> is
> >> to pass a matcher to the subrepo, that is a narrowed form of the  
> >> original
> >> passed to the function.  The original matcher needs to make it to the
> >> subrepo section, but moving the whole subrepo section to the top looks
> >> like it would work too.
> >>
> >>
> >> I guess now is a good time to ask before doing the work- subrepo revert
> >> works by essentially doing a clean update of the subrepo, after doing a
> >> cmdutil.revert() on that layer to generate backup files.  That
> >> functionality would seem to preclude the path/into/subrepo form that is
> >> being supported on more commands, and would seem to violate the  
> >> documented
> >> help for revert:
> >>
> >> 	 "Because revert does not change the working directory parents..."
> >>
> >> Granted, it isn't changing the parents of the outer repo, but the  
> >> subrepo
> >> working directory seems like an extension of the outer repo's working
> >> directory.  Backout doesn't support subrepos, and maybe it isn't  
> >> possible
> >> to do so.  But it certainly can't if subrepos are being updated like  
> >> this
> >> by revert.  And because of the recursive nature, a subrepo N subrepos  
> >> deep
> >> will be clean updated N times.
> >
> > Oh man, that's a hard question. My internal model of subrepo semantics
> > says that yes, we should get a clean checkout of subrepos under the top
> > level, but I can believe that would surprise people. I'm not sure if the
> > alternative is particularly logically consistent though.
> 
> I'm not sure I understand the logical consistency part.  If I want to  
> revert a directory in a flat repo, that's spelled 'hg revert dir/subdir'.   
> The result may or may not be consistent codewise- other code depending on  
> subdir may now be missing a dependency, but that's what I asked for, and  
> the tool doesn't get in the way.
> 
> Given that a subdirectory can be reverted (and because other commands  
> support path/into/subrepo syntax), I think from a command line POV, it is  
> more consistent to be able to do 'hg revert subrepo/dir' or 'hg revert  
> subrepo/file'.  A subrepo clean update prevents that (or at the very least  
> is inconsistent with that, if we keep the current behavior as well when no  
> path or only the root of the subrepo is given.)
> 
> Any other top level repo using the subrepo is still in a consistent state,  
> until it updates the subrepo to a different revision.  (And then it is  
> free to add more commits to the subrepo to "fix" it before committing a  
> specific subrepo rev).
> 
> The reason I brought up backout is because in a flat repo, a one line  
> commit even hundreds of revs back can be surgically nuked.  But a 'backout  
> -S' using the current revert would kill *all* of the subsequent changes,  
> and I think that would be very surprising.
> 
> I remember the threads a few years back about 'should commit recurse into  
> subrepos or abort?'.  You argued consistency then, and my take away was  
> that letting a user accidentally commit something not identical to the  
> filesystem would be a nasty (and permanent, without history editing)  
> surprise.  I don't see how that principle applies here either though-  
> revert isn't recording history.

Consider that "hg revert -a" is basically the same as "hg update -C .":
change the working copy to match the parent. So if I do "hg revert
subrepo", I'd expect that to give me the state of the subrepo in the
parent... which is not the same as "hg --cwd subrepo revert:.

Another way to look at it is this. We can think of the working directory
as a tuple:

 < parent | [files...] >

(such that if hash(files) = parent, we're clean)

The most obvious way to think about subrepos is:

 < parent | [files..., <subparent | subfiles>...] >

So when we talk about revert reverting "the working copy".. well
subparent is a property of that. In particular, it's included in the
hash above.

Patch

diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
--- a/mercurial/cmdutil.py
+++ b/mercurial/cmdutil.py
@@ -2809,11 +2809,11 @@ 
         ## filling of the `names` mapping
         # walk dirstate to fill `names`
 
-        m = scmutil.match(repo[None], pats, opts)
-        if not m.always() or node != parent:
-            m.bad = lambda x, y: False
-            for abs in repo.walk(m):
-                names[abs] = m.rel(abs), m.exact(abs)
+        wmatch = scmutil.match(repo[None], pats, opts)
+        if not wmatch.always() or node != parent:
+            wmatch.bad = lambda x, y: False
+            for abs in repo.walk(wmatch):
+                names[abs] = wmatch.rel(abs), wmatch.exact(abs)
 
             # walk target manifest to fill `names`
 
@@ -2826,12 +2826,12 @@ 
                 for f in names:
                     if f.startswith(path_):
                         return
-                ui.warn("%s: %s\n" % (m.rel(path), msg))
-
-            m.bad = badfn
-            for abs in ctx.walk(m):
+                ui.warn("%s: %s\n" % (wmatch.rel(path), msg))
+
+            wmatch.bad = badfn
+            for abs in ctx.walk(wmatch):
                 if abs not in names:
-                    names[abs] = m.rel(abs), m.exact(abs)
+                    names[abs] = wmatch.rel(abs), wmatch.exact(abs)
 
             # Find status of all file in `names`.
             m = scmutil.matchfiles(repo, names)
@@ -2839,10 +2839,10 @@ 
             changes = repo.status(node1=node, match=m,
                                   unknown=True, ignored=True, clean=True)
         else:
-            changes = repo.status(match=m)
+            changes = repo.status(match=wmatch)
             for kind in changes:
                 for abs in kind:
-                    names[abs] = m.rel(abs), m.exact(abs)
+                    names[abs] = wmatch.rel(abs), wmatch.exact(abs)
 
             m = scmutil.matchfiles(repo, names)