Patchwork subrepos: support adding files in git subrepos

login
register
mail settings
Submitter Mathias De Maré
Date Feb. 24, 2015, 7:49 a.m.
Message ID <391c33821c0399e40979.1424764189@mathias-1015PE>
Download mbox | patch
Permalink /patch/7822/
State Superseded
Commit bd9f64ec891d926abf7bdfc1b4430106f1f11090
Headers show

Comments

Mathias De Maré - Feb. 24, 2015, 7:49 a.m.
# HG changeset patch
# User Mathias De Maré <mathias.demare@gmail.com>
# Date 1424764162 -3600
#      Tue Feb 24 08:49:22 2015 +0100
# Node ID 391c33821c0399e40979ef2440289844f8e12756
# Parent  ff5caa8dfd993680d9602ca6ebb14da9de10d5f4
subrepos: support adding files in git subrepos

This support includes correct matching, so includes,
excludes and patterns are all supported.
Matt Harbison - Feb. 25, 2015, 1:46 a.m.
On Tue, 24 Feb 2015 02:49:49 -0500, Mathias De Maré  
<mathias.demare@gmail.com> wrote:

> # HG changeset patch
> # User Mathias De Maré <mathias.demare@gmail.com>
> # Date 1424764162 -3600
> #      Tue Feb 24 08:49:22 2015 +0100
> # Node ID 391c33821c0399e40979ef2440289844f8e12756
> # Parent  ff5caa8dfd993680d9602ca6ebb14da9de10d5f4
> subrepos: support adding files in git subrepos
>
> This support includes correct matching, so includes,
> excludes and patterns are all supported.
>
> diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
> --- a/mercurial/subrepo.py
> +++ b/mercurial/subrepo.py
> @@ -1524,6 +1524,24 @@
>              return False
>     @annotatesubrepoerror
> +    def add(self, ui, match, prefix, explicitonly, **opts):
> +        rev = self._state[1]
> +        if match.files():
> +            files = match.files()
> +        elif match.always():
> +            self._gitcommand(["add", "--all"])
> +            return []
> +        else:
> +            (modified, added, removed,
> +             deleted, unknown, ignored, clean) = self.status(None)
> +            files = unknown
> +        if match:
> +            files = [f for f in files if match(f)]

Will match ever be None, since .files() was called on it first?

> +        for f in files:
> +            self._gitcommand(["add", f])
> +        return []
> +
> +    @annotatesubrepoerror
>      def remove(self):
>          if self._gitmissing():
>              return

Do we need this _gitmissing() check for add() too?  Don't forget to update  
mercurial/help/subrepos.txt.

I was surprised that the matcher has all of the files in it, but your  
tests seem to prove it does.  Looking at the code, it looks like it goes  
cmdutil.add() -> wctx.walk(match) -> dirstate.walk(), which I guess picks  
up the (non-hg?) subrepo files.  But that dirstate walk is called in a way  
to ignore the files in the parent repo's .hgignore.  And since it doesn't  
know git's ignore file, will it end up adding files git wants to ignore if  
a pattern _other_ than an exact file is supplied, like a glob?  (IDK if  
git will add ignored files if explicitly named like hg.  Maybe it isn't an  
issue.)

The .gitignore issue you can probably solve by always calling  
git.status(), and only acting on the intersection of unknown and  
match.files(), unless a file is match.exact() (or match.always()).  I'm  
not sure what to suggest about bypassing .hgignore, other than to walk  
again with the correct parameter?  Maybe _always_ iterating over unknown  
 from git.status() and calling the matcher on each entry?  That may fix  
both cases, but opens the door to silently ignoring the error for adding  
an already tracked file.  That may not be worth worrying about.

--Matt
Mathias De Maré - Feb. 25, 2015, 6:21 a.m.
On Wed, Feb 25, 2015 at 2:46 AM, Matt Harbison <mharbison72@gmail.com>
wrote:

> On Tue, 24 Feb 2015 02:49:49 -0500, Mathias De Maré <
> mathias.demare@gmail.com> wrote:
>
>  # HG changeset patch
>> # User Mathias De Maré <mathias.demare@gmail.com>
>> # Date 1424764162 -3600
>> #      Tue Feb 24 08:49:22 2015 +0100
>> # Node ID 391c33821c0399e40979ef2440289844f8e12756
>> # Parent  ff5caa8dfd993680d9602ca6ebb14da9de10d5f4
>> subrepos: support adding files in git subrepos
>>
>> This support includes correct matching, so includes,
>> excludes and patterns are all supported.
>>
>> diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
>> --- a/mercurial/subrepo.py
>> +++ b/mercurial/subrepo.py
>> @@ -1524,6 +1524,24 @@
>>              return False
>>     @annotatesubrepoerror
>> +    def add(self, ui, match, prefix, explicitonly, **opts):
>> +        rev = self._state[1]
>> +        if match.files():
>> +            files = match.files()
>> +        elif match.always():
>> +            self._gitcommand(["add", "--all"])
>> +            return []
>> +        else:
>> +            (modified, added, removed,
>> +             deleted, unknown, ignored, clean) = self.status(None)
>> +            files = unknown
>> +        if match:
>> +            files = [f for f in files if match(f)]
>>
>
> Will match ever be None, since .files() was called on it first?
>
Hm, I'll go over the code calling this, I'm unsure.

>
>  +        for f in files:
>> +            self._gitcommand(["add", f])
>> +        return []
>> +
>> +    @annotatesubrepoerror
>>      def remove(self):
>>          if self._gitmissing():
>>              return
>>
>
> Do we need this _gitmissing() check for add() too?  Don't forget to update
> mercurial/help/subrepos.txt.
>
I'll have a look as well :-) Note that 'remove' is not for removing git
files, it's for removing the entire subrepository. 'forget' is used for
removing files (but is not yet implemented for git).

>
> I was surprised that the matcher has all of the files in it, but your
> tests seem to prove it does.  Looking at the code, it looks like it goes
> cmdutil.add() -> wctx.walk(match) -> dirstate.walk(), which I guess picks
> up the (non-hg?) subrepo files.  But that dirstate walk is called in a way
> to ignore the files in the parent repo's .hgignore.  And since it doesn't
> know git's ignore file, will it end up adding files git wants to ignore if
> a pattern _other_ than an exact file is supplied, like a glob?  (IDK if git
> will add ignored files if explicitly named like hg.  Maybe it isn't an
> issue.)
>
'git help add' says:

>    -f, --force
>        Allow adding otherwise ignored files.
>
> So without using that option, it should apparently be ok (though I wonder
if it would give an error in that case, I'll test).

>
> The .gitignore issue you can probably solve by always calling
> git.status(), and only acting on the intersection of unknown and
> match.files(), unless a file is match.exact() (or match.always()).  I'm not
> sure what to suggest about bypassing .hgignore, other than to walk again
> with the correct parameter?  Maybe _always_ iterating over unknown from
> git.status() and calling the matcher on each entry?  That may fix both
> cases, but opens the door to silently ignoring the error for adding an
> already tracked file.  That may not be worth worrying about.
>
I'll look into this in a bit more detail. Explicitly using the git status()
seems like the best fix, but I'll add some additional tests, both with a
.hgignore and with a .gitignore.

Thanks for the feedback!

Greetings,
Mathias

>
> --Matt
>
Matt Harbison - Feb. 25, 2015, 2:03 p.m.
On Wed, 25 Feb 2015 01:21:31 -0500, Mathias De Maré  
<mathias.demare@gmail.com> wrote:

> On Wed, Feb 25, 2015 at 2:46 AM, Matt Harbison <mharbison72@gmail.com>
> wrote:
>
>> On Tue, 24 Feb 2015 02:49:49 -0500, Mathias De Maré <
>> mathias.demare@gmail.com> wrote:
>>
>>  # HG changeset patch
>>> # User Mathias De Maré <mathias.demare@gmail.com>
>>> # Date 1424764162 -3600
>>> #      Tue Feb 24 08:49:22 2015 +0100
>>> # Node ID 391c33821c0399e40979ef2440289844f8e12756
>>> # Parent  ff5caa8dfd993680d9602ca6ebb14da9de10d5f4
>>> subrepos: support adding files in git subrepos
>>>
>>> This support includes correct matching, so includes,
>>> excludes and patterns are all supported.
>>>
>>> diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
>>> --- a/mercurial/subrepo.py
>>> +++ b/mercurial/subrepo.py
>>> @@ -1524,6 +1524,24 @@
>>>              return False
>>>     @annotatesubrepoerror
>>> +    def add(self, ui, match, prefix, explicitonly, **opts):
>>> +        rev = self._state[1]
>>> +        if match.files():
>>> +            files = match.files()
>>> +        elif match.always():
>>> +            self._gitcommand(["add", "--all"])
>>> +            return []
>>> +        else:
>>> +            (modified, added, removed,
>>> +             deleted, unknown, ignored, clean) = self.status(None)
>>> +            files = unknown
>>> +        if match:
>>> +            files = [f for f in files if match(f)]
>>>
>>
>> Will match ever be None, since .files() was called on it first?
>>
> Hm, I'll go over the code calling this, I'm unsure.
>
>>
>>  +        for f in files:
>>> +            self._gitcommand(["add", f])
>>> +        return []
>>> +
>>> +    @annotatesubrepoerror
>>>      def remove(self):
>>>          if self._gitmissing():
>>>              return
>>>
>>
>> Do we need this _gitmissing() check for add() too?  Don't forget to  
>> update
>> mercurial/help/subrepos.txt.
>>
> I'll have a look as well :-) Note that 'remove' is not for removing git
> files, it's for removing the entire subrepository. 'forget' is used for
> removing files (but is not yet implemented for git).
>
>>
>> I was surprised that the matcher has all of the files in it, but your
>> tests seem to prove it does.  Looking at the code, it looks like it goes
>> cmdutil.add() -> wctx.walk(match) -> dirstate.walk(), which I guess  
>> picks
>> up the (non-hg?) subrepo files.  But that dirstate walk is called in a  
>> way
>> to ignore the files in the parent repo's .hgignore.  And since it  
>> doesn't
>> know git's ignore file, will it end up adding files git wants to ignore  
>> if
>> a pattern _other_ than an exact file is supplied, like a glob?  (IDK if  
>> git
>> will add ignored files if explicitly named like hg.  Maybe it isn't an
>> issue.)
>>
> 'git help add' says:
>
>>    -f, --force
>>        Allow adding otherwise ignored files.
>>
>> So without using that option, it should apparently be ok (though I  
>> wonder
>> if it would give an error in that case, I'll test).

I thought about this some more, and should clarify.  I'm pretty sure that  
if you specify an exact file in hg add, it is added, even if it matches an  
ignore pattern.  And if you specify a pattern to add, ignored files are  
ignored.

I think we should follow those semantics here too, because the (user)  
operation is carried out with an hg command, not git.  So -f will be  
needed in some cases I think, for the git.status() ignored & match.exact()  
cases anyway.  Maybe it's harmless to always add -f if the match is exact,  
and let git throw away the inexact + ignored files (assuming it doesn't  
write junk to stderr).

Not sure how much of a hassle all of this becomes, or how much is worth  
handling.  I didn't see anything stating deviation from existing  
conventions, so it got me wondering about corner cases.

>>
>> The .gitignore issue you can probably solve by always calling
>> git.status(), and only acting on the intersection of unknown and
>> match.files(), unless a file is match.exact() (or match.always()).  I'm  
>> not
>> sure what to suggest about bypassing .hgignore, other than to walk again
>> with the correct parameter?  Maybe _always_ iterating over unknown from
>> git.status() and calling the matcher on each entry?  That may fix both
>> cases, but opens the door to silently ignoring the error for adding an
>> already tracked file.  That may not be worth worrying about.
>>
> I'll look into this in a bit more detail. Explicitly using the git  
> status()
> seems like the best fix, but I'll add some additional tests, both with a
> .hgignore and with a .gitignore.
>
> Thanks for the feedback!
>
> Greetings,
> Mathias
>
>>
>> --Matt
Mathias De Maré - Feb. 26, 2015, 6:50 p.m.
On Wed, Feb 25, 2015 at 2:46 AM, Matt Harbison <mharbison72@gmail.com>
wrote:

> On Tue, 24 Feb 2015 02:49:49 -0500, Mathias De Maré <
> mathias.demare@gmail.com> wrote:
>
>  # HG changeset patch
>> # User Mathias De Maré <mathias.demare@gmail.com>
>> # Date 1424764162 -3600
>> #      Tue Feb 24 08:49:22 2015 +0100
>> # Node ID 391c33821c0399e40979ef2440289844f8e12756
>> # Parent  ff5caa8dfd993680d9602ca6ebb14da9de10d5f4
>> subrepos: support adding files in git subrepos
>>
>> This support includes correct matching, so includes,
>> excludes and patterns are all supported.
>>
>> diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
>> --- a/mercurial/subrepo.py
>> +++ b/mercurial/subrepo.py
>> @@ -1524,6 +1524,24 @@
>>              return False
>>     @annotatesubrepoerror
>> +    def add(self, ui, match, prefix, explicitonly, **opts):
>> +        rev = self._state[1]
>> +        if match.files():
>> +            files = match.files()
>> +        elif match.always():
>> +            self._gitcommand(["add", "--all"])
>> +            return []
>> +        else:
>> +            (modified, added, removed,
>> +             deleted, unknown, ignored, clean) = self.status(None)
>> +            files = unknown
>> +        if match:
>> +            files = [f for f in files if match(f)]
>>
>
> Will match ever be None, since .files() was called on it first?
>
Turns out it won't, going back to commands::add(), it looks to me like this
is always created. I've removed the check (and like you say, it doesn't
make sense to have this check after calling .files() already).

>
>  +        for f in files:
>> +            self._gitcommand(["add", f])
>> +        return []
>> +
>> +    @annotatesubrepoerror
>>      def remove(self):
>>          if self._gitmissing():
>>              return
>>
>
> Do we need this _gitmissing() check for add() too?  Don't forget to update
> mercurial/help/subrepos.txt.
>
We do indeed. I've also checked other commands and I have a separate patch
to fix those.

>
> I was surprised that the matcher has all of the files in it, but your
> tests seem to prove it does.  Looking at the code, it looks like it goes
> cmdutil.add() -> wctx.walk(match) -> dirstate.walk(), which I guess picks
> up the (non-hg?) subrepo files.  But that dirstate walk is called in a way
> to ignore the files in the parent repo's .hgignore.  And since it doesn't
> know git's ignore file, will it end up adding files git wants to ignore if
> a pattern _other_ than an exact file is supplied, like a glob?  (IDK if git
> will add ignored files if explicitly named like hg.  Maybe it isn't an
> issue.)
>
cmdutil first does wctx.walk(match), but afterwards it goes over all of the
subrepos :-)

>
> The .gitignore issue you can probably solve by always calling
> git.status(), and only acting on the intersection of unknown and
> match.files(), unless a file is match.exact() (or match.always()).  I'm not
> sure what to suggest about bypassing .hgignore, other than to walk again
> with the correct parameter?  Maybe _always_ iterating over unknown from
> git.status() and calling the matcher on each entry?  That may fix both
> cases, but opens the door to silently ignoring the error for adding an
> already tracked file.  That may not be worth worrying about.
>
I've implemented this by checking using match.exact() for each file if I
should add '-f'.
I don't do this for 'match.always()', since Mercurial doesn't add ignored
files with a no-args 'hg add'. In fact, I don't treat 'always()' like a
special case, since the regular case allows me to add the "adding filename"
text in the output.

I think the only unhandled case is already tracked files, I decided to do
like you mentioned and not worry about it :-)

Greetings,
Mathias

>
> --Matt
>

Patch

diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
--- a/mercurial/subrepo.py
+++ b/mercurial/subrepo.py
@@ -1524,6 +1524,24 @@ 
             return False
 
     @annotatesubrepoerror
+    def add(self, ui, match, prefix, explicitonly, **opts):
+        rev = self._state[1]
+        if match.files():
+            files = match.files()
+        elif match.always():
+            self._gitcommand(["add", "--all"])
+            return []
+        else:
+            (modified, added, removed,
+             deleted, unknown, ignored, clean) = self.status(None)
+            files = unknown
+        if match:
+            files = [f for f in files if match(f)]
+        for f in files:
+            self._gitcommand(["add", f])
+        return []
+
+    @annotatesubrepoerror
     def remove(self):
         if self._gitmissing():
             return
diff --git a/tests/test-subrepo-git.t b/tests/test-subrepo-git.t
--- a/tests/test-subrepo-git.t
+++ b/tests/test-subrepo-git.t
@@ -850,4 +850,59 @@ 
   $ hg cat -r "parents(.)" --output tmp/%b/foobar s/foobar
   $ diff tmp/tc/foobar catparents
 
+cleanup
+  $ rm -r tmp
+  $ rm catparents
+
+add git files, using either files or patterns
+  $ echo "hsss! hsssssssh!" > s/snake.python
+  $ echo "ccc" > s/c.c
+  $ echo "cpp" > s/cpp.cpp
+
+  $ hg add s/snake.python s/c.c s/cpp.cpp
+  $ hg st --subrepos s
+  M s/foobar
+  A s/c.c
+  A s/cpp.cpp
+  A s/snake.python
+  ? s/barfoo
+  $ hg revert s
+  reverting subrepo ../gitroot
+
+  $ hg add --subrepos "glob:**.python"
+  $ hg st --subrepos s
+  A s/snake.python
+  ? s/barfoo
+  ? s/c.c
+  ? s/cpp.cpp
+  ? s/foobar.orig
+  $ hg revert s
+  reverting subrepo ../gitroot
+
+  $ hg add --subrepos s
+  $ hg st --subrepos s
+  A s/barfoo
+  A s/c.c
+  A s/cpp.cpp
+  A s/foobar.orig
+  A s/snake.python
+  $ hg revert s
+  reverting subrepo ../gitroot
+make sure everything is reverted correctly
+  $ hg st --subrepos s
+  ? s/barfoo
+  ? s/c.c
+  ? s/cpp.cpp
+  ? s/foobar.orig
+  ? s/snake.python
+
+  $ hg add --subrepos --exclude "path:s/c.c"
+  $ hg st --subrepos s
+  A s/barfoo
+  A s/cpp.cpp
+  A s/foobar.orig
+  A s/snake.python
+  ? s/c.c
+  $ hg revert --all -q
+
   $ cd ..