Patchwork [2,of,2,RFC] extdiff: use -S to archive the full repo

login
register
mail settings
Submitter Matt Harbison
Date Feb. 13, 2015, 3:45 a.m.
Message ID <op.xty8pvhu9lwrgf@envy>
Download mbox | patch
Permalink /patch/7798/
State Deferred
Headers show

Comments

Matt Harbison - Feb. 13, 2015, 3:45 a.m.
On Wed, 11 Feb 2015 07:13:17 -0500, Mathias De Maré  
<mathias.demare@gmail.com> wrote:

> On Wed, Feb 11, 2015 at 4:23 AM, Matt Harbison <mharbison72@gmail.com>
> wrote:
>
> On Tue, 10 Feb 2015 14:36:42 -0500, Mathias De Maré <
>> mathias.demare@gmail.com> wrote:
>>
>> Adding a single file in the working directory worked fine (since it only
>>> does the snapshot for the old (non-working dir) context).
>>> When I add a file and remove another, I get 'abort: no files match the
>>> archive pattern' when the snapshot of the working directory is done. I  
>>> had
>>> a further look, and it seems (archive.py:307-310) like for the
>>> subrepositories, the revision is always extracted by checking the
>>> substate.
>>> As a result, the last revision is used (instead of the working  
>>> directory).
>>> Specifically for Git, I have the impression there's additionally the
>>> problem that 'git archive' does not support archiving the working
>>> directory.
>>>
>>
>> After some digging, I think that makes sense.  hg doesn't support
>> archiving the working copy from the command line either.  Maybe what you
>> can try is in subrepo:1550, put a print statement before returning from  
>> the
>> 'if not revision' check.  The working copy is represented by None, so 0
>> files is returned, and if the parent didn't have anything to archive,
>> archive() complains.  (I also bet that if the delta is only file  
>> removes,
>> it will also archive 0 files and abort.  Maybe we need to pass some  
>> sort of
>> flag from extdiff to archive to control whether it aborts.)
>>
>
> No, I think the issue is simply that when you select the subrepo version
> through substate (like is done in archive.py:307-310), you will get the
> latest version of the substate file, and as a result you will get the
> latest committed revision, instead of None.
> I just tried this (with 1 added file and 1 removed file), and the output
> when adding on subrepo:1550 is (this is with your 2 RFC patches and the  
> one
> you presented in this mail):
>
>> revision: aa307b592e8a579e599c79665d9f1c4ec19af59f
>> revision: aa307b592e8a579e599c79665d9f1c4ec19af59f
>>
>
> Since the only file that is passed to the subrepo archive is exactly the
> file that was added (and so is not in the revision specified), the  
> archive
> will indeed say that no files can be archived.
> So archive should actually return 'None' to the subrepo, so we can handle
> it correctly in the subrepo.

Sorry for the delayed response.  I had to reboot last night, and Windows  
decided to install 11 updates... and it still wasn't done 12 hours later  
when I left for work this morning.

I put together some tests to see if I could figure out the differences  
between what I was seeing and what you were.  I had assumed that adding a  
file to a subrepo and modifying one in a subrepo wouldn't cause a  
difference in how the recursion is handled- but that doesn't appear to be  
the case.  This can be applied on top of the 3rd patch in this chain.   
There's either a subtle subrepo bug, or (more likely), I'm missing  
something.

# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1423795917 18000
#      Thu Feb 12 21:51:57 2015 -0500
# Node ID 2a257d5dafbb12c498e47a8c5d0a5c29495f1ea2
# Parent  57c3177c05122a6655ac27ed71f07276ef5c8375
extdiff: basic hg subrepo testing

What extdiff sees when a subrepo has a modified file, and when it has an  
added
file is different for reasons unknown.  A modified file in a git subrepo  
also
works as we would like (i.e. the modified file is archived), but  
apparently it
doesn't archive an added file either.



>>
>> I think you may need to manually copy from the filesystem to the archive
>> for this conditional.  Since the status call was previously done, every
>> file you need to archive is listed in the matcher.
>>
> That is indeed correct, I'll have a look if I can create a patch to  
> handle
> that case for git and for svn. However, we don't even get into that case
> yet, because of the issue above.

After further digging, the only time I've seen this case triggered is if  
the subrepo hasn't been committed to the parent repo yet.  I was hoping  
this would be used as 'give me the wctx of this subrepo', but it's not.  I  
can see how it gets there in subrepo:145.

I wonder if context.substate() should be overridden in workingctx to set  
the state to None, so that repo[None].substate always yields subrepo  
working directories.  We would need it for a revset symbol that indicates  
the working directory too.  It's different from repo['.'].substate, so I  
don't think it will break anything.  I'll try it and see what happens, but  
maybe a subrepo expert can chime in.

>>
>> Whatever you implement should work for svn too, since it is just
>> filesystem access.  (But svn seems to use the base class implementation,
>> which doesn't check the revision.  Not sure what is going on there.)
>>
>>
>>> If I then commit the added and removed file and run extdiff on those,  
>>> it
>>> works fine.
>>>
>>> So to get the working directory working, we'll also need to modify the
>>> support for archiving subrepos in general and git subrepos in  
>>> particular.
>>>
>>
>> Here's a demo that working directory archiving for hgsubrepos works.   
>> The
>> thing that surprised me is it seems to ignore uncommitted adds.  But so
>> does 'hg diff' apparently.  This should apply on top of this series.
>> Ignore the tests at the very bottom around line 518- it uses largefiles,
>> and is probably wonky.
>>
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison@yahoo.com>
>> # Date 1423621101 18000
>> #      Tue Feb 10 21:18:21 2015 -0500
>> # Node ID 57c3177c05122a6655ac27ed71f07276ef5c8375
>> # Parent  9c4f27e5c804662d2d25581ad3856ef6da3729ec
>> extdiff: quick and dirty working copy snapshot via archive

[snip]

Patch

diff --git a/mercurial/subrepo.py b/mercurial/subrepo.py
--- a/mercurial/subrepo.py
+++ b/mercurial/subrepo.py
@@ -658,6 +658,8 @@ 
      def diff(self, ui, diffopts, node2, match, prefix, **opts):
          try:
              node1 = node.bin(self._state[1])
+#            print('hgsub diff: "%s"' % self._state[1])
+
              # We currently expect node2 to come from substate and be
              # in hex format
              if node2 is not None:
@@ -676,6 +678,7 @@ 
          total = abstractsubrepo.archive(self, archiver, prefix, match)
          rev = self._state[1]
          ctx = self._repo[rev]
+#        print('hgsub archive: "%s, ctx type is %s"' % (rev, type(ctx)))
          for subpath in ctx.substate:
              s = subrepo(ctx, subpath)
              submatch = matchmod.narrowmatcher(subpath, match)
diff --git a/tests/test-subrepo-extdiff.t b/tests/test-subrepo-extdiff.t
new file mode 100644
--- /dev/null
+++ b/tests/test-subrepo-extdiff.t
@@ -0,0 +1,93 @@ 
+  $ cat >> $HGRCPATH <<EOF
+  > [extensions]
+  > extdiff=
+  > EOF
+
+  $ hg init root
+  $ echo "hgsub = hgsub" > root/.hgsub
+  $ cd root
+  $ hg init hgsub
+  $ hg add .hgsub
+  $ echo test > hgsub/modified.txt
+  $ hg -R hgsub add hgsub/modified.txt
+
+-------------------------------------------------------------------------------
+For hg subrepos that have not been locked in, changes are visible to  
diff, but
+NOT extdiff.  self._state[1] is ''
+
+  $ hg status -S
+  A .hgsub
+  A hgsub/modified.txt
+
+  $ hg diff -S
+  diff -r 000000000000 .hgsub
+  --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+  +++ b/.hgsub	* (glob)
+  @@ -0,0 +1,1 @@
+  +hgsub = hgsub
+  diff -r 000000000000 hgsub/modified.txt
+  --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+  +++ b/hgsub/modified.txt	* (glob)
+  @@ -0,0 +1,1 @@
+  +test
+
+  $ hg extdiff -S
+  diff -Npru root.000000000000/.hgsub root/.hgsub
+  --- root.000000000000/.hgsub	1970-01-01 00:00:00 +0000
+  +++ root/.hgsub	* (glob)
+  @@ -0,0 +1 @@
+  +hgsub = hgsub
+  [1]
+
+  $ hg ci -qm "modified.txt -> test" -S
+
+-------------------------------------------------------------------------------
+For hg subrepos that *have* been locked in, changes are visible to diff
+AND extdiff.  self._state[1] is 50a11ebc6587
+
+  $ echo 'second mod' > hgsub/modified.txt
+
+  $ hg status -S
+  M hgsub/modified.txt
+  $ hg diff -S
+  diff -r 50a11ebc6587 hgsub/modified.txt
+  --- a/hgsub/modified.txt	Thu Jan 01 00:00:00 1970 +0000
+  +++ b/hgsub/modified.txt	* (glob)
+  @@ -1,1 +1,1 @@
+  -test
+  +second mod
+  $ hg extdiff -S
+  ---  
c:/users/matt/appdata/local/temp/extdiff.*/root.*/hgsub/modified.txt	*  
(glob)
+  +++ $TESTTMP/root/hgsub/modified.txt	* (glob)
+  @@ -1 +1 @@
+  -test
+  +second mod
+  [1]
+
+-------------------------------------------------------------------------------
+But adding a file to hgsubrepo causes extdiff to see *nothing*.   
self_state[1]
+is still 50a11ebc6587, as it was when we had better results in the last  
test.
+
+  $ echo 'added' > hgsub/added.txt
+#  $ hg -R hgsub add hgsub/added.txt
+  $ hg add hgsub/added.txt
+
+  $ hg status -S
+  M hgsub/modified.txt
+  A hgsub/added.txt
+
+  $ hg diff -S
+  diff -r 50a11ebc6587 hgsub/added.txt
+  --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+  +++ b/hgsub/added.txt	* (glob)
+  @@ -0,0 +1,1 @@
+  +added
+  diff -r 50a11ebc6587 hgsub/modified.txt
+  --- a/hgsub/modified.txt	Thu Jan 01 00:00:00 1970 +0000
+  +++ b/hgsub/modified.txt	* (glob)
+  @@ -1,1 +1,1 @@
+  -test
+  +second mod
+
+  $ hg extdiff -S
+  [1]
diff --git a/tests/test-subrepo-git.t b/tests/test-subrepo-git.t
--- a/tests/test-subrepo-git.t
+++ b/tests/test-subrepo-git.t
@@ -698,6 +698,16 @@ 
    +foo
    +bar

+  $ hg extdiff -S --config extensions.extdiff=
+  --- nul	1970-01-01 00:00:00 +0000
+  +++ $TESTTMP/tc/s/foobar	* (glob)
+  @@ -0,0 +1,4 @@
+  +woopwoop
+  +
+  +foo
+  +bar
+  [1]
+
    $ hg commit --subrepos -m "Added foobar"
    committing subrepository s
    created new head