Patchwork treemanifest: don't use a manifest in the cache if it's dirty

login
register
mail settings
Submitter Tony Tung
Date Feb. 18, 2016, 11:46 p.m.
Message ID <a7cd652b0f4398e939eb.1455839178@andromeda.local>
Download mbox | patch
Permalink /patch/13259/
State Changes Requested
Delegated to: Augie Fackler
Headers show

Comments

Tony Tung - Feb. 18, 2016, 11:46 p.m.
# HG changeset patch
# User Tony Tung <tonytung@merly.org>
# Date 1455839061 28800
#      Thu Feb 18 15:44:21 2016 -0800
# Node ID a7cd652b0f4398e939ebc10c6ceff86c4c119a72
# Parent  c4ee1010595a7cae4aad0c7dbcd152719f780e6a
treemanifest: don't use a manifest in the cache if it's dirty.

When doing hg convert, we may encounter subrepo merges.  In that case, the
working directory subrepo will be dirty (i.e., differ from the repository
state), which will cause the manifest to be written to.  Subsequent reads
of this manifest entry will note that it's dirty, and calls to node() will
fail.

Resolves https://bz.mercurial-scm.org/show_bug.cgi?id=5076
Martin von Zweigbergk - Feb. 21, 2016, 5:52 a.m.
On Thu, Feb 18, 2016 at 3:46 PM, Tony Tung <tonytung@fb.com> wrote:
> # HG changeset patch
> # User Tony Tung <tonytung@merly.org>
> # Date 1455839061 28800
> #      Thu Feb 18 15:44:21 2016 -0800
> # Node ID a7cd652b0f4398e939ebc10c6ceff86c4c119a72
> # Parent  c4ee1010595a7cae4aad0c7dbcd152719f780e6a
> treemanifest: don't use a manifest in the cache if it's dirty.

Drop the trailing period and add " (issue5076)" instead (see
https://www.mercurial-scm.org/wiki/ContributingChanges#Submission_checklist).

> When doing hg convert, we may encounter subrepo merges.  In that case, the
> working directory subrepo will be dirty (i.e., differ from the repository
> state), which will cause the manifest to be written to.  Subsequent reads
> of this manifest entry will note that it's dirty, and calls to node() will
> fail.

Nice find!

However, it wasn't clear to me why this would be specific to
treemanifests, so I added dirty checking to the flat manifest code as
well (see patch on http://paste.debian.net/402058/) and ran the test
in the issue tracker (probably the same as in your patch) *without*
treemanifests. Sure enough, it fails there too. So it seems like the
only reason this is failing only with treemanifests is that only
treemanifests check that they are clean in some cases. Please test it
for yourself to make sure I didn't just see what I wanted to see when
I tested it.

As I said on the issue tracker, I've spent a lot of time trying to
understand how manifestmerge() interacts with submerge(). Perhaps the
reading of dirty manifests is the reason I never understood what was
happening.

Anyway, what do we do now? I think the test case should be moved out
of test-treemanifest.t at least. Is the fix to add dirty-checking in
manifestdict as well and not bypass the cache there too? Or perhaps
the bug is in the subrepo code?
Tony Tung - Feb. 22, 2016, 10:42 p.m.
> On Feb 20, 2016, at 9:52 PM, Martin von Zweigbergk <martinvonz@google.com> wrote:

> 

> On Thu, Feb 18, 2016 at 3:46 PM, Tony Tung <tonytung@fb.com> wrote:

>> # HG changeset patch

>> # User Tony Tung <tonytung@merly.org>

>> # Date 1455839061 28800

>> #      Thu Feb 18 15:44:21 2016 -0800

>> # Node ID a7cd652b0f4398e939ebc10c6ceff86c4c119a72

>> # Parent  c4ee1010595a7cae4aad0c7dbcd152719f780e6a

>> treemanifest: don't use a manifest in the cache if it's dirty.

> 

> Drop the trailing period and add " (issue5076)" instead (see

> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mercurial-2Dscm.org_wiki_ContributingChanges-23Submission-5Fchecklist&d=CwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=nX46-h6uPFfl2aGGFhBQbg&m=kj1Em7qqtj4fSbjVd3KC92KDCH1vKBgo45lrg82b8zI&s=XGXEMdVOTsQ6ObOYMeSjsuEn-gDSsEHfkIeppaPaL7E&e= ).


Thanks, will do.

>> When doing hg convert, we may encounter subrepo merges.  In that case, the

>> working directory subrepo will be dirty (i.e., differ from the repository

>> state), which will cause the manifest to be written to.  Subsequent reads

>> of this manifest entry will note that it's dirty, and calls to node() will

>> fail.

> 

> Nice find!

> 

> However, it wasn't clear to me why this would be specific to

> treemanifests, so I added dirty checking to the flat manifest code as

> well (see patch on https://urldefense.proofpoint.com/v2/url?u=http-3A__paste.debian.net_402058_&d=CwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=nX46-h6uPFfl2aGGFhBQbg&m=kj1Em7qqtj4fSbjVd3KC92KDCH1vKBgo45lrg82b8zI&s=GQhm80laTpyUYP6IY5JZG3FvWnv7io_Q7f1r29o4iKE&e= ) and ran the test

> in the issue tracker (probably the same as in your patch) *without*

> treemanifests. Sure enough, it fails there too. So it seems like the

> only reason this is failing only with treemanifests is that only

> treemanifests check that they are clean in some cases. Please test it

> for yourself to make sure I didn't just see what I wanted to see when

> I tested it.

> 

> As I said on the issue tracker, I've spent a lot of time trying to

> understand how manifestmerge() interacts with submerge(). Perhaps the

> reading of dirty manifests is the reason I never understood what was

> happening.

> 

> Anyway, what do we do now? I think the test case should be moved out

> of test-treemanifest.t at least. Is the fix to add dirty-checking in

> manifestdict as well and not bypass the cache there too? Or perhaps

> the bug is in the subrepo code?


I think the fundamental issue (bearing in mind that I’m pretty inexperienced in mercurial internals :) ) is that the convert code is doing a manifestmerge into a working context.  This is not actually the case, as it does not update the working directory (which is why the subrepo is considered dirty).

I had an alternate fix that went deep into merge.py, but was pretty ugly.  It would behave differently if called with a special argument indicating that it was not a working context.  However, it may be more correct.

Happy to move the test case out, and to add dirty checking for alternate manifest implementations.

Love to hear your thoughts.

Thanks,
Tony

Patch

diff --git a/mercurial/manifest.py b/mercurial/manifest.py
--- a/mercurial/manifest.py
+++ b/mercurial/manifest.py
@@ -963,7 +963,11 @@ 
         if node == revlog.nullid:
             return self._newmanifest() # don't upset local cache
         if node in self._mancache:
-            return self._mancache[node][0]
+            cached = self._mancache[node][0]
+            # if it's a tree manifest and dirty, then throw it away.  refetch
+            # it.
+            if not (self._treeondisk and cached._dirty):
+                return cached
         if self._treeondisk:
             def gettext():
                 return self.revision(node)
diff --git a/tests/test-treemanifest.t b/tests/test-treemanifest.t
--- a/tests/test-treemanifest.t
+++ b/tests/test-treemanifest.t
@@ -626,3 +626,82 @@ 
   bundle requirements: generaldelta, revlogv1, treemanifest
   $ hg debugbundle --spec repo-packed.hg
   none-packed1;requirements%3Dgeneraldelta%2Crevlogv1%2Ctreemanifest
+
+Converting a repo with subrepos
+
+  $ mkdir subrepos
+  $ cd subrepos
+  $ hg init main
+  $ cd main
+  $ echo "bar" > foo
+  $ hg add foo
+  $ hg commit -m "adding foo"
+  $ hg init nested
+  $ echo nested = nested > .hgsub
+  $ hg add .hgsub
+  $ hg commit -m "subrepo"
+  $ cd nested
+  $ echo "foo" > baz
+  $ hg add baz
+  $ hg commit -m baz
+  $ cd ..
+  $ hg commit -m "substate"
+  $ hg up -C 0
+  0 files updated, 0 files merged, 2 files removed, 0 files unresolved
+  $ echo "foomain" > barmain
+  $ hg add barmain
+  $ hg commit -m "foo:bar:main"
+  created new head
+  $ hg up 2
+  2 files updated, 0 files merged, 1 files removed, 0 files unresolved
+  $ hg merge 3
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  (branch merge, don't forget to commit)
+  $ hg commit -m "substate merge"
+  $ cd ..
+  $ hg init --config experimental.treemanifest=1 main-tree
+  $ hg convert --config experimental.treemanifest=1 -s hg -d hg --config extensions.convert= main/nested main-tree/nested
+  initializing destination main-tree/nested repository
+  scanning source...
+  sorting...
+  converting...
+  0 baz
+  $ hg convert --config experimental.treemanifest=1 -s hg -d hg --config extensions.convert= main main-tree
+  scanning source...
+  sorting...
+  converting...
+  4 adding foo
+  3 subrepo
+  2 substate
+  1 foo:bar:main
+  0 substate merge
+  $ cd main-tree
+  $ hg log -G -T 'changeset: {node}\n desc: {desc}\n'
+  o    changeset: dd89f10386a7a20b6c47ba1f654fd9351d3a8af1
+  |\    desc: substate merge
+  | o  changeset: d7f4a353d996a012f0f96da6a64a3897834fea66
+  | |   desc: foo:bar:main
+  o |  changeset: 1e5c05b42e66cf36e1fdf89688a619a91e7e0fcf
+  | |   desc: substate
+  o |  changeset: 5fa49dfed1f161b05df29131c919a823a3cfaf18
+  |/    desc: subrepo
+  o  changeset: 7ea9107ac22e3f641679fe6eb24f714f8e215a8e
+      desc: adding foo
+  $ hg diff -r 3 -r 4
+  diff -r d7f4a353d996 -r dd89f10386a7 .hgsub
+  --- /dev/null	* (glob)
+  +++ b/.hgsub	* (glob)
+  @@ -0,0 +1,1 @@
+  +nested = nested
+  diff -r d7f4a353d996 -r dd89f10386a7 .hgsubstate
+  --- /dev/null	* (glob)
+  +++ b/.hgsubstate	* (glob)
+  @@ -0,0 +1,1 @@
+  +a66b20dc9550d7f9d48381371c10d14fe0e78957 nested
+  $ hg diff -r 2 -r 4
+  diff -r 1e5c05b42e66 -r dd89f10386a7 barmain
+  --- /dev/null	* (glob)
+  +++ b/barmain	* (glob)
+  @@ -0,0 +1,1 @@
+  +foomain
+  $ cd ..