Patchwork [5,of,6,V2,RFC] share: add "full share" suport

login
register
mail settings
Submitter Angel Ezquerra
Date Dec. 26, 2014, 11:46 a.m.
Message ID <bb70464b9121df236db6.1419594413@108.1.168.192.in-addr.arpa>
Download mbox | patch
Permalink /patch/7246/
State Changes Requested
Headers show

Comments

Angel Ezquerra - Dec. 26, 2014, 11:46 a.m.
# HG changeset patch
# User Angel Ezquerra <angel.ezquerra@gmail.com>
# Date 1419360788 -3600
#      Tue Dec 23 19:53:08 2014 +0100
# Node ID bb70464b9121df236db6d264d57b6f56ad78cd3b
# Parent  dd180345dd166fc51d061a8041e014f320c20d6c
share: add "full share" suport

Add a new --full flag that makes it possible to create "full shares".

The actual full commit message comes below, but first some notes and caveats:

# NOTES:
1. This is the first step on the "subrepo store" plan that we discussed during
the mercurial 3.2 sprint
2. This is RFC because I am not certain that this is the right way to do this.
Maybe there is a better, obvious way to do it. In that case I'd love to hear
about it
3. If this approach is not completely crazy, are there any files that I should
add to the fullshareexceptions list? (see below for context)
4. hg unshare is currently unsupported
5. This revision does not handle the locking of the non-store parts of the
repository properly. That will be handled on a separate patch (which in the
final form of this series should be folded into this revision).

# Actual commit message:

A "fully shared repository" is one that shares _everything_ with its source
repository including bookmarks, mq patches and configuration files. The only
difference between the source repo and the fully shared copy is that their
working directory can be different and point to a different revision,
bookmark and branch.

In order to do so we replace the regular localrepo vfs object with a unionvfs,
object, which is the union of a vfs that points to the share source and another
that points to the local repository .hg path. The unionvfs selection function
is such that all files are taken from the shared repository vfs except for a
few key files (dirstate, branch, bookmarks.current, requires, sharedpath and
sharedinfull).

The idea is to only keep the files that are strictly necessary to maintain
separate working directories and a different current branch and active bookmark
in the shared repository, and share everything else with the shared source
repository.
Pierre-Yves David - Dec. 30, 2014, 9:58 p.m.
On 12/26/2014 03:46 AM, Angel Ezquerra wrote:
> # HG changeset patch
> # User Angel Ezquerra <angel.ezquerra@gmail.com>
> # Date 1419360788 -3600
> #      Tue Dec 23 19:53:08 2014 +0100
> # Node ID bb70464b9121df236db6d264d57b6f56ad78cd3b
> # Parent  dd180345dd166fc51d061a8041e014f320c20d6c
> share: add "full share" suport

I've been through the RFC, The vfs dispatcher is a bit scary and can 
probably be avoided, the multi locking is less scary and could be okay.

But this series ring some familiar bells. You are going through some 
dancing to handle "few key files that represent the working copy state". 
I've recently been thinking about how to move toward a newer repository 
format that would allow fully atomic transaction for new data type 
(phases, bookmarks, obsmarkers) and efficient sharing. One of the way to 
improve the situation is to improve "tagging". Knowing what file are 
critical for the transaction, which one depends/control the working copy 
etc.

In my current opinion, the best way to achieve this is to have distinct 
vfs object for distinct category of file. This would remove the needs 
for the unionvfs withcraft.

What do you think?

(note: there is various typo, all around your commit messages. I 
recommend running some spell checker on them)
Angel Ezquerra - Dec. 31, 2014, 1:07 a.m.
On Tue, Dec 30, 2014 at 10:58 PM, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
>
>
> On 12/26/2014 03:46 AM, Angel Ezquerra wrote:
>>
>> # HG changeset patch
>> # User Angel Ezquerra <angel.ezquerra@gmail.com>
>> # Date 1419360788 -3600
>> #      Tue Dec 23 19:53:08 2014 +0100
>> # Node ID bb70464b9121df236db6d264d57b6f56ad78cd3b
>> # Parent  dd180345dd166fc51d061a8041e014f320c20d6c
>> share: add "full share" suport
>
>
> I've been through the RFC, The vfs dispatcher is a bit scary and can
> probably be avoided, the multi locking is less scary and could be okay.
>
> But this series ring some familiar bells. You are going through some dancing
> to handle "few key files that represent the working copy state". I've
> recently been thinking about how to move toward a newer repository format
> that would allow fully atomic transaction for new data type (phases,
> bookmarks, obsmarkers) and efficient sharing. One of the way to improve the
> situation is to improve "tagging". Knowing what file are critical for the
> transaction, which one depends/control the working copy etc.
>
> In my current opinion, the best way to achieve this is to have distinct vfs
> object for distinct category of file. This would remove the needs for the
> unionvfs withcraft.
>
> What do you think?

So your proposal would be to remove the existing vfs localrepo
property and replace it with multiple vfs objects, one for each
related group of files? If that is what you propose it could be a way
to avoid the unionvfs. That being said there are quite a few uses (not
a lot, but not few either) of the vfs property in the code, so this
sort of thing would require a number of changes to the code.

To be completely honest my goal is to implement the subrepo store that
we discussed in the Munich sprint. Given the bandwidth that I will
have in the next few months I worry that trying to do this sort of big
architectural change might stop me from reaching that goal. I don't
want to sacrifice the quality of the code of course but I wonder,
isn't using a unionvfs an implementation detail? The important thing
is the behavior of the full shared repos. The implementation itself
can be changed later, can't it?

> (note: there is various typo, all around your commit messages. I recommend
> running some spell checker on them)

Thanks for letting me know. I'd be nice if TortoiseHg had an
integrated spell checker. Probably I'm not the only one who would
benefit from it ( *wink* *wink* ;-) ).

Cheers,

Angel
Pierre-Yves David - Dec. 31, 2014, 1:39 a.m.
On 12/30/2014 05:07 PM, Angel Ezquerra wrote:
> On Tue, Dec 30, 2014 at 10:58 PM, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org> wrote:
>>
>>
>> On 12/26/2014 03:46 AM, Angel Ezquerra wrote:
>>>
>>> # HG changeset patch
>>> # User Angel Ezquerra <angel.ezquerra@gmail.com>
>>> # Date 1419360788 -3600
>>> #      Tue Dec 23 19:53:08 2014 +0100
>>> # Node ID bb70464b9121df236db6d264d57b6f56ad78cd3b
>>> # Parent  dd180345dd166fc51d061a8041e014f320c20d6c
>>> share: add "full share" suport
>>
>>
>> I've been through the RFC, The vfs dispatcher is a bit scary and can
>> probably be avoided, the multi locking is less scary and could be okay.
>>
>> But this series ring some familiar bells. You are going through some dancing
>> to handle "few key files that represent the working copy state". I've
>> recently been thinking about how to move toward a newer repository format
>> that would allow fully atomic transaction for new data type (phases,
>> bookmarks, obsmarkers) and efficient sharing. One of the way to improve the
>> situation is to improve "tagging". Knowing what file are critical for the
>> transaction, which one depends/control the working copy etc.
>>
>> In my current opinion, the best way to achieve this is to have distinct vfs
>> object for distinct category of file. This would remove the needs for the
>> unionvfs withcraft.
>>
>> What do you think?
>
> So your proposal would be to remove the existing vfs localrepo
> property and replace it with multiple vfs objects, one for each
> related group of files? If that is what you propose it could be a way
> to avoid the unionvfs. That being said there are quite a few uses (not
> a lot, but not few either) of the vfs property in the code, so this
> sort of thing would require a number of changes to the code.

The localrepo.vfs would not necessary disappear. we already have other 
vfs alongside it (wvfs, svfs). We would introduces a new vfs attribute 
for the category of file you already isolated and use it for these file. 
It should not be too complicated since you already went through the 
trouble of isolating the list of affected file.

> To be completely honest my goal is to implement the subrepo store that
> we discussed in the Munich sprint. Given the bandwidth that I will
> have in the next few months I worry that trying to do this sort of big
> architectural change might stop me from reaching that goal. I don't
> want to sacrifice the quality of the code of course but I wonder,
> isn't using a unionvfs an implementation detail? The important thing
> is the behavior of the full shared repos. The implementation itself
> can be changed later, can't it?

the unionvfs is a bit scary and looks fragile, the new vfs approach is a 
step in a right direction and does not introduce new magical class. The 
point here is not to isolate all possibles category and adds handful of 
new vfs. Just change the minimal amount of call to get you code to work.
Angel Ezquerra - Jan. 3, 2015, 4:45 p.m.
On Wed, Dec 31, 2014 at 2:39 AM, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
>
>
> On 12/30/2014 05:07 PM, Angel Ezquerra wrote:
>>
>> On Tue, Dec 30, 2014 at 10:58 PM, Pierre-Yves David
>> <pierre-yves.david@ens-lyon.org> wrote:
>>>
>>>
>>>
>>> On 12/26/2014 03:46 AM, Angel Ezquerra wrote:
>>>>
>>>>
>>>> # HG changeset patch
>>>> # User Angel Ezquerra <angel.ezquerra@gmail.com>
>>>> # Date 1419360788 -3600
>>>> #      Tue Dec 23 19:53:08 2014 +0100
>>>> # Node ID bb70464b9121df236db6d264d57b6f56ad78cd3b
>>>> # Parent  dd180345dd166fc51d061a8041e014f320c20d6c
>>>> share: add "full share" suport
>>>
>>>
>>>
>>> I've been through the RFC, The vfs dispatcher is a bit scary and can
>>> probably be avoided, the multi locking is less scary and could be okay.
>>>
>>> But this series ring some familiar bells. You are going through some
>>> dancing
>>> to handle "few key files that represent the working copy state". I've
>>> recently been thinking about how to move toward a newer repository format
>>> that would allow fully atomic transaction for new data type (phases,
>>> bookmarks, obsmarkers) and efficient sharing. One of the way to improve
>>> the
>>> situation is to improve "tagging". Knowing what file are critical for the
>>> transaction, which one depends/control the working copy etc.
>>>
>>> In my current opinion, the best way to achieve this is to have distinct
>>> vfs
>>> object for distinct category of file. This would remove the needs for the
>>> unionvfs withcraft.
>>>
>>> What do you think?
>>
>>
>> So your proposal would be to remove the existing vfs localrepo
>> property and replace it with multiple vfs objects, one for each
>> related group of files? If that is what you propose it could be a way
>> to avoid the unionvfs. That being said there are quite a few uses (not
>> a lot, but not few either) of the vfs property in the code, so this
>> sort of thing would require a number of changes to the code.
>
>
> The localrepo.vfs would not necessary disappear. we already have other vfs
> alongside it (wvfs, svfs). We would introduces a new vfs attribute for the
> category of file you already isolated and use it for these file. It should
> not be too complicated since you already went through the trouble of
> isolating the list of affected file.

If I understand you correctly, we would need some sort of new
"narrowvfs" class that gets a list of files that it can actually
handle (e.g. the ones needed to maintain the working directory state)
and then enforce those on all of its operations. Then we would have
multiple such narrowvfs objects inside the localrepository, and we
would need to use one of them depending on the context. Is that what
you meant?

>> To be completely honest my goal is to implement the subrepo store that
>> we discussed in the Munich sprint. Given the bandwidth that I will
>> have in the next few months I worry that trying to do this sort of big
>> architectural change might stop me from reaching that goal. I don't
>> want to sacrifice the quality of the code of course but I wonder,
>> isn't using a unionvfs an implementation detail? The important thing
>> is the behavior of the full shared repos. The implementation itself
>> can be changed later, can't it?
>
> the unionvfs is a bit scary and looks fragile, the new vfs approach is a
> step in a right direction and does not introduce new magical class. The
> point here is not to isolate all possibles category and adds handful of new
> vfs. Just change the minimal amount of call to get you code to work.

The thing I like about my original approach (the altvfs) and the new
unionvfs based approach that Matt suggested is that the concentrate
the difference between a regular repository and a fully shared
repository in a single place, the vfs. I think it makes it easy to
reasib about it and to avoid problems when people access the internals
of the localrepository class. The vfs makes sure that you access the
right files on the right location and that's it. The localrepository
users do not need to worry about anything else.

Cheers,

Angel

Patch

diff --git a/hgext/share.py b/hgext/share.py
--- a/hgext/share.py
+++ b/hgext/share.py
@@ -16,10 +16,11 @@ 
 
 @command('share',
     [('U', 'noupdate', None, _('do not create a working copy')),
-     ('B', 'bookmarks', None, _('also share bookmarks'))],
-    _('[-U] [-B] SOURCE [DEST]'),
+     ('B', 'bookmarks', None, _('also share bookmarks')),
+     ('', 'full', None, _('share in full'))],
+    _('[-U] [-B] [--full] SOURCE [DEST]'),
     norepo=True)
-def share(ui, source, dest=None, noupdate=False, bookmarks=False):
+def share(ui, source, dest=None, noupdate=False, bookmarks=False, full=False):
     """create a new shared repository
 
     Initialize a new repository and working directory that shares its
@@ -37,7 +38,8 @@ 
        the broken clone to reset it to a changeset that still exists.
     """
 
-    return hg.share(ui, source, dest, not noupdate, bookmarks)
+    return hg.share(ui, source, dest, not noupdate, bookmarks=bookmarks,
+                    fullshare=full)
 
 @command('unshare', [], '')
 def unshare(ui, repo):
@@ -46,6 +48,8 @@ 
     Copy the store data to the repo and remove the sharedpath data.
     """
 
+    if repo.fullshare:
+        raise util.Abort(_("unsharing a full repository share is unsupported"))
     if not repo.shared():
         raise util.Abort(_("this is not a shared repo"))
 
diff --git a/mercurial/hg.py b/mercurial/hg.py
--- a/mercurial/hg.py
+++ b/mercurial/hg.py
@@ -158,7 +158,7 @@ 
         return ''
     return os.path.basename(os.path.normpath(path))
 
-def share(ui, source, dest=None, update=True, bookmarks=True):
+def share(ui, source, dest=None, update=True, bookmarks=True, fullshare=False):
     '''create a shared repository'''
 
     if not islocal(source):
@@ -201,6 +201,8 @@ 
     requirements += 'shared\n'
     destvfs.write('requires', requirements)
     destvfs.write('sharedpath', sharedpath)
+    if fullshare:
+        destvfs.write('sharedinfull', '')
 
     r = repository(ui, destwvfs.base)
 
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -189,6 +189,7 @@ 
         return self.requirements[:]
 
     def __init__(self, baseui, path=None, create=False):
+        self.fullshare = False
         self.wvfs = scmutil.vfs(path, expandpath=True, realpath=True)
         self.wopener = self.wvfs
         self.root = self.wvfs.base
@@ -256,13 +257,28 @@ 
 
         self.sharedpath = self.path
         try:
-            vfs = scmutil.vfs(self.vfs.read("sharedpath").rstrip('\n'),
+            fullshareexceptions = ('dirstate', 'branch', 'bookmarks.current',
+                                   'requires', 'sharedpath', 'sharedinfull')
+            sourcevfs = scmutil.vfs(self.vfs.read("sharedpath").rstrip('\n'),
                               realpath=True)
-            s = vfs.base
-            if not vfs.exists():
+            s = sourcevfs.base
+            if not sourcevfs.exists():
                 raise error.RepoError(
                     _('.hg/sharedpath points to nonexistent directory %s') % s)
             self.sharedpath = s
+            self.fullshare = self.vfs.exists('sharedinfull')
+            if self.fullshare:
+                # full shares are those in which all files except for the
+                # requires, sharedpath and sharedinfull files must be read from
+                # the source repository
+                self.origpath = self.path
+                self.path = self.sharedpath
+                def selectfn(path):
+                    if path in fullshareexceptions:
+                        return 1
+                    return 0
+                self.vfs = scmutil.unionvfs(selectfn, sourcevfs, self.vfs)
+                self.opener = self.vfs
         except IOError, inst:
             if inst.errno != errno.ENOENT:
                 raise
@@ -760,7 +776,9 @@ 
 
     def shared(self):
         '''the type of shared repository (None if not shared)'''
-        if self.sharedpath != self.path:
+        if self.fullshare:
+            return 'full'
+        elif self.sharedpath != self.path:
             return 'store'
         return None
 
diff --git a/tests/test-share.t b/tests/test-share.t
--- a/tests/test-share.t
+++ b/tests/test-share.t
@@ -297,6 +297,176 @@ 
      bm4                       5:92793bfc8cad
   $ cd ..
 
+create a full share
+
+  $ hg share --full repo1 repo-fullshare
+  updating working directory
+  2 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ cd repo-fullshare
+
+full shares share the whole set of revisions
+  $ hg log -G
+  @  changeset:   5:92793bfc8cad
+  |  bookmark:    bm4
+  |  tag:         tip
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     foo in b
+  |
+  o  changeset:   4:62f4ded848e4
+  |  bookmark:    bm3
+  |  parent:      2:c2e0ac586386
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     testing shared bookmarks
+  |
+  | o  changeset:   3:b87954705719
+  |/   bookmark:    bm1
+  |    user:        test
+  |    date:        Thu Jan 01 00:00:00 1970 +0000
+  |    summary:     testing shared bookmarks
+  |
+  o  changeset:   2:c2e0ac586386
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     another file
+  |
+  o  changeset:   1:8af4dc49db9e
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     change in shared clone
+  |
+  o  changeset:   0:d3873e73d99e
+     user:        test
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     init
+  
+and bookmarks
+  $ hg bookmarks
+     bm1                       3:b87954705719
+     bm3                       4:62f4ded848e4
+     bm4                       5:92793bfc8cad
+  $ hg -R ../repo1 bookmarks
+   * bm1                       3:b87954705719
+     bm3                       4:62f4ded848e4
+     bm4                       5:92793bfc8cad
+
+the source and shared repo working directories can point to different revisions
+and bookmarks
+  $ hg update bm3
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  (activating bookmark bm3)
+  $ hg log -r .
+  changeset:   4:62f4ded848e4
+  bookmark:    bm3
+  parent:      2:c2e0ac586386
+  user:        test
+  date:        Thu Jan 01 00:00:00 1970 +0000
+  summary:     testing shared bookmarks
+  
+  $ hg -R ../repo1 log -r .
+  changeset:   3:b87954705719
+  bookmark:    bm1
+  user:        test
+  date:        Thu Jan 01 00:00:00 1970 +0000
+  summary:     testing shared bookmarks
+  
+they can also commit to different branches
+  $ hg branch createdinshare
+  marked working directory as branch createdinshare
+  (branches are permanent and global, did you want a bookmark?)
+  $ hg -R ../repo1 branch
+  default
+
+and commit on top of different revisions
+  $ cat a
+  more shared bookmarks
+  $ echo a > a
+  $ hg commit -m "created in full share"
+  $ echo a > ../repo1/a
+  $ hg commit -R ../repo1 -m "created in share source"
+  $ hg log -G
+  o  changeset:   7:e0fb5f85a10b
+  |  bookmark:    bm1
+  |  tag:         tip
+  |  parent:      3:b87954705719
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     created in share source
+  |
+  | @  changeset:   6:5ea6503932c4
+  | |  branch:      createdinshare
+  | |  bookmark:    bm3
+  | |  parent:      4:62f4ded848e4
+  | |  user:        test
+  | |  date:        Thu Jan 01 00:00:00 1970 +0000
+  | |  summary:     created in full share
+  | |
+  | | o  changeset:   5:92793bfc8cad
+  | |/   bookmark:    bm4
+  | |    user:        test
+  | |    date:        Thu Jan 01 00:00:00 1970 +0000
+  | |    summary:     foo in b
+  | |
+  | o  changeset:   4:62f4ded848e4
+  | |  parent:      2:c2e0ac586386
+  | |  user:        test
+  | |  date:        Thu Jan 01 00:00:00 1970 +0000
+  | |  summary:     testing shared bookmarks
+  | |
+  o |  changeset:   3:b87954705719
+  |/   user:        test
+  |    date:        Thu Jan 01 00:00:00 1970 +0000
+  |    summary:     testing shared bookmarks
+  |
+  o  changeset:   2:c2e0ac586386
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     another file
+  |
+  o  changeset:   1:8af4dc49db9e
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     change in shared clone
+  |
+  o  changeset:   0:d3873e73d99e
+     user:        test
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     init
+  
+
+full shares should have a limited set of files in its .hg folder
+
+  $ ls .hg
+  bookmarks.current
+  branch
+  dirstate
+  requires
+  sharedinfull
+  sharedpath
+
+  $ ls ../repo1/.hg
+  00changelog.i
+  bookmarks
+  bookmarks.current
+  branch
+  cache
+  dirstate
+  last-message.txt
+  requires
+  store
+  undo.bookmarks
+  undo.branch
+  undo.desc
+  undo.dirstate
+
+unsharing a full share is unsupported
+
+  $ hg unshare
+  abort: unsharing a full repository share is unsupported
+  [255]
+
 Explicitly kill daemons to let the test exit on Windows
 
   $ "$TESTDIR/killdaemons.py" $DAEMON_PIDS