Patchwork [STABLE,V3] hg: obtain lock when creating share from pooled repo (issue5104)

login
register
mail settings
Submitter Gregory Szorc
Date Feb. 28, 2016, 2:26 a.m.
Message ID <1fbee0f1decbd380655c.1456626377@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/13439/
State Accepted
Headers show

Comments

Gregory Szorc - Feb. 28, 2016, 2:26 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1456626169 28800
#      Sat Feb 27 18:22:49 2016 -0800
# Branch stable
# Node ID 1fbee0f1decbd380655c9759f2bbf369c12e3b47
# Parent  cb6a952efbf48d306a7c82d7fe46115f072cbb1d
hg: obtain lock when creating share from pooled repo (issue5104)

There are race conditions between clients performing a shared clone
to pooled storage:

1) Clients race to create the new shared repo in the pool directory
2) 1 client is seeding the repo in the pool directory and another goes
   to share it before it is fully cloned

We prevent these race conditions by obtaining a lock in the pool
directory that is derived from the name of the repo we will be
accessing.

To test this, a simple generic "lockdelay" extension has been added.
The extension inserts an optional, configurable delay before or after
lock acquisition. In the test, we delay 2 seconds after lock acquisition
in the first process and 1 second before lock acquisition in the 2nd
process. This means the first process has 1s to obtain the lock. There
is a race condition here. If we encounter it in the wild, we could
change the dummy extension to wait on the lock file to appear instead
of relying on timing. But that's more complicated. Let's see what
happens first.
Matt Mackall - March 1, 2016, 11:55 p.m.
On Sat, 2016-02-27 at 18:26 -0800, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1456626169 28800
> #      Sat Feb 27 18:22:49 2016 -0800
> # Branch stable
> # Node ID 1fbee0f1decbd380655c9759f2bbf369c12e3b47
> # Parent  cb6a952efbf48d306a7c82d7fe46115f072cbb1d
> hg: obtain lock when creating share from pooled repo (issue5104)

Queued for stable, thanks.

-- 
Mathematics is the supreme nostalgia of our time.
Danek Duvall - March 3, 2016, 4:26 a.m.
I'm getting this:

    --- .../tests/test-clone.t
    +++ .../tests/test-clone.t.err
    @@ -1067,18 +1067,18 @@
       adding manifests
       adding file changes
       added 3 changesets with 3 changes to 1 files
    -  updating working directory
    -  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
       searching for changes
       no changes found
       adding remote bookmark bookA
    +  updating working directory
    +  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
     
       $ cat race2.log
       (sharing from existing pooled repository b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1)
    -  updating working directory
    -  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
    -  waiting for lock on repository share-destrace2 held by * (glob)
    -  got lock after \d+ seconds (re)
    +  waiting for lock on repository share-destrace2 held by 'smelly:17476'
    +  got lock after 0 seconds
       searching for changes
       no changes found
       adding remote bookmark bookA
    +  updating working directory
    +  1 files updated, 0 files merged, 0 files removed, 0 files unresolved

but I'm not sure what to do with it.  I can try to dive into it if the
cause isn't apparent to you.

Thanks,
Danek
timeless - March 3, 2016, 2 p.m.
Fwiw, I get that on default, but not on stable. - I.e. it was correct for
stable.

I ran it using repeat tests with a large number.
On Mar 3, 2016 12:24 AM, "Danek Duvall" <danek.duvall@oracle.com> wrote:

> I'm getting this:
>
>     --- .../tests/test-clone.t
>     +++ .../tests/test-clone.t.err
>     @@ -1067,18 +1067,18 @@
>        adding manifests
>        adding file changes
>        added 3 changesets with 3 changes to 1 files
>     -  updating working directory
>     -  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
>        searching for changes
>        no changes found
>        adding remote bookmark bookA
>     +  updating working directory
>     +  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
>
>        $ cat race2.log
>        (sharing from existing pooled repository
> b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1)
>     -  updating working directory
>     -  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
>     -  waiting for lock on repository share-destrace2 held by * (glob)
>     -  got lock after \d+ seconds (re)
>     +  waiting for lock on repository share-destrace2 held by
> 'smelly:17476'
>     +  got lock after 0 seconds
>        searching for changes
>        no changes found
>        adding remote bookmark bookA
>     +  updating working directory
>     +  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
>
> but I'm not sure what to do with it.  I can try to dive into it if the
> cause isn't apparent to you.
>
> Thanks,
> Danek
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>

Patch

diff --git a/mercurial/hg.py b/mercurial/hg.py
--- a/mercurial/hg.py
+++ b/mercurial/hg.py
@@ -330,27 +330,40 @@  def clonewithshare(ui, peeropts, sharepa
     revs = None
     if rev:
         if not srcpeer.capable('lookup'):
             raise error.Abort(_("src repository does not support "
                                "revision lookup and so doesn't "
                                "support clone by revision"))
         revs = [srcpeer.lookup(r) for r in rev]
 
+    # Obtain a lock before checking for or cloning the pooled repo otherwise
+    # 2 clients may race creating or populating it.
+    pooldir = os.path.dirname(sharepath)
+    # lock class requires the directory to exist.
+    try:
+        util.makedir(pooldir, False)
+    except OSError as e:
+        if e.errno != errno.EEXIST:
+            raise
+
+    poolvfs = scmutil.vfs(pooldir)
     basename = os.path.basename(sharepath)
 
-    if os.path.exists(sharepath):
-        ui.status(_('(sharing from existing pooled repository %s)\n') %
-                  basename)
-    else:
-        ui.status(_('(sharing from new pooled repository %s)\n') % basename)
-        # Always use pull mode because hardlinks in share mode don't work well.
-        # Never update because working copies aren't necessary in share mode.
-        clone(ui, peeropts, source, dest=sharepath, pull=True,
-              rev=rev, update=False, stream=stream)
+    with lock.lock(poolvfs, '%s.lock' % basename):
+        if os.path.exists(sharepath):
+            ui.status(_('(sharing from existing pooled repository %s)\n') %
+                      basename)
+        else:
+            ui.status(_('(sharing from new pooled repository %s)\n') % basename)
+            # Always use pull mode because hardlinks in share mode don't work
+            # well. Never update because working copies aren't necessary in
+            # share mode.
+            clone(ui, peeropts, source, dest=sharepath, pull=True,
+                  rev=rev, update=False, stream=stream)
 
     sharerepo = repository(ui, path=sharepath)
     share(ui, sharerepo, dest=dest, update=update, bookmarks=False)
 
     # We need to perform a pull against the dest repo to fetch bookmarks
     # and other non-store data that isn't shared by default. In the case of
     # non-existing shared repo, this means we pull from the remote twice. This
     # is a bit weird. But at the time it was implemented, there wasn't an easy
diff --git a/tests/lockdelay.py b/tests/lockdelay.py
new file mode 100644
--- /dev/null
+++ b/tests/lockdelay.py
@@ -0,0 +1,26 @@ 
+# Dummy extension that adds a delay after acquiring a lock.
+#
+# This extension can be used to test race conditions between lock acquisition.
+
+from __future__ import absolute_import
+
+import os
+import time
+
+from mercurial import (
+    lock as lockmod,
+)
+
+class delaylock(lockmod.lock):
+    def lock(self):
+        delay = float(os.environ.get('HGPRELOCKDELAY', '0.0'))
+        if delay:
+            time.sleep(delay)
+        res = super(delaylock, self).lock()
+        delay = float(os.environ.get('HGPOSTLOCKDELAY', '0.0'))
+        if delay:
+            time.sleep(delay)
+        return res
+
+def extsetup(ui):
+    lockmod.lock = delaylock
diff --git a/tests/test-clone.t b/tests/test-clone.t
--- a/tests/test-clone.t
+++ b/tests/test-clone.t
@@ -1020,8 +1020,54 @@  Test that auto sharing doesn't cause fai
   $ hg -R a id -r 0
   acb14030fe0a
   $ hg id -R remote -r 0
   abort: repository remote not found!
   [255]
   $ hg --config share.pool=share -q clone -e "python \"$TESTDIR/dummyssh\"" a ssh://user@dummy/remote
   $ hg -R remote id -r 0
   acb14030fe0a
+
+Cloning into pooled storage doesn't race (issue5104)
+
+  $ HGPOSTLOCKDELAY=2.0 hg --config share.pool=racepool --config extensions.lockdelay=$TESTDIR/lockdelay.py clone source1a share-destrace1 > race1.log 2>&1 &
+  $ HGPRELOCKDELAY=1.0 hg --config share.pool=racepool --config extensions.lockdelay=$TESTDIR/lockdelay.py clone source1a share-destrace2  > race2.log 2>&1
+  $ wait
+
+  $ hg -R share-destrace1 log -r tip
+  changeset:   2:e5bfe23c0b47
+  bookmark:    bookA
+  tag:         tip
+  user:        test
+  date:        Thu Jan 01 00:00:00 1970 +0000
+  summary:     1a
+  
+
+  $ hg -R share-destrace2 log -r tip
+  changeset:   2:e5bfe23c0b47
+  bookmark:    bookA
+  tag:         tip
+  user:        test
+  date:        Thu Jan 01 00:00:00 1970 +0000
+  summary:     1a
+  
+  $ cat race1.log
+  (sharing from new pooled repository b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1)
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 3 changes to 1 files
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  searching for changes
+  no changes found
+  adding remote bookmark bookA
+
+  $ cat race2.log
+  (sharing from existing pooled repository b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1)
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  waiting for lock on repository share-destrace2 held by * (glob)
+  got lock after \d+ seconds (re)
+  searching for changes
+  no changes found
+  adding remote bookmark bookA