Patchwork [4,of,4,STABLE,V2] lock: show about possibility of lock corruption for empty locker

login
register
mail settings
Submitter Katsunori FUJIWARA
Date May 1, 2017, 10:33 a.m.
Message ID <bfde0f99282eccec8c63.1493634806@juju>
Download mbox | patch
Permalink /patch/20323/
State Superseded
Headers show

Comments

Katsunori FUJIWARA - May 1, 2017, 10:33 a.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1493634414 -32400
#      Mon May 01 19:26:54 2017 +0900
# Branch stable
# Node ID bfde0f99282eccec8c63f61905f7165fe4fcf24d
# Parent  df143d27afc9ec8c6f7be606dd3735faa1e9f79f
lock: show about possibility of lock corruption for empty locker

Before this patch, if corrupted lock file makes vfs.readlink() return
empty string, lock acquisition waits for a while, even though lock is
never acquired in such case (this issue was reported by a user using
Mercurial on Windows).

This issue occurs in steps below:

  1. vfs.readlink() returns empty string as "locker" for corrupted
     lock file

  2. lock._testlock() returns empty string, because split(":", 1) on
     "locker" fails by ValueError

  3. lock._trylock() raises LockHeld(EAGAIN), because empty string
     "locker" isn't None

  4. lock.lock() sleeps for next trial

  5. steps above are repeated while timeout, because corrupted lock
     file isn't removed by lock._testlock() at (2)

  6. lock acquisition (and "hg" command, maybe) is aborted by timeout

Then, lock acquisition fails until removal of corrupted lock file by
hand ("hg debuglocks" or so).

On the other hand, it isn't safe to assume that lock was corrupted at
unlocking. There is no mechanical way to determine whether "locker"
process of corrupted lock is still alive or not exactly.

In addition to it, empty "locker" might mean "locking in progress" on
no-symlink platform, because vfs.makelock() isn't atomic action on
such platform (consisting of "open", "write', and "close").

Therefore, it should be responsibility of end users to determine
whether such corrupted lock can be safely removed or not.

This patch shows about possibility of lock corruption for empty
"locker", and suggests use of "hg debuglocks" in "hint" message to
resolve an issue, even though debug commands should be hidden from
ordinary users.

Patch

diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
--- a/mercurial/scmutil.py
+++ b/mercurial/scmutil.py
@@ -156,7 +156,8 @@  def callcatch(ui, func):
             reason = _('lock held by %r') % inst.locker
         ui.warn(_("abort: %s: %s\n") % (inst.desc or inst.filename, reason))
         if not inst.locker:
-            ui.warn(_("(lock might be very busy)\n"))
+            ui.warn(_("(lock might be very busy, or corrupted;"
+                      " use 'hg debuglocks' to check corruption)\n"))
     except error.LockUnavailable as inst:
         ui.warn(_("abort: could not lock %s: %s\n") %
                (inst.desc or inst.filename, inst.strerror))
diff --git a/tests/test-no-symlinks.t b/tests/test-no-symlinks.t
--- a/tests/test-no-symlinks.t
+++ b/tests/test-no-symlinks.t
@@ -64,6 +64,12 @@  Test handling lock corruption
   lock:  free
   wlock: free
 
+  $ cat >> .hg/hgrc <<EOF
+  > [ui]
+  > # for immediate timeout
+  > timeout = 0
+  > EOF
+
 (emulate lock corruption by empty file)
 
   $ printf '' > .hg/wlock
@@ -72,3 +78,33 @@  Test handling lock corruption
   lock:  free
   wlock: user *, but lock might be corrupted (*s) (glob)
   [1]
+
+  $ echo foo > foo
+  $ hg add foo
+  waiting for lock on working directory of $TESTTMP/t2 held by '' (glob)
+  abort: working directory of $TESTTMP/t2: timed out waiting for lock held by '' (glob)
+  (lock might be very busy, or corrupted; use 'hg debuglocks' to check corruption)
+  [255]
+
+(check still corrupted)
+
+  $ hg debuglocks
+  lock:  free
+  wlock: user *, but lock might be corrupted (*s) (glob)
+  [1]
+
+(check normal waiting/timedout messages for safety, too)
+
+  $ printf '1234' > .hg/wlock
+
+  $ hg add foo
+  waiting for lock on working directory of $TESTTMP/t2 held by '1234' (glob)
+  abort: working directory of $TESTTMP/t2: timed out waiting for lock held by '1234' (glob)
+  [255]
+
+  $ printf 'foobar:1234' > .hg/wlock
+
+  $ hg add foo
+  waiting for lock on working directory of $TESTTMP/t2 held by process '1234' on host 'foobar' (glob)
+  abort: working directory of $TESTTMP/t2: timed out waiting for lock held by 'foobar:1234' (glob)
+  [255]