Patchwork change posix test for symlinks

login
register
mail settings
Submitter Matt Mackall
Date July 28, 2013, 10:48 p.m.
Message ID <1375051717.5856.42.camel@calx>
Download mbox | patch
Permalink /patch/1978/
State Rejected
Headers show

Comments

Matt Mackall - July 28, 2013, 10:48 p.m.
On Sun, 2013-07-28 at 17:28 +0000, DeathGorePain wrote:
> Hello,
> 
> I've been having problems with hg leaving artifacts when the repository
> / working directory is mounted on sshfs. hg-checklink-* linking to the
> `hg root` are left in root of the repository and render all operations
> slow. In somebody else's words (stackoverflow).
> <http://stackoverflow.com/questions/5213876/mercurial-hg-checklinks-recursive-symlink-over-nfs-samba-sshfs-network-drive>
> 
> I have not figured out how your tests work (confusing output) :/ But a
> local test (sudo cp posix.py ...) returned favorable results and the
> artifacts don't exist anymore.
> Please find the patch below.

This patch looks like voodoo.

The existing code was obviously correct and reasonable on normal
filesystems, and we've arrived at its current state after half a decade
of deployment and feedback. You threw that code out and completely
rewrote it apparently without ever diagnosing what the underlying
problem was. And without having done that, you can't tell us why the new
code works but the old code doesn't: voodoo.

(It's also got a few new bugs.)

Ideally, you'd first find the root cause of the problem, and then make a
_minimal_ change to the existing code to deal with it, with an
explanation in the comments or commit messages.

That said, the problem here is almost certainly sshfs's follow_symlinks
mount option, which is a horrible and magically broken feature. After
poking at its weirdness for a bit, I came up with this:

# HG changeset patch
# User Matt Mackall <mpm@selenic.com>
# Date 1375041752 18000
#      Sun Jul 28 15:02:32 2013 -0500
# Branch stable
# Node ID cfdae231ba781a590e2e27b3eeee4dabda884be5
# Parent  9e8298a324accd509cd8df1915cd98dfbdcdb512
checklink: work around sshfs brain-damage (issue3636)

With the follow_symlinks option, sshfs will successfully create links
while claiming it encountered an I/O error. In addition, depending on
the type of link, it may subsequently be impossible to delete the link
via sshfs. Our existing link to '.' will cause sshfs to think the link
is a directory, and thus cause unlink to give EISDIR. Links to
non-existent names or circular links will cause the created link to not
even be visible.

Thus, we need to create a new temporary file and link to that. We'll
still get a failure, but we'll be able to remove the link.
DeathGorePain - July 29, 2013, 6:05 a.m.
You go off attacking my code calling it "voodoo", critising me for a
lack of explanation. Did you even read the stackoverflow post and my
commit message?

>hg-checklink:change posix test for symlinks
>
>old:link to currentdir(hg root), if fail: leave artefacts
>new:link to temporary file and clean up when method exits

And then in the code

>+    # Will create a temporary file in the path
>+    # Try to symlink to it
>+    # And clean up all traces if anything goes wrong

Anyway, whichever patch gets accepted, I don't care as long as the issue
is solved. That stackoverflow post is nearly 2 years.

Thank you so much for your kind words and making contribution to an
opensource project so rewarding. I will definitely be doing this again!

Cheers,
DeathGorePain

Le dim. 28 juil. 2013 22:48:37 UTC, Matt Mackall a écrit :
>
> On Sun, 2013-07-28 at 17:28 +0000, DeathGorePain wrote:
>>
>> Hello,
>>
>> I've been having problems with hg leaving artifacts when the repository
>> / working directory is mounted on sshfs. hg-checklink-* linking to the
>> `hg root` are left in root of the repository and render all operations
>> slow. In somebody else's words (stackoverflow).
>> <http://stackoverflow.com/questions/5213876/mercurial-hg-checklinks-recursive-symlink-over-nfs-samba-sshfs-network-drive>
>>
>> I have not figured out how your tests work (confusing output) :/ But a
>> local test (sudo cp posix.py ...) returned favorable results and the
>> artifacts don't exist anymore.
>> Please find the patch below.
>
>
> This patch looks like voodoo.
>
> The existing code was obviously correct and reasonable on normal
> filesystems, and we've arrived at its current state after half a decade
> of deployment and feedback. You threw that code out and completely
> rewrote it apparently without ever diagnosing what the underlying
> problem was. And without having done that, you can't tell us why the new
> code works but the old code doesn't: voodoo.
>
> (It's also got a few new bugs.)
>
> Ideally, you'd first find the root cause of the problem, and then make a
> _minimal_ change to the existing code to deal with it, with an
> explanation in the comments or commit messages.
>
> That said, the problem here is almost certainly sshfs's follow_symlinks
> mount option, which is a horrible and magically broken feature. After
> poking at its weirdness for a bit, I came up with this:
>
> # HG changeset patch
> # User Matt Mackall <mpm@selenic.com>
> # Date 1375041752 18000
> # Sun Jul 28 15:02:32 2013 -0500
> # Branch stable
> # Node ID cfdae231ba781a590e2e27b3eeee4dabda884be5
> # Parent 9e8298a324accd509cd8df1915cd98dfbdcdb512
> checklink: work around sshfs brain-damage (issue3636)
>
> With the follow_symlinks option, sshfs will successfully create links
> while claiming it encountered an I/O error. In addition, depending on
> the type of link, it may subsequently be impossible to delete the link
> via sshfs. Our existing link to '.' will cause sshfs to think the link
> is a directory, and thus cause unlink to give EISDIR. Links to
> non-existent names or circular links will cause the created link to not
> even be visible.
>
> Thus, we need to create a new temporary file and link to that. We'll
> still get a failure, but we'll be able to remove the link.
>
> diff -r 9e8298a324ac -r cfdae231ba78 mercurial/posix.py
> --- a/mercurial/posix.py Sat Jul 27 19:31:14 2013 -0500
> +++ b/mercurial/posix.py Sun Jul 28 15:02:32 2013 -0500
> @@ -154,10 +154,16 @@
> # file already exists
> name = tempfile.mktemp(dir=path, prefix='hg-checklink-')
> try:
> - os.symlink(".", name)
> + fd = tempfile.NamedTemporaryFile(dir=path, prefix='hg-checklink-')
> + os.symlink(os.path.basename(fd.name), name)
> os.unlink(name)
> return True
> - except (OSError, AttributeError):
> + except AttributeError:
> + return False
> + except OSError, inst:
> + # sshfs might report failure while successfully creating the link
> + if inst[0] == errno.EIO and os.path.exists(name):
> + os.unlink(name)
> return False
>
> def checkosfilename(path):
>
>
Matt Mackall - July 29, 2013, 8:13 p.m.
On Mon, 2013-07-29 at 06:05 +0000, DeathGorePain wrote:
> You go off attacking my code calling it "voodoo", critising me for a
> lack of explanation. Did you even read the stackoverflow post and my
> commit message?
> 
> >hg-checklink:change posix test for symlinks
> >
> >old:link to currentdir(hg root), if fail: leave artefacts
> >new:link to temporary file and clean up when method exits

Let me try again:

You're saying something like "my face swelled up when I ate M&Ms, when I
eat some tree bark it's better". It's great that you found a fix and
brought it to our attention, but unfortunately it's way too mysterious
to be useful to us without more work.

What we need to see is something like "my face swelled up when I ate
_peanut_ M&Ms _because I'm allergic to peanuts_, when I eat this tree
bark _that contains a chemical that counteracts allergic reactions_ it's
better".

Here, the fix is the same, only the explanation has changed. Voodoo vs
engineering is all about the specificity and testability of the
explanations.

Here, "sshfs" is not a sufficient explanation in the same way "M&Ms" are
not a sufficient explanation. In fact, sshfs works just fine with
symlinks with its built-in defaults[1], which I tested right after I got
your message. And the existing code is of course perfectly correct for
real POSIX filesystems.

Sshfs (a) with -o follow_symlinks (b) lying about creating symlinks and
(c) refusing to delete the kind of otherwise perfectly reasonable link
we were creating is the explanation. That's something we can understand,
document, diagnose, test, improve, and avoid regressing in the future.

None of the key facts (a) through (c) is on Stackoverflow or in your
commit message or in your comments, which is crucial for avoiding this
bug coming back when future hackers are tempted to cleanup this
mysterious code that's trying to delete symlinks that were never
created. In fact, your commit message and comments neglect to mention
'sshfs' at all(!).

(Normally, we walk people through the steps needed to fix up the various
problems in their submissions, but instead I got deep enough into
researching the underlying problem that it was easier to just show you
what the end result ought to look like.)


[1] But at least Macfusion screws up the defaults:
https://code.google.com/p/macfusion/issues/detail?id=284

Patch

diff -r 9e8298a324ac -r cfdae231ba78 mercurial/posix.py
--- a/mercurial/posix.py	Sat Jul 27 19:31:14 2013 -0500
+++ b/mercurial/posix.py	Sun Jul 28 15:02:32 2013 -0500
@@ -154,10 +154,16 @@ 
     # file already exists
     name = tempfile.mktemp(dir=path, prefix='hg-checklink-')
     try:
-        os.symlink(".", name)
+        fd = tempfile.NamedTemporaryFile(dir=path, prefix='hg-checklink-')
+        os.symlink(os.path.basename(fd.name), name)
         os.unlink(name)
         return True
-    except (OSError, AttributeError):
+    except AttributeError:
+        return False
+    except OSError, inst:
+        # sshfs might report failure while successfully creating the link
+        if inst[0] == errno.EIO and os.path.exists(name):
+            os.unlink(name)
         return False
 
 def checkosfilename(path):