Patchwork [5,of,5] posix: give checklink a fast path that cache the check file and is read only

login
register
mail settings
Submitter Mads Kiilerich
Date Nov. 17, 2016, 6:44 p.m.
Message ID <73b671fbed41d82a5dd4.1479408261@madski>
Download mbox | patch
Permalink /patch/17625/
State Accepted
Headers show

Comments

Mads Kiilerich - Nov. 17, 2016, 6:44 p.m.
# HG changeset patch
# User Mads Kiilerich <madski@unity3d.com>
# Date 1421194526 -3600
#      Wed Jan 14 01:15:26 2015 +0100
# Node ID 73b671fbed41d82a5dd46e485c61ddb8afe42faf
# Parent  5409e0c5e6c0764e802360a3912f7719885ba2b5
posix: give checklink a fast path that cache the check file and is read only

util.checklink would create a symlink and remove it again. That would sometimes
happen multiple times. Write operations are relatively expensive and give disk
tear and noise for applications monitoring file system activity.

Instead of creating a symlink and deleting it again, just create it once and
leave it in .hg/cache/check-link . If the file exists, just verify that
os.islink reports true. We will assume that this check is as good as symlink
creation not failing.

Note: The symlink left in .hg/cache has to resolve to a file - otherwise 'make
dist' will fail ...

test-symlink-os-yes-fs-no.py does some monkey patching to simulate a platform
without symlink support. The slightly different testing method requires
additional monkeying.
Augie Fackler - Nov. 19, 2016, 1:36 a.m.
On Thu, Nov 17, 2016 at 07:44:21PM +0100, Mads Kiilerich wrote:
> # HG changeset patch
> # User Mads Kiilerich <madski@unity3d.com>
> # Date 1421194526 -3600
> #      Wed Jan 14 01:15:26 2015 +0100
> # Node ID 73b671fbed41d82a5dd46e485c61ddb8afe42faf
> # Parent  5409e0c5e6c0764e802360a3912f7719885ba2b5
> posix: give checklink a fast path that cache the check file and is read only

Queued these, thanks.

>
> util.checklink would create a symlink and remove it again. That would sometimes
> happen multiple times. Write operations are relatively expensive and give disk
> tear and noise for applications monitoring file system activity.
>
> Instead of creating a symlink and deleting it again, just create it once and
> leave it in .hg/cache/check-link . If the file exists, just verify that
> os.islink reports true. We will assume that this check is as good as symlink
> creation not failing.
>
> Note: The symlink left in .hg/cache has to resolve to a file - otherwise 'make
> dist' will fail ...
>
> test-symlink-os-yes-fs-no.py does some monkey patching to simulate a platform
> without symlink support. The slightly different testing method requires
> additional monkeying.
>
> diff --git a/mercurial/posix.py b/mercurial/posix.py
> --- a/mercurial/posix.py
> +++ b/mercurial/posix.py
> @@ -220,6 +220,10 @@ def checklink(path):
>      # file already exists
>      while True:
>          cachedir = os.path.join(path, '.hg', 'cache')
> +        checklink = os.path.join(cachedir, 'checklink')
> +        # try fast path, read only
> +        if os.path.islink(checklink):
> +            return True
>          if os.path.isdir(cachedir):
>              checkdir = cachedir
>          else:
> @@ -231,7 +235,13 @@ def checklink(path):
>                                               prefix='hg-checklink-')
>              try:
>                  os.symlink(os.path.basename(fd.name), name)
> -                os.unlink(name)
> +                if cachedir is None:
> +                    os.unlink(name)
> +                else:
> +                    try:
> +                        os.rename(name, checklink)
> +                    except OSError:
> +                        os.unlink(name)
>                  return True
>              except OSError as inst:
>                  # link creation might race, try again
> diff --git a/tests/test-clone.t b/tests/test-clone.t
> --- a/tests/test-clone.t
> +++ b/tests/test-clone.t
> @@ -32,6 +32,7 @@ Trigger branchcache creation:
>    $ ls .hg/cache
>    branch2-served
>    checkisexec
> +  checklink
>    checknoexec
>    rbc-names-v1
>    rbc-revs-v1
> @@ -48,6 +49,7 @@ Ensure branchcache got copied over:
>    $ ls .hg/cache
>    branch2-served
>    checkisexec
> +  checklink
>
>    $ cat a
>    a
> diff --git a/tests/test-symlink-os-yes-fs-no.py b/tests/test-symlink-os-yes-fs-no.py
> --- a/tests/test-symlink-os-yes-fs-no.py
> +++ b/tests/test-symlink-os-yes-fs-no.py
> @@ -35,6 +35,9 @@ commands.status(u, repo)
>  def symlink_failure(src, dst):
>      raise OSError(1, "Operation not permitted")
>  os.symlink = symlink_failure
> +def islink_failure(path):
> +    return False
> +os.path.islink = islink_failure
>
>  # dereference links as if a Samba server has exported this to a
>  # Windows client
> diff --git a/tests/test-tags.t b/tests/test-tags.t
> --- a/tests/test-tags.t
> +++ b/tests/test-tags.t
> @@ -673,6 +673,7 @@ Missing tags2* files means the cache was
>    $ ls tagsclient/.hg/cache
>    branch2-served
>    checkisexec
> +  checklink
>    hgtagsfnodes1
>    rbc-names-v1
>    rbc-revs-v1
> @@ -698,6 +699,7 @@ Running hg tags should produce tags2* fi
>    $ ls tagsclient/.hg/cache
>    branch2-served
>    checkisexec
> +  checklink
>    hgtagsfnodes1
>    rbc-names-v1
>    rbc-revs-v1
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Danek Duvall - Nov. 23, 2016, 4 a.m.
Mads Kiilerich wrote:

> # HG changeset patch
> # User Mads Kiilerich <madski@unity3d.com>
> # Date 1421194526 -3600
> #      Wed Jan 14 01:15:26 2015 +0100
> # Node ID 73b671fbed41d82a5dd46e485c61ddb8afe42faf
> # Parent  5409e0c5e6c0764e802360a3912f7719885ba2b5
> posix: give checklink a fast path that cache the check file and is read only
> 
> util.checklink would create a symlink and remove it again. That would sometimes
> happen multiple times. Write operations are relatively expensive and give disk
> tear and noise for applications monitoring file system activity.
> 
> Instead of creating a symlink and deleting it again, just create it once and
> leave it in .hg/cache/check-link . If the file exists, just verify that
> os.islink reports true. We will assume that this check is as good as symlink
> creation not failing.
> 
> Note: The symlink left in .hg/cache has to resolve to a file - otherwise 'make
> dist' will fail ...
> 
> test-symlink-os-yes-fs-no.py does some monkey patching to simulate a platform
> without symlink support. The slightly different testing method requires
> additional monkeying.
> 
> diff --git a/mercurial/posix.py b/mercurial/posix.py
> --- a/mercurial/posix.py
> +++ b/mercurial/posix.py
> @@ -220,6 +220,10 @@ def checklink(path):
>      # file already exists
>      while True:
>          cachedir = os.path.join(path, '.hg', 'cache')
> +        checklink = os.path.join(cachedir, 'checklink')
> +        # try fast path, read only
> +        if os.path.islink(checklink):
> +            return True
>          if os.path.isdir(cachedir):
>              checkdir = cachedir
>          else:
> @@ -231,7 +235,13 @@ def checklink(path):
>                                               prefix='hg-checklink-')
>              try:
>                  os.symlink(os.path.basename(fd.name), name)
> -                os.unlink(name)
> +                if cachedir is None:
> +                    os.unlink(name)
> +                else:
> +                    try:
> +                        os.rename(name, checklink)
> +                    except OSError:
> +                        os.unlink(name)
>                  return True

I'm a little confused by this.  Since fd refers to a temporary file that
gets deleted when the variable goes out of scope, then checklink ends up
dangling after all, violating your Note in the description.

A handful of tests fail on Solaris because "cp -r" follows symlinks by
default, and it squawks on all the dangling checklink files.  That's
obviously related to this somehow, though whether or not it's specifically
because of that I'm not positive.

I could fix it by making all the failing cp -r calls be cp -rP, but I don't
really like that solution.

I tried adding "delete=(cachedir is None)" (or the equivalent) to the
NamedTemporaryFile call, and that cleared up all the cp -r errors, but left
behind three tests failing due to the files left behind.  At least two of
them are straightforward to fix, I think, since they're just the output of
ls commands, but the ones in test-hardlinks.t I'm not so sure about.

Thanks,
Danek
Danek Duvall - Nov. 23, 2016, 5:53 a.m.
Danek Duvall wrote:

> I tried adding "delete=(cachedir is None)" (or the equivalent) to the
> NamedTemporaryFile call, and that cleared up all the cp -r errors, but left
> behind three tests failing due to the files left behind.  At least two of
> them are straightforward to fix, I think, since they're just the output of
> ls commands, but the ones in test-hardlinks.t I'm not so sure about.

I'm wondering if util.copyfiles() needs to be smarter about copying/linking
symlinks.  But I could just add "| grep -v checklink" to those tests and
sweep it under the rug.

Danek

Patch

diff --git a/mercurial/posix.py b/mercurial/posix.py
--- a/mercurial/posix.py
+++ b/mercurial/posix.py
@@ -220,6 +220,10 @@  def checklink(path):
     # file already exists
     while True:
         cachedir = os.path.join(path, '.hg', 'cache')
+        checklink = os.path.join(cachedir, 'checklink')
+        # try fast path, read only
+        if os.path.islink(checklink):
+            return True
         if os.path.isdir(cachedir):
             checkdir = cachedir
         else:
@@ -231,7 +235,13 @@  def checklink(path):
                                              prefix='hg-checklink-')
             try:
                 os.symlink(os.path.basename(fd.name), name)
-                os.unlink(name)
+                if cachedir is None:
+                    os.unlink(name)
+                else:
+                    try:
+                        os.rename(name, checklink)
+                    except OSError:
+                        os.unlink(name)
                 return True
             except OSError as inst:
                 # link creation might race, try again
diff --git a/tests/test-clone.t b/tests/test-clone.t
--- a/tests/test-clone.t
+++ b/tests/test-clone.t
@@ -32,6 +32,7 @@  Trigger branchcache creation:
   $ ls .hg/cache
   branch2-served
   checkisexec
+  checklink
   checknoexec
   rbc-names-v1
   rbc-revs-v1
@@ -48,6 +49,7 @@  Ensure branchcache got copied over:
   $ ls .hg/cache
   branch2-served
   checkisexec
+  checklink
 
   $ cat a
   a
diff --git a/tests/test-symlink-os-yes-fs-no.py b/tests/test-symlink-os-yes-fs-no.py
--- a/tests/test-symlink-os-yes-fs-no.py
+++ b/tests/test-symlink-os-yes-fs-no.py
@@ -35,6 +35,9 @@  commands.status(u, repo)
 def symlink_failure(src, dst):
     raise OSError(1, "Operation not permitted")
 os.symlink = symlink_failure
+def islink_failure(path):
+    return False
+os.path.islink = islink_failure
 
 # dereference links as if a Samba server has exported this to a
 # Windows client
diff --git a/tests/test-tags.t b/tests/test-tags.t
--- a/tests/test-tags.t
+++ b/tests/test-tags.t
@@ -673,6 +673,7 @@  Missing tags2* files means the cache was
   $ ls tagsclient/.hg/cache
   branch2-served
   checkisexec
+  checklink
   hgtagsfnodes1
   rbc-names-v1
   rbc-revs-v1
@@ -698,6 +699,7 @@  Running hg tags should produce tags2* fi
   $ ls tagsclient/.hg/cache
   branch2-served
   checkisexec
+  checklink
   hgtagsfnodes1
   rbc-names-v1
   rbc-revs-v1