Patchwork win32: improve the performance of win32.unlink() over CIFS

login
register
mail settings
Submitter Kaz Nishimura
Date Feb. 4, 2014, 6:55 a.m.
Message ID <be24685ef8ae89aa8996.1391496951@Canopus.local.linuxfront.com>
Download mbox | patch
Permalink /patch/3465/
State Superseded
Commit 1a9ebc83a74c7e7872006491bd984997a9348031
Headers show

Comments

Kaz Nishimura - Feb. 4, 2014, 6:55 a.m.
# HG changeset patch
# User Kaz Nishimura <kazssym@vx68k.org>
# Date 1381984037 -32400
#      Thu Oct 17 13:27:17 2013 +0900
# Branch stable
# Node ID be24685ef8ae89aa89964fc0b147260325b309d9
# Parent  7d269e7620c48eee93871d33e673b7a8acb503cc
win32: improve the performance of win32.unlink() over CIFS

Emulating POSIX unlink() behavior with os.rename() and os.unlink() is often slow
especially over CIFS from Windows clients due to its protocol overhead. This
patch changes win32.unlink() to try first an exclusive open with the Win32
delete-on-close flag, and if a sharing violation is detected, to fall back to
the original emulation.

This patch also removes a test with os.path.isdir() since we expect opening a
directory shall fail as os.unlink() would.

Example measurements (repeated 3-times after 1-time calibration):

(Without this patch: hg update from null to default)
127 files updated, 0 files merged, 0 files removed, 0 files unresolved
time: real 19.871 secs (user 0.328+0.000 sys 1.794+0.000)
time: real 19.622 secs (user 0.312+0.000 sys 2.044+0.000)
time: real 19.138 secs (user 0.250+0.000 sys 1.872+0.000)

(Without this patch: hg update from default to null)
0 files updated, 0 files merged, 127 files removed, 0 files unresolved
time: real 35.158 secs (user 0.156+0.000 sys 2.512+0.000)
time: real 35.272 secs (user 0.250+0.000 sys 2.512+0.000)
time: real 36.569 secs (user 0.203+0.000 sys 2.387+0.000)

(With this patch: hg update from null to default)
127 files updated, 0 files merged, 0 files removed, 0 files unresolved
time: real 17.893 secs (user 0.328+0.000 sys 1.700+0.000)
time: real 18.512 secs (user 0.265+0.000 sys 1.529+0.000)
time: real 20.238 secs (user 0.312+0.000 sys 1.685+0.000)

(With this patch: hg update from default to null)
0 files updated, 0 files merged, 127 files removed, 0 files unresolved
time: real 12.312 secs (user 0.250+0.000 sys 0.811+0.000)
time: real 12.471 secs (user 0.156+0.000 sys 0.889+0.000)
time: real 9.727 secs (user 0.125+0.000 sys 0.858+0.000)
Kaz Nishimura - Feb. 4, 2014, 7:03 a.m.
I am not sure that this patch breaks nothing.  Please let me know if it
actually improves performance or breaks something on your environments.
Thanks in advance.


On Tue, Feb 4, 2014 at 3:55 PM, Kaz Nishimura <kazssym@vx68k.org> wrote:

> # HG changeset patch
> # User Kaz Nishimura <kazssym@vx68k.org>
> # Date 1381984037 -32400
> #      Thu Oct 17 13:27:17 2013 +0900
> # Branch stable
> # Node ID be24685ef8ae89aa89964fc0b147260325b309d9
> # Parent  7d269e7620c48eee93871d33e673b7a8acb503cc
> win32: improve the performance of win32.unlink() over CIFS
>
> Emulating POSIX unlink() behavior with os.rename() and os.unlink() is
> often slow
> especially over CIFS from Windows clients due to its protocol overhead.
> This
> patch changes win32.unlink() to try first an exclusive open with the Win32
> delete-on-close flag, and if a sharing violation is detected, to fall back
> to
> the original emulation.
>
> This patch also removes a test with os.path.isdir() since we expect
> opening a
> directory shall fail as os.unlink() would.
>
> Example measurements (repeated 3-times after 1-time calibration):
>
> (Without this patch: hg update from null to default)
> 127 files updated, 0 files merged, 0 files removed, 0 files unresolved
> time: real 19.871 secs (user 0.328+0.000 sys 1.794+0.000)
> time: real 19.622 secs (user 0.312+0.000 sys 2.044+0.000)
> time: real 19.138 secs (user 0.250+0.000 sys 1.872+0.000)
>
> (Without this patch: hg update from default to null)
> 0 files updated, 0 files merged, 127 files removed, 0 files unresolved
> time: real 35.158 secs (user 0.156+0.000 sys 2.512+0.000)
> time: real 35.272 secs (user 0.250+0.000 sys 2.512+0.000)
> time: real 36.569 secs (user 0.203+0.000 sys 2.387+0.000)
>
> (With this patch: hg update from null to default)
> 127 files updated, 0 files merged, 0 files removed, 0 files unresolved
> time: real 17.893 secs (user 0.328+0.000 sys 1.700+0.000)
> time: real 18.512 secs (user 0.265+0.000 sys 1.529+0.000)
> time: real 20.238 secs (user 0.312+0.000 sys 1.685+0.000)
>
> (With this patch: hg update from default to null)
> 0 files updated, 0 files merged, 127 files removed, 0 files unresolved
> time: real 12.312 secs (user 0.250+0.000 sys 0.811+0.000)
> time: real 12.471 secs (user 0.156+0.000 sys 0.889+0.000)
> time: real 9.727 secs (user 0.125+0.000 sys 0.858+0.000)
>
> diff -r 7d269e7620c4 -r be24685ef8ae mercurial/win32.py
> --- a/mercurial/win32.py
> +++ b/mercurial/win32.py
> @@ -25,4 +25,5 @@
>  # GetLastError
>  _ERROR_SUCCESS = 0
> +_ERROR_SHARING_VIOLATION = 32
>  _ERROR_INVALID_PARAMETER = 87
>  _ERROR_INSUFFICIENT_BUFFER = 122
> @@ -60,5 +61,7 @@
>  _OPEN_EXISTING = 3
>
> +_FILE_FLAG_OPEN_REPARSE_POINT = 0x00200000
>  _FILE_FLAG_BACKUP_SEMANTICS = 0x02000000
> +_FILE_FLAG_DELETE_ON_CLOSE = 0x04000000
>
>  # SetFileAttributes
> @@ -345,9 +348,16 @@
>      '''try to implement POSIX' unlink semantics on Windows'''
>
> -    if os.path.isdir(f):
> -        # use EPERM because it is POSIX prescribed value, even though
> -        # unlink(2) on directories returns EISDIR on Linux
> -        raise IOError(errno.EPERM,
> -                      "Unlinking directory not permitted: '%s'" % f)
> +    # If we can open f exclusively, no other processes must have open
> handles
> +    # for it and we can expect its name will be deleted immediately when
> we
> +    # close the handle unless we have another in the same process.  We
> also
> +    # expect we shall simply fail to open f if it is a directory.
> +    fh = _kernel32.CreateFileA(f, 0, 0, None, _OPEN_EXISTING,
> +        _FILE_FLAG_OPEN_REPARSE_POINT | _FILE_FLAG_DELETE_ON_CLOSE, None)
> +    if fh != _INVALID_HANDLE_VALUE:
> +        _kernel32.CloseHandle(fh)
> +        return
> +    error = _kernel32.GetLastError()
> +    if error != _ERROR_SHARING_VIOLATION:
> +        raise ctypes.WinError(error)
>
>      # POSIX allows to unlink and rename open files. Windows has serious
>
Kaz Nishimura - Feb. 12, 2014, 6:26 a.m.
For your information, this patch could improve file deletion performance
even on local drives.  Here is my measurements on a local clone of the
Mercurial repository.  The actual times may be affected by real-time virus
scanners but it looks interesting, doesn't it?

(Without this patch: hg update from null to 2.9)
1036 files updated, 0 files merged, 0 files removed, 0 files unresolved
time: real 18.233 secs (user 1.919+0.000 sys 3.744+0.000)
time: real 18.065 secs (user 1.856+0.000 sys 4.165+0.000)
time: real 17.867 secs (user 1.778+0.000 sys 4.056+0.000)

(Without this patch: hg update from 2.9 to null)
0 files updated, 0 files merged, 1036 files removed, 0 files unresolved
time: real 16.639 secs (user 0.593+0.000 sys 7.192+0.000)
time: real 16.379 secs (user 0.593+0.000 sys 7.207+0.000)
time: real 16.574 secs (user 0.562+0.000 sys 6.770+0.000)

(With this patch: hg update from null to 2.9)
1036 files updated, 0 files merged, 0 files removed, 0 files unresolved
time: real 18.212 secs (user 1.638+0.000 sys 3.494+0.000)
time: real 18.239 secs (user 1.544+0.000 sys 3.791+0.000)
time: real 18.412 secs (user 1.607+0.000 sys 3.962+0.000)

(With this patch: hg update from 2.9 to null)
0 files updated, 0 files merged, 1036 files removed, 0 files unresolved
time: real 1.210 secs (user 0.374+0.000 sys 0.764+0.000)
time: real 1.267 secs (user 0.390+0.000 sys 0.842+0.000)
time: real 1.335 secs (user 0.343+0.000 sys 0.842+0.000)

Patch

diff -r 7d269e7620c4 -r be24685ef8ae mercurial/win32.py
--- a/mercurial/win32.py
+++ b/mercurial/win32.py
@@ -25,4 +25,5 @@ 
 # GetLastError
 _ERROR_SUCCESS = 0
+_ERROR_SHARING_VIOLATION = 32
 _ERROR_INVALID_PARAMETER = 87
 _ERROR_INSUFFICIENT_BUFFER = 122
@@ -60,5 +61,7 @@ 
 _OPEN_EXISTING = 3
 
+_FILE_FLAG_OPEN_REPARSE_POINT = 0x00200000
 _FILE_FLAG_BACKUP_SEMANTICS = 0x02000000
+_FILE_FLAG_DELETE_ON_CLOSE = 0x04000000
 
 # SetFileAttributes
@@ -345,9 +348,16 @@ 
     '''try to implement POSIX' unlink semantics on Windows'''
 
-    if os.path.isdir(f):
-        # use EPERM because it is POSIX prescribed value, even though
-        # unlink(2) on directories returns EISDIR on Linux
-        raise IOError(errno.EPERM,
-                      "Unlinking directory not permitted: '%s'" % f)
+    # If we can open f exclusively, no other processes must have open handles
+    # for it and we can expect its name will be deleted immediately when we
+    # close the handle unless we have another in the same process.  We also
+    # expect we shall simply fail to open f if it is a directory.
+    fh = _kernel32.CreateFileA(f, 0, 0, None, _OPEN_EXISTING,
+        _FILE_FLAG_OPEN_REPARSE_POINT | _FILE_FLAG_DELETE_ON_CLOSE, None)
+    if fh != _INVALID_HANDLE_VALUE:
+        _kernel32.CloseHandle(fh)
+        return
+    error = _kernel32.GetLastError()
+    if error != _ERROR_SHARING_VIOLATION:
+        raise ctypes.WinError(error)
 
     # POSIX allows to unlink and rename open files. Windows has serious