Patchwork [1,of,3] killdaemons: fix error handling on Windows

login
register
mail settings
Submitter Matt Harbison
Date May 15, 2017, 4:08 a.m.
Message ID <024271e90987e5794dab.1494821310@Envy>
Download mbox | patch
Permalink /patch/20623/
State Changes Requested
Delegated to: Yuya Nishihara
Headers show

Comments

Matt Harbison - May 15, 2017, 4:08 a.m.
# HG changeset patch
# User Matt Harbison <matt_harbison@yahoo.com>
# Date 1494779534 14400
#      Sun May 14 12:32:14 2017 -0400
# Node ID 024271e90987e5794dab6eb00844467065fae7e4
# Parent  db0b470547d5e21f042390a204f09ae7d7303757
killdaemons: fix error handling on Windows

After taking Adrian's suggestion[1] to use util.posixfile to avoid os.unlink()
errors, the error changed from:

  Errored test-hgweb-json.t: Traceback (most recent call last):
    File "./run-tests.py", line 724, in run
      self.tearDown()
    File "./run-tests.py", line 805, in tearDown
      killdaemons(entry)
    File "./run-tests.py", line 540, in killdaemons
      logfn=vlog)
    File "...\tests\killdaemons.py", line 94, in killdaemons
      os.unlink(pidfile)
  WindowsError: [Error 32] The process cannot access the file because it is
     being used by another process: '...\\hgtests.zmpqj3\\child80\\daemon.pids'

to:

  Errored test-largefiles.t: Traceback (most recent call last):
    File "./run-tests.py", line 724, in run
      self.tearDown()
    File "./run-tests.py", line 805, in tearDown
      killdaemons(entry)
    File "./run-tests.py", line 540, in killdaemons
      logfn=vlog)
    File "...\tests\killdaemons.py", line 91, in killdaemons
      kill(pid, logfn, tryhard)
    File "...\tests\killdaemons.py", line 55, in kill
      _check(ctypes.windll.kernel32.CloseHandle(handle))
    File "...\tests\killdaemons.py", line 18, in _check
      raise ctypes.WinError(winerrno)
  WindowsError: [Error 6] The handle is invalid.

The handle in question is for the process, not the file.  That made me wonder
why WaitForSingleObject() didn't raise an error, and it's because it isn't a
BOOL return.  (For this method, a return of 0 is WAIT_OBJECT_0, and an error is
signalled by returning (DWORD) 0xFFFFFFFF.)  When I printed out that value (as
returned by the function) with '0x%x', it printed '0x-1'.  There are some bugs
around DWORDs in this range[2], and since the code explictly tests for errors it
wants to handle, it seems safer to test for 'ret != TRUE', to avoid silent
breakage in the future if this bug is fixed.  The only case not handled is
WAIT_ABANDONED, which will now raise.  ctypes.windll.kernel32.GetLastError has a
'c_long' in its restype field, instead of 'c_ulong' for some reason.

With that fixed, WaitForSingleObject() started raising an error with the same
message, meaning OpenProcess() was returning an invalid handle.  Since HANDLE is
a pointer, the proper check is for None, not 0.  With this fixed, I confirmed
with a print statement that the err 87 case now properly bails out.

[1] https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-May/097600.html
[2] https://bugs.python.org/issue28474
Yuya Nishihara - May 15, 2017, 2:21 p.m.
On Mon, 15 May 2017 00:08:30 -0400, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison@yahoo.com>
> # Date 1494779534 14400
> #      Sun May 14 12:32:14 2017 -0400
> # Node ID 024271e90987e5794dab6eb00844467065fae7e4
> # Parent  db0b470547d5e21f042390a204f09ae7d7303757
> killdaemons: fix error handling on Windows
> 
> After taking Adrian's suggestion[1] to use util.posixfile to avoid os.unlink()
> errors, the error changed from:
> 
>   Errored test-hgweb-json.t: Traceback (most recent call last):
>     File "./run-tests.py", line 724, in run
>       self.tearDown()
>     File "./run-tests.py", line 805, in tearDown
>       killdaemons(entry)
>     File "./run-tests.py", line 540, in killdaemons
>       logfn=vlog)
>     File "...\tests\killdaemons.py", line 94, in killdaemons
>       os.unlink(pidfile)
>   WindowsError: [Error 32] The process cannot access the file because it is
>      being used by another process: '...\\hgtests.zmpqj3\\child80\\daemon.pids'
> 
> to:
> 
>   Errored test-largefiles.t: Traceback (most recent call last):
>     File "./run-tests.py", line 724, in run
>       self.tearDown()
>     File "./run-tests.py", line 805, in tearDown
>       killdaemons(entry)
>     File "./run-tests.py", line 540, in killdaemons
>       logfn=vlog)
>     File "...\tests\killdaemons.py", line 91, in killdaemons
>       kill(pid, logfn, tryhard)
>     File "...\tests\killdaemons.py", line 55, in kill
>       _check(ctypes.windll.kernel32.CloseHandle(handle))
>     File "...\tests\killdaemons.py", line 18, in _check
>       raise ctypes.WinError(winerrno)
>   WindowsError: [Error 6] The handle is invalid.
> 
> The handle in question is for the process, not the file.  That made me wonder
> why WaitForSingleObject() didn't raise an error, and it's because it isn't a
> BOOL return.

I think it's better to define _check() function per return-value type. Win32
API doesn't state that the TRUE value is 1, even though it's practically 1.

> (For this method, a return of 0 is WAIT_OBJECT_0, and an error is
> signalled by returning (DWORD) 0xFFFFFFFF.)  When I printed out that value (as
> returned by the function) with '0x%x', it printed '0x-1'.

That's probably because we don't teach ctypes the function signature. A return
value is taken as int by default.

> -        if handle == 0:
> +        if handle is None:
>              _check(0, 87) # err 87 when process not found

This could be _checkptr(handle, 87) for example.
Matt Harbison - May 16, 2017, 1:03 a.m.
On Mon, 15 May 2017 10:21:13 -0400, Yuya Nishihara <yuya@tcha.org> wrote:

> On Mon, 15 May 2017 00:08:30 -0400, Matt Harbison wrote:
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison@yahoo.com>
>> # Date 1494779534 14400
>> #      Sun May 14 12:32:14 2017 -0400
>> # Node ID 024271e90987e5794dab6eb00844467065fae7e4
>> # Parent  db0b470547d5e21f042390a204f09ae7d7303757
>> killdaemons: fix error handling on Windows
>>
>> After taking Adrian's suggestion[1] to use util.posixfile to avoid  
>> os.unlink()
>> errors, the error changed from:
>>
>>   Errored test-hgweb-json.t: Traceback (most recent call last):
>>     File "./run-tests.py", line 724, in run
>>       self.tearDown()
>>     File "./run-tests.py", line 805, in tearDown
>>       killdaemons(entry)
>>     File "./run-tests.py", line 540, in killdaemons
>>       logfn=vlog)
>>     File "...\tests\killdaemons.py", line 94, in killdaemons
>>       os.unlink(pidfile)
>>   WindowsError: [Error 32] The process cannot access the file because  
>> it is
>>      being used by another process:  
>> '...\\hgtests.zmpqj3\\child80\\daemon.pids'
>>
>> to:
>>
>>   Errored test-largefiles.t: Traceback (most recent call last):
>>     File "./run-tests.py", line 724, in run
>>       self.tearDown()
>>     File "./run-tests.py", line 805, in tearDown
>>       killdaemons(entry)
>>     File "./run-tests.py", line 540, in killdaemons
>>       logfn=vlog)
>>     File "...\tests\killdaemons.py", line 91, in killdaemons
>>       kill(pid, logfn, tryhard)
>>     File "...\tests\killdaemons.py", line 55, in kill
>>       _check(ctypes.windll.kernel32.CloseHandle(handle))
>>     File "...\tests\killdaemons.py", line 18, in _check
>>       raise ctypes.WinError(winerrno)
>>   WindowsError: [Error 6] The handle is invalid.
>>
>> The handle in question is for the process, not the file.  That made me  
>> wonder
>> why WaitForSingleObject() didn't raise an error, and it's because it  
>> isn't a
>> BOOL return.
>
> I think it's better to define _check() function per return-value type.  
> Win32
> API doesn't state that the TRUE value is 1, even though it's practically  
> 1.
>
>> (For this method, a return of 0 is WAIT_OBJECT_0, and an error is
>> signalled by returning (DWORD) 0xFFFFFFFF.)  When I printed out that  
>> value (as
>> returned by the function) with '0x%x', it printed '0x-1'.
>
> That's probably because we don't teach ctypes the function signature. A  
> return
> value is taken as int by default.
>
>> -        if handle == 0:
>> +        if handle is None:
>>              _check(0, 87) # err 87 when process not found
>
> This could be _checkptr(handle, 87) for example.

Since there are 3 different return types in play that would need  
specialized check functions (and one of those is special cased to  
WaitForSingleObject()), isn't that a bit over-engineered?  I thought about  
just using _check(0) to handle WFSO() right as I was about to patchbomb  
it, and that would allow the previous 'ret == 0' test.
Yuya Nishihara - May 16, 2017, 2:27 p.m.
On Mon, 15 May 2017 21:03:13 -0400, Matt Harbison wrote:
> On Mon, 15 May 2017 10:21:13 -0400, Yuya Nishihara <yuya@tcha.org> wrote:
> 
> > On Mon, 15 May 2017 00:08:30 -0400, Matt Harbison wrote:
> >> # HG changeset patch
> >> # User Matt Harbison <matt_harbison@yahoo.com>
> >> # Date 1494779534 14400
> >> #      Sun May 14 12:32:14 2017 -0400
> >> # Node ID 024271e90987e5794dab6eb00844467065fae7e4
> >> # Parent  db0b470547d5e21f042390a204f09ae7d7303757
> >> killdaemons: fix error handling on Windows
> >>
> >> After taking Adrian's suggestion[1] to use util.posixfile to avoid  
> >> os.unlink()
> >> errors, the error changed from:
> >>
> >>   Errored test-hgweb-json.t: Traceback (most recent call last):
> >>     File "./run-tests.py", line 724, in run
> >>       self.tearDown()
> >>     File "./run-tests.py", line 805, in tearDown
> >>       killdaemons(entry)
> >>     File "./run-tests.py", line 540, in killdaemons
> >>       logfn=vlog)
> >>     File "...\tests\killdaemons.py", line 94, in killdaemons
> >>       os.unlink(pidfile)
> >>   WindowsError: [Error 32] The process cannot access the file because  
> >> it is
> >>      being used by another process:  
> >> '...\\hgtests.zmpqj3\\child80\\daemon.pids'
> >>
> >> to:
> >>
> >>   Errored test-largefiles.t: Traceback (most recent call last):
> >>     File "./run-tests.py", line 724, in run
> >>       self.tearDown()
> >>     File "./run-tests.py", line 805, in tearDown
> >>       killdaemons(entry)
> >>     File "./run-tests.py", line 540, in killdaemons
> >>       logfn=vlog)
> >>     File "...\tests\killdaemons.py", line 91, in killdaemons
> >>       kill(pid, logfn, tryhard)
> >>     File "...\tests\killdaemons.py", line 55, in kill
> >>       _check(ctypes.windll.kernel32.CloseHandle(handle))
> >>     File "...\tests\killdaemons.py", line 18, in _check
> >>       raise ctypes.WinError(winerrno)
> >>   WindowsError: [Error 6] The handle is invalid.
> >>
> >> The handle in question is for the process, not the file.  That made me  
> >> wonder
> >> why WaitForSingleObject() didn't raise an error, and it's because it  
> >> isn't a
> >> BOOL return.
> >
> > I think it's better to define _check() function per return-value type.  
> > Win32
> > API doesn't state that the TRUE value is 1, even though it's practically  
> > 1.
> >
> >> (For this method, a return of 0 is WAIT_OBJECT_0, and an error is
> >> signalled by returning (DWORD) 0xFFFFFFFF.)  When I printed out that  
> >> value (as
> >> returned by the function) with '0x%x', it printed '0x-1'.
> >
> > That's probably because we don't teach ctypes the function signature. A  
> > return
> > value is taken as int by default.
> >
> >> -        if handle == 0:
> >> +        if handle is None:
> >>              _check(0, 87) # err 87 when process not found
> >
> > This could be _checkptr(handle, 87) for example.
> 
> Since there are 3 different return types in play that would need  
> specialized check functions (and one of those is special cased to  
> WaitForSingleObject()), isn't that a bit over-engineered?  I thought about  
> just using _check(0) to handle WFSO() right as I was about to patchbomb  
> it, and that would allow the previous 'ret == 0' test.

Then, how about extracting a function that translates LastError to WinError?

  if handle is None:
      _mayberaisewinerror(87)

Patch

diff --git a/tests/killdaemons.py b/tests/killdaemons.py
--- a/tests/killdaemons.py
+++ b/tests/killdaemons.py
@@ -11,7 +11,10 @@ 
     import ctypes
 
     def _check(ret, expectederr=None):
-        if ret == 0:
+        # 0 is typically a failure, but NOT for WaitForSingleObject().  It is
+        # expected that the cases which can be handled for WaitForSingleObject()
+        # will be before calling this.
+        if ret != 1:
             winerrno = ctypes.GetLastError()
             if winerrno == expectederr:
                 return True
@@ -27,7 +30,7 @@ 
         handle = ctypes.windll.kernel32.OpenProcess(
                 PROCESS_TERMINATE|SYNCHRONIZE|PROCESS_QUERY_INFORMATION,
                 False, pid)
-        if handle == 0:
+        if handle is None:
             _check(0, 87) # err 87 when process not found
             return # process not found, already finished
         try: