Submitter | Pulkit Goyal |
---|---|
Date | Nov. 5, 2016, 11:16 p.m. |
Message ID | <b5fc4e71286dd4f33336.1478387785@pulkit-goyal> |
Download | mbox | patch |
Permalink | /patch/17360/ |
State | Accepted |
Headers | show |
Comments
On Sun, 06 Nov 2016 04:46:25 +0530, Pulkit Goyal wrote: > # HG changeset patch > # User Pulkit Goyal <7895pulkit@gmail.com> > # Date 1478387186 -19800 > # Sun Nov 06 04:36:26 2016 +0530 > # Node ID b5fc4e71286dd4f33336e4f38e0b9fb17f51f1e3 > # Parent 6eed3ee0df425da61d03bfe024dd082f3176ce5d > py3: have bytes version of sys.argv > > sys.argv returns unicodes on Python 3. We need a bytes version for us. > There was also a python bug/feature request which wanted then to implement > one. They rejected and it is quoted in one of the comments that we can use > fsencode() to get a bytes version of sys.argv. Though not sure about its > correctness. > > Link to the comment: http://bugs.python.org/issue8776#msg217416 > > After this patch we will have pycompat.sysargv which will return us bytes > version of sys.argv. If this patch goes in, i will like to make transformer > rewrite sys.argv with pycompat.argv because there are lot of occurences. > > diff -r 6eed3ee0df42 -r b5fc4e71286d mercurial/pycompat.py > --- a/mercurial/pycompat.py Sun Nov 06 04:17:19 2016 +0530 > +++ b/mercurial/pycompat.py Sun Nov 06 04:36:26 2016 +0530 > @@ -41,6 +41,7 @@ > osname = os.name.encode('ascii') > ospathsep = os.pathsep.encode('ascii') > ossep = os.sep.encode('ascii') > + sysargv = list(map(os.fsencode, sys.argv)) Looks good to me. Can you add a comment why we can use os.fsencode() here (and the weirdness of Python 3 on Unix.) We might need a Windows workaround because the situation is slightly different, but we wouldn't want to care for now.
On Sun, Nov 6, 2016 at 9:04 AM, Yuya Nishihara <yuya@tcha.org> wrote: > On Sun, 06 Nov 2016 04:46:25 +0530, Pulkit Goyal wrote: >> # HG changeset patch >> # User Pulkit Goyal <7895pulkit@gmail.com> >> # Date 1478387186 -19800 >> # Sun Nov 06 04:36:26 2016 +0530 >> # Node ID b5fc4e71286dd4f33336e4f38e0b9fb17f51f1e3 >> # Parent 6eed3ee0df425da61d03bfe024dd082f3176ce5d >> py3: have bytes version of sys.argv >> >> sys.argv returns unicodes on Python 3. We need a bytes version for us. >> There was also a python bug/feature request which wanted then to implement >> one. They rejected and it is quoted in one of the comments that we can use >> fsencode() to get a bytes version of sys.argv. Though not sure about its >> correctness. >> >> Link to the comment: http://bugs.python.org/issue8776#msg217416 >> >> After this patch we will have pycompat.sysargv which will return us bytes >> version of sys.argv. If this patch goes in, i will like to make transformer >> rewrite sys.argv with pycompat.argv because there are lot of occurences. >> >> diff -r 6eed3ee0df42 -r b5fc4e71286d mercurial/pycompat.py >> --- a/mercurial/pycompat.py Sun Nov 06 04:17:19 2016 +0530 >> +++ b/mercurial/pycompat.py Sun Nov 06 04:36:26 2016 +0530 >> @@ -41,6 +41,7 @@ >> osname = os.name.encode('ascii') >> ospathsep = os.pathsep.encode('ascii') >> ossep = os.sep.encode('ascii') >> + sysargv = list(map(os.fsencode, sys.argv)) > > Looks good to me. Can you add a comment why we can use os.fsencode() here > (and the weirdness of Python 3 on Unix.) We might need a Windows workaround > because the situation is slightly different, but we wouldn't want to care > for now. Well I will resend this patch because I am not sure about its correctness still. I followed that issue where Victor Stinner, one who wrote os.environb commented this. There are few doubts/confusions or maybe I want to just confirm it with MJ once.
On Sun, 6 Nov 2016 23:55:50 +0530, Pulkit Goyal wrote: > On Sun, Nov 6, 2016 at 9:04 AM, Yuya Nishihara <yuya@tcha.org> wrote: > > On Sun, 06 Nov 2016 04:46:25 +0530, Pulkit Goyal wrote: > >> # HG changeset patch > >> # User Pulkit Goyal <7895pulkit@gmail.com> > >> # Date 1478387186 -19800 > >> # Sun Nov 06 04:36:26 2016 +0530 > >> # Node ID b5fc4e71286dd4f33336e4f38e0b9fb17f51f1e3 > >> # Parent 6eed3ee0df425da61d03bfe024dd082f3176ce5d > >> py3: have bytes version of sys.argv > >> > >> sys.argv returns unicodes on Python 3. We need a bytes version for us. > >> There was also a python bug/feature request which wanted then to implement > >> one. They rejected and it is quoted in one of the comments that we can use > >> fsencode() to get a bytes version of sys.argv. Though not sure about its > >> correctness. > >> > >> Link to the comment: http://bugs.python.org/issue8776#msg217416 > >> > >> After this patch we will have pycompat.sysargv which will return us bytes > >> version of sys.argv. If this patch goes in, i will like to make transformer > >> rewrite sys.argv with pycompat.argv because there are lot of occurences. > >> > >> diff -r 6eed3ee0df42 -r b5fc4e71286d mercurial/pycompat.py > >> --- a/mercurial/pycompat.py Sun Nov 06 04:17:19 2016 +0530 > >> +++ b/mercurial/pycompat.py Sun Nov 06 04:36:26 2016 +0530 > >> @@ -41,6 +41,7 @@ > >> osname = os.name.encode('ascii') > >> ospathsep = os.pathsep.encode('ascii') > >> ossep = os.sep.encode('ascii') > >> + sysargv = list(map(os.fsencode, sys.argv)) > > > > Looks good to me. Can you add a comment why we can use os.fsencode() here > > (and the weirdness of Python 3 on Unix.) We might need a Windows workaround > > because the situation is slightly different, but we wouldn't want to care > > for now. > > Well I will resend this patch because I am not sure about its > correctness still. I followed that issue where Victor Stinner, one who > wrote os.environb commented this. There are few doubts/confusions or > maybe I want to just confirm it with MJ once. It's generally wrong to assume argv is in filesystem encoding, but I think that's okay for Python 3 on Unix. They builds "wchar_t argv" from "char argv" by Py_DecodeLocale(), which would be identical to fsdecode() on Unix. https://hg.python.org/cpython/file/v3.5.1/Programs/python.c#l55 On Windows, the native argv appears to be wchar_t, so we'll need a different hack to simulate the Python 2 (i.e. ANSI Win32 API) behavior.
(resend with CC the list) On Mon, 7 Nov 2016 20:40:19 +0530, Pulkit Goyal wrote: > > On Windows, the native argv appears to be wchar_t, so we'll need a different > > hack to simulate the Python 2 (i.e. ANSI Win32 API) behavior. > > Well we are not touching the Python 2 behaviour, it remains the same > with the patch. And it will be great if you can push this one with > your explanation appended :) Ok, I'll queue this and send a follow-up patch to add comment. > After this patch we will have pycompat.sysargv which will return us bytes > version of sys.argv. If this patch goes in, i will like to make transformer > rewrite sys.argv with pycompat.argv because there are lot of occurences. Maybe I said at the sprint, I'm not a big fan of rewriting complex nodes by the transformer. And some of sys.argv might not work if replaced to bytes argv.
Patch
diff -r 6eed3ee0df42 -r b5fc4e71286d mercurial/pycompat.py --- a/mercurial/pycompat.py Sun Nov 06 04:17:19 2016 +0530 +++ b/mercurial/pycompat.py Sun Nov 06 04:36:26 2016 +0530 @@ -41,6 +41,7 @@ osname = os.name.encode('ascii') ospathsep = os.pathsep.encode('ascii') ossep = os.sep.encode('ascii') + sysargv = list(map(os.fsencode, sys.argv)) def sysstr(s): """Return a keyword str to be passed to Python functions such as @@ -89,6 +90,7 @@ osname = os.name ospathsep = os.pathsep ossep = os.sep + sysargv = sys.argv stringio = io.StringIO empty = _queue.Empty