Patchwork [3,of,3,v2] run-tests: use strings for env object

login
register
mail settings
Submitter timeless@mozdev.org
Date March 30, 2016, 8:29 a.m.
Message ID <3ac7919ccd75ec188623.1459326593@waste.org>
Download mbox | patch
Permalink /patch/14176/
State Changes Requested
Delegated to: Yuya Nishihara
Headers show

Comments

timeless@mozdev.org - March 30, 2016, 8:29 a.m.
# HG changeset patch
# User timeless <timeless@mozdev.org>
# Date 1459319268 0
#      Wed Mar 30 06:27:48 2016 +0000
# Node ID 3ac7919ccd75ec18862379804a4227f652f346c3
# Parent  bc149b7f13ad842e7945704a6b27467a1481dab4
run-tests: use strings for env object
Yuya Nishihara - March 31, 2016, 1:39 p.m.
On Wed, 30 Mar 2016 03:29:53 -0500, timeless wrote:
> # HG changeset patch
> # User timeless <timeless@mozdev.org>
> # Date 1459319268 0
> #      Wed Mar 30 06:27:48 2016 +0000
> # Node ID 3ac7919ccd75ec18862379804a4227f652f346c3
> # Parent  bc149b7f13ad842e7945704a6b27467a1481dab4
> run-tests: use strings for env object
> 
> diff --git a/tests/run-tests.py b/tests/run-tests.py
> --- a/tests/run-tests.py
> +++ b/tests/run-tests.py
> @@ -804,14 +804,16 @@
>              offset = '' if i == 0 else '%s' % i
>              env["HGPORT%s" % offset] = '%s' % (self._startport + i)
>          env = os.environ.copy()
> -        env['TESTTMP'] = self._testtmp
> -        env['HOME'] = self._testtmp
> +        testtmp = self._testtmp.decode('utf-8')
> +        threadtmp = self._threadtmp.decode('utf-8')
> +        env['TESTTMP'] = testtmp
> +        env['HOME'] = testtmp
>          # This number should match portneeded in _getport
>          for port in xrange(3):
>              # This list should be parallel to _portmap in _getreplacements
>              defineport(port)
> -        env["HGRCPATH"] = os.path.join(self._threadtmp, b'.hgrc')
> -        env["DAEMON_PIDS"] = os.path.join(self._threadtmp, b'daemon.pids')
> +        env["HGRCPATH"] = os.path.join(threadtmp, '.hgrc')
> +        env["DAEMON_PIDS"] = os.path.join(threadtmp, 'daemon.pids')

This seems going to the opposite direction. We generally avoid encoding
problems by not using unicode objects.

Perhaps it should use os.environb on Python 3.
Augie Fackler - April 5, 2016, 6:14 p.m.
On Thu, Mar 31, 2016 at 10:39:04PM +0900, Yuya Nishihara wrote:
> On Wed, 30 Mar 2016 03:29:53 -0500, timeless wrote:
> > # HG changeset patch
> > # User timeless <timeless@mozdev.org>
> > # Date 1459319268 0
> > #      Wed Mar 30 06:27:48 2016 +0000
> > # Node ID 3ac7919ccd75ec18862379804a4227f652f346c3
> > # Parent  bc149b7f13ad842e7945704a6b27467a1481dab4
> > run-tests: use strings for env object
> >
> > diff --git a/tests/run-tests.py b/tests/run-tests.py
> > --- a/tests/run-tests.py
> > +++ b/tests/run-tests.py
> > @@ -804,14 +804,16 @@
> >              offset = '' if i == 0 else '%s' % i
> >              env["HGPORT%s" % offset] = '%s' % (self._startport + i)
> >          env = os.environ.copy()
> > -        env['TESTTMP'] = self._testtmp
> > -        env['HOME'] = self._testtmp
> > +        testtmp = self._testtmp.decode('utf-8')
> > +        threadtmp = self._threadtmp.decode('utf-8')
> > +        env['TESTTMP'] = testtmp
> > +        env['HOME'] = testtmp
> >          # This number should match portneeded in _getport
> >          for port in xrange(3):
> >              # This list should be parallel to _portmap in _getreplacements
> >              defineport(port)
> > -        env["HGRCPATH"] = os.path.join(self._threadtmp, b'.hgrc')
> > -        env["DAEMON_PIDS"] = os.path.join(self._threadtmp, b'daemon.pids')
> > +        env["HGRCPATH"] = os.path.join(threadtmp, '.hgrc')
> > +        env["DAEMON_PIDS"] = os.path.join(threadtmp, 'daemon.pids')
>
> This seems going to the opposite direction. We generally avoid encoding
> problems by not using unicode objects.
>
> Perhaps it should use os.environb on Python 3.

Likely. That was the approach I took last time I hack-and-slashed my
way to some progress (but those patches were unfit for human
consumption.)

> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
timeless - May 5, 2016, 11:09 a.m.
I just spent some time investigating this.

The answer is: NO. WE SHOULD NOT USE environb.

That's an awful red herring.

os.environb

Bytes version of environ: a mapping object representing the
environment as byte strings. environ and environb are synchronized
(modify environb updates environ, and vice versa).

environb is only available if supports_bytes_environ is True.

New in version 3.2.

https://docs.python.org/3/library/os.html#os.environb

os.supports_bytes_environ

True if the native OS type of the environment is bytes (eg. False on Windows).

New in version 3.2.

https://docs.python.org/3/library/os.html#os.supports_bytes_environ

Trying to use environb would break python3 on windows because we'd
have an os.environ object instead of an os.environb object, and we'd
be trying to stash bytes into it.

I'm going to reassert that what I'm doing is correct. And I'd like to
get a version of it in so that I can unclutter my repository. I have a
whole slew of things that are blocked by this. Especially test.env
work.

I will send a patch to *REMOVE* all use of environb and an item for
check code to reject it.

On Thu, Mar 31, 2016 at 9:39 AM, Yuya Nishihara <yuya@tcha.org> wrote:
> On Wed, 30 Mar 2016 03:29:53 -0500, timeless wrote:
>> # HG changeset patch
>> # User timeless <timeless@mozdev.org>
>> # Date 1459319268 0
>> #      Wed Mar 30 06:27:48 2016 +0000
>> # Node ID 3ac7919ccd75ec18862379804a4227f652f346c3
>> # Parent  bc149b7f13ad842e7945704a6b27467a1481dab4
>> run-tests: use strings for env object
>>
>> diff --git a/tests/run-tests.py b/tests/run-tests.py
>> --- a/tests/run-tests.py
>> +++ b/tests/run-tests.py
>> @@ -804,14 +804,16 @@
>>              offset = '' if i == 0 else '%s' % i
>>              env["HGPORT%s" % offset] = '%s' % (self._startport + i)
>>          env = os.environ.copy()
>> -        env['TESTTMP'] = self._testtmp
>> -        env['HOME'] = self._testtmp
>> +        testtmp = self._testtmp.decode('utf-8')
>> +        threadtmp = self._threadtmp.decode('utf-8')
>> +        env['TESTTMP'] = testtmp
>> +        env['HOME'] = testtmp
>>          # This number should match portneeded in _getport
>>          for port in xrange(3):
>>              # This list should be parallel to _portmap in _getreplacements
>>              defineport(port)
>> -        env["HGRCPATH"] = os.path.join(self._threadtmp, b'.hgrc')
>> -        env["DAEMON_PIDS"] = os.path.join(self._threadtmp, b'daemon.pids')
>> +        env["HGRCPATH"] = os.path.join(threadtmp, '.hgrc')
>> +        env["DAEMON_PIDS"] = os.path.join(threadtmp, 'daemon.pids')
>
> This seems going to the opposite direction. We generally avoid encoding
> problems by not using unicode objects.
>
> Perhaps it should use os.environb on Python 3.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Pierre-Yves David - May 5, 2016, 12:17 p.m.
On 05/05/2016 01:09 PM, timeless wrote:
> I just spent some time investigating this.
>
> The answer is: NO. WE SHOULD NOT USE environb.
>
> That's an awful red herring.
>
> os.environb
>
> Bytes version of environ: a mapping object representing the
> environment as byte strings. environ and environb are synchronized
> (modify environb updates environ, and vice versa).
>
> environb is only available if supports_bytes_environ is True.
>
> New in version 3.2.
>
> https://docs.python.org/3/library/os.html#os.environb
>
> os.supports_bytes_environ
>
> True if the native OS type of the environment is bytes (eg. False on Windows).
>
> New in version 3.2.
>
> https://docs.python.org/3/library/os.html#os.supports_bytes_environ
>
> Trying to use environb would break python3 on windows because we'd
> have an os.environ object instead of an os.environb object, and we'd
> be trying to stash bytes into it.
>
> I'm going to reassert that what I'm doing is correct. And I'd like to
> get a version of it in so that I can unclutter my repository. I have a
> whole slew of things that are blocked by this. Especially test.env
> work.
>
> I will send a patch to *REMOVE* all use of environb and an item for
> check code to reject it.

Could we have a version that use environb implementing a thin layer in 
the window case to provide an environb like object in this case?
timeless - May 5, 2016, 12:48 p.m.
It's possible. I really don't want to do it.
This whole thing is a mess. If someone else wants to do that and save
me from another 24 hours of encoding hell, i'd be appreciative.

This is probably the fourth or fifth night I've lost trying to fight encodings.

On Thu, May 5, 2016 at 8:17 AM, Pierre-Yves David
<pierre-yves.david@ens-lyon.org> wrote:
>
>
> On 05/05/2016 01:09 PM, timeless wrote:
>>
>> I just spent some time investigating this.
>>
>> The answer is: NO. WE SHOULD NOT USE environb.
>>
>> That's an awful red herring.
>>
>> os.environb
>>
>> Bytes version of environ: a mapping object representing the
>> environment as byte strings. environ and environb are synchronized
>> (modify environb updates environ, and vice versa).
>>
>> environb is only available if supports_bytes_environ is True.
>>
>> New in version 3.2.
>>
>> https://docs.python.org/3/library/os.html#os.environb
>>
>> os.supports_bytes_environ
>>
>> True if the native OS type of the environment is bytes (eg. False on
>> Windows).
>>
>> New in version 3.2.
>>
>> https://docs.python.org/3/library/os.html#os.supports_bytes_environ
>>
>> Trying to use environb would break python3 on windows because we'd
>> have an os.environ object instead of an os.environb object, and we'd
>> be trying to stash bytes into it.
>>
>> I'm going to reassert that what I'm doing is correct. And I'd like to
>> get a version of it in so that I can unclutter my repository. I have a
>> whole slew of things that are blocked by this. Especially test.env
>> work.
>>
>> I will send a patch to *REMOVE* all use of environb and an item for
>> check code to reject it.
>
>
> Could we have a version that use environb implementing a thin layer in the
> window case to provide an environb like object in this case?
>
> --
> Pierre-Yves David
Yuya Nishihara - May 6, 2016, 1:15 a.m.
On Thu, 5 May 2016 08:48:00 -0400, timeless wrote:
> It's possible. I really don't want to do it.
> This whole thing is a mess. If someone else wants to do that and save
> me from another 24 hours of encoding hell, i'd be appreciative.
> 
> This is probably the fourth or fifth night I've lost trying to fight encodings.
> 
> On Thu, May 5, 2016 at 8:17 AM, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org> wrote:
> >
> >
> > On 05/05/2016 01:09 PM, timeless wrote:  
> >>
> >> I just spent some time investigating this.
> >>
> >> The answer is: NO. WE SHOULD NOT USE environb.
> >>
> >> That's an awful red herring.
> >>
> >> os.environb
> >>
> >> Bytes version of environ: a mapping object representing the
> >> environment as byte strings. environ and environb are synchronized
> >> (modify environb updates environ, and vice versa).
> >>
> >> environb is only available if supports_bytes_environ is True.
> >>
> >> New in version 3.2.
> >>
> >> https://docs.python.org/3/library/os.html#os.environb
> >>
> >> os.supports_bytes_environ
> >>
> >> True if the native OS type of the environment is bytes (eg. False on
> >> Windows).
> >>
> >> New in version 3.2.
> >>
> >> https://docs.python.org/3/library/os.html#os.supports_bytes_environ
> >>
> >> Trying to use environb would break python3 on windows because we'd
> >> have an os.environ object instead of an os.environb object, and we'd
> >> be trying to stash bytes into it.
> >>
> >> I'm going to reassert that what I'm doing is correct. And I'd like to
> >> get a version of it in so that I can unclutter my repository. I have a
> >> whole slew of things that are blocked by this. Especially test.env
> >> work.
> >>
> >> I will send a patch to *REMOVE* all use of environb and an item for
> >> check code to reject it.  
> >
> > Could we have a version that use environb implementing a thin layer in the
> > window case to provide an environb like object in this case?

Writing a thin environb would be easy if we don't have to update the real
environment variables in flight. Is it difficult to make run-tests.py use
subprocess.Popen() and pass env explicitly?

Anyway we'll need environb and Popen() wrappers to make hg work on Python3
on Windows.
timeless - May 6, 2016, 11:44 a.m.
We already have a Popen4, and generally pass environment objects.
On May 5, 2016 9:16 PM, "Yuya Nishihara" <yuya@tcha.org> wrote:

> On Thu, 5 May 2016 08:48:00 -0400, timeless wrote:
> > It's possible. I really don't want to do it.
> > This whole thing is a mess. If someone else wants to do that and save
> > me from another 24 hours of encoding hell, i'd be appreciative.
> >
> > This is probably the fourth or fifth night I've lost trying to fight
> encodings.
> >
> > On Thu, May 5, 2016 at 8:17 AM, Pierre-Yves David
> > <pierre-yves.david@ens-lyon.org> wrote:
> > >
> > >
> > > On 05/05/2016 01:09 PM, timeless wrote:
> > >>
> > >> I just spent some time investigating this.
> > >>
> > >> The answer is: NO. WE SHOULD NOT USE environb.
> > >>
> > >> That's an awful red herring.
> > >>
> > >> os.environb
> > >>
> > >> Bytes version of environ: a mapping object representing the
> > >> environment as byte strings. environ and environb are synchronized
> > >> (modify environb updates environ, and vice versa).
> > >>
> > >> environb is only available if supports_bytes_environ is True.
> > >>
> > >> New in version 3.2.
> > >>
> > >> https://docs.python.org/3/library/os.html#os.environb
> > >>
> > >> os.supports_bytes_environ
> > >>
> > >> True if the native OS type of the environment is bytes (eg. False on
> > >> Windows).
> > >>
> > >> New in version 3.2.
> > >>
> > >> https://docs.python.org/3/library/os.html#os.supports_bytes_environ
> > >>
> > >> Trying to use environb would break python3 on windows because we'd
> > >> have an os.environ object instead of an os.environb object, and we'd
> > >> be trying to stash bytes into it.
> > >>
> > >> I'm going to reassert that what I'm doing is correct. And I'd like to
> > >> get a version of it in so that I can unclutter my repository. I have a
> > >> whole slew of things that are blocked by this. Especially test.env
> > >> work.
> > >>
> > >> I will send a patch to *REMOVE* all use of environb and an item for
> > >> check code to reject it.
> > >
> > > Could we have a version that use environb implementing a thin layer in
> the
> > > window case to provide an environb like object in this case?
>
> Writing a thin environb would be easy if we don't have to update the real
> environment variables in flight. Is it difficult to make run-tests.py use
> subprocess.Popen() and pass env explicitly?
>
> Anyway we'll need environb and Popen() wrappers to make hg work on Python3
> on Windows.
>
Yuya Nishihara - May 6, 2016, 1 p.m.
On Fri, 6 May 2016 07:44:32 -0400, timeless wrote:
> We already have a Popen4, and generally pass environment objects.

Yeah, if it works with bytes environ, we don't have to mix unicodes and bytes,
which is horrible without strict type checking.
Gregory Szorc - May 8, 2016, 7:01 p.m.
I haven't been following this thread too closely. But in case it helps, the
"latin1" encoding can be used to convert between str/unicode without data
loss. I've used it to e.g. decode random byte sequences to unicode types to
appease APIs insisting on being passed unicode instances. It's definitely a
very hacky approach. And obviously it really only works if you start with
str/bytes up front: if something has already assumed an encoding, you are
out of luck :/

On Thu, May 5, 2016 at 5:48 AM, timeless <timeless@gmail.com> wrote:

> It's possible. I really don't want to do it.
> This whole thing is a mess. If someone else wants to do that and save
> me from another 24 hours of encoding hell, i'd be appreciative.
>
> This is probably the fourth or fifth night I've lost trying to fight
> encodings.
>
> On Thu, May 5, 2016 at 8:17 AM, Pierre-Yves David
> <pierre-yves.david@ens-lyon.org> wrote:
> >
> >
> > On 05/05/2016 01:09 PM, timeless wrote:
> >>
> >> I just spent some time investigating this.
> >>
> >> The answer is: NO. WE SHOULD NOT USE environb.
> >>
> >> That's an awful red herring.
> >>
> >> os.environb
> >>
> >> Bytes version of environ: a mapping object representing the
> >> environment as byte strings. environ and environb are synchronized
> >> (modify environb updates environ, and vice versa).
> >>
> >> environb is only available if supports_bytes_environ is True.
> >>
> >> New in version 3.2.
> >>
> >> https://docs.python.org/3/library/os.html#os.environb
> >>
> >> os.supports_bytes_environ
> >>
> >> True if the native OS type of the environment is bytes (eg. False on
> >> Windows).
> >>
> >> New in version 3.2.
> >>
> >> https://docs.python.org/3/library/os.html#os.supports_bytes_environ
> >>
> >> Trying to use environb would break python3 on windows because we'd
> >> have an os.environ object instead of an os.environb object, and we'd
> >> be trying to stash bytes into it.
> >>
> >> I'm going to reassert that what I'm doing is correct. And I'd like to
> >> get a version of it in so that I can unclutter my repository. I have a
> >> whole slew of things that are blocked by this. Especially test.env
> >> work.
> >>
> >> I will send a patch to *REMOVE* all use of environb and an item for
> >> check code to reject it.
> >
> >
> > Could we have a version that use environb implementing a thin layer in
> the
> > window case to provide an environb like object in this case?
> >
> > --
> > Pierre-Yves David
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>

Patch

diff --git a/tests/run-tests.py b/tests/run-tests.py
--- a/tests/run-tests.py
+++ b/tests/run-tests.py
@@ -804,14 +804,16 @@ 
             offset = '' if i == 0 else '%s' % i
             env["HGPORT%s" % offset] = '%s' % (self._startport + i)
         env = os.environ.copy()
-        env['TESTTMP'] = self._testtmp
-        env['HOME'] = self._testtmp
+        testtmp = self._testtmp.decode('utf-8')
+        threadtmp = self._threadtmp.decode('utf-8')
+        env['TESTTMP'] = testtmp
+        env['HOME'] = testtmp
         # This number should match portneeded in _getport
         for port in xrange(3):
             # This list should be parallel to _portmap in _getreplacements
             defineport(port)
-        env["HGRCPATH"] = os.path.join(self._threadtmp, b'.hgrc')
-        env["DAEMON_PIDS"] = os.path.join(self._threadtmp, b'daemon.pids')
+        env["HGRCPATH"] = os.path.join(threadtmp, '.hgrc')
+        env["DAEMON_PIDS"] = os.path.join(threadtmp, 'daemon.pids')
         env["HGEDITOR"] = ('"' + sys.executable + '"'
                            + ' -c "import sys; sys.exit(0)"')
         env["HGMERGE"] = "internal:merge"