Patchwork [1,of,7] py3: make util.datapath a bytes variable

login
register
mail settings
Submitter Pulkit Goyal
Date Nov. 2, 2016, 10:23 p.m.
Message ID <e541b0e5839988f63446.1478125386@pulkit-goyal>
Download mbox | patch
Permalink /patch/17298/
State Accepted
Headers show

Comments

Pulkit Goyal - Nov. 2, 2016, 10:23 p.m.
# HG changeset patch
# User Pulkit Goyal <7895pulkit@gmail.com>
# Date 1478113113 -19800
#      Thu Nov 03 00:28:33 2016 +0530
# Node ID e541b0e5839988f63446c88509db68772a55775b
# Parent  bb586966818986131068280bfd95fc96fbdaaa0d
py3: make util.datapath a bytes variable

Fixing things when a warning or error comes up was a good approach, but that
won't work in long time, because we will be having all the errors fixed
but no idea where we set which variable to bytes and which to unicodes.
Which function is returning bytes and which is returning unicodes.
We have to make sure if some variable is changed then its effects throughout
the repository are taken care.

In this patch we make util.datapath a bytes variables.

The line containing i18n.setdatapath is skipped for a reason.
i18n.setdatapath looks something like this.

def setdatapath(datapath):
    localedir = os.path.join(datapath, pycompat.sysstr('locale'))
    t = gettextmod.translation('hg', localedir, _languages, fallback=True)
    ....

Here we can't pass gettextmod.translation() bytes when we have _languages as
None in Python 3.5. But yeah we can pass 'hg' as bytes because the code which
returns TypeError deals with localedir variable only. So we need localedir to
be unicode to make gettextmod.translation() happy. If we pass the bytes
version of datapath we will have to convert localedir back to unicode.
So skipped that line of code before converting util.datapath to bytes to
use in rest of the code.
Yuya Nishihara - Nov. 4, 2016, 3:32 a.m.
On Thu, 03 Nov 2016 03:53:06 +0530, Pulkit Goyal wrote:
> # HG changeset patch
> # User Pulkit Goyal <7895pulkit@gmail.com>
> # Date 1478113113 -19800
> #      Thu Nov 03 00:28:33 2016 +0530
> # Node ID e541b0e5839988f63446c88509db68772a55775b
> # Parent  bb586966818986131068280bfd95fc96fbdaaa0d
> py3: make util.datapath a bytes variable
> 
> Fixing things when a warning or error comes up was a good approach, but that
> won't work in long time, because we will be having all the errors fixed
> but no idea where we set which variable to bytes and which to unicodes.
> Which function is returning bytes and which is returning unicodes.
> We have to make sure if some variable is changed then its effects throughout
> the repository are taken care.
> 
> In this patch we make util.datapath a bytes variables.
> 
> The line containing i18n.setdatapath is skipped for a reason.
> i18n.setdatapath looks something like this.
> 
> def setdatapath(datapath):
>     localedir = os.path.join(datapath, pycompat.sysstr('locale'))
>     t = gettextmod.translation('hg', localedir, _languages, fallback=True)
>     ....
> 
> Here we can't pass gettextmod.translation() bytes when we have _languages as
> None in Python 3.5. But yeah we can pass 'hg' as bytes because the code which
> returns TypeError deals with localedir variable only. So we need localedir to
> be unicode to make gettextmod.translation() happy. If we pass the bytes
> version of datapath we will have to convert localedir back to unicode.
> So skipped that line of code before converting util.datapath to bytes to
> use in rest of the code.

i18n.setdatapath() can decode bytes to unicode by fsdecode() for consistency
of API.

> diff -r bb5869668189 -r e541b0e58399 mercurial/util.py
> --- a/mercurial/util.py	Tue Nov 01 15:40:21 2016 -0400
> +++ b/mercurial/util.py	Thu Nov 03 00:28:33 2016 +0530
> @@ -940,6 +940,9 @@
>  
>  i18n.setdatapath(datapath)
>  
> +if not isinstance(datapath, bytes):
> +    datapath = datapath.encode('utf-8')

Perhaps we can use pycompat.fsencode(), which would be what Python 2 does
on Windows.

Patch

diff -r bb5869668189 -r e541b0e58399 mercurial/util.py
--- a/mercurial/util.py	Tue Nov 01 15:40:21 2016 -0400
+++ b/mercurial/util.py	Thu Nov 03 00:28:33 2016 +0530
@@ -940,6 +940,9 @@ 
 
 i18n.setdatapath(datapath)
 
+if not isinstance(datapath, bytes):
+    datapath = datapath.encode('utf-8')
+
 _hgexecutable = None
 
 def hgexecutable():