Patchwork [1,of,4] mail: let headencode() return a native string

login
register
mail settings
Submitter Denis Laxalde
Date Nov. 9, 2019, 4:06 p.m.
Message ID <1c7a66bf3315f6903642.1573315600@marimba>
Download mbox | patch
Permalink /patch/43098/
State Accepted
Headers show

Comments

Denis Laxalde - Nov. 9, 2019, 4:06 p.m.
# HG changeset patch
# User Denis Laxalde <denis@laxalde.org>
# Date 1573299914 -3600
#      Sat Nov 09 12:45:14 2019 +0100
# Node ID 1c7a66bf3315f69036420e2b07b584abacfef881
# Parent  a78a65c33b5aa84cc9c3ae5f7dad636d371f149c
mail: let headencode() return a native string

This is to avoid conversion to/from str on py3.
Yuya Nishihara - Nov. 10, 2019, 8:19 a.m.
On Sat, 09 Nov 2019 17:06:40 +0100, Denis Laxalde wrote:
> # HG changeset patch
> # User Denis Laxalde <denis@laxalde.org>
> # Date 1573299914 -3600
> #      Sat Nov 09 12:45:14 2019 +0100
> # Node ID 1c7a66bf3315f69036420e2b07b584abacfef881
> # Parent  a78a65c33b5aa84cc9c3ae5f7dad636d371f149c
> mail: let headencode() return a native string

Queued the first two, thanks.

> diff --git a/mercurial/mail.py b/mercurial/mail.py
> --- a/mercurial/mail.py
> +++ b/mercurial/mail.py
> @@ -365,13 +365,13 @@ def headencode(ui, s, charsets=None, dis
>      if not display:
>          # split into words?
>          s, cs = _encode(ui, s, charsets)
> -        return encoding.strtolocal(email.header.Header(s, cs).encode())
> -    return s
> +        return email.header.Header(s, cs).encode()
> +    return encoding.strfromlocal(s)

I think we should add some type comments to help readers to see which string
type is returned/accepted/used. Mixing bytes and unicode is a nightmare.

I'm not an expert, but maybe we can annotate both functions and variables
with "# type: <type expr>" comment?

https://google.github.io/pytype/user_guide.html#variable-annotations

Augie, do you have any idea?
Augie Fackler - Nov. 11, 2019, 3:55 p.m.
> On Nov 10, 2019, at 03:19, Yuya Nishihara <yuya@tcha.org> wrote:
> 
> On Sat, 09 Nov 2019 17:06:40 +0100, Denis Laxalde wrote:
>> # HG changeset patch
>> # User Denis Laxalde <denis@laxalde.org>
>> # Date 1573299914 -3600
>> #      Sat Nov 09 12:45:14 2019 +0100
>> # Node ID 1c7a66bf3315f69036420e2b07b584abacfef881
>> # Parent  a78a65c33b5aa84cc9c3ae5f7dad636d371f149c
>> mail: let headencode() return a native string
> 
> Queued the first two, thanks.
> 
>> diff --git a/mercurial/mail.py b/mercurial/mail.py
>> --- a/mercurial/mail.py
>> +++ b/mercurial/mail.py
>> @@ -365,13 +365,13 @@ def headencode(ui, s, charsets=None, dis
>>     if not display:
>>         # split into words?
>>         s, cs = _encode(ui, s, charsets)
>> -        return encoding.strtolocal(email.header.Header(s, cs).encode())
>> -    return s
>> +        return email.header.Header(s, cs).encode()
>> +    return encoding.strfromlocal(s)
> 
> I think we should add some type comments to help readers to see which string
> type is returned/accepted/used. Mixing bytes and unicode is a nightmare.
> 
> I'm not an expert, but maybe we can annotate both functions and variables
> with "# type: <type expr>" comment?
> 
> https://google.github.io/pytype/user_guide.html#variable-annotations
> 
> Augie, do you have any idea?

Yes, you should be able to annotate individual variables like that. I've never actually annotated a variable, as typically I get plenty out of annotating functions, but it's definitely worth a shot where it's messy.

Patch

diff --git a/hgext/notify.py b/hgext/notify.py
--- a/hgext/notify.py
+++ b/hgext/notify.py
@@ -421,8 +421,8 @@  class notifier(object):
         maxsubject = int(self.ui.config(b'notify', b'maxsubject'))
         if maxsubject:
             subject = stringutil.ellipsis(subject, maxsubject)
-        msg[r'Subject'] = encoding.strfromlocal(
-            mail.headencode(self.ui, subject, self.charsets, self.test)
+        msg[r'Subject'] = mail.headencode(
+            self.ui, subject, self.charsets, self.test
         )
 
         # try to make message have proper sender
diff --git a/mercurial/mail.py b/mercurial/mail.py
--- a/mercurial/mail.py
+++ b/mercurial/mail.py
@@ -365,13 +365,13 @@  def headencode(ui, s, charsets=None, dis
     if not display:
         # split into words?
         s, cs = _encode(ui, s, charsets)
-        return encoding.strtolocal(email.header.Header(s, cs).encode())
-    return s
+        return email.header.Header(s, cs).encode()
+    return encoding.strfromlocal(s)
 
 
 def _addressencode(ui, name, addr, charsets=None):
     assert isinstance(addr, bytes)
-    name = encoding.strfromlocal(headencode(ui, name, charsets))
+    name = headencode(ui, name, charsets)
     try:
         acc, dom = addr.split(b'@')
         acc.decode('ascii')