Patchwork D7020: pycompat: implement a shlexquote that properly handles bytes

login
register
mail settings
Submitter phabricator
Date Oct. 8, 2019, 4:43 a.m.
Message ID <differential-rev-PHID-DREV-74xv7ttpvdbhscxid4yl-req@mercurial-scm.org>
Download mbox | patch
Permalink /patch/42092/
State Superseded
Headers show

Comments

phabricator - Oct. 8, 2019, 4:43 a.m.
indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  2cc453284d5 <https://phab.mercurial-scm.org/rHG2cc453284d5c473dc722c1cca1d4ea403fc5ebd2> introduced this function call to for mail.py. This broke Python
  3 because shlex.quote() expects str, not bytes. In order to unbust it,
  we need to normalize bytes to str then go back to bytes to appease the
  caller.
  
  This is a bit ugly. But I don't see any other obvious solution.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D7020

AFFECTED FILES
  mercurial/pycompat.py

CHANGE DETAILS




To: indygreg, #hg-reviewers
Cc: mercurial-devel
phabricator - Oct. 8, 2019, 10:33 a.m.
marmoute added inline comments.

INLINE COMMENTS

> pycompat.py:343
> +        s = shlex.quote(s)
> +        return s.encode('latin-1')
> +

It might be a silly questions, but why are we going through `latin-1` instead of `utf-8` or any other crazy encoding ?

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7020/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7020

To: indygreg, #hg-reviewers
Cc: marmoute, mercurial-devel
phabricator - Oct. 8, 2019, 3:05 p.m.
indygreg added inline comments.

INLINE COMMENTS

> marmoute wrote in pycompat.py:343
> It might be a silly questions, but why are we going through `latin-1` instead of `utf-8` or any other crazy encoding ?

Because the underlying string is likely raw bytes and `latin-1` will preserve those code points. i.e. we don't care what the encoding is: just that we can pass `str` to `shlex.quote`.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7020/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7020

To: indygreg, #hg-reviewers
Cc: marmoute, mercurial-devel
Yuya Nishihara - Oct. 8, 2019, 3:40 p.m.
See https://www.mercurial-scm.org/pipermail/mercurial-devel/2019-October/134352.html
phabricator - Oct. 8, 2019, 3:42 p.m.
yuja added a comment.


  See https://www.mercurial-scm.org/pipermail/mercurial-devel/2019-October/134352.html

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7020/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7020

To: indygreg, #hg-reviewers
Cc: yuja, marmoute, mercurial-devel
phabricator - Oct. 10, 2019, 3:15 a.m.
indygreg added a comment.
indygreg abandoned this revision.


  Yuya's patch fixed the test failure and is a better solution.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7020/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7020

To: indygreg, #hg-reviewers
Cc: yuja, marmoute, mercurial-devel

Patch

diff --git a/mercurial/pycompat.py b/mercurial/pycompat.py
--- a/mercurial/pycompat.py
+++ b/mercurial/pycompat.py
@@ -337,7 +337,11 @@ 
         ret = shlex.split(s.decode('latin-1'), comments, posix)
         return [a.encode('latin-1') for a in ret]
 
-    shlexquote = shlex.quote
+    def shlexquote(s):
+        s = s.decode('latin-1')
+        s = shlex.quote(s)
+        return s.encode('latin-1')
+
     iteritems = lambda x: x.items()
     itervalues = lambda x: x.values()