Patchwork D6044: phabricator: convert conduit response JSON unicode to bytes inside callconduit

login
register
mail settings
Submitter phabricator
Date March 2, 2019, 6:51 p.m.
Message ID <differential-rev-PHID-DREV-fjfmism4fkarnjzscbxj-req@phab.mercurial-scm.org>
Download mbox | patch
Permalink /patch/38995/
State Superseded
Headers show

Comments

phabricator - March 2, 2019, 6:51 p.m.
Kwan created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  Previously the byte conversion was happening piecemeal in callers, and in the
  case of createdifferentialrevision not at all, leading to UnicodeEncodeErrors
  when trying to phabsend a commit with a description containing characters not
  representable in ascii. (issue6040)
  
  Remove all the scattered encoding.unitolocal calls and perform it once, inside
  callconduit, on the entire response dict recursively before returning it, in
  keeping with the strategy of converting at the earliest opportunity.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6044

AFFECTED FILES
  hgext/phabricator.py

CHANGE DETAILS




To: Kwan, #hg-reviewers
Cc: mercurial-devel
phabricator - March 2, 2019, 6:55 p.m.
Kwan added a comment.


  Apologies for the lack of a test, but I can't seem to write one that works.  Either the test harness isn't capable of tests with unicode in them, or I don't know how to represent it in the test files in such a way that works.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6044

To: Kwan, #hg-reviewers
Cc: mercurial-devel
phabricator - March 2, 2019, 11:08 p.m.
Kwan added a comment.


  Okay, making progress on the test now, just need to figure out how to update the vcr files cleanly.
  Slightly complicated by the fact that there was a small error in the original creation of the phabsend-update-alpha-create-beta.json recording.  It must have been made after the updated diff for alpha was already on here, because it hasn't recorded the alpha update, since the query reported the diff to already be the updated version.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6044

To: Kwan, #hg-reviewers
Cc: mercurial-devel
Yuya Nishihara - March 3, 2019, 9:34 a.m.
> -    parsed = json.loads(body)
> +    parsed = pycompat.rapply(lambda x: encoding.unitolocal(x)
> +                             if isinstance(x, unicode) else x, json.loads(body))

s/unicode/pycompat.unicode/

Perhaps some of `r''` would have to be changed to `b''` since dict keys
are now byte strings. See the wiki page for our Py3 hacks.

https://www.mercurial-scm.org/wiki/Python3
phabricator - March 3, 2019, 9:47 a.m.
yuja added a comment.


  > - parsed = json.loads(body) +    parsed = pycompat.rapply(lambda x: encoding.unitolocal(x) +                             if isinstance(x, unicode) else x, json.loads(body))
  
  s/unicode/pycompat.unicode/
  
  Perhaps some of `r''` would have to be changed to `b''` since dict keys
  are now byte strings. See the wiki page for our Py3 hacks.
  
  https://www.mercurial-scm.org/wiki/Python3

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6044

To: Kwan, #hg-reviewers
Cc: yuja, mercurial-devel
phabricator - March 4, 2019, 5:12 p.m.
Kwan added a comment.


  In https://phab.mercurial-scm.org/D6044#88262, @yuja wrote:
  
  > > - parsed = json.loads(body) +    parsed = pycompat.rapply(lambda x: encoding.unitolocal(x) +                             if isinstance(x, unicode) else x, json.loads(body))
  >
  > s/unicode/pycompat.unicode/
  
  
  Thanks, done.
  
  > Perhaps some of `r''` would have to be changed to `b''` since dict keys
  >  are now byte strings. See the wiki page for our Py3 hacks.
  > 
  > https://www.mercurial-scm.org/wiki/Python3
  
  Ah, good point, thanks.  Would it be worth keeping the keys as r'' strings?  rapply can fairly easily be extended with an optional notkeys boolean that would allow doing so.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6044

To: Kwan, #hg-reviewers
Cc: yuja, mercurial-devel
Yuya Nishihara - March 5, 2019, 1:44 p.m.
>   > Perhaps some of `r''` would have to be changed to `b''` since dict keys
>   >  are now byte strings. See the wiki page for our Py3 hacks.
>   > 
>   > https://www.mercurial-scm.org/wiki/Python3
>   
>   Ah, good point, thanks.  Would it be worth keeping the keys as r'' strings?  rapply can fairly easily be extended with an optional notkeys boolean that would allow doing so.

I prefer converting everything to bytes so we won't have to handle both
unicode and byte strings.
phabricator - March 5, 2019, 1:45 p.m.
yuja added a comment.


  >   > Perhaps some of `r''` would have to be changed to `b''` since dict keys
  >   >  are now byte strings. See the wiki page for our Py3 hacks.
  >   > 
  >   > https://www.mercurial-scm.org/wiki/Python3
  >   
  >   Ah, good point, thanks.  Would it be worth keeping the keys as r'' strings?  rapply can fairly easily be extended with an optional notkeys boolean that would allow doing so.
  
  I prefer converting everything to bytes so we won't have to handle both
  unicode and byte strings.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6044

To: Kwan, #hg-reviewers
Cc: yuja, mercurial-devel

Patch

diff --git a/hgext/phabricator.py b/hgext/phabricator.py
--- a/hgext/phabricator.py
+++ b/hgext/phabricator.py
@@ -60,6 +60,7 @@ 
     parser,
     patch,
     phases,
+    pycompat,
     registrar,
     scmutil,
     smartset,
@@ -219,7 +220,8 @@ 
         with contextlib.closing(urlopener.open(request)) as rsp:
             body = rsp.read()
     repo.ui.debug(b'Conduit Response: %s\n' % body)
-    parsed = json.loads(body)
+    parsed = pycompat.rapply(lambda x: encoding.unitolocal(x)
+                             if isinstance(x, unicode) else x, json.loads(body))
     if parsed.get(r'error_code'):
         msg = (_(b'Conduit Error (%s): %s')
                % (parsed[r'error_code'], parsed[r'error_info']))
@@ -251,7 +253,7 @@ 
                         {b'constraints': {b'callsigns': [callsign]}})
     if len(query[r'data']) == 0:
         return None
-    repophid = encoding.strtolocal(query[r'data'][0][r'phid'])
+    repophid = query[r'data'][0][r'phid']
     repo.ui.setconfig(b'phabricator', b'repophid', repophid)
     return repophid
 
@@ -305,8 +307,8 @@ 
         drevs = [drev for force, precs, drev in toconfirm.values()]
         alldiffs = callconduit(unfi, b'differential.querydiffs',
                                {b'revisionIDs': drevs})
-        getnode = lambda d: bin(encoding.unitolocal(
-            getdiffmeta(d).get(r'node', b''))) or None
+        getnode = lambda d: bin(
+            getdiffmeta(d).get(r'node', b'')) or None
         for newnode, (force, precset, drev) in toconfirm.items():
             diffs = [d for d in alldiffs.values()
                      if int(d[r'revisionID']) == drev]
@@ -582,7 +584,6 @@ 
                 drevid = drevids[i]
                 drev = [d for d in drevs if int(d[r'id']) == drevid][0]
                 newdesc = getdescfromdrev(drev)
-                newdesc = encoding.unitolocal(newdesc)
                 # Make sure commit message contain "Differential Revision"
                 if old.description() != newdesc:
                     if old.phase() == phases.public:
@@ -923,7 +924,7 @@ 
                 header += b'# %s %s\n' % (_metanamemap[k], meta[k])
 
         content = b'%s%s\n%s' % (header, desc, body)
-        write(encoding.unitolocal(content))
+        write(content)
 
 @vcrcommand(b'phabread',
          [(b'', b'stack', False, _(b'read dependencies'))],