Patchwork [7,of,7] templatefilters: document json filter that requires UTF-8 bytes

login
register
mail settings
Submitter Yuya Nishihara
Date Feb. 23, 2016, 3:45 p.m.
Message ID <02072a6dba91478d5f12.1456242332@mimosa>
Download mbox | patch
Permalink /patch/13318/
State Rejected
Delegated to: Yuya Nishihara
Headers show

Comments

Yuya Nishihara - Feb. 23, 2016, 3:45 p.m.
# HG changeset patch
# User Yuya Nishihara <yuya@tcha.org>
# Date 1452939833 -32400
#      Sat Jan 16 19:23:53 2016 +0900
# Node ID 02072a6dba91478d5f12fd52113dce45f0b2a8c3
# Parent  484371c3f9c61fd2694aae90b1c650dec203c596
templatefilters: document json filter that requires UTF-8 bytes

I don't know why we haven't documented it, but "json" filter is essential
for web templates. It should be a public template filter.
Augie Fackler - Feb. 23, 2016, 7:35 p.m.
On Wed, Feb 24, 2016 at 12:45:32AM +0900, Yuya Nishihara wrote:
> # HG changeset patch
> # User Yuya Nishihara <yuya@tcha.org>
> # Date 1452939833 -32400
> #      Sat Jan 16 19:23:53 2016 +0900
> # Node ID 02072a6dba91478d5f12fd52113dce45f0b2a8c3
> # Parent  484371c3f9c61fd2694aae90b1c650dec203c596
> templatefilters: document json filter that requires UTF-8 bytes

I've queued patches 1-6, but I have a fuzzy memory that mpm had a
reason to not document json. Matt?

>
> I don't know why we haven't documented it, but "json" filter is essential
> for web templates. It should be a public template filter.
>
> diff --git a/mercurial/templatefilters.py b/mercurial/templatefilters.py
> --- a/mercurial/templatefilters.py
> +++ b/mercurial/templatefilters.py
> @@ -193,6 +193,8 @@ def indent(text, prefix):
>      return "".join(indenter())
>
>  def json(obj):
> +    """:json: Any object. Serializes the object to a JSON formatted text.
> +    Input text should be encoded in UTF-8."""
>      if obj is None or obj is False or obj is True:
>          return {None: 'null', False: 'false', True: 'true'}[obj]
>      elif isinstance(obj, int) or isinstance(obj, float):
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Pierre-Yves David - March 7, 2016, 3:11 p.m.
On 02/23/2016 08:35 PM, Augie Fackler wrote:
> On Wed, Feb 24, 2016 at 12:45:32AM +0900, Yuya Nishihara wrote:
>> # HG changeset patch
>> # User Yuya Nishihara <yuya@tcha.org>
>> # Date 1452939833 -32400
>> #      Sat Jan 16 19:23:53 2016 +0900
>> # Node ID 02072a6dba91478d5f12fd52113dce45f0b2a8c3
>> # Parent  484371c3f9c61fd2694aae90b1c650dec203c596
>> templatefilters: document json filter that requires UTF-8 bytes
>
> I've queued patches 1-6, but I have a fuzzy memory that mpm had a
> reason to not document json. Matt?

Gentle ping.
Pierre-Yves David - March 9, 2016, 5:03 p.m.
On 03/07/2016 03:11 PM, Pierre-Yves David wrote:
>
>
> On 02/23/2016 08:35 PM, Augie Fackler wrote:
>> On Wed, Feb 24, 2016 at 12:45:32AM +0900, Yuya Nishihara wrote:
>>> # HG changeset patch
>>> # User Yuya Nishihara <yuya@tcha.org>
>>> # Date 1452939833 -32400
>>> #      Sat Jan 16 19:23:53 2016 +0900
>>> # Node ID 02072a6dba91478d5f12fd52113dce45f0b2a8c3
>>> # Parent  484371c3f9c61fd2694aae90b1c650dec203c596
>>> templatefilters: document json filter that requires UTF-8 bytes
>>
>> I've queued patches 1-6, but I have a fuzzy memory that mpm had a
>> reason to not document json. Matt?
>
> Gentle ping.

According to timeless, we don't want to document this filter because it 
would freeze the json format, something we are not ready to. I'm 
dropping the patch from patchwork.

Cheers,
Yuya Nishihara - March 11, 2016, 1:35 p.m.
On Wed, 9 Mar 2016 17:03:30 +0000, Pierre-Yves David wrote:
> On 03/07/2016 03:11 PM, Pierre-Yves David wrote:
> > On 02/23/2016 08:35 PM, Augie Fackler wrote:
> >> On Wed, Feb 24, 2016 at 12:45:32AM +0900, Yuya Nishihara wrote:
> >>> # HG changeset patch
> >>> # User Yuya Nishihara <yuya@tcha.org>
> >>> # Date 1452939833 -32400
> >>> #      Sat Jan 16 19:23:53 2016 +0900
> >>> # Node ID 02072a6dba91478d5f12fd52113dce45f0b2a8c3
> >>> # Parent  484371c3f9c61fd2694aae90b1c650dec203c596
> >>> templatefilters: document json filter that requires UTF-8 bytes
> >>
> >> I've queued patches 1-6, but I have a fuzzy memory that mpm had a
> >> reason to not document json. Matt?
> >
> > Gentle ping.
> 
> According to timeless, we don't want to document this filter because it
> would freeze the json format, something we are not ready to. I'm
> dropping the patch from patchwork.

I don't think this is related to timeless' work. "|json" filter is just
a general-purpose JSON serializer. But I'm okay to go without a public
docstring now. Instead, I'll add a comment that says input text should be
UTF-8.
timeless - March 11, 2016, 2:39 p.m.
So, a reason not to document this filter is if we haven't decided to
formally freeze at our encoding.

I'm not sure if we have, and I don't know if we have enough tests for its
input/output. If we do, then, it's OK to freeze this (if we're confident in
the encoding).
On Mar 11, 2016 8:39 AM, "Yuya Nishihara" <yuya@tcha.org> wrote:

> On Wed, 9 Mar 2016 17:03:30 +0000, Pierre-Yves David wrote:
> > On 03/07/2016 03:11 PM, Pierre-Yves David wrote:
> > > On 02/23/2016 08:35 PM, Augie Fackler wrote:
> > >> On Wed, Feb 24, 2016 at 12:45:32AM +0900, Yuya Nishihara wrote:
> > >>> # HG changeset patch
> > >>> # User Yuya Nishihara <yuya@tcha.org>
> > >>> # Date 1452939833 -32400
> > >>> #      Sat Jan 16 19:23:53 2016 +0900
> > >>> # Node ID 02072a6dba91478d5f12fd52113dce45f0b2a8c3
> > >>> # Parent  484371c3f9c61fd2694aae90b1c650dec203c596
> > >>> templatefilters: document json filter that requires UTF-8 bytes
> > >>
> > >> I've queued patches 1-6, but I have a fuzzy memory that mpm had a
> > >> reason to not document json. Matt?
> > >
> > > Gentle ping.
> >
> > According to timeless, we don't want to document this filter because it
> > would freeze the json format, something we are not ready to. I'm
> > dropping the patch from patchwork.
>
> I don't think this is related to timeless' work. "|json" filter is just
> a general-purpose JSON serializer. But I'm okay to go without a public
> docstring now. Instead, I'll add a comment that says input text should be
> UTF-8.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
Yuya Nishihara - March 11, 2016, 3:28 p.m.
On Fri, 11 Mar 2016 09:39:22 -0500, timeless wrote:
> So, a reason not to document this filter is if we haven't decided to
> formally freeze at our encoding.

You mean "|json" filter might be changed to convert character encoding?

The current version does no encoding conversion. If input is UTF-8, output
is UTF-8. If input is non-UTF-8 bytes, output is those bytes encapsulated in
UTF-8 structure to comply JSON spec.

> I'm not sure if we have, and I don't know if we have enough tests for its
> input/output. If we do, then, it's OK to freeze this (if we're confident in
> the encoding).

Well, I don't think there would be a better way, but I'll happily wait the
freeze of the whole JSON formatting issues.

Thanks for the clarification.
timeless - March 11, 2016, 7:05 p.m.
Yuya Nishihara <yuya@tcha.org> wrote:
> On Fri, 11 Mar 2016 09:39:22 -0500, timeless wrote:
>> So, a reason not to document this filter is if we haven't decided to

(I should have capitalized the "IF" here)

>> formally freeze at our encoding.
>
> You mean "|json" filter might be changed to convert character encoding?

I roughly meant that I don't know the details of what |json does /
what we want it to do / how much testing we have to verify it does
what we want it to do.

> The current version does no encoding conversion.

afaik, it would be possible to add an optional argument to json()
which wouldn't break the behavior of |json, right?
If so, then I don't think that an optional encoding thing blocks
moving forward w/ |json.

> If input is UTF-8, output
> is UTF-8. If input is non-UTF-8 bytes, output is those bytes encapsulated in
> UTF-8 structure to comply JSON spec.

If we're certain about the byte encapsulation (is there testing to verify this?)
and that the json spec is stable, and we don't intend to change it,
then I think we can freeze it and thus a patch to document it is fine.

>> I'm not sure if we have, and I don't know if we have enough tests for its
>> input/output. If we do, then, it's OK to freeze this (if we're confident in
>> the encoding).
>
> Well, I don't think there would be a better way, but I'll happily wait the
> freeze of the whole JSON formatting issues.

The other side (naming conventions) isn't related (imo) and shouldn't
block your change.

There reason I'm handwavy is that I don't want to spend the time
reviewing the JSON spec, the converter code, and the testbase to do
the verification (I have enough things to worry about right now, and
I'm not an official reviewer), so I'm only trying to describe things
in generalities. I believe I've outlined the requirements that someone
else could use to determine whether it's ready.

> Thanks for the clarification.

I hope this clarifies it.
Yuya Nishihara - March 14, 2016, 1:40 p.m.
On Fri, 11 Mar 2016 14:05:26 -0500, timeless wrote:
> I roughly meant that I don't know the details of what |json does /
> what we want it to do / how much testing we have to verify it does
> what we want it to do.

- what |json does:
  serialize an input object as JSON

- what we want it to do:
  (AFAIK) serialize an input object as JSON without loosing non-UTF8 bytes

- how much testing we have to verify it does what we want it to do:
  I think encoding handling is well tested thanks to a couple of bugs found
  by Hypothesis. But handling of non-trivial types (e.g. a _hybrid object)
  seems not tested well.

> > The current version does no encoding conversion.
> 
> afaik, it would be possible to add an optional argument to json()
> which wouldn't break the behavior of |json, right?
> If so, then I don't think that an optional encoding thing blocks
> moving forward w/ |json.

Yep, we can change it to a template function and add an optional parameter
if we want.

So, "encoding" won't be a blocker. Do anybody remember why "|json" wasn't
documented?
timeless - March 14, 2016, 1:43 p.m.
Yuya Nishihara <yuya@tcha.org> wrote:
> - how much testing we have to verify it does what we want it to do:
>   I think encoding handling is well tested thanks to a couple of bugs found
>   by Hypothesis. But handling of non-trivial types (e.g. a _hybrid object)
>   seems not tested well.

I'd feel more comfortable if we had more testing here :)

Patch

diff --git a/mercurial/templatefilters.py b/mercurial/templatefilters.py
--- a/mercurial/templatefilters.py
+++ b/mercurial/templatefilters.py
@@ -193,6 +193,8 @@  def indent(text, prefix):
     return "".join(indenter())
 
 def json(obj):
+    """:json: Any object. Serializes the object to a JSON formatted text.
+    Input text should be encoded in UTF-8."""
     if obj is None or obj is False or obj is True:
         return {None: 'null', False: 'false', True: 'true'}[obj]
     elif isinstance(obj, int) or isinstance(obj, float):