Patchwork [FOLLOWUP,STABLE] commitextras: make sure keys contains ascii letters, numbers, '_' and '-' only

login
register
mail settings
Submitter Pulkit Goyal
Date July 28, 2017, 2:20 p.m.
Message ID <2545b963f05a731d51f6.1501251641@workspace>
Download mbox | patch
Permalink /patch/22572/
State Superseded
Headers show

Comments

Pulkit Goyal - July 28, 2017, 2:20 p.m.
# HG changeset patch
# User Pulkit Goyal <7895pulkit@gmail.com>
# Date 1501207975 -19800
#      Fri Jul 28 07:42:55 2017 +0530
# Branch stable
# Node ID 2545b963f05a731d51f632bdb0572a755e80eca5
# Parent  850d2ec2cf6a266987a401752c909f95dd8c4c53
commitextras: make sure keys contains ascii letters, numbers, '_' and '-' only
via Mercurial-devel - July 28, 2017, 3:37 p.m.
On Fri, Jul 28, 2017 at 7:20 AM, Pulkit Goyal <7895pulkit@gmail.com> wrote:
> # HG changeset patch
> # User Pulkit Goyal <7895pulkit@gmail.com>
> # Date 1501207975 -19800
> #      Fri Jul 28 07:42:55 2017 +0530
> # Branch stable
> # Node ID 2545b963f05a731d51f632bdb0572a755e80eca5
> # Parent  850d2ec2cf6a266987a401752c909f95dd8c4c53
> commitextras: make sure keys contains ascii letters, numbers, '_' and '-' only

Why is that important? Are stored as is in the changeset data so they
can make it unparseable?
Yuya Nishihara - July 29, 2017, 1:04 p.m.
On Fri, 28 Jul 2017 08:37:33 -0700, Martin von Zweigbergk via Mercurial-devel wrote:
> On Fri, Jul 28, 2017 at 7:20 AM, Pulkit Goyal <7895pulkit@gmail.com> wrote:
> > # HG changeset patch
> > # User Pulkit Goyal <7895pulkit@gmail.com>
> > # Date 1501207975 -19800
> > #      Fri Jul 28 07:42:55 2017 +0530
> > # Branch stable
> > # Node ID 2545b963f05a731d51f632bdb0572a755e80eca5
> > # Parent  850d2ec2cf6a266987a401752c909f95dd8c4c53
> > commitextras: make sure keys contains ascii letters, numbers, '_' and '-' only
> 
> Why is that important? Are stored as is in the changeset data so they
> can make it unparseable?

':' and 8-bit characters have a real problem. ':' is a key-value separator,
and extras are stored without encoding conversion. The other characters are
probably okay, but I don't think it's good idea to allow free-form keys.

We can relax the restriction later, but tightening it wouldn't be easy.

> > +                    if re.match('[\w-]+', k).group() != k:

Common regex idiom is finding bad characters.

  re.search('[^\w-]', k)

Patch

diff --git a/hgext/commitextras.py b/hgext/commitextras.py
--- a/hgext/commitextras.py
+++ b/hgext/commitextras.py
@@ -9,6 +9,8 @@ 
 
 from __future__ import absolute_import
 
+import re
+
 from mercurial.i18n import _
 from mercurial import (
     commands,
@@ -52,6 +54,10 @@ 
                                 "KEY=VALUE format")
                         raise error.Abort(msg % raw)
                     k, v = raw.split('=', 1)
+                    if re.match('[\w-]+', k).group() != k:
+                        msg = _("keys can only contain ascii letters, digits,"
+                                " '_' and '-'")
+                        raise error.Abort(msg)
                     if k in usedinternally:
                         msg = _("key '%s' is used internally, can't be set "
                                 "manually")
diff --git a/tests/test-commit.t b/tests/test-commit.t
--- a/tests/test-commit.t
+++ b/tests/test-commit.t
@@ -135,6 +135,9 @@ 
   $ hg commit -m "adding internal used extras" --extra amend_source=hash
   abort: key 'amend_source' is used internally, can't be set manually
   [255]
+  $ hg commit -m "special chars in extra" --extra id@phab=214
+  abort: keys can only contain ascii letters, digits, '_' and '-'
+  [255]
   $ hg commit -m "adding extras" --extra sourcehash=foo --extra oldhash=bar
   $ hg log -r . -T '{extras % "{extra}\n"}'
   branch=default