Patchwork bundle2: enforce parttype as alphanumerical

login
register
mail settings
Submitter Pierre-Yves David
Date Jan. 16, 2015, 10:30 a.m.
Message ID <0f1be09c71fcc524eb66.1421404233@marginatus.alto.octopoid.net>
Download mbox | patch
Permalink /patch/7489/
State Accepted
Commit 54139ea7ecfc1fea30b1afe47bf1e03a45aef61e
Headers show

Comments

Pierre-Yves David - Jan. 16, 2015, 10:30 a.m.
# HG changeset patch
# User Pierre-Yves David <pierre-yves.david@fb.com>
# Date 1418958841 28800
#      Thu Dec 18 19:14:01 2014 -0800
# Node ID 0f1be09c71fcc524eb66815003752abc41477587
# Parent  81349f4b47f4c793873290852808ff095ab24c00
bundle2: enforce parttype as alphanumerical

The binary format description have always state that the parttype should simple,
but it was never really enforced. Recent discussion have convinced me we want to
keep the part type simple and easy to debug. There is enough extensibility in
the rest of the format.
Augie Fackler - Jan. 16, 2015, 4:01 p.m.
On Fri, Jan 16, 2015 at 02:30:33AM -0800, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david@fb.com>
> # Date 1418958841 28800
> #      Thu Dec 18 19:14:01 2014 -0800
> # Node ID 0f1be09c71fcc524eb66815003752abc41477587
> # Parent  81349f4b47f4c793873290852808ff095ab24c00
> bundle2: enforce parttype as alphanumerical

Queued this, thanks.

>
> The binary format description have always state that the parttype should simple,
> but it was never really enforced. Recent discussion have convinced me we want to
> keep the part type simple and easy to debug. There is enough extensibility in
> the rest of the format.
>
> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
> --- a/mercurial/bundle2.py
> +++ b/mercurial/bundle2.py
> @@ -83,11 +83,11 @@ Binary format is as follow
>
>      The binary format of the header is has follow
>
>      :typesize: (one byte)
>
> -    :parttype: alphanumerical part name
> +    :parttype: alphanumerical part name (restricted to ^a-zA-Z0-9_:-])
>
>      :partid: A 32bits integer (unique in the bundle) that can be used to refer
>               to this part.
>
>      :parameters:
> @@ -151,10 +151,11 @@ import struct
>  import urllib
>  import string
>  import obsolete
>  import pushkey
>  import url
> +import re
>
>  import changegroup, error
>  from i18n import _
>
>  _pack = struct.pack
> @@ -169,10 +170,17 @@ from i18n import _
>  _fpayloadsize = '>i'
>  _fpartparamcount = '>BB'
>
>  preferedchunksize = 4096
>
> +_parttypeforbidden = re.compile('[^a-zA-Z0-9_:-]')
> +
> +def validateparttype(parttype):
> +    """raise ValueError if a parttype contains invalid character"""
> +    if _parttypeforbidden.match(parttype):
> +        raise ValueError(parttype)
> +
>  def _makefpartparamsizes(nbparams):
>      """return a struct format to read part parameter sizes
>
>      The number parameters is variable so we need to build that format
>      dynamically.
> @@ -189,10 +197,11 @@ def parthandler(parttype, params=()):
>          @parthandler('myparttype', ('mandatory', 'param', 'handled'))
>          def myparttypehandler(...):
>              '''process a part of type "my part".'''
>              ...
>      """
> +    validateparttype(parttype)
>      def _decorator(func):
>          lparttype = parttype.lower() # enforce lower case matching.
>          assert lparttype not in parthandlermapping
>          parthandlermapping[lparttype] = func
>          func.params = frozenset(params)
> @@ -588,10 +597,11 @@ class bundlepart(object):
>      Both data and parameters cannot be modified after the generation has begun.
>      """
>
>      def __init__(self, parttype, mandatoryparams=(), advisoryparams=(),
>                   data='', mandatory=True):
> +        validateparttype(parttype)
>          self.id = None
>          self.type = parttype
>          self._data = data
>          self._mandatoryparams = list(mandatoryparams)
>          self._advisoryparams = list(advisoryparams)
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Malte Helmert - Jan. 18, 2015, 1:16 a.m.
On 16.01.2015 17:01, Augie Fackler wrote:
> On Fri, Jan 16, 2015 at 02:30:33AM -0800, Pierre-Yves David wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david@fb.com>
>> # Date 1418958841 28800
>> #      Thu Dec 18 19:14:01 2014 -0800
>> # Node ID 0f1be09c71fcc524eb66815003752abc41477587
>> # Parent  81349f4b47f4c793873290852808ff095ab24c00
>> bundle2: enforce parttype as alphanumerical
> 
> Queued this, thanks.

I may be a bit late, but...

>> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
>> --- a/mercurial/bundle2.py
>> +++ b/mercurial/bundle2.py
>> @@ -83,11 +83,11 @@ Binary format is as follow
>>
>>      The binary format of the header is has follow
>>
>>      :typesize: (one byte)
>>
>> -    :parttype: alphanumerical part name
>> +    :parttype: alphanumerical part name (restricted to ^a-zA-Z0-9_:-])

...the comment seems to have two errors: "^" shouldn't be there because
the comment says what is allowed, not what is forbidden. And if "]" is
part of the syntax here, then "[" should be too. Replacing "^" by "["
should fix both.

But more importantly:

>> +_parttypeforbidden = re.compile('[^a-zA-Z0-9_:-]')
>> +
>> +def validateparttype(parttype):
>> +    """raise ValueError if a parttype contains invalid character"""
>> +    if _parttypeforbidden.match(parttype):
>> +        raise ValueError(parttype)
>> +

...given that this is "match", it only check the first character of
parttype. So something like "a$@%" would be accepted. Using "search"
instead of "match" should work.
Matt Mackall - Jan. 18, 2015, 2:09 a.m.
On Sun, 2015-01-18 at 02:16 +0100, Malte Helmert wrote:
> On 16.01.2015 17:01, Augie Fackler wrote:
> > On Fri, Jan 16, 2015 at 02:30:33AM -0800, Pierre-Yves David wrote:
> >> # HG changeset patch
> >> # User Pierre-Yves David <pierre-yves.david@fb.com>
> >> # Date 1418958841 28800
> >> #      Thu Dec 18 19:14:01 2014 -0800
> >> # Node ID 0f1be09c71fcc524eb66815003752abc41477587
> >> # Parent  81349f4b47f4c793873290852808ff095ab24c00
> >> bundle2: enforce parttype as alphanumerical
> > 
> > Queued this, thanks.
> 
> I may be a bit late, but...
> 
> >> diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
> >> --- a/mercurial/bundle2.py
> >> +++ b/mercurial/bundle2.py
> >> @@ -83,11 +83,11 @@ Binary format is as follow
> >>
> >>      The binary format of the header is has follow
> >>
> >>      :typesize: (one byte)
> >>
> >> -    :parttype: alphanumerical part name
> >> +    :parttype: alphanumerical part name (restricted to ^a-zA-Z0-9_:-])
> 
> ...the comment seems to have two errors: "^" shouldn't be there because
> the comment says what is allowed, not what is forbidden. And if "]" is
> part of the syntax here, then "[" should be too. Replacing "^" by "["
> should fix both.
> 
> But more importantly:
> 
> >> +_parttypeforbidden = re.compile('[^a-zA-Z0-9_:-]')
> >> +
> >> +def validateparttype(parttype):
> >> +    """raise ValueError if a parttype contains invalid character"""
> >> +    if _parttypeforbidden.match(parttype):
> >> +        raise ValueError(parttype)
> >> +
> 
> ...given that this is "match", it only check the first character of
> parttype. So something like "a$@%" would be accepted. Using "search"
> instead of "match" should work.

Fixed, thanks. Good spotting!

Patch

diff --git a/mercurial/bundle2.py b/mercurial/bundle2.py
--- a/mercurial/bundle2.py
+++ b/mercurial/bundle2.py
@@ -83,11 +83,11 @@  Binary format is as follow
 
     The binary format of the header is has follow
 
     :typesize: (one byte)
 
-    :parttype: alphanumerical part name
+    :parttype: alphanumerical part name (restricted to ^a-zA-Z0-9_:-])
 
     :partid: A 32bits integer (unique in the bundle) that can be used to refer
              to this part.
 
     :parameters:
@@ -151,10 +151,11 @@  import struct
 import urllib
 import string
 import obsolete
 import pushkey
 import url
+import re
 
 import changegroup, error
 from i18n import _
 
 _pack = struct.pack
@@ -169,10 +170,17 @@  from i18n import _
 _fpayloadsize = '>i'
 _fpartparamcount = '>BB'
 
 preferedchunksize = 4096
 
+_parttypeforbidden = re.compile('[^a-zA-Z0-9_:-]')
+
+def validateparttype(parttype):
+    """raise ValueError if a parttype contains invalid character"""
+    if _parttypeforbidden.match(parttype):
+        raise ValueError(parttype)
+
 def _makefpartparamsizes(nbparams):
     """return a struct format to read part parameter sizes
 
     The number parameters is variable so we need to build that format
     dynamically.
@@ -189,10 +197,11 @@  def parthandler(parttype, params=()):
         @parthandler('myparttype', ('mandatory', 'param', 'handled'))
         def myparttypehandler(...):
             '''process a part of type "my part".'''
             ...
     """
+    validateparttype(parttype)
     def _decorator(func):
         lparttype = parttype.lower() # enforce lower case matching.
         assert lparttype not in parthandlermapping
         parthandlermapping[lparttype] = func
         func.params = frozenset(params)
@@ -588,10 +597,11 @@  class bundlepart(object):
     Both data and parameters cannot be modified after the generation has begun.
     """
 
     def __init__(self, parttype, mandatoryparams=(), advisoryparams=(),
                  data='', mandatory=True):
+        validateparttype(parttype)
         self.id = None
         self.type = parttype
         self._data = data
         self._mandatoryparams = list(mandatoryparams)
         self._advisoryparams = list(advisoryparams)