Patchwork [3,of,3] listkeypattern: add listkeypattern wireproto method

login
register
mail settings
Submitter Stanislau Hlebik
Date Aug. 15, 2016, 8:34 a.m.
Message ID <C366F670-2F89-41CB-8175-789EB5BA9887@fb.com>
Download mbox | patch
Permalink /patch/16303/
State Not Applicable
Headers show

Comments

Stanislau Hlebik - Aug. 15, 2016, 8:34 a.m.
So to summarize, you suggest to wait for [PATCH 7 of 9 RFC] pushkey: support for encoding and decoding raw listkeys dicts to be committed and implement new wireproto method listkeypattern using encodekeysraw/decodekeysraw? That sounds good for me.

From: Gregory Szorc <gregory.szorc@gmail.com>

Date: Sunday, August 14, 2016 at 6:17 PM
To: Stanislau Hlebik <stash@fb.com>
Cc: mercurial-devel <mercurial-devel@mercurial-scm.org>
Subject: Re: [PATCH 3 of 3] listkeypattern: add listkeypattern wireproto method

On Fri, Aug 12, 2016 at 5:09 AM, Stanislau Hlebik <stash@fb.com<mailto:stash@fb.com>> wrote:
# HG changeset patch
# User Stanislau Hlebik <stash@fb.com<mailto:stash@fb.com>>
# Date 1470999441 25200
#      Fri Aug 12 03:57:21 2016 -0700
# Node ID c2ee493e216c60ff439ab93cc1efe6ac5922d8eb
# Parent  fd2185d7c2f7aa529b2ad0a6584832fb2b1b4ecb
listkeypattern: add listkeypattern wireproto method

wireproto method to list remote keys by pattern

 @wireprotocommand('lookup', 'key')
 def lookup(repo, proto, key):
     try:

I think introducing a new wire protocol command is the correct way to solve this problem (as opposed to introducing a new argument on the existing command).
However, if we're introducing a new wire protocol command for obtaining pushkey values, I think we should improve deficiencies in the response encoding rather than propagate its problems.

The "listkeys" response encoding can't transmit the full range of binary values. This can lead to larger (and slower) responses sizes. For example, a number of pushkey namespaces exchange lists of nodes. These have to be represented as hex instead of binary. For pushkey namespaces like phases or obsolescence that can exchange hundreds or thousands of nodes, the overhead can add up.
I think the response from a new listkeys command should be using framing to encode the key names and values so the full range of binary values can be efficiently transferred. We may also want a special mechanism to represent a list of nodes, as avoiding the overhead of framing on fixed width values would be desirable.

Of course, at the point you introduce a new response encoding, we may want to call the command "listkeys2." If others agree, I can code up an implementation and you can add the patterns functionality on top of it.
Gregory Szorc - Aug. 15, 2016, 4:12 p.m.
> On Aug 15, 2016, at 01:34, Stanislau Hlebik <stash@fb.com> wrote:
> 
> So to summarize, you suggest to wait for [PATCH 7 of 9 RFC] pushkey: support for encoding and decoding raw listkeys dicts to be committed and implement new wireproto method listkeypattern using encodekeysraw/decodekeysraw? That sounds good for me.

Or add a new argument to the "listkeys2" command I sent in that series.

Others may have different opinions on the need to overhaul pushkey. At the very minimum I think any new pushkey command should be using a more efficient and robust encoding.

>  
> From: Gregory Szorc <gregory.szorc@gmail.com>
> Date: Sunday, August 14, 2016 at 6:17 PM
> To: Stanislau Hlebik <stash@fb.com>
> Cc: mercurial-devel <mercurial-devel@mercurial-scm.org>
> Subject: Re: [PATCH 3 of 3] listkeypattern: add listkeypattern wireproto method
>  
> On Fri, Aug 12, 2016 at 5:09 AM, Stanislau Hlebik <stash@fb.com> wrote:
> # HG changeset patch
> # User Stanislau Hlebik <stash@fb.com>
> # Date 1470999441 25200
> #      Fri Aug 12 03:57:21 2016 -0700
> # Node ID c2ee493e216c60ff439ab93cc1efe6ac5922d8eb
> # Parent  fd2185d7c2f7aa529b2ad0a6584832fb2b1b4ecb
> listkeypattern: add listkeypattern wireproto method
> 
> wireproto method to list remote keys by pattern
> 
> diff --git a/mercurial/wireproto.py b/mercurial/wireproto.py
> --- a/mercurial/wireproto.py
> +++ b/mercurial/wireproto.py
> @@ -353,6 +353,22 @@
>                        % (namespace, len(d)))
>          yield pushkeymod.decodekeys(d)
> 
> +    @batchable
> +    def listkeypattern(self, namespace, patterns):
> +        if not self.capable('pushkey'):
> +            yield {}, None
> +        f = future()
> +        self.ui.debug('preparing listkeys for "%s" with pattern "%s"\n' %
> +                      (namespace, patterns))
> +        yield {
> +            'namespace': encoding.fromlocal(namespace),
> +            'patterns': encodelist(patterns)
> +        }, f
> +        d = f.value
> +        self.ui.debug('received listkey for "%s": %i bytes\n'
> +                      % (namespace, len(d)))
> +        yield pushkeymod.decodekeys(d)
> +
>      def stream_out(self):
>          return self._callstream('stream_out')
> 
> @@ -676,7 +692,8 @@
>      return repo.opener.tryread('clonebundles.manifest')
> 
>  wireprotocaps = ['lookup', 'changegroupsubset', 'branchmap', 'pushkey',
> -                 'known', 'getbundle', 'unbundlehash', 'batch']
> +                 'known', 'getbundle', 'unbundlehash', 'batch',
> +                 'listkeypattern']
> 
>  def _capabilities(repo, proto):
>      """return a list of capabilities for a repo
> @@ -791,6 +808,12 @@
>      d = repo.listkeys(encoding.tolocal(namespace)).items()
>      return pushkeymod.encodekeys(d)
> 
> +@wireprotocommand('listkeypattern', 'namespace patterns *')
>  
> Why the "*" here? "others" is not used in the function implementation.
>  
> +def listkeypattern(repo, proto, namespace, patterns, others):
> +    patterns = decodelist(patterns)
> +    d = repo.listkeys(encoding.tolocal(namespace), patterns=patterns).items()
> +    return pushkeymod.encodekeys(d)
> +
>  @wireprotocommand('lookup', 'key')
>  def lookup(repo, proto, key):
>      try:
>  
> I think introducing a new wire protocol command is the correct way to solve this problem (as opposed to introducing a new argument on the existing command).
> 
> However, if we're introducing a new wire protocol command for obtaining pushkey values, I think we should improve deficiencies in the response encoding rather than propagate its problems.
> 
> The "listkeys" response encoding can't transmit the full range of binary values. This can lead to larger (and slower) responses sizes. For example, a number of pushkey namespaces exchange lists of nodes. These have to be represented as hex instead of binary. For pushkey namespaces like phases or obsolescence that can exchange hundreds or thousands of nodes, the overhead can add up.
> 
> I think the response from a new listkeys command should be using framing to encode the key names and values so the full range of binary values can be efficiently transferred. We may also want a special mechanism to represent a list of nodes, as avoiding the overhead of framing on fixed width values would be desirable.
> 
> Of course, at the point you introduce a new response encoding, we may want to call the command "listkeys2." If others agree, I can code up an implementation and you can add the patterns functionality on top of it.
Durham Goode - Aug. 17, 2016, 7:27 p.m.
On 8/15/16 9:12 AM, Gregory Szorc wrote:
>
>
> On Aug 15, 2016, at 01:34, Stanislau Hlebik <stash@fb.com 
> <mailto:stash@fb.com>> wrote:
>
>> So to summarize, you suggest to wait for /[PATCH 7 of 9 RFC] pushkey: 
>> support for encoding and decoding raw listkeys dicts**/to be 
>> committed and implement new wireproto method /listkeypattern/ using 
>> encodekeysraw/decodekeysraw?**That sounds good for me.
>>
>
> Or add a new argument to the "listkeys2" command I sent in that series.
>
> Others may have different opinions on the need to overhaul pushkey. At 
> the very minimum I think any new pushkey command should be using a 
> more efficient and robust encoding.
>
I don't think over hauling the pushkey infra should be a prerequisite 
for adding this api.  But if there is incoming support for serializing 
raw stuff, then using that seems like a reasonable ask.

Patch

diff --git a/mercurial/wireproto.py b/mercurial/wireproto.py

--- a/mercurial/wireproto.py

+++ b/mercurial/wireproto.py

@@ -353,6 +353,22 @@ 

                       % (namespace, len(d)))
         yield pushkeymod.decodekeys(d)

+    @batchable

+    def listkeypattern(self, namespace, patterns):

+        if not self.capable('pushkey'):

+            yield {}, None

+        f = future()

+        self.ui.debug('preparing listkeys for "%s" with pattern "%s"\n' %

+                      (namespace, patterns))

+        yield {

+            'namespace': encoding.fromlocal(namespace),

+            'patterns': encodelist(patterns)

+        }, f

+        d = f.value

+        self.ui.debug('received listkey for "%s": %i bytes\n'

+                      % (namespace, len(d)))

+        yield pushkeymod.decodekeys(d)

+

     def stream_out(self):
         return self._callstream('stream_out')

@@ -676,7 +692,8 @@ 

     return repo.opener.tryread('clonebundles.manifest')

 wireprotocaps = ['lookup', 'changegroupsubset', 'branchmap', 'pushkey',
-                 'known', 'getbundle', 'unbundlehash', 'batch']

+                 'known', 'getbundle', 'unbundlehash', 'batch',

+                 'listkeypattern']


 def _capabilities(repo, proto):
     """return a list of capabilities for a repo
@@ -791,6 +808,12 @@ 

     d = repo.listkeys(encoding.tolocal(namespace)).items()
     return pushkeymod.encodekeys(d)

+@wireprotocommand('listkeypattern', 'namespace patterns *')


Why the "*" here? "others" is not used in the function implementation.

+def listkeypattern(repo, proto, namespace, patterns, others):

+    patterns = decodelist(patterns)

+    d = repo.listkeys(encoding.tolocal(namespace), patterns=patterns).items()

+    return pushkeymod.encodekeys(d)

+