Patchwork remotefilectx: use manifest.find() to look for a single file

login
register
mail settings
Submitter via Mercurial-devel
Date Sept. 28, 2016, 12:19 a.m.
Message ID <38faebcddb68e8425fd5.1475021944@martinvonz.mtv.corp.google.com>
Download mbox | patch
Permalink /patch/16796/
State Changes Requested
Headers show

Comments

via Mercurial-devel - Sept. 28, 2016, 12:19 a.m.
# HG changeset patch
# User Martin von Zweigbergk <martinvonz@google.com>
# Date 1475020863 25200
#      Tue Sep 27 17:01:03 2016 -0700
# Node ID 38faebcddb68e8425fd5fcc64b58b61002ce0eaa
# Parent  69eeafa68cd0e62916dab43e1c5bcdc62069e19d
remotefilectx: use manifest.find() to look for a single file

The find() method on the manifest class (i.e. the revlog) is optimized
for both flat and treemanifests, so use that where we're only looking
for a single file.
Durham Goode - Oct. 11, 2016, 10:37 p.m.
Sorry for the delay in reviewing this. I am a terrible person.


On 9/27/16 5:19 PM, Martin von Zweigbergk wrote:
> # HG changeset patch
> # User Martin von Zweigbergk <martinvonz@google.com>
> # Date 1475020863 25200
> #      Tue Sep 27 17:01:03 2016 -0700
> # Node ID 38faebcddb68e8425fd5fcc64b58b61002ce0eaa
> # Parent  69eeafa68cd0e62916dab43e1c5bcdc62069e19d
> remotefilectx: use manifest.find() to look for a single file
>
> The find() method on the manifest class (i.e. the revlog) is optimized
> for both flat and treemanifests, so use that where we're only looking
> for a single file.
>
> diff -r 69eeafa68cd0 -r 38faebcddb68 remotefilelog/remotefilectx.py
> --- a/remotefilelog/remotefilectx.py	Tue Aug 30 10:18:24 2016 -0700
> +++ b/remotefilelog/remotefilectx.py	Tue Sep 27 17:01:03 2016 -0700
> @@ -68,7 +68,7 @@
>               if path in data[3]: # checking the 'files' field.
>                   # The file has been touched, check if the hash is what we're
>                   # looking for.
> -                if fileid == ma.readfast(data[0]).get(path):
> +                if fileid == ma.find(data[0], path)[0]:
Even though find is optimized for finding the file quickly within the 
manifest, doesn't it require that we decompress the entire manifest 
first? That adds 500+ms to every invocation here (which is in a tight 
loop over the changelog).
via Mercurial-devel - Oct. 11, 2016, 11:13 p.m.
On Tue, Oct 11, 2016 at 3:37 PM, Durham Goode <durham@fb.com> wrote:
> Sorry for the delay in reviewing this. I am a terrible person.
>
>
> On 9/27/16 5:19 PM, Martin von Zweigbergk wrote:
>>
>> # HG changeset patch
>> # User Martin von Zweigbergk <martinvonz@google.com>
>> # Date 1475020863 25200
>> #      Tue Sep 27 17:01:03 2016 -0700
>> # Node ID 38faebcddb68e8425fd5fcc64b58b61002ce0eaa
>> # Parent  69eeafa68cd0e62916dab43e1c5bcdc62069e19d
>> remotefilectx: use manifest.find() to look for a single file
>>
>> The find() method on the manifest class (i.e. the revlog) is optimized
>> for both flat and treemanifests, so use that where we're only looking
>> for a single file.
>>
>> diff -r 69eeafa68cd0 -r 38faebcddb68 remotefilelog/remotefilectx.py
>> --- a/remotefilelog/remotefilectx.py    Tue Aug 30 10:18:24 2016 -0700
>> +++ b/remotefilelog/remotefilectx.py    Tue Sep 27 17:01:03 2016 -0700
>> @@ -68,7 +68,7 @@
>>               if path in data[3]: # checking the 'files' field.
>>                   # The file has been touched, check if the hash is what
>> we're
>>                   # looking for.
>> -                if fileid == ma.readfast(data[0]).get(path):
>> +                if fileid == ma.find(data[0], path)[0]:
>
> Even though find is optimized for finding the file quickly within the
> manifest, doesn't it require that we decompress the entire manifest first?
> That adds 500+ms to every invocation here (which is in a tight loop over the
> changelog).

Good point. Maybe I need to handle it differently for flat and tree,
because for tree manifests ma.find() calls slow_slowreaddelta()
(ironic, I know).

Patch

diff -r 69eeafa68cd0 -r 38faebcddb68 remotefilelog/remotefilectx.py
--- a/remotefilelog/remotefilectx.py	Tue Aug 30 10:18:24 2016 -0700
+++ b/remotefilelog/remotefilectx.py	Tue Sep 27 17:01:03 2016 -0700
@@ -68,7 +68,7 @@ 
             if path in data[3]: # checking the 'files' field.
                 # The file has been touched, check if the hash is what we're
                 # looking for.
-                if fileid == ma.readfast(data[0]).get(path):
+                if fileid == ma.find(data[0], path)[0]:
                     return rev
 
         # Couldn't find the linkrev. This should generally not happen, and will
@@ -215,7 +215,7 @@ 
             if path in ac[3]: # checking the 'files' field.
                 # The file has been touched, check if the content is
                 # similar to the one we search for.
-                if fnode == ma.readfast(ac[0]).get(path):
+                if fnode == ma.find(ac[0], path)[0]:
                     return cl.node(a)
 
         return linknode