Patchwork [2,of,2] manifest: drop the now-unused find method

login
register
mail settings
Submitter Bryan O'Sullivan
Date March 12, 2013, 9:31 p.m.
Message ID <6142662eb478db7da277.1363123892@australite.thefacebook.com>
Download mbox | patch
Permalink /patch/1117/
State Rejected, archived
Headers show

Comments

Bryan O'Sullivan - March 12, 2013, 9:31 p.m.
# HG changeset patch
# User Bryan O'Sullivan <bryano@fb.com>
# Date 1363123859 25200
# Node ID 6142662eb478db7da277a74381b01251c67c7342
# Parent  be03cde093c4ab03c92e31fc55d5680c35bd8327
manifest: drop the now-unused find method

It didn't manage _mancache correctly, and it was also slower than simply
parsing the entire manifest (which has been done in C since 2008) and
looking up the resulting dicts.
Durham Goode - March 12, 2013, 10:37 p.m.
Looks good.  I don't see anywhere else in the code that uses it.
Bryan O'Sullivan - March 12, 2013, 11:51 p.m.
On Tue, Mar 12, 2013 at 3:37 PM, Durham Goode <durham@fb.com> wrote:

> Looks good.  I don't see anywhere else in the code that uses it.
>

It turns out to affect grep performance after all, at least in the
small-number-of-changes case. That's ... awkward.

So we need something like manifest.find to help (a) the common case of just
a few changes to the manifest, but we also need something for (b) large
numbers of changes, but it has to not clobber performance for case (a).
This needs some more thinking :-(
Sean Farley - March 13, 2013, 1:01 a.m.
Bryan O'Sullivan writes:

> On Tue, Mar 12, 2013 at 3:37 PM, Durham Goode <durham@fb.com> wrote:
>
>> Looks good.  I don't see anywhere else in the code that uses it.
>>
>
> It turns out to affect grep performance after all, at least in the
> small-number-of-changes case. That's ... awkward.
>
> So we need something like manifest.find to help (a) the common case of just
> a few changes to the manifest, but we also need something for (b) large
> numbers of changes, but it has to not clobber performance for case (a).
> This needs some more thinking :-(

For me, this almost doubled the time for 'hg grep' but I don't know if
it's in case (a) or not. I'll post the results for completeness. The
number of revisions is 101978 and the manifest is

$ hg manifest | wc -l
16395

-- Before --
$ time /opt/local/bin/hg grep canonical

real	6m20.292s
user	6m6.175s
sys	0m11.345s

-- After --
$ ~/projects/hg/hg grep canonical

real	12m17.326s
user	12m1.745s
sys	0m12.773s

Patch

diff --git a/mercurial/manifest.py b/mercurial/manifest.py
--- a/mercurial/manifest.py
+++ b/mercurial/manifest.py
@@ -100,20 +100,6 @@  class manifest(revlog.revlog):
         else:
             return (lo, lo)
 
-    def find(self, node, f):
-        '''look up entry for a single file efficiently.
-        return (node, flags) pair if found, (None, None) if not.'''
-        if node in self._mancache:
-            mapping = self._mancache[node][0]
-            return mapping.get(f), mapping.flags(f)
-        text = self.revision(node)
-        start, end = self._search(text, f)
-        if start == end:
-            return None, None
-        l = text[start:end]
-        f, n = l.split('\0')
-        return revlog.bin(n[:40]), n[40:-1]
-
     def add(self, map, transaction, link, p1=None, p2=None,
             changed=None):
         # apply the changes collected during the bisect loop to our addlist