Patchwork [7,of,9] context: enhance findbyhash for partial hash ID

login
register
mail settings
Submitter Katsunori FUJIWARA
Date March 29, 2015, 6:34 p.m.
Message ID <ce47b26e3c0d6104b5bf.1427654069@juju>
Download mbox | patch
Permalink /patch/8360/
State Changes Requested
Headers show

Comments

Katsunori FUJIWARA - March 29, 2015, 6:34 p.m.
# HG changeset patch
# User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
# Date 1427653752 -32400
#      Mon Mar 30 03:29:12 2015 +0900
# Node ID ce47b26e3c0d6104b5bfb2d15e59b5a2af573956
# Parent  5070134e83f1a46a55f74e6e6dce6ebd75a5eee9
context: enhance findbyhash for partial hash ID

Before this patch, there is no easy way to examine existence of the
revision by partial hash ID strictly.

"hashid in repo" may match against bookmarks, tags and so on.

This patch factors out the logic for it in "changectx.__init__()" and
enhance "findbyhash()" with it, to avoid duplication of similar code.

This is a preparation to fix equivalence problem of revset predicate
"id()".
Pierre-Yves David - March 30, 2015, 10:13 p.m.
On 03/29/2015 11:34 AM, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> # Date 1427653752 -32400
> #      Mon Mar 30 03:29:12 2015 +0900
> # Node ID ce47b26e3c0d6104b5bfb2d15e59b5a2af573956
> # Parent  5070134e83f1a46a55f74e6e6dce6ebd75a5eee9
> context: enhance findbyhash for partial hash ID
>
> Before this patch, there is no easy way to examine existence of the
> revision by partial hash ID strictly.
>
> "hashid in repo" may match against bookmarks, tags and so on.
>
> This patch factors out the logic for it in "changectx.__init__()" and
> enhance "findbyhash()" with it, to avoid duplication of similar code.
>
> This is a preparation to fix equivalence problem of revset predicate
> "id()".
>
> diff --git a/mercurial/context.py b/mercurial/context.py
> --- a/mercurial/context.py
> +++ b/mercurial/context.py
> @@ -426,6 +426,12 @@ def _findby40hash(repo, changeid):
>       except (TypeError, LookupError):
>           return None
>
> +def _findbypartialhash(repo, changeid):
> +    node = repo.unfiltered().changelog._partialmatch(changeid)
> +    if node is not None:
> +        return (node, repo.changelog.rev(node))
> +    return None
> +
>   def findbyhash(repo, hashid, abort=False):
>       """Find the revision by hash ID
>
> @@ -443,8 +449,12 @@ def findbyhash(repo, hashid, abort=False
>       branches and so on.
>       """
>       assert isinstance(hashid, str)
> -    assert len(hashid) == 40
> -    return _wraplookup(_findby40hash, repo, hashid, abort)
> +    assert len(hashid) <= 40
> +    if len(hashid) == 40:
> +        lookupfunc = _findby40hash
> +    else:
> +        lookupfunc = _findbypartialhash

Why are we not using _partialmatch in all case ?.

> +    return _wraplookup(lookupfunc, repo, hashid, abort)
>
>   class changectx(basectx):
>       """A changecontext object makes access to data related to a particular
> @@ -528,9 +538,9 @@ class changectx(basectx):
>               except error.RepoLookupError:
>                   pass
>
> -            self._node = repo.unfiltered().changelog._partialmatch(changeid)
> -            if self._node is not None:
> -                self._rev = repo.changelog.rev(self._node)
> +            found = _findbypartialhash(repo, changeid)
> +            if found:
> +                self._node, self._rev = found
>                   return
>
>               # lookup failed
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>
Katsunori FUJIWARA - March 31, 2015, 10 a.m.
At Mon, 30 Mar 2015 15:13:41 -0700,
Pierre-Yves David wrote:
> 
> 
> 
> On 03/29/2015 11:34 AM, FUJIWARA Katsunori wrote:
> > # HG changeset patch
> > # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
> > # Date 1427653752 -32400
> > #      Mon Mar 30 03:29:12 2015 +0900
> > # Node ID ce47b26e3c0d6104b5bfb2d15e59b5a2af573956
> > # Parent  5070134e83f1a46a55f74e6e6dce6ebd75a5eee9
> > context: enhance findbyhash for partial hash ID
> >
> > Before this patch, there is no easy way to examine existence of the
> > revision by partial hash ID strictly.
> >
> > "hashid in repo" may match against bookmarks, tags and so on.
> >
> > This patch factors out the logic for it in "changectx.__init__()" and
> > enhance "findbyhash()" with it, to avoid duplication of similar code.
> >
> > This is a preparation to fix equivalence problem of revset predicate
> > "id()".
> >
> > diff --git a/mercurial/context.py b/mercurial/context.py
> > --- a/mercurial/context.py
> > +++ b/mercurial/context.py
> > @@ -426,6 +426,12 @@ def _findby40hash(repo, changeid):
> >       except (TypeError, LookupError):
> >           return None
> >
> > +def _findbypartialhash(repo, changeid):
> > +    node = repo.unfiltered().changelog._partialmatch(changeid)
> > +    if node is not None:
> > +        return (node, repo.changelog.rev(node))
> > +    return None
> > +
> >   def findbyhash(repo, hashid, abort=False):
> >       """Find the revision by hash ID
> >
> > @@ -443,8 +449,12 @@ def findbyhash(repo, hashid, abort=False
> >       branches and so on.
> >       """
> >       assert isinstance(hashid, str)
> > -    assert len(hashid) == 40
> > -    return _wraplookup(_findby40hash, repo, hashid, abort)
> > +    assert len(hashid) <= 40
> > +    if len(hashid) == 40:
> > +        lookupfunc = _findby40hash
> > +    else:
> > +        lookupfunc = _findbypartialhash
> 
> Why are we not using _partialmatch in all case ?.

Because "revlog.rev()" is more efficient than "revlog._partialmatch()"
for 40 letter hash ID, which doesn't have any ambiguity.

According to revlog implementation, "revlog._partialmatch()" implies
both "revlog.index.partialmatch()" and "revlog.rev()" (via
"revlog.hasnode()"), if specified id is valid.

Even if C implementation of "index.partialmatch()" is much faster than
"revlog.rev()", it is just overhead of "revlog._partialmatch()" at
comparison with "revlog.rev()".

Do I overlook any reasons (perhaps, around C implementation) to have
to use "revlog._partialmatch()" in both cases ?


> > +    return _wraplookup(lookupfunc, repo, hashid, abort)
> >
> >   class changectx(basectx):
> >       """A changecontext object makes access to data related to a particular
> > @@ -528,9 +538,9 @@ class changectx(basectx):
> >               except error.RepoLookupError:
> >                   pass
> >
> > -            self._node = repo.unfiltered().changelog._partialmatch(changeid)
> > -            if self._node is not None:
> > -                self._rev = repo.changelog.rev(self._node)
> > +            found = _findbypartialhash(repo, changeid)
> > +            if found:
> > +                self._node, self._rev = found
> >                   return
> >
> >               # lookup failed
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel@selenic.com
> > http://selenic.com/mailman/listinfo/mercurial-devel
> >
> 
> -- 
> Pierre-Yves David
> 

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy@lares.dti.ne.jp
Pierre-Yves David - March 31, 2015, 4:47 p.m.
On 03/31/2015 03:00 AM, FUJIWARA Katsunori wrote:
> At Mon, 30 Mar 2015 15:13:41 -0700,
> Pierre-Yves David wrote:
>>
>>
>>
>> On 03/29/2015 11:34 AM, FUJIWARA Katsunori wrote:
>>> # HG changeset patch
>>> # User FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
>>> # Date 1427653752 -32400
>>> #      Mon Mar 30 03:29:12 2015 +0900
>>> # Node ID ce47b26e3c0d6104b5bfb2d15e59b5a2af573956
>>> # Parent  5070134e83f1a46a55f74e6e6dce6ebd75a5eee9
>>> context: enhance findbyhash for partial hash ID
>>>
>>> Before this patch, there is no easy way to examine existence of the
>>> revision by partial hash ID strictly.
>>>
>>> "hashid in repo" may match against bookmarks, tags and so on.
>>>
>>> This patch factors out the logic for it in "changectx.__init__()" and
>>> enhance "findbyhash()" with it, to avoid duplication of similar code.
>>>
>>> This is a preparation to fix equivalence problem of revset predicate
>>> "id()".
>>>
>>> diff --git a/mercurial/context.py b/mercurial/context.py
>>> --- a/mercurial/context.py
>>> +++ b/mercurial/context.py
>>> @@ -426,6 +426,12 @@ def _findby40hash(repo, changeid):
>>>        except (TypeError, LookupError):
>>>            return None
>>>
>>> +def _findbypartialhash(repo, changeid):
>>> +    node = repo.unfiltered().changelog._partialmatch(changeid)
>>> +    if node is not None:
>>> +        return (node, repo.changelog.rev(node))
>>> +    return None
>>> +
>>>    def findbyhash(repo, hashid, abort=False):
>>>        """Find the revision by hash ID
>>>
>>> @@ -443,8 +449,12 @@ def findbyhash(repo, hashid, abort=False
>>>        branches and so on.
>>>        """
>>>        assert isinstance(hashid, str)
>>> -    assert len(hashid) == 40
>>> -    return _wraplookup(_findby40hash, repo, hashid, abort)
>>> +    assert len(hashid) <= 40
>>> +    if len(hashid) == 40:
>>> +        lookupfunc = _findby40hash
>>> +    else:
>>> +        lookupfunc = _findbypartialhash
>>
>> Why are we not using _partialmatch in all case ?.
>
> Because "revlog.rev()" is more efficient than "revlog._partialmatch()"
> for 40 letter hash ID, which doesn't have any ambiguity.
>
> According to revlog implementation, "revlog._partialmatch()" implies
> both "revlog.index.partialmatch()" and "revlog.rev()" (via
> "revlog.hasnode()"), if specified id is valid.
>
> Even if C implementation of "index.partialmatch()" is much faster than
> "revlog.rev()", it is just overhead of "revlog._partialmatch()" at
> comparison with "revlog.rev()".
>
> Do I overlook any reasons (perhaps, around C implementation) to have
> to use "revlog._partialmatch()" in both cases ?

It does not seems so, this reply is can probably be reduced to an inline 
comment explaining this.

Patch

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -426,6 +426,12 @@  def _findby40hash(repo, changeid):
     except (TypeError, LookupError):
         return None
 
+def _findbypartialhash(repo, changeid):
+    node = repo.unfiltered().changelog._partialmatch(changeid)
+    if node is not None:
+        return (node, repo.changelog.rev(node))
+    return None
+
 def findbyhash(repo, hashid, abort=False):
     """Find the revision by hash ID
 
@@ -443,8 +449,12 @@  def findbyhash(repo, hashid, abort=False
     branches and so on.
     """
     assert isinstance(hashid, str)
-    assert len(hashid) == 40
-    return _wraplookup(_findby40hash, repo, hashid, abort)
+    assert len(hashid) <= 40
+    if len(hashid) == 40:
+        lookupfunc = _findby40hash
+    else:
+        lookupfunc = _findbypartialhash
+    return _wraplookup(lookupfunc, repo, hashid, abort)
 
 class changectx(basectx):
     """A changecontext object makes access to data related to a particular
@@ -528,9 +538,9 @@  class changectx(basectx):
             except error.RepoLookupError:
                 pass
 
-            self._node = repo.unfiltered().changelog._partialmatch(changeid)
-            if self._node is not None:
-                self._rev = repo.changelog.rev(self._node)
+            found = _findbypartialhash(repo, changeid)
+            if found:
+                self._node, self._rev = found
                 return
 
             # lookup failed