Submitter | Mads Kiilerich |
---|---|
Date | Oct. 8, 2015, 11:20 p.m. |
Message ID | <dfa056583dbbec4e0a2c.1444346427@localhost.localdomain> |
Download | mbox | patch |
Permalink | /patch/10904/ |
State | Accepted |
Commit | a3fcc8e3136bd19012d28b863d6bf4429948c573 |
Headers | show |
Comments
On Fri, 2015-10-09 at 01:20 +0200, Mads Kiilerich wrote: > # HG changeset patch > # User Mads Kiilerich <madski@unity3d.com> > # Date 1444346377 -7200 > # Fri Oct 09 01:19:37 2015 +0200 > # Node ID dfa056583dbbec4e0a2cc31366a46599c989a979 > # Parent 3f234db6fe8d7cc3d39083f7e7523444bc21e5a2 > context: don't hex encode all unknown 20 char revision specs (issue4890) > > d3908c911d5e introduced nice hexified display of missing nodes. It did however > also make missing 20 character revision specifications be shown as hex - very > confusing. > > Users are often wrong and somehow specify revisions that don't exist. Nodes > will however rarely be missing ... and they will only look like a user provided > revision specification and be all ascii in 1 of 4*10**9. It's actually 4e8, not 9. Which means it has something like a 1% chance for mysteriously breaking one of the repos at my day job. > - if len(changeid) == 20: > + if len(changeid) == 20 and nonascii(changeid): > changeid = hex(changeid) You cannot put a regex check in the context creation fast path. That code desperately needs to be faster, not slower. You are, however, welcome to make the error handler as slow as you want.
On 10/09/2015 06:16 AM, Matt Mackall wrote: > On Fri, 2015-10-09 at 01:20 +0200, Mads Kiilerich wrote: >> # HG changeset patch >> # User Mads Kiilerich <madski@unity3d.com> >> # Date 1444346377 -7200 >> # Fri Oct 09 01:19:37 2015 +0200 >> # Node ID dfa056583dbbec4e0a2cc31366a46599c989a979 >> # Parent 3f234db6fe8d7cc3d39083f7e7523444bc21e5a2 >> context: don't hex encode all unknown 20 char revision specs (issue4890) >> >> d3908c911d5e introduced nice hexified display of missing nodes. It did however >> also make missing 20 character revision specifications be shown as hex - very >> confusing. >> >> Users are often wrong and somehow specify revisions that don't exist. Nodes >> will however rarely be missing ... and they will only look like a user provided >> revision specification and be all ascii in 1 of 4*10**9. > It's actually 4e8, not 9. Which means it has something like a 1% chance > for mysteriously breaking one of the repos at my day job. Yes, if you make lookup of all the nodes and none of them exists. I don't know, what is the simplest way of triggering a lookup of a non-existing node? >> - if len(changeid) == 20: >> + if len(changeid) == 20 and nonascii(changeid): >> changeid = hex(changeid) > You cannot put a regex check in the context creation fast path. That > code desperately needs to be faster, not slower. > > You are, however, welcome to make the error handler as slow as you want. This _is_ in the error handler. /Mads
On Fri, 2015-10-09 at 12:57 +0200, Mads Kiilerich wrote: > >> - if len(changeid) == 20: > >> + if len(changeid) == 20 and nonascii(changeid): > >> changeid = hex(changeid) > > You cannot put a regex check in the context creation fast path. That > > code desperately needs to be faster, not slower. > > > > You are, however, welcome to make the error handler as slow as you want. > > This _is_ in the error handler. Right, now it all makes sense. I thought this was the earlier "== 20" block. Queued for default, thanks.
On 10/11/2015 05:57 PM, Matt Mackall wrote: > On Fri, 2015-10-09 at 12:57 +0200, Mads Kiilerich wrote: >>>> - if len(changeid) == 20: >>>> + if len(changeid) == 20 and nonascii(changeid): >>>> changeid = hex(changeid) >>> You cannot put a regex check in the context creation fast path. That >>> code desperately needs to be faster, not slower. >>> >>> You are, however, welcome to make the error handler as slow as you want. >> This _is_ in the error handler. > Right, now it all makes sense. I thought this was the earlier "== 20" > block. Queued for default, thanks. I looked more at it and also agree that the end of the __init__ not _just_ is an _error_ handler. It is also reached from localrepo __contains__ and lookup where RepoLookupError is caught and the nice localized message is discarded. Some anecdotal evidence from the test suite: it only ends up in the case with missing 20 char changeids 50 times. 37 of them are '_rebasedefaultdest()'. Only 7 of them are binary. The code did thus already have room for improvement and the extra regexp check is not in a fast fast path and will not slow anything down in a noticeable way. Also agreed, as a part of the mentioned desperately needed improvement, these nice and often irrelevant messages could be moved to the "real" error handler in dispatch. /Mads
Patch
diff --git a/mercurial/context.py b/mercurial/context.py --- a/mercurial/context.py +++ b/mercurial/context.py @@ -5,6 +5,8 @@ # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. +import re + from node import nullid, nullrev, wdirid, short, hex, bin from i18n import _ import mdiff, error, util, scmutil, subrepo, patch, encoding, phases @@ -22,6 +24,8 @@ propertycache = util.propertycache # dirty in the working copy. _newnode = '!' * 21 +nonascii = re.compile(r'[^\x21-\x7f]').search + class basectx(object): """A basectx object represents the common logic for its children: changectx: read-only context that is already present in the repo, @@ -466,7 +470,7 @@ class changectx(basectx): msg = _("working directory has unknown parent '%s'!") raise error.Abort(msg % short(changeid)) try: - if len(changeid) == 20: + if len(changeid) == 20 and nonascii(changeid): changeid = hex(changeid) except TypeError: pass diff --git a/tests/test-pull.t b/tests/test-pull.t --- a/tests/test-pull.t +++ b/tests/test-pull.t @@ -52,6 +52,18 @@ $ hg rollback --dry-run --verbose repository tip rolled back to revision -1 (undo pull: http://foo:***@localhost:$HGPORT/) +Test pull of non-existing 20 character revision specification, making sure plain ascii identifiers +not are encoded like a node: + + $ hg pull -r 'xxxxxxxxxxxxxxxxxxxy' + pulling from http://foo@localhost:$HGPORT/ + abort: unknown revision 'xxxxxxxxxxxxxxxxxxxy'! + [255] + $ hg pull -r 'xxxxxxxxxxxxxxxxxx y' + pulling from http://foo@localhost:$HGPORT/ + abort: unknown revision '7878787878787878787878787878787878782079'! + [255] + Issue622: hg init && hg pull -u URL doesn't checkout default branch $ cd ..