Patchwork [3,of,3,V3] localrepo: better error message when revlog error and file addition (issue4675)

login
register
mail settings
Submitter Matt Mackall
Date June 4, 2015, 7:57 p.m.
Message ID <1433447820.3128.14.camel@selenic.com>
Download mbox | patch
Permalink /patch/9487/
State Not Applicable
Headers show

Comments

Matt Mackall - June 4, 2015, 7:57 p.m.
On Wed, 2015-06-03 at 17:21 -0400, Jordi Gutiérrez Hermoso wrote:
> On Wed, 2015-06-03 at 15:20 -0500, Matt Mackall wrote:
> > On Tue, 2015-06-02 at 15:07 -0400, Jordi Gutiérrez Hermoso wrote:
> > > # HG changeset patch
> > > # User Jordi Gutiérrez Hermoso <jordigh@octave.org>
> > > # Date 1432239912 14400
> > > #      Thu May 21 16:25:12 2015 -0400
> > > # Node ID 6299517509f9a8912ff83fffa46cecf964ba779c
> > > # Parent  3353bc00fbb14f2930cf0e9b72be312e578d2f6e
> > > localrepo: better error message when revlog error and file addition (issue4675)
> > > 
> > > At this level we have enough context for saying which filename caused
> > > the revlog error, and we can also add the hint from the revlog
> > > exception, if any.
> > > 
> > > diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> > > --- a/mercurial/localrepo.py
> > > +++ b/mercurial/localrepo.py
> > > @@ -1348,7 +1348,10 @@ class localrepository(object):
> > >          text = fctx.data()
> > >          if fparent2 != nullid or flog.cmp(fparent1, text) or meta:
> > >              changelist.append(fname)
> > > -            return flog.add(text, meta, tr, linkrev, fparent1, fparent2)
> > > +            try:
> > > +                return flog.add(text, meta, tr, linkrev, fparent1, fparent2)
> > > +            except error.RevlogError as exc:
> > > +                raise util.Abort("%s: %s" % (exc, fname), hint=exc.hint)
> > >          # are just the flags changed during merge?
> > >          elif fname in manifest1 and manifest1.flags(fname) != fctx.flags():
> > >              changelist.append(fname)
> > 
> > Looks like the wrong layer for this. Either belongs in
> > revlog._addrevision (where it'll catch changelog and manifest overflows
> > too)
> 
> I thought it would make most sense to catch manifest overflows
> elsewhere, not in revlog.py. We need more context in order to form an
> error message about how the problem is about too many files, not a
> single large file.
> 
> Which makes me think that the largefiles hint should also be added at
> localrepo, but then the exception raised by revlogio should be
> something more specific than RevlogError. Are we ok with
> RevlogOverflow or something like it?

Here's what I think we ought to do:


Things to note:
- it hides the ugly traceback
- it handles both the revlogio and revlogoldio paths
- it handles all the revlogs
- it hides the details from upper levels
- it shows a filename that will probably be user-comprehensible
- we probably won't need to think about this again for years
Jordi Gutiérrez Hermoso - June 5, 2015, 1:53 p.m.
On Thu, 2015-06-04 at 14:57 -0500, Matt Mackall wrote:
> Here's what I think we ought to do:
> 
> diff -r 69609f43c752 mercurial/revlog.py
> --- a/mercurial/revlog.py	Fri May 29 13:11:52 2015 -0700
> +++ b/mercurial/revlog.py	Thu Jun 04 14:48:20 2015 -0500
> @@ -167,12 +167,6 @@
>          return index, getattr(index, 'nodemap', None), cache
>  
>      def packentry(self, entry, node, version, rev):
> -        uncompressedlength = entry[2]
> -        if uncompressedlength > _maxentrysize:
> -            raise RevlogError(
> -                _("size of %d bytes exceeds maximum revlog storage of 2GiB")
> -                % uncompressedlength)
> -
>          p = _pack(indexformatng, *entry)
>          if rev == 0:
>              p = _pack(versionformat, version) + p[4:]
> @@ -1190,6 +1184,12 @@
>          if link == nullrev:
>              raise RevlogError(_("attempted to add linkrev -1 to %s")
>                                % self.indexfile)
> +
> +        if len(text) > _maxentrysize:
> +            raise RevlogError(
> +                _("%s: size of %d bytes exceeds maximum revlog storage of 2GiB")
> +                % (self.indexfile, uncompressedlength))
> +
>          node = node or self.hash(text, p1, p2)
>          if node in self.nodemap:
>              return node
> 
> Things to note:
> - it hides the ugly traceback
> - it handles both the revlogio and revlogoldio paths
> - it handles all the revlogs
> - it hides the details from upper levels
> - it shows a filename that will probably be user-comprehensible
> - we probably won't need to think about this again for years

I like everything except exposing the revlog's file name. I disagree
that users will know what it means. Most users don't even know what a
revlog is, and without some knowledge of hg internals, they won't know
how to tell apart a manifest from a file revlog.

But I'm also tired of arguing about this. Just push your patch. It's
certainly better than what we have now, and they can always do a web
search for the error message if they need an explanation.

Patch

diff -r 69609f43c752 mercurial/revlog.py
--- a/mercurial/revlog.py	Fri May 29 13:11:52 2015 -0700
+++ b/mercurial/revlog.py	Thu Jun 04 14:48:20 2015 -0500
@@ -167,12 +167,6 @@ 
         return index, getattr(index, 'nodemap', None), cache
 
     def packentry(self, entry, node, version, rev):
-        uncompressedlength = entry[2]
-        if uncompressedlength > _maxentrysize:
-            raise RevlogError(
-                _("size of %d bytes exceeds maximum revlog storage of 2GiB")
-                % uncompressedlength)
-
         p = _pack(indexformatng, *entry)
         if rev == 0:
             p = _pack(versionformat, version) + p[4:]
@@ -1190,6 +1184,12 @@ 
         if link == nullrev:
             raise RevlogError(_("attempted to add linkrev -1 to %s")
                               % self.indexfile)
+
+        if len(text) > _maxentrysize:
+            raise RevlogError(
+                _("%s: size of %d bytes exceeds maximum revlog storage of 2GiB")
+                % (self.indexfile, uncompressedlength))
+
         node = node or self.hash(text, p1, p2)
         if node in self.nodemap:
             return node