Patchwork [3,of,7] revlog: do not return raw cache blindly

login
register
mail settings
Submitter Jun Wu
Date March 28, 2017, 7:49 a.m.
Message ID <d557aaee6ada70bf51fc.1490687344@localhost.localdomain>
Download mbox | patch
Permalink /patch/19773/
State Changes Requested
Headers show

Comments

Jun Wu - March 28, 2017, 7:49 a.m.
# HG changeset patch
# User Jun Wu <quark@fb.com>
# Date 1490685397 25200
#      Tue Mar 28 00:16:37 2017 -0700
# Node ID d557aaee6ada70bf51fcc9d4d05d07a54d8f2d4e
# Parent  99cfe31d37df62b50e53a126f0eb31a1e352ac67
# Available At https://bitbucket.org/quark-zju/hg-draft
#              hg pull https://bitbucket.org/quark-zju/hg-draft -r d557aaee6ada
revlog: do not return raw cache blindly

Previously, revlog.revision(raw=False) may return raw content on cache hit.
That is wrong. This patch adds a check to fix it.

This slows down common use-cases where raw=False and flags=0. That
performance issue will be fixed later.
Ryan McElroy - March 29, 2017, 12:40 p.m.
On 3/28/17 8:49 AM, Jun Wu wrote:
> # HG changeset patch
> # User Jun Wu <quark@fb.com>
> # Date 1490685397 25200
> #      Tue Mar 28 00:16:37 2017 -0700
> # Node ID d557aaee6ada70bf51fcc9d4d05d07a54d8f2d4e
> # Parent  99cfe31d37df62b50e53a126f0eb31a1e352ac67
> revlog: do not return raw cache blindly
>
> Previously, revlog.revision(raw=False) may return raw content on cache hit.
> That is wrong. This patch adds a check to fix it.
>
> This slows down common use-cases where raw=False and flags=0. That
> performance issue will be fixed later.

s/later/in a later patch
This will make it clear that it's addressed in the same series

>
> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -1262,5 +1262,6 @@ class revlog(object):
>           if self._cache:
>               if self._cache[0] == node:
> -                return self._cache[2]
> +                if raw:

You can fold this raw check into the previous if statement. Also, I'd 
really like a comment about why "raw is right" here.

The "stop at cachedrev" stuff is also pretty non-obvious, I'd love a 
comment around that too if you have the chance.

> +                    return self._cache[2]
>               cachedrev = self._cache[1]
>   
>
Jun Wu - March 29, 2017, 3:21 p.m.
Excerpts from Ryan McElroy's message of 2017-03-29 13:40:10 +0100:
> On 3/28/17 8:49 AM, Jun Wu wrote:
> > # HG changeset patch
> > # User Jun Wu <quark@fb.com>
> > # Date 1490685397 25200
> > #      Tue Mar 28 00:16:37 2017 -0700
> > # Node ID d557aaee6ada70bf51fcc9d4d05d07a54d8f2d4e
> > # Parent  99cfe31d37df62b50e53a126f0eb31a1e352ac67
> > revlog: do not return raw cache blindly
> >
> > Previously, revlog.revision(raw=False) may return raw content on cache hit.
> > That is wrong. This patch adds a check to fix it.
> >
> > This slows down common use-cases where raw=False and flags=0. That
> > performance issue will be fixed later.
> 
> s/later/in a later patch
> This will make it clear that it's addressed in the same series
> 
> >
> > diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> > --- a/mercurial/revlog.py
> > +++ b/mercurial/revlog.py
> > @@ -1262,5 +1262,6 @@ class revlog(object):
> >           if self._cache:
> >               if self._cache[0] == node:
> > -                return self._cache[2]
> > +                if raw:
> 
> You can fold this raw check into the previous if statement. Also, I'd 
> really like a comment about why "raw is right" here.

It's also documented in revlog.__init__:

    # 3-tuple of (node, rev, text) for a raw revision.
    self._cache = None

So it seems unnecessary to repeat that comment.

> 
> The "stop at cachedrev" stuff is also pretty non-obvious, I'd love a 
> comment around that too if you have the chance.

I think it's reasonably obvious if people know what "deltachain" is. I don't
think it needs a comment. Maybe others can convince me.

> 
> > +                    return self._cache[2]
> >               cachedrev = self._cache[1]
> >   
> >

Patch

diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -1262,5 +1262,6 @@  class revlog(object):
         if self._cache:
             if self._cache[0] == node:
-                return self._cache[2]
+                if raw:
+                    return self._cache[2]
             cachedrev = self._cache[1]