Patchwork changelog: make sure datafile is 00changelog.d

login
register
mail settings
Submitter Jun Wu
Date May 18, 2017, 3:01 a.m.
Message ID <9pmaojgrydoac8j0atv9bttb.1495076495705@email.android.com>
Download mbox | patch
Permalink /patch/20671/
State Accepted
Headers show

Comments

Jun Wu - May 18, 2017, 3:01 a.m.
(replying from mobile - sorry if the format is bad)

We got new user reports about ".i.d" do not exist today. So I think it needs to be fixed regardless of whether previous version is broken or not.

I think "(API)" is cleaner. Hopefully that could be fixed in flight. Or I can send a V2 2ith the title change.



-------- Original message --------
From: Gregory Szorc <gregory.szorc@gmail.com>
Date: 5/17/17 4:46 PM (GMT-08:00)
To: Jun Wu <quark@fb.com>
Cc: mercurial-devel <mercurial-devel@mercurial-scm.org>
Subject: Re: [PATCH] changelog: make sure datafile is 00changelog.d

On Wed, May 17, 2017 at 3:30 PM, Jun Wu <quark@fb.com<mailto:quark@fb.com>> wrote:
# HG changeset patch
# User Jun Wu <quark@fb.com<mailto:quark@fb.com>>
# Date 1495059840 25200
#      Wed May 17 15:24:00 2017 -0700
# Node ID ea13a8402d0c78e843e811617c41e3d4c23222e6
# Parent  2d19664e257da7ad5cb97150d81838c25872fac7
# Available At https://bitbucket.org/quark-zju/hg-draft<https://urldefense.proofpoint.com/v2/url?u=https-3A__bitbucket.org_quark-2Dzju_hg-2Ddraft&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=goC_KEU1hoLmVFdaD4QgHA&m=Bo0RE33WfGu_xbfPYzPx8uhr_g4APfOqlkmvQtWBuWM&s=aK9lPRhPGmY7NKH19igEB6L9_aErDkN_-r5DrlAPc40&e=>
#              hg pull https://bitbucket.org/quark-zju/hg-draft<https://urldefense.proofpoint.com/v2/url?u=https-3A__bitbucket.org_quark-2Dzju_hg-2Ddraft&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=goC_KEU1hoLmVFdaD4QgHA&m=Bo0RE33WfGu_xbfPYzPx8uhr_g4APfOqlkmvQtWBuWM&s=aK9lPRhPGmY7NKH19igEB6L9_aErDkN_-r5DrlAPc40&e=> -r ea13a8402d0c
changelog: make sure datafile is 00changelog.d

0ad0d26ff7 makes it possible for changelog datafile to be "00changelog.i.d",
which is wrong. This patch adds an explicit datafile parameter to fix it.

The bad path to datafile existed before. But it was less exposed since the changelog/revlog instance having that value was very shortly lived.

Anyway, this shouldn't matter since the changelog/revlog instance associated with the 00changelog.i.a file is never written to via the revlog APIs. But, I agree we shouldn't take the chance. I also agree we should pass the datafile as an argument rather than build ".i.a" parsing into revlog (since it is changelog specific).



         """
         create a revlog object
@@ -261,5 +261,5 @@ class revlog(object):
         """
         self.indexfile = indexfile
-        self.datafile = indexfile[:-2] + ".d"
+        self.datafile = datafile or (indexfile[:-2] + ".d")
         self.opener = opener
         #  When True, indexfile is opened with checkambig=True at writing, to
Gregory Szorc - May 18, 2017, 3:09 a.m.
On Wed, May 17, 2017 at 8:01 PM, Jun Wu <quark@fb.com> wrote:

> (replying from mobile - sorry if the format is bad)
>
> We got new user reports about ".i.d" do not exist today. So I think it
> needs to be fixed regardless of whether previous version is broken or not.
>

Hmmm. I suppose if there is a pretxnchangegroup process hook that needs to
access non-index data we could hit this. I would like to think we have test
coverage of this. But something tells me most/all our tests are for
changelogs smaller than 128kb and thus use inline data.

Yeah, that's a pretty serious bug.

I'll queue this with (API). If a queuer doesn't like the API change, we can
work it out later.


>
> I think "(API)" is cleaner. Hopefully that could be fixed in flight. Or I
> can send a V2 2ith the title change.
>
>
>
> -------- Original message --------
> From: Gregory Szorc <gregory.szorc@gmail.com>
> Date: 5/17/17 4:46 PM (GMT-08:00)
> To: Jun Wu <quark@fb.com>
> Cc: mercurial-devel <mercurial-devel@mercurial-scm.org>
> Subject: Re: [PATCH] changelog: make sure datafile is 00changelog.d
>
> On Wed, May 17, 2017 at 3:30 PM, Jun Wu <quark@fb.com> wrote:
>
>> # HG changeset patch
>> # User Jun Wu <quark@fb.com>
>> # Date 1495059840 25200
>> #      Wed May 17 15:24:00 2017 -0700
>> # Node ID ea13a8402d0c78e843e811617c41e3d4c23222e6
>> # Parent  2d19664e257da7ad5cb97150d81838c25872fac7
>> # Available At https://bitbucket.org/quark-zju/hg-draft
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bitbucket.org_quark-2Dzju_hg-2Ddraft&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=goC_KEU1hoLmVFdaD4QgHA&m=Bo0RE33WfGu_xbfPYzPx8uhr_g4APfOqlkmvQtWBuWM&s=aK9lPRhPGmY7NKH19igEB6L9_aErDkN_-r5DrlAPc40&e=>
>> #              hg pull https://bitbucket.org/quark-zju/hg-draft
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bitbucket.org_quark-2Dzju_hg-2Ddraft&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=goC_KEU1hoLmVFdaD4QgHA&m=Bo0RE33WfGu_xbfPYzPx8uhr_g4APfOqlkmvQtWBuWM&s=aK9lPRhPGmY7NKH19igEB6L9_aErDkN_-r5DrlAPc40&e=>
>> -r ea13a8402d0c
>> changelog: make sure datafile is 00changelog.d
>>
>> 0ad0d26ff7 makes it possible for changelog datafile to be
>> "00changelog.i.d",
>> which is wrong. This patch adds an explicit datafile parameter to fix it.
>>
>
> The bad path to datafile existed before. But it was less exposed since the
> changelog/revlog instance having that value was very shortly lived.
>
> Anyway, this shouldn't matter since the changelog/revlog instance
> associated with the 00changelog.i.a file is never written to via the revlog
> APIs. But, I agree we shouldn't take the chance. I also agree we should
> pass the datafile as an argument rather than build ".i.a" parsing into
> revlog (since it is changelog specific).
>
>
>>
>> diff --git a/mercurial/changelog.py b/mercurial/changelog.py
>> --- a/mercurial/changelog.py
>> +++ b/mercurial/changelog.py
>> @@ -274,5 +274,7 @@ class changelog(revlog.revlog):
>>              indexfile = '00changelog.i'
>>
>> -        revlog.revlog.__init__(self, opener, indexfile, checkambig=True)
>> +        datafile = '00changelog.d'
>> +        revlog.revlog.__init__(self, opener, indexfile,
>> datafile=datafile,
>> +                               checkambig=True)
>>
>>          if self._initempty:
>> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
>> --- a/mercurial/revlog.py
>> +++ b/mercurial/revlog.py
>> @@ -253,5 +253,5 @@ class revlog(object):
>>      writing, to avoid file stat ambiguity.
>>      """
>> -    def __init__(self, opener, indexfile, checkambig=False):
>> +    def __init__(self, opener, indexfile, datafile=None,
>> checkambig=False):
>>
>
> Being paranoid, I think we should preserve the argument order in case any
> caller (in extensions) isn't used named arguments. That, or this should be
> marked (API).
>
>
>>          """
>>          create a revlog object
>> @@ -261,5 +261,5 @@ class revlog(object):
>>          """
>>          self.indexfile = indexfile
>> -        self.datafile = indexfile[:-2] + ".d"
>> +        self.datafile = datafile or (indexfile[:-2] + ".d")
>>          self.opener = opener
>>          #  When True, indexfile is opened with checkambig=True at
>> writing, to
>>
>
>
Jun Wu - May 18, 2017, 4:38 a.m.
Excerpts from Gregory Szorc's message of 2017-05-17 20:09:53 -0700:
> Hmmm. I suppose if there is a pretxnchangegroup process hook that needs to
> access non-index data we could hit this. I would like to think we have test
> coverage of this. But something tells me most/all our tests are for
> changelogs smaller than 128kb and thus use inline data.

I did consider adding a test and test-hook.t does catch this if inline
revlog is disabled. I thought ideally with my run-tests .t multi cases
change, test-hook.t could have a new test case with inline disabled.

Since run-tests change was not ready yet but the fix is important and
straightforward, I sent it without a test.

> Yeah, that's a pretty serious bug.
> 
> I'll queue this with (API). If a queuer doesn't like the API change, we can
> work it out later.

Patch

diff --git a/mercurial/changelog.py b/mercurial/changelog.py
--- a/mercurial/changelog.py
+++ b/mercurial/changelog.py
@@ -274,5 +274,7 @@  class changelog(revlog.revlog):
             indexfile = '00changelog.i'

-        revlog.revlog.__init__(self, opener, indexfile, checkambig=True)
+        datafile = '00changelog.d'
+        revlog.revlog.__init__(self, opener, indexfile, datafile=datafile,
+                               checkambig=True)

         if self._initempty:
diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -253,5 +253,5 @@  class revlog(object):
     writing, to avoid file stat ambiguity.
     """
-    def __init__(self, opener, indexfile, checkambig=False):
+    def __init__(self, opener, indexfile, datafile=None, checkambig=False):

Being paranoid, I think we should preserve the argument order in case any caller (in extensions) isn't used named arguments. That, or this should be marked (API).