Submitter | Pierre-Yves David |
---|---|
Date | May 11, 2016, 7:38 a.m. |
Message ID | <b62cf5924c3b5684cb06.1462952333@nodosa.octopoid.net> |
Download | mbox | patch |
Permalink | /patch/15009/ |
State | Accepted |
Commit | 0e9ed09f5fe9c390e60a98426f64bc3a231f7aa0 |
Headers | show |
Comments
The series itself is ok. I'm not a huge fan of "x uses y instead of z", it's moderately painful to read. But that shouldn't stop this from going in. On Wed, May 11, 2016 at 3:38 AM, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: > # HG changeset patch > # User Pierre-Yves David <pierre-yves.david@ens-lyon.org> > # Date 1462469571 -7200 > # Thu May 05 19:32:51 2016 +0200 > # Node ID b62cf5924c3b5684cb0693fe4186dc607dcc5b50 > # Parent 685102426f222a6746960a905e2392cc2c4607da > # EXP-Topic cleanup > cleanup: replace False identity testing with an explicit token object > > The recommended way to check default value (when None is not as option) is a > token object. Identity testing to integer is less explicit and not guaranteed to > work in all implementations. > > diff -r 685102426f22 -r b62cf5924c3b mercurial/dirstate.py > --- a/mercurial/dirstate.py Wed May 11 09:31:47 2016 +0200 > +++ b/mercurial/dirstate.py Thu May 05 19:32:51 2016 +0200 > @@ -74,6 +74,8 @@ def _trypending(root, vfs, filename): > raise > return (vfs(filename), False) > > +_token = object() > + > class dirstate(object): > > def __init__(self, opener, ui, root, validate): > @@ -688,12 +690,12 @@ class dirstate(object): > self._pl = (parent, nullid) > self._dirty = True > > - def write(self, tr=False): > + def write(self, tr=_token): > if not self._dirty: > return > > filename = self._filename > - if tr is False: # not explicitly specified > + if tr is _token: # not explicitly specified > self._ui.deprecwarn('use dirstate.write with ' > 'repo.currenttransaction()', > '3.9') > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> On May 11, 2016, at 3:38 AM, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: > > # HG changeset patch > # User Pierre-Yves David <pierre-yves.david@ens-lyon.org> > # Date 1462469571 -7200 > # Thu May 05 19:32:51 2016 +0200 > # Node ID b62cf5924c3b5684cb0693fe4186dc607dcc5b50 > # Parent 685102426f222a6746960a905e2392cc2c4607da > # EXP-Topic cleanup > cleanup: replace False identity testing with an explicit token object > > The recommended way to check default value (when None is not as option) is a > token object. Identity testing to integer is less explicit and not guaranteed to > work in all implementations. > > diff -r 685102426f22 -r b62cf5924c3b mercurial/dirstate.py > --- a/mercurial/dirstate.py Wed May 11 09:31:47 2016 +0200 > +++ b/mercurial/dirstate.py Thu May 05 19:32:51 2016 +0200 > @@ -74,6 +74,8 @@ def _trypending(root, vfs, filename): > raise > return (vfs(filename), False) > > +_token = object() > + > class dirstate(object): > > def __init__(self, opener, ui, root, validate): > @@ -688,12 +690,12 @@ class dirstate(object): > self._pl = (parent, nullid) > self._dirty = True > > - def write(self, tr=False): > + def write(self, tr=_token): > if not self._dirty: > return > > filename = self._filename > - if tr is False: # not explicitly specified > + if tr is _token: # not explicitly specified I just saw some code (seriously, I tripped over it about at the same time as I saw this patch) in localrepo.py that explicitly calls dirstate.write(None). Is that a bug? It looks like a bug. I suspect this might want to just look for a truthy value to sniff out clients that are doing the wrong thing? (if not, we should maybe have some documentation that states why internal users are allowed to skip the transaction here) > self._ui.deprecwarn('use dirstate.write with ' > 'repo.currenttransaction()', > '3.9') > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
On 05/11/2016 02:34 PM, Augie Fackler wrote: >> On May 11, 2016, at 3:38 AM, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: >> >> # HG changeset patch >> # User Pierre-Yves David <pierre-yves.david@ens-lyon.org> >> # Date 1462469571 -7200 >> # Thu May 05 19:32:51 2016 +0200 >> # Node ID b62cf5924c3b5684cb0693fe4186dc607dcc5b50 >> # Parent 685102426f222a6746960a905e2392cc2c4607da >> # EXP-Topic cleanup >> cleanup: replace False identity testing with an explicit token object >> >> The recommended way to check default value (when None is not as option) is a >> token object. Identity testing to integer is less explicit and not guaranteed to >> work in all implementations. >> >> diff -r 685102426f22 -r b62cf5924c3b mercurial/dirstate.py >> --- a/mercurial/dirstate.py Wed May 11 09:31:47 2016 +0200 >> +++ b/mercurial/dirstate.py Thu May 05 19:32:51 2016 +0200 >> @@ -74,6 +74,8 @@ def _trypending(root, vfs, filename): >> raise >> return (vfs(filename), False) >> >> +_token = object() >> + >> class dirstate(object): >> >> def __init__(self, opener, ui, root, validate): >> @@ -688,12 +690,12 @@ class dirstate(object): >> self._pl = (parent, nullid) >> self._dirty = True >> >> - def write(self, tr=False): >> + def write(self, tr=_token): >> if not self._dirty: >> return >> >> filename = self._filename >> - if tr is False: # not explicitly specified >> + if tr is _token: # not explicitly specified > I just saw some code (seriously, I tripped over it about at the same time as I saw this patch) in localrepo.py that explicitly calls dirstate.write(None). Is that a bug? It looks like a bug. I suspect this might want to just look for a truthy value to sniff out clients that are doing the wrong thing? > > (if not, we should maybe have some documentation that states why internal users are allowed to skip the transaction here) Dirstate change -must- collaborate with the transaction if there is one. However, if there is no currently open transaction, it wont.
> On May 11, 2016, at 8:38 AM, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: > >>> - if tr is False: # not explicitly specified >>> + if tr is _token: # not explicitly specified >> I just saw some code (seriously, I tripped over it about at the same time as I saw this patch) in localrepo.py that explicitly calls dirstate.write(None). Is that a bug? It looks like a bug. I suspect this might want to just look for a truthy value to sniff out clients that are doing the wrong thing? >> >> (if not, we should maybe have some documentation that states why internal users are allowed to skip the transaction here) > > Dirstate change -must- collaborate with the transaction if there is one. However, if there is no currently open transaction, it wont. So, I think I’ve answered my own question (this response from you did not help in the slightest). Here’s what I think I’m seeing: localrepo.transaction() needs a way to flush any pending writes in the dirstate before a transaction starts, so that the only things which get rolled back as part of the transaction are things that happen during the transaction. As such, localrepo needs a way to flush the dirstate to disk without a transaction in play (in case it’s dirty), and the way to do that is by calling dirstate.write(None). Is that correct? If that’s correct, then we should probably just make an _ensureclean() method that the transaction layer can call that does the write without a transaction, and document why it exists. As it is, anyone that wants to bypass a transaction can do so by passing any falsey value into dirstate.write(). (I’ll probably still take this patch, but want to continue this discussion.)
On 05/11/2016 02:46 PM, Augie Fackler wrote: >> On May 11, 2016, at 8:38 AM, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: >> >>>> - if tr is False: # not explicitly specified >>>> + if tr is _token: # not explicitly specified >>> I just saw some code (seriously, I tripped over it about at the same time as I saw this patch) in localrepo.py that explicitly calls dirstate.write(None). Is that a bug? It looks like a bug. I suspect this might want to just look for a truthy value to sniff out clients that are doing the wrong thing? >>> >>> (if not, we should maybe have some documentation that states why internal users are allowed to skip the transaction here) >> Dirstate change -must- collaborate with the transaction if there is one. However, if there is no currently open transaction, it wont. > So, I think I’ve answered my own question (this response from you did not help in the slightest). Here’s what I think I’m seeing: > > localrepo.transaction() needs a way to flush any pending writes in the dirstate before a transaction starts, so that the only things which get rolled back as part of the transaction are things that happen during the transaction. As such, localrepo needs a way to flush the dirstate to disk without a transaction in play (in case it’s dirty), and the way to do that is by calling dirstate.write(None). Is that correct? That seems like a plausible explanation. I think foozy is the one who did this, I've CCed him for validation. > If that’s correct, then we should probably just make an _ensureclean() method that the transaction layer can call that does the write without a transaction, and document why it exists. As it is, anyone that wants to bypass a transaction can do so by passing any falsey value into dirstate.write(). It is not about falsey value, it is about passing None, the official way for "write outside of a transaction". I agree with you that we should should update the documentation to make it clear, but the "write" method seems to be the place were it belong. I'm not sure what would be the value of a dedicated "_ensureclean" and why it would have to be private. > (I’ll probably still take this patch, but want to continue this discussion.) That patch is not related to these dirstate.write(None) call in localrepo I'm not sure why this discussion would delay them. (discussion in itself it useful, but unrelated)
> On May 11, 2016, at 09:34, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: > >> (I’ll probably still take this patch, but want to continue this discussion.) > > That patch is not related to these dirstate.write(None) call in localrepo I'm not sure why this discussion would delay them. > (discussion in itself it useful, but unrelated) It's related in my mind, because I was having a lot of trouble understanding why we're using a sentinel object instead of just requiring callers to always pass a transaction. Now that I understand, this patch is a strict improvement (so I'll take it), but there's still some work to be done to make this a clean API (which I'll probably do sometime this week.) Thanks for bearing with me.
On 05/11/2016 04:04 PM, Augie Fackler wrote: >> On May 11, 2016, at 09:34, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: >> >>> (I’ll probably still take this patch, but want to continue this discussion.) >> That patch is not related to these dirstate.write(None) call in localrepo I'm repo._phasecache.invalidate(not sure why this discussion would delay them. >> (discussion in itself it useful, but unrelated) > It's related in my mind, because I was having a lot of trouble understanding why we're using a sentinel object instead of just requiring callers to always pass a transaction. Now that I understand, this patch is a strict improvement (so I'll take it), but there's still some work to be done to make this a clean API (which I'll probably do sometime this week.) The sentinel object is here for deprecation purpose, the argument will become mandatory after 3.9. Your last statement is a bit confusing to me, especially I'm not sure what you find "unclear" in the current API. Here is a wide explanation of the situation as an attempt to clarify. None is valid value and will remain a valid value because dirstate change do not requires a transaction. They just have to collaborate with one if present. For example `hg update` or `hg add` do not use a transaction. As you pointed out in your previous email, there is also case around the transaction code that explicitly call the write with None because it is the transaction code dirstate is collaborating with, so it is supposed to know what it is doing.
> On May 11, 2016, at 10:17, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: > > None is valid value and will remain a valid value because dirstate change do not requires a transaction. They just have to collaborate with one if present. For example `hg update` or `hg add` do not use a transaction. As you pointed out in your previous email, there is also case around the transaction code that explicitly call the write with None because it is the transaction code dirstate is collaborating with, so it is supposed to know what it is doing. Aha! This is the missing piece of understanding for me: it's valid for update() (et al) codepaths to adjust dirstate without a transaction - it's only required to pass a transaction if one is in play.[1] In summary, today we have three cases: dirstate.write(), no active transaction -> fine dirstate.write(), a transaction is active -> buggy dirstate.write(tr) -> always fine This series is merely about disabling that middle case more formally. I think, given that, the direction I'd actually rather go (but won't block this patch on), is make tr an optional argument to dirstate.write()[0], and if devel-warnings are enabled, check to see if there's a current transaction and log if so. Does that sound like a workable future approach? (My concern here is that None as a magic value for "I know what I'm doing" is going to be prone to misuse and/or abuse, and it makes the API a little weird.) 0: It'll also be legal to pass a falsey value for tr, which is another way of explicitly saying "don't use a transaction", but that's still Wrong if there's a transaction in flight, right?
On 05/11/2016 04:26 PM, Augie Fackler wrote: >> On May 11, 2016, at 10:17, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: >> >> None is valid value and will remain a valid value because dirstate change do not requires a transaction. They just have to collaborate with one if present. For example `hg update` or `hg add` do not use a transaction. As you pointed out in your previous email, there is also case around the transaction code that explicitly call the write with None because it is the transaction code dirstate is collaborating with, so it is supposed to know what it is doing. > Aha! This is the missing piece of understanding for me: it's valid for update() (et al) codepaths to adjust dirstate without a transaction - it's only required to pass a transaction if one is in play.[1] In summary, today we have three cases: > > dirstate.write(), no active transaction -> fine > dirstate.write(), a transaction is active -> buggy > dirstate.write(tr) -> always fine > > This series is merely about disabling that middle case more formally. The first case is also problematic. Unless you are in a very narrow set of code (pretty much, the strip code that explicitly forbid transaction to be in place). There is no way for any code to know if it will be use within a transaction at some point. So all code using dirstate.write() should call it using dirstate.write(repo.currenttransaction()) the currenttransaction bit will take are of returning None or the active transaction. This construct of "dirstate.write(repo.currenttransaction())" is the result of the long discussion in the middle of last year with at least Foozy and I. I do not think it is worth revisiting, but if you really feel like doing so, please go dig that these discussion out first. > I think, given that, the direction I'd actually rather go (but won't block this patch on), is make tr an optional argument to dirstate.write()[0], and if devel-warnings are enabled, check to see if there's a current transaction and log if so. Does that sound like a workable future approach? as pointed a bit earlier in this message, because you never now if code will eventually be wrapped within a transaction, all normal code should be called with "dirstate.write(repo.currenttransaction())" all the time. So making this argument mandatory is the way to go to me. We should improve the documentation to make that clear. > (My concern here is that None as a magic value for "I know what I'm doing" is going to be prone to misuse and/or abuse, and it makes the API a little weird.) "None" is not just a magic value for dirstate.write it is the official return of repo.currenttransaction() when none exist, various other part of the code rely on this and I see his as a valid python idiom. The case that explicitly pass None are narrow and we should document them properly. This is python, if programmer decide to ignore the warning sign they are responsible adult. We should make sure they see the sign. > 0: It'll also be legal to pass a falsey value for tr, which is another way of explicitly saying "don't use a transaction", but that's still Wrong if there's a transaction in flight, right? Nope, the current code make it legal but we should enforce a None checking here. I don't see a reason for a legal falsey value here.
At Wed, 11 May 2016 15:34:30 +0200, Pierre-Yves David wrote: > > On 05/11/2016 02:46 PM, Augie Fackler wrote: > >> On May 11, 2016, at 8:38 AM, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: > >> > >>>> - if tr is False: # not explicitly specified > >>>> + if tr is _token: # not explicitly specified > >>> I just saw some code (seriously, I tripped over it about at the same time as I saw this patch) in localrepo.py that explicitly calls dirstate.write(None). Is that a bug? It looks like a bug. I suspect this might want to just look for a truthy value to sniff out clients that are doing the wrong thing? > >>> > >>> (if not, we should maybe have some documentation that states why internal users are allowed to skip the transaction here) > >> Dirstate change -must- collaborate with the transaction if there is one. However, if there is no currently open transaction, it wont. > > So, I think I’ve answered my own question (this response from you did not help in the slightest). Here’s what I think I’m seeing: > > > > localrepo.transaction() needs a way to flush any pending writes in > > the dirstate before a transaction starts, so that the only things > > which get rolled back as part of the transaction are things that > > happen during the transaction. As such, localrepo needs a way to > > flush the dirstate to disk without a transaction in play (in case > > it’s dirty), and the way to do that is by calling > > dirstate.write(None). Is that correct? > > > That seems like a plausible explanation. I think foozy is the one who > did this, I've CCed him for validation. (I'm just writing reply to Augie :-)) Yes, that is correct. But strictly speaking, using 'None' as 'tr' is not for "we want to bypass transaction" but for "we can ensure there is no transaction in flight" in cases below: 1. before writing journal files out https://selenic.com/repo/hg/file/df838803c1d4/mercurial/localrepo.py#l1016 2. after successful closing transaction https://selenic.com/repo/hg/file/df838803c1d4/mercurial/localrepo.py#l1036 3. at releasing wlock https://selenic.com/repo/hg/file/df838803c1d4/mercurial/localrepo.py#l1360 (In fact, in corner case of #3, we can't assume that. But it isn't related to this series. I'll post it as new topic) BTW, largefilesdirstate class derived from dirstate also invokes dirstate.write(None). This should be replaced, too, if we introduce _ensureclean(). > > If that’s correct, then we should probably just make an _ensureclean() method that the transaction layer can call that does the write without a transaction, and document why it exists. As it is, anyone that wants to bypass a transaction can do so by passing any falsey value into dirstate.write(). > > > It is not about falsey value, it is about passing None, the official > way for "write outside of a transaction". I agree with you that we > should should update the documentation to make it clear, but the > "write" method seems to be the place were it belong. I'm not sure > what would be the value of a dedicated "_ensureclean" and why it > would have to be private. > > > (I’ll probably still take this patch, but want to continue this discussion.) > > That patch is not related to these dirstate.write(None) call in > localrepo I'm not sure why this discussion would delay them. > (discussion in itself it useful, but unrelated) > > -- > Pierre-Yves David > ---------------------------------------------------------------------- [FUJIWARA Katsunori] foozy@lares.dti.ne.jp
At Wed, 11 May 2016 10:26:37 -0400, Augie Fackler wrote: > > > On May 11, 2016, at 10:17, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: > > > > None is valid value and will remain a valid value because dirstate > > change do not requires a transaction. They just have to > > collaborate with one if present. For example `hg update` or `hg > > add` do not use a transaction. As you pointed out in your previous > > email, there is also case around the transaction code that > > explicitly call the write with None because it is the transaction > > code dirstate is collaborating with, so it is supposed to know > > what it is doing. > > Aha! This is the missing piece of understanding for me: it's valid > for update() (et al) codepaths to adjust dirstate without a > transaction - it's only required to pass a transaction if one is in > play.[1] In summary, today we have three cases: > > dirstate.write(), no active transaction -> fine > dirstate.write(), a transaction is active -> buggy > dirstate.write(tr) -> always fine > > This series is merely about disabling that middle case more formally. > > I think, given that, the direction I'd actually rather go (but won't > block this patch on), is make tr an optional argument to > dirstate.write()[0], and if devel-warnings are enabled, check to see > if there's a current transaction and log if so. Does that sound like > a workable future approach? Unfortunately, there is no easy way on dirstate.write() side to examine whether transaction is running, because dirstate doesn't have reference to repo object (even though thread local variable or so might help that idea). > (My concern here is that None as a magic value for "I know what I'm > doing" is going to be prone to misuse and/or abuse, and it makes the > API a little weird.) > > 0: It'll also be legal to pass a falsey value for tr, which is > another way of explicitly saying "don't use a transaction", but > that's still Wrong if there's a transaction in flight, right? Right. We can't easily assume there is no transaction in flight. It is reason why we should recommend to pass the result of repo.currenttransaction() to dirstate.write() (wlock, slock and transaction scope is easily nested :-<) > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel ---------------------------------------------------------------------- [FUJIWARA Katsunori] foozy@lares.dti.ne.jp
At Wed, 11 May 2016 16:43:39 +0200, Pierre-Yves David wrote: > > On 05/11/2016 04:26 PM, Augie Fackler wrote: > >> On May 11, 2016, at 10:17, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote: > >> > >> None is valid value and will remain a valid value because dirstate change do not requires a transaction. They just have to collaborate with one if present. For example `hg update` or `hg add` do not use a transaction. As you pointed out in your previous email, there is also case around the transaction code that explicitly call the write with None because it is the transaction code dirstate is collaborating with, so it is supposed to know what it is doing. > > Aha! This is the missing piece of understanding for me: it's valid for update() (et al) codepaths to adjust dirstate without a transaction - it's only required to pass a transaction if one is in play.[1] In summary, today we have three cases: > > > > dirstate.write(), no active transaction -> fine > > dirstate.write(), a transaction is active -> buggy > > dirstate.write(tr) -> always fine > > > > This series is merely about disabling that middle case more formally. > > The first case is also problematic. Unless you are in a very narrow set > of code (pretty much, the strip code that explicitly forbid transaction > to be in place). There is no way for any code to know if it will be use > within a transaction at some point. So all code using dirstate.write() > should call it using dirstate.write(repo.currenttransaction()) the > currenttransaction bit will take are of returning None or the active > transaction. > > This construct of "dirstate.write(repo.currenttransaction())" is the > result of the long discussion in the middle of last year with at least > Foozy and I. I do not think it is worth revisiting, but if you really > feel like doing so, please go dig that these discussion out first. For future reference: https://www.mercurial-scm.org/wiki/DirstateTransactionPlan ---------------------------------------------------------------------- [FUJIWARA Katsunori] foozy@lares.dti.ne.jp
Patch
diff -r 685102426f22 -r b62cf5924c3b mercurial/dirstate.py --- a/mercurial/dirstate.py Wed May 11 09:31:47 2016 +0200 +++ b/mercurial/dirstate.py Thu May 05 19:32:51 2016 +0200 @@ -74,6 +74,8 @@ def _trypending(root, vfs, filename): raise return (vfs(filename), False) +_token = object() + class dirstate(object): def __init__(self, opener, ui, root, validate): @@ -688,12 +690,12 @@ class dirstate(object): self._pl = (parent, nullid) self._dirty = True - def write(self, tr=False): + def write(self, tr=_token): if not self._dirty: return filename = self._filename - if tr is False: # not explicitly specified + if tr is _token: # not explicitly specified self._ui.deprecwarn('use dirstate.write with ' 'repo.currenttransaction()', '3.9')