Patchwork [1,of,3] dirstate: add begin/endparentchange to dirstate

login
register
mail settings
Submitter Durham Goode
Date Sept. 5, 2014, 7:36 p.m.
Message ID <8c6d502c92c8ed8dfa79.1409945779@dev2000.prn2.facebook.com>
Download mbox | patch
Permalink /patch/5705/
State Superseded
Commit 12bc7f06fc4151cd9951aad5ce77bb310b04f0a2
Headers show

Comments

Durham Goode - Sept. 5, 2014, 7:36 p.m.
# HG changeset patch
# User Durham Goode <durham@fb.com>
# Date 1409942069 25200
#      Fri Sep 05 11:34:29 2014 -0700
# Node ID 8c6d502c92c8ed8dfa79b8fac4c8a2cc9e6c4153
# Parent  188b8aa2120b03eead618ba150319074f4e3b42b
dirstate: add begin/endparentchange to dirstate

It's possible for the dirstate to become incoherent (issue4353) if there is an
exception in the middle of the dirstate parent and entries being written (like
if the user ctrl+c's). This change adds begin/endparentchange which a future
patch will require to be set before changing the dirstate parent.  This will
allow us to prevent writing the dirstate in the event of an exception while
changing the parent.
Mads Kiilerich - Sept. 8, 2014, 6:27 p.m.
On 09/05/2014 09:36 PM, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham@fb.com>
> # Date 1409942069 25200
> #      Fri Sep 05 11:34:29 2014 -0700
> # Node ID 8c6d502c92c8ed8dfa79b8fac4c8a2cc9e6c4153
> # Parent  188b8aa2120b03eead618ba150319074f4e3b42b
> dirstate: add begin/endparentchange to dirstate
>
> It's possible for the dirstate to become incoherent (issue4353) if there is an
> exception in the middle of the dirstate parent and entries being written (like
> if the user ctrl+c's). This change adds begin/endparentchange which a future
> patch will require to be set before changing the dirstate parent.  This will
> allow us to prevent writing the dirstate in the event of an exception while
> changing the parent.
>
> diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
> --- a/mercurial/dirstate.py
> +++ b/mercurial/dirstate.py
> @@ -44,6 +44,17 @@
>           self._lastnormaltime = 0
>           self._ui = ui
>           self._filecache = {}
> +        self._parentwriters = 0
> +
> +    def beginparentchange(self):
> +        self._parentwriters += 1
> +
> +    def endparentchange(self):
> +        if self._parentwriters > 0:
> +            self._parentwriters -= 1

The primary use case for this might be parent changes but it seems like 
it is a general transaction concept that should have a more generic name.

> +
> +    def pendingparentchange(self):
> +        return self._parentwriters > 0
>   
>       @propertycache
>       def _map(self):
> @@ -300,6 +311,7 @@
>                   delattr(self, a)
>           self._lastnormaltime = 0
>           self._dirty = False
> +        self._parentwriters = 0
>   
>       def copy(self, source, dest):
>           """Mark dest as a copy of source. Unmark dest if source is None."""
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -1108,7 +1108,11 @@
>               return l
>   
>           def unlock():
> -            self.dirstate.write()
> +            if self.dirstate.pendingparentchange():
> +                self.dirstate.invalidate()
> +            else:
> +                self.dirstate.write()
> +

Shouldn't this "don't write with pending parent change" live inside 
dirstate? Why expose it?

I also wonder if it would be a good idea to have some built-in 
verification that the count reaches 0 unless we are in an exceptional 
case. That would probably require some way of indicating whether it is a 
normal or exceptional unlock we are doing ... which perhaps could be 
used by itself instead of this semaphore. Hmm ... I don't know where 
this is going. I will keep wondering.

/Mads
Durham Goode - Sept. 8, 2014, 6:42 p.m.
On 9/8/14, 6:27 PM, "Mads Kiilerich" <mads@kiilerich.com> wrote:

>On 09/05/2014 09:36 PM, Durham Goode wrote:

>> # HG changeset patch

>> # User Durham Goode <durham@fb.com>

>> # Date 1409942069 25200

>> #      Fri Sep 05 11:34:29 2014 -0700

>> # Node ID 8c6d502c92c8ed8dfa79b8fac4c8a2cc9e6c4153

>> # Parent  188b8aa2120b03eead618ba150319074f4e3b42b

>> dirstate: add begin/endparentchange to dirstate

>>

>> It's possible for the dirstate to become incoherent (issue4353) if 

>>there is an

>> exception in the middle of the dirstate parent and entries being 

>>written (like

>> if the user ctrl+c's). This change adds begin/endparentchange which a 

>>future

>> patch will require to be set before changing the dirstate parent.  This 

>>will

>> allow us to prevent writing the dirstate in the event of an exception 

>>while

>> changing the parent.

>>

>> diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py

>> --- a/mercurial/dirstate.py

>> +++ b/mercurial/dirstate.py

>> @@ -44,6 +44,17 @@

>>           self._lastnormaltime = 0

>>           self._ui = ui

>>           self._filecache = {}

>> +        self._parentwriters = 0

>> +

>> +    def beginparentchange(self):

>> +        self._parentwriters += 1

>> +

>> +    def endparentchange(self):

>> +        if self._parentwriters > 0:

>> +            self._parentwriters -= 1

>

>The primary use case for this might be parent changes but it seems like 

>it is a general transaction concept that should have a more generic name.


It’s not fully transactional, so I didn’t want to give the wrong 
impression.  For instance, if someone explicitly calls dirstate.write() 
during the middle of this, it will still write to disk.

>

>> +

>> +    def pendingparentchange(self):

>> +        return self._parentwriters > 0

>>   

>>       @propertycache

>>       def _map(self):

>> @@ -300,6 +311,7 @@

>>                   delattr(self, a)

>>           self._lastnormaltime = 0

>>           self._dirty = False

>> +        self._parentwriters = 0

>>   

>>       def copy(self, source, dest):

>>           """Mark dest as a copy of source. Unmark dest if source is 

>>None."""

>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py

>> --- a/mercurial/localrepo.py

>> +++ b/mercurial/localrepo.py

>> @@ -1108,7 +1108,11 @@

>>               return l

>>   

>>           def unlock():

>> -            self.dirstate.write()

>> +            if self.dirstate.pendingparentchange():

>> +                self.dirstate.invalidate()

>> +            else:

>> +                self.dirstate.write()

>> +

>

>Shouldn't this "don't write with pending parent change" live inside 

>dirstate? Why expose it?


Because certain functionality relies on calling dirstate.write halfway 
through (like rebase). So this allows callers to treat dirstate.write like 
a dirstate.commit type function, then wlock makes a decision on whether to 
auto-commit or not based on if there was an exception or not (I.e. if 
pendingparentchanges == 0).

>

>I also wonder if it would be a good idea to have some built-in 

>verification that the count reaches 0 unless we are in an exceptional 

>case. That would probably require some way of indicating whether it is a 

>normal or exceptional unlock we are doing ... which perhaps could be 

>used by itself instead of this semaphore. Hmm ... I don't know where 

>this is going. I will keep wondering.


I originally wanted more transactional semantics:

Try:
    Dirstate.begin()


    Dirstate.commit()
Finally:
    Dirstate.close()

But that would introduce a lot of duplication (with wlock.begin/release) 
and Matt wanted to still allow the one off dirstate changes (like hg add) 
without the begin/end stuff

>

>/Mads

>

Patch

diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -44,6 +44,17 @@ 
         self._lastnormaltime = 0
         self._ui = ui
         self._filecache = {}
+        self._parentwriters = 0
+
+    def beginparentchange(self):
+        self._parentwriters += 1
+
+    def endparentchange(self):
+        if self._parentwriters > 0:
+            self._parentwriters -= 1
+
+    def pendingparentchange(self):
+        return self._parentwriters > 0
 
     @propertycache
     def _map(self):
@@ -300,6 +311,7 @@ 
                 delattr(self, a)
         self._lastnormaltime = 0
         self._dirty = False
+        self._parentwriters = 0
 
     def copy(self, source, dest):
         """Mark dest as a copy of source. Unmark dest if source is None."""
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1108,7 +1108,11 @@ 
             return l
 
         def unlock():
-            self.dirstate.write()
+            if self.dirstate.pendingparentchange():
+                self.dirstate.invalidate()
+            else:
+                self.dirstate.write()
+
             self._filecache['dirstate'].refresh()
 
         l = self._lock(self.vfs, "wlock", wait, unlock,