Patchwork [3,of,3,V2] pack_dirstate: only invalidate mtime for files written in the last second

login
register
mail settings
Submitter Siddharth Agarwal
Date Aug. 31, 2013, 8:14 p.m.
Message ID <d932486c0285a75fcbe3.1377980061@dev1091.prn1.facebook.com>
Download mbox | patch
Permalink /patch/2297/
State Superseded
Commit 187bf2dde7c11b073c384de6a1470eba2707eefb
Headers show

Comments

Siddharth Agarwal - Aug. 31, 2013, 8:14 p.m.
# HG changeset patch
# User Siddharth Agarwal <sid0@fb.com>
# Date 1376797729 25200
#      Sat Aug 17 20:48:49 2013 -0700
# Node ID d932486c0285a75fcbe322a3028aaa5e1d5b2570
# Parent  c22d981740c0b6484e437461ade3d7dc9a01aa6b
pack_dirstate: only invalidate mtime for files written in the last second

Previously we'd place files written in the last second in the lookup set. This
can lead to pathological cases where a file always remains in the lookup set if
it gets modified before the next time status is run.

With this patch, only the mtime of those files is invalidated. This means that
if a file's size or mode changes, we can immediately declare it as modified
without needing to compare file contents.
Adrian Buehlmann - Aug. 31, 2013, 9:28 p.m.
On 2013-08-31 22:14, Siddharth Agarwal wrote:
> # HG changeset patch
> # User Siddharth Agarwal <sid0@fb.com>
> # Date 1376797729 25200
> #      Sat Aug 17 20:48:49 2013 -0700
> # Node ID d932486c0285a75fcbe322a3028aaa5e1d5b2570
> # Parent  c22d981740c0b6484e437461ade3d7dc9a01aa6b
> pack_dirstate: only invalidate mtime for files written in the last second
> 
> Previously we'd place files written in the last second in the lookup set. This
> can lead to pathological cases where a file always remains in the lookup set if
> it gets modified before the next time status is run.
> 
> With this patch, only the mtime of those files is invalidated. This means that
> if a file's size or mode changes, we can immediately declare it as modified
> without needing to compare file contents.
> 
> diff --git a/mercurial/parsers.c b/mercurial/parsers.c
> --- a/mercurial/parsers.c
> +++ b/mercurial/parsers.c
> @@ -330,7 +330,7 @@
>  			 * this. */
>  			if (PyDict_SetItem(map, k, dirstate_unset) == -1)
>  				goto bail;
> -			mode = 0, size = -1, mtime = -1;
> +			mtime = -1;
>  		}
>  		putbe32(mode, p);
>  		putbe32(size, p + 4);
> diff --git a/mercurial/pure/parsers.py b/mercurial/pure/parsers.py
> --- a/mercurial/pure/parsers.py
> +++ b/mercurial/pure/parsers.py
> @@ -100,11 +100,11 @@
>              # systems with a granularity of 1 sec). This commonly happens
>              # for at least a couple of files on 'update'.
>              # The user could change the file without changing its size
> -            # within the same second. Invalidate the file's stat data in
> +            # within the same second. Invalidate the file's mtime in
>              # dirstate, forcing future 'status' calls to compare the
> -            # contents of the file. This prevents mistakenly treating such
> -            # files as clean.
> -            e = (e[0], 0, -1, -1)   # mark entry as 'unset'
> +            # contents of the file if the size is the same. This prevents
> +            # mistakenly treating such files as clean.
> +            e = (e[0], e[1], e[2], -1)
>              dmap[f] = e
>  
>          if f in copymap:

Clever.

If we won't be able to resolve the time, we only force the mtime value
entry to invalid state. "Problem with time" -> "tweak time only". Elegant.

Patch looks correct to me.
Augie Fackler - Sept. 3, 2013, 7:01 p.m.
On Sat, Aug 31, 2013 at 01:14:21PM -0700, Siddharth Agarwal wrote:
> # HG changeset patch
> # User Siddharth Agarwal <sid0@fb.com>
> # Date 1376797729 25200
> #      Sat Aug 17 20:48:49 2013 -0700
> # Node ID d932486c0285a75fcbe322a3028aaa5e1d5b2570
> # Parent  c22d981740c0b6484e437461ade3d7dc9a01aa6b
> pack_dirstate: only invalidate mtime for files written in the last second

Queued the lot. My head hurts.

>
> Previously we'd place files written in the last second in the lookup set. This
> can lead to pathological cases where a file always remains in the lookup set if
> it gets modified before the next time status is run.
>
> With this patch, only the mtime of those files is invalidated. This means that
> if a file's size or mode changes, we can immediately declare it as modified
> without needing to compare file contents.
>
> diff --git a/mercurial/parsers.c b/mercurial/parsers.c
> --- a/mercurial/parsers.c
> +++ b/mercurial/parsers.c
> @@ -330,7 +330,7 @@
>                        * this. */
>                       if (PyDict_SetItem(map, k, dirstate_unset) == -1)
>                               goto bail;
> -			mode = 0, size = -1, mtime = -1;
> +			mtime = -1;
>               }
>               putbe32(mode, p);
>               putbe32(size, p + 4);
> diff --git a/mercurial/pure/parsers.py b/mercurial/pure/parsers.py
> --- a/mercurial/pure/parsers.py
> +++ b/mercurial/pure/parsers.py
> @@ -100,11 +100,11 @@
>              # systems with a granularity of 1 sec). This commonly happens
>              # for at least a couple of files on 'update'.
>              # The user could change the file without changing its size
> -            # within the same second. Invalidate the file's stat data in
> +            # within the same second. Invalidate the file's mtime in
>              # dirstate, forcing future 'status' calls to compare the
> -            # contents of the file. This prevents mistakenly treating such
> -            # files as clean.
> -            e = (e[0], 0, -1, -1)   # mark entry as 'unset'
> +            # contents of the file if the size is the same. This prevents
> +            # mistakenly treating such files as clean.
> +            e = (e[0], e[1], e[2], -1)
>              dmap[f] = e
>
>          if f in copymap:
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Augie Fackler - Sept. 3, 2013, 7:05 p.m.
Dropping this series because check-code says hi.

On Tue, Sep 3, 2013 at 3:01 PM, Augie Fackler <raf@durin42.com> wrote:
> On Sat, Aug 31, 2013 at 01:14:21PM -0700, Siddharth Agarwal wrote:
>> # HG changeset patch
>> # User Siddharth Agarwal <sid0@fb.com>
>> # Date 1376797729 25200
>> #      Sat Aug 17 20:48:49 2013 -0700
>> # Node ID d932486c0285a75fcbe322a3028aaa5e1d5b2570
>> # Parent  c22d981740c0b6484e437461ade3d7dc9a01aa6b
>> pack_dirstate: only invalidate mtime for files written in the last second
>
> Queued the lot. My head hurts.
>
>>
>> Previously we'd place files written in the last second in the lookup set. This
>> can lead to pathological cases where a file always remains in the lookup set if
>> it gets modified before the next time status is run.
>>
>> With this patch, only the mtime of those files is invalidated. This means that
>> if a file's size or mode changes, we can immediately declare it as modified
>> without needing to compare file contents.
>>
>> diff --git a/mercurial/parsers.c b/mercurial/parsers.c
>> --- a/mercurial/parsers.c
>> +++ b/mercurial/parsers.c
>> @@ -330,7 +330,7 @@
>>                        * this. */
>>                       if (PyDict_SetItem(map, k, dirstate_unset) == -1)
>>                               goto bail;
>> -                     mode = 0, size = -1, mtime = -1;
>> +                     mtime = -1;
>>               }
>>               putbe32(mode, p);
>>               putbe32(size, p + 4);
>> diff --git a/mercurial/pure/parsers.py b/mercurial/pure/parsers.py
>> --- a/mercurial/pure/parsers.py
>> +++ b/mercurial/pure/parsers.py
>> @@ -100,11 +100,11 @@
>>              # systems with a granularity of 1 sec). This commonly happens
>>              # for at least a couple of files on 'update'.
>>              # The user could change the file without changing its size
>> -            # within the same second. Invalidate the file's stat data in
>> +            # within the same second. Invalidate the file's mtime in
>>              # dirstate, forcing future 'status' calls to compare the
>> -            # contents of the file. This prevents mistakenly treating such
>> -            # files as clean.
>> -            e = (e[0], 0, -1, -1)   # mark entry as 'unset'
>> +            # contents of the file if the size is the same. This prevents
>> +            # mistakenly treating such files as clean.
>> +            e = (e[0], e[1], e[2], -1)
>>              dmap[f] = e
>>
>>          if f in copymap:
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel@selenic.com
>> http://selenic.com/mailman/listinfo/mercurial-devel
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/parsers.c b/mercurial/parsers.c
--- a/mercurial/parsers.c
+++ b/mercurial/parsers.c
@@ -330,7 +330,7 @@ 
 			 * this. */
 			if (PyDict_SetItem(map, k, dirstate_unset) == -1)
 				goto bail;
-			mode = 0, size = -1, mtime = -1;
+			mtime = -1;
 		}
 		putbe32(mode, p);
 		putbe32(size, p + 4);
diff --git a/mercurial/pure/parsers.py b/mercurial/pure/parsers.py
--- a/mercurial/pure/parsers.py
+++ b/mercurial/pure/parsers.py
@@ -100,11 +100,11 @@ 
             # systems with a granularity of 1 sec). This commonly happens
             # for at least a couple of files on 'update'.
             # The user could change the file without changing its size
-            # within the same second. Invalidate the file's stat data in
+            # within the same second. Invalidate the file's mtime in
             # dirstate, forcing future 'status' calls to compare the
-            # contents of the file. This prevents mistakenly treating such
-            # files as clean.
-            e = (e[0], 0, -1, -1)   # mark entry as 'unset'
+            # contents of the file if the size is the same. This prevents
+            # mistakenly treating such files as clean.
+            e = (e[0], e[1], e[2], -1)
             dmap[f] = e
 
         if f in copymap: