Patchwork [09,of,10] repair: copy non-revlog store files during upgrade

login
register
mail settings
Submitter Gregory Szorc
Date Nov. 6, 2016, 4:40 a.m.
Message ID <3d4dd237b705479f8c74.1478407225@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/17372/
State Changes Requested
Headers show

Comments

Gregory Szorc - Nov. 6, 2016, 4:40 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1478392019 25200
#      Sat Nov 05 17:26:59 2016 -0700
# Node ID 3d4dd237b705479f8c7475b821ae719893381fa8
# Parent  d2261c558ca9639fb81c182de15d75151cbad0f9
repair: copy non-revlog store files during upgrade

The store contains more than just revlogs. This patch teaches the
upgrade code to copy regular files as well.

As the test changes demonstrate, the phaseroots file is now copied.
Augie Fackler - Nov. 21, 2016, 10:08 p.m.
On Sat, Nov 05, 2016 at 09:40:25PM -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1478392019 25200
> #      Sat Nov 05 17:26:59 2016 -0700
> # Node ID 3d4dd237b705479f8c7475b821ae719893381fa8
> # Parent  d2261c558ca9639fb81c182de15d75151cbad0f9
> repair: copy non-revlog store files during upgrade
>
> The store contains more than just revlogs. This patch teaches the
> upgrade code to copy regular files as well.
>
> As the test changes demonstrate, the phaseroots file is now copied.
>
> diff --git a/mercurial/repair.py b/mercurial/repair.py
> --- a/mercurial/repair.py
> +++ b/mercurial/repair.py
> @@ -10,6 +10,7 @@ from __future__ import absolute_import
>
>  import errno
>  import hashlib
> +import stat
>  import tempfile
>  import time
>
> @@ -622,6 +623,39 @@ def _copyrevlogs(ui, srcrepo, dstrepo, t
>                 'across %d revlogs and %d revisions\n') % (
>               dstsize, dstsize - srcsize, rlcount, revcount))
>
> +def _upgradefilterstorefile(srcrepo, dstrepo, requirements, path, mode, st):
> +    """Determine whether to copy a store file during upgrade.
> +
> +    This function is called when migrating store files from ``srcrepo`` to
> +    ``dstrepo`` as part of upgrading a repository. The function receives
> +    the ``requirements`` for ``dstrepo``, the relative ``path`` of the
> +    store file being examined, its ``ST_MODE`` file type, and a full
> +    stat data structure.

This args list is pretty substantial. Can I interest you in adopting
the style used at
https://www.mercurial-scm.org/repo/hg/file/tip/mercurial/manifest.py#l526
or something similar that would give a more bullet-list-like
experience for enumerating the arguments and their types and meanings?

> +
> +    Function should return ``True`` if the file is to be copied.
> +    """
> +    # Skip revlogs.
> +    if path.endswith(('.i', '.d')):
> +        return False
> +    # Skip transaction related files.
> +    if path.startswith('undo'):
> +        return False
> +    # Only copy regular files.
> +    if mode != stat.S_IFREG:
> +        return False
> +    # Skip other skipped files.
> +    if path in ('lock', 'fncache'):
> +        return False
> +
> +    return True

I'm unclear which way we should bias. This is probably better than not
copying anything we don't recognize though.

This appears to only be files inside store. What about localtags and
bookmarks? I suppose it doesn't matter, since this is a store-level
upgrade that shouldn't change hashes.

> +
> +def _upgradefinishdatamigration(ui, srcrepo, dstrepo, requirements):
> +    """Hook point for extensions to perform additional actions during upgrade.
> +
> +    This function is called after revlogs and store files have been copied but
> +    before the new store is swapped into the original location.
> +    """
> +
>  def _upgraderepo(ui, srcrepo, dstrepo, requirements):
>      """Do the low-level work of upgrading a repository.
>
> @@ -641,7 +675,18 @@ def _upgraderepo(ui, srcrepo, dstrepo, r
>      with dstrepo.transaction('upgrade') as tr:
>          _copyrevlogs(ui, srcrepo, dstrepo, tr)
>
> -        # TODO copy non-revlog store files
> +    # Now copy other files in the store directory.
> +    for p, kind, st in srcrepo.store.vfs.readdir('', stat=True):
> +        if not _upgradefilterstorefile(srcrepo, dstrepo, requirements,
> +                                       p, kind, st):
> +            continue
> +
> +        srcrepo.ui.write(_('copying %s\n') % p)
> +        src = srcrepo.store.vfs.join(p)
> +        dst = dstrepo.store.vfs.join(p)
> +        util.copyfile(src, dst, copystat=True)
> +
> +    _upgradefinishdatamigration(ui, srcrepo, dstrepo, requirements)
>
>      ui.write(_('data fully migrated to temporary repository\n'))
>
> diff --git a/tests/test-upgrade-repo.t b/tests/test-upgrade-repo.t
> --- a/tests/test-upgrade-repo.t
> +++ b/tests/test-upgrade-repo.t
> @@ -149,6 +149,7 @@ Upgrading a repository to generaldelta w
>    migrating manifests...
>    migrating changelog...
>    revlogs migration complete; wrote 917 bytes (delta 0 bytes) across 5 revlogs and 9 revisions
> +  copying phaseroots
>    data fully migrated to temporary repository
>    starting in-place swap of repository data
>    (clients may error or see inconsistent repository data until this operation completes)
> @@ -182,6 +183,7 @@ store directory has files we expect
>    00manifest.i
>    data
>    fncache
> +  phaseroots
>    undo
>    undo.backupfiles
>    undo.phaseroots
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Pierre-Yves David - Nov. 22, 2016, 2:24 a.m.
On 11/21/2016 11:08 PM, Augie Fackler wrote:
> On Sat, Nov 05, 2016 at 09:40:25PM -0700, Gregory Szorc wrote:
>> # HG changeset patch
>> # User Gregory Szorc <gregory.szorc@gmail.com>
>> # Date 1478392019 25200
>> #      Sat Nov 05 17:26:59 2016 -0700
>> # Node ID 3d4dd237b705479f8c7475b821ae719893381fa8
>> # Parent  d2261c558ca9639fb81c182de15d75151cbad0f9
>> repair: copy non-revlog store files during upgrade
>>
>> The store contains more than just revlogs. This patch teaches the
>> upgrade code to copy regular files as well.
>>
>> As the test changes demonstrate, the phaseroots file is now copied.
>>
>> diff --git a/mercurial/repair.py b/mercurial/repair.py
>> --- a/mercurial/repair.py
>> +++ b/mercurial/repair.py
>> @@ -10,6 +10,7 @@ from __future__ import absolute_import
>>
>>  import errno
>>  import hashlib
>> +import stat
>>  import tempfile
>>  import time
>>
>> @@ -622,6 +623,39 @@ def _copyrevlogs(ui, srcrepo, dstrepo, t
>>                 'across %d revlogs and %d revisions\n') % (
>>               dstsize, dstsize - srcsize, rlcount, revcount))
>>
>> +def _upgradefilterstorefile(srcrepo, dstrepo, requirements, path, mode, st):
>> +    """Determine whether to copy a store file during upgrade.
>> +
>> +    This function is called when migrating store files from ``srcrepo`` to
>> +    ``dstrepo`` as part of upgrading a repository. The function receives
>> +    the ``requirements`` for ``dstrepo``, the relative ``path`` of the
>> +    store file being examined, its ``ST_MODE`` file type, and a full
>> +    stat data structure.
>
> This args list is pretty substantial. Can I interest you in adopting
> the style used at
> https://www.mercurial-scm.org/repo/hg/file/tip/mercurial/manifest.py#l526
> or something similar that would give a more bullet-list-like
> experience for enumerating the arguments and their types and meanings?

+1

>> +
>> +    Function should return ``True`` if the file is to be copied.
>> +    """
>> +    # Skip revlogs.
>> +    if path.endswith(('.i', '.d')):
>> +        return False
>> +    # Skip transaction related files.
>> +    if path.startswith('undo'):
>> +        return False
>> +    # Only copy regular files.
>> +    if mode != stat.S_IFREG:
>> +        return False
>> +    # Skip other skipped files.
>> +    if path in ('lock', 'fncache'):
>> +        return False
>> +
>> +    return True
>
> I'm unclear which way we should bias. This is probably better than not
> copying anything we don't recognize though.

I might lean the other way, if an extension is adding data to the store, 
it would be better to blindly copy unknown content of the store to avoid 
discarding them. That would work great for extension which add new kind 
of information (eg: old evolve). The only case this would go bad is if 
an extension is directly referencing data in '.i'/'.d' from another 
file. But I've no such example in mind (but Jun probably have a secret 
laboratory under the Thames where he torture '.d' in such manners).

> This appears to only be files inside store. What about localtags and
> bookmarks? I suppose it doesn't matter, since this is a store-level
> upgrade that shouldn't change hashes.

as far as I understand we only touch the store so these one are just fine.

Other file we might worry about are the caches, some of them might rely 
on file size for validation and get confused. Greg, can you assert the 
situation regarding cache and come with a good story ?

>> +def _upgradefinishdatamigration(ui, srcrepo, dstrepo, requirements):
>> +    """Hook point for extensions to perform additional actions during upgrade.
>> +
>> +    This function is called after revlogs and store files have been copied but
>> +    before the new store is swapped into the original location.
>> +    """

That's "working" but we might consider going straight to the 
"list+dict+decorator" scheme we use in a couple of place (notable in 
exchange.py. It got created because the pure wrapping approach was often 
too limited.

>>  def _upgraderepo(ui, srcrepo, dstrepo, requirements):
>>      """Do the low-level work of upgrading a repository.
>>
>> @@ -641,7 +675,18 @@ def _upgraderepo(ui, srcrepo, dstrepo, r
>>      with dstrepo.transaction('upgrade') as tr:
>>          _copyrevlogs(ui, srcrepo, dstrepo, tr)
>>
>> -        # TODO copy non-revlog store files
>> +    # Now copy other files in the store directory.
>> +    for p, kind, st in srcrepo.store.vfs.readdir('', stat=True):
>> +        if not _upgradefilterstorefile(srcrepo, dstrepo, requirements,
>> +                                       p, kind, st):
>> +            continue
>> +
>> +        srcrepo.ui.write(_('copying %s\n') % p)
>> +        src = srcrepo.store.vfs.join(p)
>> +        dst = dstrepo.store.vfs.join(p)
>> +        util.copyfile(src, dst, copystat=True)
>> +
>> +    _upgradefinishdatamigration(ui, srcrepo, dstrepo, requirements)
>>
>>      ui.write(_('data fully migrated to temporary repository\n'))
>>
>> diff --git a/tests/test-upgrade-repo.t b/tests/test-upgrade-repo.t
>> --- a/tests/test-upgrade-repo.t
>> +++ b/tests/test-upgrade-repo.t
>> @@ -149,6 +149,7 @@ Upgrading a repository to generaldelta w
>>    migrating manifests...
>>    migrating changelog...
>>    revlogs migration complete; wrote 917 bytes (delta 0 bytes) across 5 revlogs and 9 revisions
>> +  copying phaseroots
>>    data fully migrated to temporary repository
>>    starting in-place swap of repository data
>>    (clients may error or see inconsistent repository data until this operation completes)
>> @@ -182,6 +183,7 @@ store directory has files we expect
>>    00manifest.i
>>    data
>>    fncache
>> +  phaseroots
>>    undo
>>    undo.backupfiles
>>    undo.phaseroots
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel@mercurial-scm.org
>> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>

Patch

diff --git a/mercurial/repair.py b/mercurial/repair.py
--- a/mercurial/repair.py
+++ b/mercurial/repair.py
@@ -10,6 +10,7 @@  from __future__ import absolute_import
 
 import errno
 import hashlib
+import stat
 import tempfile
 import time
 
@@ -622,6 +623,39 @@  def _copyrevlogs(ui, srcrepo, dstrepo, t
                'across %d revlogs and %d revisions\n') % (
              dstsize, dstsize - srcsize, rlcount, revcount))
 
+def _upgradefilterstorefile(srcrepo, dstrepo, requirements, path, mode, st):
+    """Determine whether to copy a store file during upgrade.
+
+    This function is called when migrating store files from ``srcrepo`` to
+    ``dstrepo`` as part of upgrading a repository. The function receives
+    the ``requirements`` for ``dstrepo``, the relative ``path`` of the
+    store file being examined, its ``ST_MODE`` file type, and a full
+    stat data structure.
+
+    Function should return ``True`` if the file is to be copied.
+    """
+    # Skip revlogs.
+    if path.endswith(('.i', '.d')):
+        return False
+    # Skip transaction related files.
+    if path.startswith('undo'):
+        return False
+    # Only copy regular files.
+    if mode != stat.S_IFREG:
+        return False
+    # Skip other skipped files.
+    if path in ('lock', 'fncache'):
+        return False
+
+    return True
+
+def _upgradefinishdatamigration(ui, srcrepo, dstrepo, requirements):
+    """Hook point for extensions to perform additional actions during upgrade.
+
+    This function is called after revlogs and store files have been copied but
+    before the new store is swapped into the original location.
+    """
+
 def _upgraderepo(ui, srcrepo, dstrepo, requirements):
     """Do the low-level work of upgrading a repository.
 
@@ -641,7 +675,18 @@  def _upgraderepo(ui, srcrepo, dstrepo, r
     with dstrepo.transaction('upgrade') as tr:
         _copyrevlogs(ui, srcrepo, dstrepo, tr)
 
-        # TODO copy non-revlog store files
+    # Now copy other files in the store directory.
+    for p, kind, st in srcrepo.store.vfs.readdir('', stat=True):
+        if not _upgradefilterstorefile(srcrepo, dstrepo, requirements,
+                                       p, kind, st):
+            continue
+
+        srcrepo.ui.write(_('copying %s\n') % p)
+        src = srcrepo.store.vfs.join(p)
+        dst = dstrepo.store.vfs.join(p)
+        util.copyfile(src, dst, copystat=True)
+
+    _upgradefinishdatamigration(ui, srcrepo, dstrepo, requirements)
 
     ui.write(_('data fully migrated to temporary repository\n'))
 
diff --git a/tests/test-upgrade-repo.t b/tests/test-upgrade-repo.t
--- a/tests/test-upgrade-repo.t
+++ b/tests/test-upgrade-repo.t
@@ -149,6 +149,7 @@  Upgrading a repository to generaldelta w
   migrating manifests...
   migrating changelog...
   revlogs migration complete; wrote 917 bytes (delta 0 bytes) across 5 revlogs and 9 revisions
+  copying phaseroots
   data fully migrated to temporary repository
   starting in-place swap of repository data
   (clients may error or see inconsistent repository data until this operation completes)
@@ -182,6 +183,7 @@  store directory has files we expect
   00manifest.i
   data
   fncache
+  phaseroots
   undo
   undo.backupfiles
   undo.phaseroots