Patchwork debug: automate the process of truncating a damaged obsstore

login
register
mail settings
Submitter Simon Farnsworth
Date June 30, 2016, 2:04 p.m.
Message ID <252cefa0326063bd664f.1467295449@devvm631.lla1.facebook.com>
Download mbox | patch
Permalink /patch/15695/
State Changes Requested
Delegated to: Yuya Nishihara
Headers show

Comments

Simon Farnsworth - June 30, 2016, 2:04 p.m.
# HG changeset patch
# User Simon Farnsworth <simonfar@fb.com>
# Date 1467295436 25200
#      Thu Jun 30 07:03:56 2016 -0700
# Node ID 252cefa0326063bd664f47e3628942df30d08ccf
# Parent  c42a3fd5c1fc5193e5f45887bfddaf05ca977fa4
debug: automate the process of truncating a damaged obsstore

We occasionally see users who've had a system crash damage the obsstore file
in their .hg/store directory; this makes all `hg` commands fail until we go
in and remove the damaged section of the obsstore by hand.

Automate the process we use when this happens, as a debug command because it
loses the corrupted data. We only use it in rare circumstances when it's
important to retrieve a user's work and apply it to a fresh clone.
Yuya Nishihara - July 3, 2016, 3:59 a.m.
On Thu, 30 Jun 2016 07:04:09 -0700, Simon Farnsworth wrote:
> # HG changeset patch
> # User Simon Farnsworth <simonfar@fb.com>
> # Date 1467295436 25200
> #      Thu Jun 30 07:03:56 2016 -0700
> # Node ID 252cefa0326063bd664f47e3628942df30d08ccf
> # Parent  c42a3fd5c1fc5193e5f45887bfddaf05ca977fa4
> debug: automate the process of truncating a damaged obsstore

(issue5265)

> +@command('debugtruncatestore',
> +    [('', 'obsolete', None, _('truncate bad markers in obsstore'))],
> +    _('[OPTION]'))
> +def debugtruncatestore(ui, repo, **opts):
> +    """Fix up repository corruption by truncating damaged files
> +
> +    Most on-disk data structures are designed to be append-only. A failed write
> +    (e.g. due to an unexpected power failure) can leave the file corrupted.
> +
> +    This command attempts to recover from that situation by replacing the
> +    corrupted file with a version that only contains the valid records from the
> +    broken file.
> +
> +    You should normally use :hg:`recover` before resorting to this command.
> +    """
> +
> +    if 'obsolete' in opts:
> +        data = repo.svfs.tryread('obsstore')
> +        if data:
> +            # Slow algorithm - but this is an emergency debug operation
> +            version = None
> +            corrupt = False
> +            while version is None:
> +                try:
> +                    (version, markers) = obsolete._readmarkers(data)
> +                except ValueError:
> +                    corrupt = True
> +                    version = None
> +                    data = data[:-1]
> +                    continue
> +                break
> +            if corrupt:
> +                repo.svfs.write('obsstore', data)
> +                ui.write(_('truncated obsstore\n'))
> +            else:
> +                ui.write(_('no corruption\n'))

It should be covered by a lock. I'm not sure if a transaction is necessary.

And can you add a test? I want to run it with/without --pure to see both
cases are handled well.
Simon Farnsworth - July 4, 2016, 3:20 p.m.
On 03/07/2016 04:59, Yuya Nishihara wrote:
> On Thu, 30 Jun 2016 07:04:09 -0700, Simon Farnsworth wrote:
>> # HG changeset patch
>> # User Simon Farnsworth <simonfar@fb.com>
>> # Date 1467295436 25200
>> #      Thu Jun 30 07:03:56 2016 -0700
>> # Node ID 252cefa0326063bd664f47e3628942df30d08ccf
>> # Parent  c42a3fd5c1fc5193e5f45887bfddaf05ca977fa4
>> debug: automate the process of truncating a damaged obsstore
>
> (issue5265)
>
Added to end of line.

>> +@command('debugtruncatestore',
>> +    [('', 'obsolete', None, _('truncate bad markers in obsstore'))],
>> +    _('[OPTION]'))
>> +def debugtruncatestore(ui, repo, **opts):
>> +    """Fix up repository corruption by truncating damaged files
>> +
>> +    Most on-disk data structures are designed to be append-only. A failed write
>> +    (e.g. due to an unexpected power failure) can leave the file corrupted.
>> +
>> +    This command attempts to recover from that situation by replacing the
>> +    corrupted file with a version that only contains the valid records from the
>> +    broken file.
>> +
>> +    You should normally use :hg:`recover` before resorting to this command.
>> +    """
>> +
>> +    if 'obsolete' in opts:
>> +        data = repo.svfs.tryread('obsstore')
>> +        if data:
>> +            # Slow algorithm - but this is an emergency debug operation
>> +            version = None
>> +            corrupt = False
>> +            while version is None:
>> +                try:
>> +                    (version, markers) = obsolete._readmarkers(data)
>> +                except ValueError:
>> +                    corrupt = True
>> +                    version = None
>> +                    data = data[:-1]
>> +                    continue
>> +                break
>> +            if corrupt:
>> +                repo.svfs.write('obsstore', data)
>> +                ui.write(_('truncated obsstore\n'))
>> +            else:
>> +                ui.write(_('no corruption\n'))
>
> It should be covered by a lock. I'm not sure if a transaction is necessary.
>

I've added a lock - instead of a transaction, I've renamed the corrupt 
file, and written out the new data. Not perfect,

> And can you add a test? I want to run it with/without --pure to see both
> cases are handled well.
>
Sure; I've also fixed it up to handle the different behaviour between C 
and pure Python implementations. v2 inbound.

Patch

diff --git a/mercurial/commands.py b/mercurial/commands.py
--- a/mercurial/commands.py
+++ b/mercurial/commands.py
@@ -3702,6 +3702,46 @@ 
             displayer.show(repo[r], **props)
         displayer.close()
 
+@command('debugtruncatestore',
+    [('', 'obsolete', None, _('truncate bad markers in obsstore'))],
+    _('[OPTION]'))
+def debugtruncatestore(ui, repo, **opts):
+    """Fix up repository corruption by truncating damaged files
+
+    Most on-disk data structures are designed to be append-only. A failed write
+    (e.g. due to an unexpected power failure) can leave the file corrupted.
+
+    This command attempts to recover from that situation by replacing the
+    corrupted file with a version that only contains the valid records from the
+    broken file.
+
+    You should normally use :hg:`recover` before resorting to this command.
+    """
+
+    if 'obsolete' in opts:
+        data = repo.svfs.tryread('obsstore')
+        if data:
+            # Slow algorithm - but this is an emergency debug operation
+            version = None
+            corrupt = False
+            while version is None:
+                try:
+                    (version, markers) = obsolete._readmarkers(data)
+                except ValueError:
+                    corrupt = True
+                    version = None
+                    data = data[:-1]
+                    continue
+                break
+            if corrupt:
+                repo.svfs.write('obsstore', data)
+                ui.write(_('truncated obsstore\n'))
+            else:
+                ui.write(_('no corruption\n'))
+
+        else:
+            ui.write(_('no obsstore\n'))
+
 @command('debugwalk', walkopts, _('[OPTION]... [FILE]...'), inferrepo=True)
 def debugwalk(ui, repo, *pats, **opts):
     """show how files match on given patterns"""