Patchwork [1,of,3,RFC] context: move dirstate to workingctx

login
register
mail settings
Submitter Sean Farley
Date June 1, 2017, 12:25 a.m.
Message ID <498dae194ccf1e82caed.1496276720@1.0.0.127.in-addr.arpa>
Download mbox | patch
Permalink /patch/21111/
State Accepted, archived
Headers show

Comments

Sean Farley - June 1, 2017, 12:25 a.m.
# HG changeset patch
# User Sean Farley <sean@farley.io>
# Date 1494536549 25200
#      Thu May 11 14:02:29 2017 -0700
# Branch wctxds
# Node ID 498dae194ccf1e82caed51a02e6ce0b77f8d92e8
# Parent  c8f9bb73d4308b5629ac786fd5e671ed717b2b58
context: move dirstate to workingctx

This commit eventually needs to be broken up (so consider this a WIP).
Currently, dirstate is mostly an implementation optimization to prevent
reading the status from disk. It's also a staging area for metadata for
the next commit (never mind the branch name ;-)).

As such, let's move the dirstate to workingctx and split up the logic
for the metadata at some point later.
Sean Farley - June 1, 2017, 12:28 a.m.
Sean Farley <sean@farley.io> writes:

> # HG changeset patch
> # User Sean Farley <sean@farley.io>
> # Date 1494536549 25200
> #      Thu May 11 14:02:29 2017 -0700
> # Branch wctxds
> # Node ID 498dae194ccf1e82caed51a02e6ce0b77f8d92e8
> # Parent  c8f9bb73d4308b5629ac786fd5e671ed717b2b58
> context: move dirstate to workingctx
>
> This commit eventually needs to be broken up (so consider this a WIP).
> Currently, dirstate is mostly an implementation optimization to prevent
> reading the status from disk. It's also a staging area for metadata for
> the next commit (never mind the branch name ;-)).
>
> As such, let's move the dirstate to workingctx and split up the logic
> for the metadata at some point later.

This is a request-for-comment about removing the dirstate property from
localrepo before going all the way through with it. The general idea is
that things involving the working directory will be migrated to
workingctx and localrepo will no long having methods to access the disk.

That would make future work for in-memory operations easy to swap out by
just passing in a memctx instead of workingctx.
Jun Wu - June 1, 2017, 2:05 a.m.
While this might work in a short term to work with merge.update, I think
dirstate is special and complex enough to be detached from a ctx (ex.
memctx) object, except for wctx.

Ideally, I think the main wctx/mctx API is just __setitem__(path, fctx):

  # change content
  ctx[path] = overlayfilectx(ctx[path], datafunc=lambda: newdata)

  # change exec or symlink
  ctx[path] = overlayfilectx(ctx[path], flags='xl')

  # change copy information
  ctx[path] = overlayfilectx(ctx[path], copied=(path, filenode))

  # remove a file
  ctx[path] = None

Since ctx knows a manifest, the above interface is enough for a mctx to
figure out what's added, modified, removed, copied at commit time. wctx
could use dirstate as an implementation detail to deal with "copied"
change. The benefit of "writing a fctx" allows us to have LFS fast path for
free.

For an extension developer's point of view, I think it's more friendly to
make the following just work:

  mctx = memctx()
  mctx['newfile'] = memfctx(data='new file content')
  mctx['cheap-copy-a-large-file'] = ctx['a-large-file']
  mctx['deleteme'] = None
  mctx.commit()

without requiring a developer to figure out dirstate manually like:

  mctx.dirstate.add('newfile')
  ...

Excerpts from Sean Farley's message of 2017-05-31 17:28:49 -0700:
> Sean Farley <sean@farley.io> writes:
> 
> > # HG changeset patch
> > # User Sean Farley <sean@farley.io>
> > # Date 1494536549 25200
> > #      Thu May 11 14:02:29 2017 -0700
> > # Branch wctxds
> > # Node ID 498dae194ccf1e82caed51a02e6ce0b77f8d92e8
> > # Parent  c8f9bb73d4308b5629ac786fd5e671ed717b2b58
> > context: move dirstate to workingctx
> >
> > This commit eventually needs to be broken up (so consider this a WIP).
> > Currently, dirstate is mostly an implementation optimization to prevent
> > reading the status from disk. It's also a staging area for metadata for
> > the next commit (never mind the branch name ;-)).
> >
> > As such, let's move the dirstate to workingctx and split up the logic
> > for the metadata at some point later.
> 
> This is a request-for-comment about removing the dirstate property from
> localrepo before going all the way through with it. The general idea is
> that things involving the working directory will be migrated to
> workingctx and localrepo will no long having methods to access the disk.
> 
> That would make future work for in-memory operations easy to swap out by
> just passing in a memctx instead of workingctx.
Sean Farley - June 1, 2017, 2:12 a.m.
Jun Wu <quark@fb.com> writes:

> While this might work in a short term to work with merge.update, I think
> dirstate is special and complex enough to be detached from a ctx (ex.
> memctx) object, except for wctx.
>
> Ideally, I think the main wctx/mctx API is just __setitem__(path, fctx):
>
>   # change content
>   ctx[path] = overlayfilectx(ctx[path], datafunc=lambda: newdata)
>
>   # change exec or symlink
>   ctx[path] = overlayfilectx(ctx[path], flags='xl')
>
>   # change copy information
>   ctx[path] = overlayfilectx(ctx[path], copied=(path, filenode))
>
>   # remove a file
>   ctx[path] = None
>
> Since ctx knows a manifest, the above interface is enough for a mctx to
> figure out what's added, modified, removed, copied at commit time. wctx
> could use dirstate as an implementation detail to deal with "copied"
> change. The benefit of "writing a fctx" allows us to have LFS fast path for
> free.
>
> For an extension developer's point of view, I think it's more friendly to
> make the following just work:
>
>   mctx = memctx()
>   mctx['newfile'] = memfctx(data='new file content')
>   mctx['cheap-copy-a-large-file'] = ctx['a-large-file']
>   mctx['deleteme'] = None
>   mctx.commit()
>
> without requiring a developer to figure out dirstate manually like:
>
>   mctx.dirstate.add('newfile')
>   ...

Yeah, that's pretty much what I had in mind. I guess I had imagined this
RFC direction to be more-or-less a cleanup that would end with dirstate
becoming a private member (then later doing as you outline here)
Augie Fackler - June 1, 2017, 2:21 p.m.
On Wed, May 31, 2017 at 05:28:49PM -0700, Sean Farley wrote:
> Sean Farley <sean@farley.io> writes:
>
> > # HG changeset patch
> > # User Sean Farley <sean@farley.io>
> > # Date 1494536549 25200
> > #      Thu May 11 14:02:29 2017 -0700
> > # Branch wctxds
> > # Node ID 498dae194ccf1e82caed51a02e6ce0b77f8d92e8
> > # Parent  c8f9bb73d4308b5629ac786fd5e671ed717b2b58
> > context: move dirstate to workingctx
> >
> > This commit eventually needs to be broken up (so consider this a WIP).
> > Currently, dirstate is mostly an implementation optimization to prevent
> > reading the status from disk. It's also a staging area for metadata for
> > the next commit (never mind the branch name ;-)).
> >
> > As such, let's move the dirstate to workingctx and split up the logic
> > for the metadata at some point later.
>
> This is a request-for-comment about removing the dirstate property from
> localrepo before going all the way through with it. The general idea is
> that things involving the working directory will be migrated to
> workingctx and localrepo will no long having methods to access the disk.

+1 - I love things that make it easier to inject alternate backends
for things into localrepo. Sounds like this is going in a good direction.

>
> That would make future work for in-memory operations easy to swap out by
> just passing in a memctx instead of workingctx.



> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/context.py b/mercurial/context.py
index e29eff7..a3dc7c5 100644
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -1562,25 +1562,32 @@  class workingctx(committablectx):
     """
     def __init__(self, repo, text="", user=None, date=None, extra=None,
                  changes=None):
         super(workingctx, self).__init__(repo, text, user, date, extra, changes)
 
+        # This is where working directory references should live. For now we
+        # just borrow references to stuff in localrepo.
+
+    @property
+    def dirstate(self):
+        return self._repo.dirstate
+
     def __iter__(self):
-        d = self._repo.dirstate
+        d = self.dirstate
         for f in d:
             if d[f] != 'r':
                 yield f
 
     def __contains__(self, key):
-        return self._repo.dirstate[key] not in "?r"
+        return self.dirstate[key] not in "?r"
 
     def hex(self):
         return hex(wdirid)
 
     @propertycache
     def _parents(self):
-        p = self._repo.dirstate.parents()
+        p = self.dirstate.parents()
         if p[1] == nullid:
             p = p[:-1]
         return [changectx(self._repo, x) for x in p]
 
     def filectx(self, path, filelog=None):
@@ -1601,11 +1608,11 @@  class workingctx(committablectx):
                 (missing and self.deleted()))
 
     def add(self, list, prefix=""):
         join = lambda f: os.path.join(prefix, f)
         with self._repo.wlock():
-            ui, ds = self._repo.ui, self._repo.dirstate
+            ui, ds = self._repo.ui, self.dirstate
             rejected = []
             lstat = self._repo.wvfs.lstat
             for f in list:
                 scmutil.checkportable(ui, join(f))
                 try:
@@ -1635,30 +1642,30 @@  class workingctx(committablectx):
     def forget(self, files, prefix=""):
         join = lambda f: os.path.join(prefix, f)
         with self._repo.wlock():
             rejected = []
             for f in files:
-                if f not in self._repo.dirstate:
+                if f not in self.dirstate:
                     self._repo.ui.warn(_("%s not tracked!\n") % join(f))
                     rejected.append(f)
-                elif self._repo.dirstate[f] != 'a':
-                    self._repo.dirstate.remove(f)
+                elif self.dirstate[f] != 'a':
+                    self.dirstate.remove(f)
                 else:
-                    self._repo.dirstate.drop(f)
+                    self.dirstate.drop(f)
             return rejected
 
     def undelete(self, list):
         pctxs = self.parents()
         with self._repo.wlock():
             for f in list:
-                if self._repo.dirstate[f] != 'r':
+                if self.dirstate[f] != 'r':
                     self._repo.ui.warn(_("%s not removed!\n") % f)
                 else:
                     fctx = f in pctxs[0] and pctxs[0][f] or pctxs[1][f]
                     t = fctx.data()
                     self._repo.wwrite(f, t, fctx.flags())
-                    self._repo.dirstate.normal(f)
+                    self.dirstate.normal(f)
 
     def copy(self, source, dest):
         try:
             st = self._repo.wvfs.lstat(dest)
         except OSError as err:
@@ -1669,15 +1676,15 @@  class workingctx(committablectx):
         if not (stat.S_ISREG(st.st_mode) or stat.S_ISLNK(st.st_mode)):
             self._repo.ui.warn(_("copy failed: %s is not a file or a "
                                  "symbolic link\n") % dest)
         else:
             with self._repo.wlock():
-                if self._repo.dirstate[dest] in '?':
-                    self._repo.dirstate.add(dest)
-                elif self._repo.dirstate[dest] in 'r':
-                    self._repo.dirstate.normallookup(dest)
-                self._repo.dirstate.copy(source, dest)
+                if self.dirstate[dest] in '?':
+                    self.dirstate.add(dest)
+                elif self.dirstate[dest] in 'r':
+                    self.dirstate.normallookup(dest)
+                self.dirstate.copy(source, dest)
 
     def match(self, pats=None, include=None, exclude=None, default='glob',
               listsubrepos=False, badfn=None):
         if pats is None:
             pats = []
@@ -1690,11 +1697,11 @@  class workingctx(committablectx):
                               default, auditor=r.auditor, ctx=self,
                               listsubrepos=listsubrepos, badfn=badfn,
                               icasefs=icasefs)
 
     def _filtersuspectsymlink(self, files):
-        if not files or self._repo.dirstate._checklink:
+        if not files or self.dirstate._checklink:
             return files
 
         # Symlink placeholders may get non-symlink-like contents
         # via user error or dereferencing by NFS or Samba servers,
         # so we filter out any placeholders that don't look like a
@@ -1732,18 +1739,18 @@  class workingctx(committablectx):
                 # updating the dirstate is optional
                 # so we don't wait on the lock
                 # wlock can invalidate the dirstate, so cache normal _after_
                 # taking the lock
                 with self._repo.wlock(False):
-                    normal = self._repo.dirstate.normal
+                    normal = self.dirstate.normal
                     for f in fixup:
                         normal(f)
                     # write changes out explicitly, because nesting
                     # wlock at runtime may prevent 'wlock.release()'
                     # after this block from doing so for subsequent
                     # changing files
-                    self._repo.dirstate.write(self._repo.currenttransaction())
+                    self.dirstate.write(self._repo.currenttransaction())
             except error.LockError:
                 pass
         return modified, fixup
 
     def _dirstatestatus(self, match=None, ignored=False, clean=False,
@@ -1752,11 +1759,11 @@  class workingctx(committablectx):
         listignored, listclean, listunknown = ignored, clean, unknown
         match = match or matchmod.always(self._repo.root, self._repo.getcwd())
         subrepos = []
         if '.hgsub' in self:
             subrepos = sorted(self.substate)
-        cmp, s = self._repo.dirstate.status(match, subrepos, listignored,
+        cmp, s = self.dirstate.status(match, subrepos, listignored,
                                             listclean, listunknown)
 
         # check for any possibly clean files
         if cmp:
             modified2, fixup = self._checklookup(cmp)
@@ -1846,11 +1853,11 @@  class workingctx(committablectx):
             def bad(f, msg):
                 # 'f' may be a directory pattern from 'match.files()',
                 # so 'f not in ctx1' is not enough
                 if f not in other and not other.hasdir(f):
                     self._repo.ui.warn('%s: %s\n' %
-                                       (self._repo.dirstate.pathto(f), msg))
+                                       (self.dirstate.pathto(f), msg))
             match.bad = bad
         return match
 
 class committablefilectx(basefilectx):
     """A committablefilectx provides common functionality for a file context