Submitter | Sean Farley |
---|---|
Date | June 1, 2017, 12:25 a.m. |
Message ID | <498dae194ccf1e82caed.1496276720@1.0.0.127.in-addr.arpa> |
Download | mbox | patch |
Permalink | /patch/21111/ |
State | Accepted, archived |
Headers | show |
Comments
Sean Farley <sean@farley.io> writes: > # HG changeset patch > # User Sean Farley <sean@farley.io> > # Date 1494536549 25200 > # Thu May 11 14:02:29 2017 -0700 > # Branch wctxds > # Node ID 498dae194ccf1e82caed51a02e6ce0b77f8d92e8 > # Parent c8f9bb73d4308b5629ac786fd5e671ed717b2b58 > context: move dirstate to workingctx > > This commit eventually needs to be broken up (so consider this a WIP). > Currently, dirstate is mostly an implementation optimization to prevent > reading the status from disk. It's also a staging area for metadata for > the next commit (never mind the branch name ;-)). > > As such, let's move the dirstate to workingctx and split up the logic > for the metadata at some point later. This is a request-for-comment about removing the dirstate property from localrepo before going all the way through with it. The general idea is that things involving the working directory will be migrated to workingctx and localrepo will no long having methods to access the disk. That would make future work for in-memory operations easy to swap out by just passing in a memctx instead of workingctx.
While this might work in a short term to work with merge.update, I think dirstate is special and complex enough to be detached from a ctx (ex. memctx) object, except for wctx. Ideally, I think the main wctx/mctx API is just __setitem__(path, fctx): # change content ctx[path] = overlayfilectx(ctx[path], datafunc=lambda: newdata) # change exec or symlink ctx[path] = overlayfilectx(ctx[path], flags='xl') # change copy information ctx[path] = overlayfilectx(ctx[path], copied=(path, filenode)) # remove a file ctx[path] = None Since ctx knows a manifest, the above interface is enough for a mctx to figure out what's added, modified, removed, copied at commit time. wctx could use dirstate as an implementation detail to deal with "copied" change. The benefit of "writing a fctx" allows us to have LFS fast path for free. For an extension developer's point of view, I think it's more friendly to make the following just work: mctx = memctx() mctx['newfile'] = memfctx(data='new file content') mctx['cheap-copy-a-large-file'] = ctx['a-large-file'] mctx['deleteme'] = None mctx.commit() without requiring a developer to figure out dirstate manually like: mctx.dirstate.add('newfile') ... Excerpts from Sean Farley's message of 2017-05-31 17:28:49 -0700: > Sean Farley <sean@farley.io> writes: > > > # HG changeset patch > > # User Sean Farley <sean@farley.io> > > # Date 1494536549 25200 > > # Thu May 11 14:02:29 2017 -0700 > > # Branch wctxds > > # Node ID 498dae194ccf1e82caed51a02e6ce0b77f8d92e8 > > # Parent c8f9bb73d4308b5629ac786fd5e671ed717b2b58 > > context: move dirstate to workingctx > > > > This commit eventually needs to be broken up (so consider this a WIP). > > Currently, dirstate is mostly an implementation optimization to prevent > > reading the status from disk. It's also a staging area for metadata for > > the next commit (never mind the branch name ;-)). > > > > As such, let's move the dirstate to workingctx and split up the logic > > for the metadata at some point later. > > This is a request-for-comment about removing the dirstate property from > localrepo before going all the way through with it. The general idea is > that things involving the working directory will be migrated to > workingctx and localrepo will no long having methods to access the disk. > > That would make future work for in-memory operations easy to swap out by > just passing in a memctx instead of workingctx.
Jun Wu <quark@fb.com> writes: > While this might work in a short term to work with merge.update, I think > dirstate is special and complex enough to be detached from a ctx (ex. > memctx) object, except for wctx. > > Ideally, I think the main wctx/mctx API is just __setitem__(path, fctx): > > # change content > ctx[path] = overlayfilectx(ctx[path], datafunc=lambda: newdata) > > # change exec or symlink > ctx[path] = overlayfilectx(ctx[path], flags='xl') > > # change copy information > ctx[path] = overlayfilectx(ctx[path], copied=(path, filenode)) > > # remove a file > ctx[path] = None > > Since ctx knows a manifest, the above interface is enough for a mctx to > figure out what's added, modified, removed, copied at commit time. wctx > could use dirstate as an implementation detail to deal with "copied" > change. The benefit of "writing a fctx" allows us to have LFS fast path for > free. > > For an extension developer's point of view, I think it's more friendly to > make the following just work: > > mctx = memctx() > mctx['newfile'] = memfctx(data='new file content') > mctx['cheap-copy-a-large-file'] = ctx['a-large-file'] > mctx['deleteme'] = None > mctx.commit() > > without requiring a developer to figure out dirstate manually like: > > mctx.dirstate.add('newfile') > ... Yeah, that's pretty much what I had in mind. I guess I had imagined this RFC direction to be more-or-less a cleanup that would end with dirstate becoming a private member (then later doing as you outline here)
On Wed, May 31, 2017 at 05:28:49PM -0700, Sean Farley wrote: > Sean Farley <sean@farley.io> writes: > > > # HG changeset patch > > # User Sean Farley <sean@farley.io> > > # Date 1494536549 25200 > > # Thu May 11 14:02:29 2017 -0700 > > # Branch wctxds > > # Node ID 498dae194ccf1e82caed51a02e6ce0b77f8d92e8 > > # Parent c8f9bb73d4308b5629ac786fd5e671ed717b2b58 > > context: move dirstate to workingctx > > > > This commit eventually needs to be broken up (so consider this a WIP). > > Currently, dirstate is mostly an implementation optimization to prevent > > reading the status from disk. It's also a staging area for metadata for > > the next commit (never mind the branch name ;-)). > > > > As such, let's move the dirstate to workingctx and split up the logic > > for the metadata at some point later. > > This is a request-for-comment about removing the dirstate property from > localrepo before going all the way through with it. The general idea is > that things involving the working directory will be migrated to > workingctx and localrepo will no long having methods to access the disk. +1 - I love things that make it easier to inject alternate backends for things into localrepo. Sounds like this is going in a good direction. > > That would make future work for in-memory operations easy to swap out by > just passing in a memctx instead of workingctx. > _______________________________________________ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Patch
diff --git a/mercurial/context.py b/mercurial/context.py index e29eff7..a3dc7c5 100644 --- a/mercurial/context.py +++ b/mercurial/context.py @@ -1562,25 +1562,32 @@ class workingctx(committablectx): """ def __init__(self, repo, text="", user=None, date=None, extra=None, changes=None): super(workingctx, self).__init__(repo, text, user, date, extra, changes) + # This is where working directory references should live. For now we + # just borrow references to stuff in localrepo. + + @property + def dirstate(self): + return self._repo.dirstate + def __iter__(self): - d = self._repo.dirstate + d = self.dirstate for f in d: if d[f] != 'r': yield f def __contains__(self, key): - return self._repo.dirstate[key] not in "?r" + return self.dirstate[key] not in "?r" def hex(self): return hex(wdirid) @propertycache def _parents(self): - p = self._repo.dirstate.parents() + p = self.dirstate.parents() if p[1] == nullid: p = p[:-1] return [changectx(self._repo, x) for x in p] def filectx(self, path, filelog=None): @@ -1601,11 +1608,11 @@ class workingctx(committablectx): (missing and self.deleted())) def add(self, list, prefix=""): join = lambda f: os.path.join(prefix, f) with self._repo.wlock(): - ui, ds = self._repo.ui, self._repo.dirstate + ui, ds = self._repo.ui, self.dirstate rejected = [] lstat = self._repo.wvfs.lstat for f in list: scmutil.checkportable(ui, join(f)) try: @@ -1635,30 +1642,30 @@ class workingctx(committablectx): def forget(self, files, prefix=""): join = lambda f: os.path.join(prefix, f) with self._repo.wlock(): rejected = [] for f in files: - if f not in self._repo.dirstate: + if f not in self.dirstate: self._repo.ui.warn(_("%s not tracked!\n") % join(f)) rejected.append(f) - elif self._repo.dirstate[f] != 'a': - self._repo.dirstate.remove(f) + elif self.dirstate[f] != 'a': + self.dirstate.remove(f) else: - self._repo.dirstate.drop(f) + self.dirstate.drop(f) return rejected def undelete(self, list): pctxs = self.parents() with self._repo.wlock(): for f in list: - if self._repo.dirstate[f] != 'r': + if self.dirstate[f] != 'r': self._repo.ui.warn(_("%s not removed!\n") % f) else: fctx = f in pctxs[0] and pctxs[0][f] or pctxs[1][f] t = fctx.data() self._repo.wwrite(f, t, fctx.flags()) - self._repo.dirstate.normal(f) + self.dirstate.normal(f) def copy(self, source, dest): try: st = self._repo.wvfs.lstat(dest) except OSError as err: @@ -1669,15 +1676,15 @@ class workingctx(committablectx): if not (stat.S_ISREG(st.st_mode) or stat.S_ISLNK(st.st_mode)): self._repo.ui.warn(_("copy failed: %s is not a file or a " "symbolic link\n") % dest) else: with self._repo.wlock(): - if self._repo.dirstate[dest] in '?': - self._repo.dirstate.add(dest) - elif self._repo.dirstate[dest] in 'r': - self._repo.dirstate.normallookup(dest) - self._repo.dirstate.copy(source, dest) + if self.dirstate[dest] in '?': + self.dirstate.add(dest) + elif self.dirstate[dest] in 'r': + self.dirstate.normallookup(dest) + self.dirstate.copy(source, dest) def match(self, pats=None, include=None, exclude=None, default='glob', listsubrepos=False, badfn=None): if pats is None: pats = [] @@ -1690,11 +1697,11 @@ class workingctx(committablectx): default, auditor=r.auditor, ctx=self, listsubrepos=listsubrepos, badfn=badfn, icasefs=icasefs) def _filtersuspectsymlink(self, files): - if not files or self._repo.dirstate._checklink: + if not files or self.dirstate._checklink: return files # Symlink placeholders may get non-symlink-like contents # via user error or dereferencing by NFS or Samba servers, # so we filter out any placeholders that don't look like a @@ -1732,18 +1739,18 @@ class workingctx(committablectx): # updating the dirstate is optional # so we don't wait on the lock # wlock can invalidate the dirstate, so cache normal _after_ # taking the lock with self._repo.wlock(False): - normal = self._repo.dirstate.normal + normal = self.dirstate.normal for f in fixup: normal(f) # write changes out explicitly, because nesting # wlock at runtime may prevent 'wlock.release()' # after this block from doing so for subsequent # changing files - self._repo.dirstate.write(self._repo.currenttransaction()) + self.dirstate.write(self._repo.currenttransaction()) except error.LockError: pass return modified, fixup def _dirstatestatus(self, match=None, ignored=False, clean=False, @@ -1752,11 +1759,11 @@ class workingctx(committablectx): listignored, listclean, listunknown = ignored, clean, unknown match = match or matchmod.always(self._repo.root, self._repo.getcwd()) subrepos = [] if '.hgsub' in self: subrepos = sorted(self.substate) - cmp, s = self._repo.dirstate.status(match, subrepos, listignored, + cmp, s = self.dirstate.status(match, subrepos, listignored, listclean, listunknown) # check for any possibly clean files if cmp: modified2, fixup = self._checklookup(cmp) @@ -1846,11 +1853,11 @@ class workingctx(committablectx): def bad(f, msg): # 'f' may be a directory pattern from 'match.files()', # so 'f not in ctx1' is not enough if f not in other and not other.hasdir(f): self._repo.ui.warn('%s: %s\n' % - (self._repo.dirstate.pathto(f), msg)) + (self.dirstate.pathto(f), msg)) match.bad = bad return match class committablefilectx(basefilectx): """A committablefilectx provides common functionality for a file context