Submitter | Matt Harbison |
---|---|
Date | July 3, 2016, 10 p.m. |
Message ID | <e9fce4275ce6b26b941f.1467583211@Envy> |
Download | mbox | patch |
Permalink | /patch/15727/ |
State | RFC, archived |
Headers | show |
Comments
On Sun, 03 Jul 2016 18:00:11 -0400, Matt Harbison wrote: > # HG changeset patch > # User Matt Harbison <matt_harbison@yahoo.com> > # Date 1467571904 14400 > # Sun Jul 03 14:51:44 2016 -0400 > # Node ID e9fce4275ce6b26b941f47044744015e90367e7b > # Parent de4a80a2b45c6fcae0948ac12872dd8a61ced26a > scmutil: introduce the flagstate class to suppliment missing flag support > > This will allow manipulation of the executable bit when filesystems don't track > it natively. This file isn't tracked in any form, and will likely be small > since it can be purged completely when wdir is cleaned- the manifest tracks the > flags in permanent history. It is intended only for platforms that don't have > native flag support, and will noop on platforms that do for simplicity. > Obviously a tree copy in either direction between filesystems with and without > flag support will lose the flag state, but that seems like an edge case that > isn't worth having two sources of data that can get out of sync on Unix. > > We could in theory also support designating a file as a symlink, so I kept this > generic. Since the file isn't tracked, and the flags are stored symbolically, > it should be extensible without changing the file format. So I didn't bother > with a version identifier. I have no idea if there would be any encoding > concerns (I saw references to encode/decode in the fncache class when I was > trying to find a suitable bit of code to model), but I wouldn't think so. > > I couldn't find an existing suitable file format, and I really wanted to just > munge the flags on dirstate to get `commit`, `update -C` and `revert` handling > for free. But dirstate doesn't know about the flag building function used in > context.py, and seems to be populated by what the OS reports, though I couldn't > figure out where exactly. I tried overwriting the dirstate entry for a file > with a new dirstatetuple containing the new mode and writing it out, but it > eventually seemed to revert back to missing the executable bit. Since dirstate > is performance sensitive, maybe we shouldn't be fiddling with that anyway. Since dirstate seems to keep mode bits for 'n'ormal files, this field could be (re|ab)used. I don't know how hard it would be, but there's an old proposal. https://www.mercurial-scm.org/wiki/DirState#Proposed_extensions Using a separate storage class might be the reason why we have to fix the flagstate explicitly as you do in the patch 3. > + def write(self, tr): > + if self._dirty: > + tr.addbackup('flagstate') # was fncache > + fp = self._repo.svfs('flagstate', mode='wb', atomictemp=True) If I understand it, the flagstate belongs to the working directory, not to the store. > + try: > + for k, v in self._entries.iteritems(): > + fp.write(k + '\0' + v + '\n') A filename may contain '\n' on non-Windows.
On Tue, 05 Jul 2016 11:04:10 -0400, Yuya Nishihara <yuya@tcha.org> wrote: > On Sun, 03 Jul 2016 18:00:11 -0400, Matt Harbison wrote: >> # HG changeset patch >> # User Matt Harbison <matt_harbison@yahoo.com> >> # Date 1467571904 14400 >> # Sun Jul 03 14:51:44 2016 -0400 >> # Node ID e9fce4275ce6b26b941f47044744015e90367e7b >> # Parent de4a80a2b45c6fcae0948ac12872dd8a61ced26a >> scmutil: introduce the flagstate class to suppliment missing flag >> support >> >> This will allow manipulation of the executable bit when filesystems >> don't track >> it natively. This file isn't tracked in any form, and will likely be >> small >> since it can be purged completely when wdir is cleaned- the manifest >> tracks the >> flags in permanent history. It is intended only for platforms that >> don't have >> native flag support, and will noop on platforms that do for simplicity. >> Obviously a tree copy in either direction between filesystems with and >> without >> flag support will lose the flag state, but that seems like an edge case >> that >> isn't worth having two sources of data that can get out of sync on Unix. >> >> We could in theory also support designating a file as a symlink, so I >> kept this >> generic. Since the file isn't tracked, and the flags are stored >> symbolically, >> it should be extensible without changing the file format. So I didn't >> bother >> with a version identifier. I have no idea if there would be any >> encoding >> concerns (I saw references to encode/decode in the fncache class when I >> was >> trying to find a suitable bit of code to model), but I wouldn't think >> so. >> >> I couldn't find an existing suitable file format, and I really wanted >> to just >> munge the flags on dirstate to get `commit`, `update -C` and `revert` >> handling >> for free. But dirstate doesn't know about the flag building function >> used in >> context.py, and seems to be populated by what the OS reports, though I >> couldn't >> figure out where exactly. I tried overwriting the dirstate entry for a >> file >> with a new dirstatetuple containing the new mode and writing it out, >> but it >> eventually seemed to revert back to missing the executable bit. Since >> dirstate >> is performance sensitive, maybe we shouldn't be fiddling with that >> anyway. > > Since dirstate seems to keep mode bits for 'n'ormal files, this field > could be > (re|ab)used. I don't know how hard it would be, but there's an old > proposal. > > https://www.mercurial-scm.org/wiki/DirState#Proposed_extensions > > Using a separate storage class might be the reason why we have to fix the > flagstate explicitly as you do in the patch 3. OK, I think I understand what is being proposed on the wiki, and *might* see a path forward. Everyone is OK with stealing an implementation specific unused bit to signal that the rest of the field is unused? I guess I can do that as a first patch. It looks like update (via merge) is calling util.setflags(), and conveniently, so does patch. However, dirstate is nowhere to be found there. I assume that it is a layering violation for that method to take in a dirstate. Alternately, we could add dirstate.setflags() that calls util.setflags(). It seems like a weird construct either way (especially since dirstate needs to be written out to complete the flag change, but only on some platforms). There is at least one place (the convert svn sink), where we will never be able to make this work. That one is understandable, and I think the others are salvageable. >> + def write(self, tr): >> + if self._dirty: >> + tr.addbackup('flagstate') # was fncache >> + fp = self._repo.svfs('flagstate', mode='wb', >> atomictemp=True) > > If I understand it, the flagstate belongs to the working directory, not > to > the store. Yep, should have been repo.vfs. Any UI thoughts? >> + try: >> + for k, v in self._entries.iteritems(): >> + fp.write(k + '\0' + v + '\n') > > A filename may contain '\n' on non-Windows.
Patch
diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py --- a/mercurial/scmutil.py +++ b/mercurial/scmutil.py @@ -36,6 +36,7 @@ else: from . import scmposix as scmplatform +propertycache = util.propertycache systemrcpath = scmplatform.systemrcpath userrcpath = scmplatform.userrcpath @@ -230,6 +231,90 @@ key = s.digest() return key +class flagstate(object): + """Tracks file flags when the filesystem doesn't support them.""" + def __init__(self, repo): + self._entries = dict() + self._repo = repo + self._dirty = False + self._load() + + def __contains__(self, fn): + return fn in self._entries + + def __getitem__(self, fn): + return self._entries[fn] + + def __iter__(self): + for x in self._entries: + yield x + + def __setitem__(self, fn, flags): + if not self._checkexec: + self._dirty = True + self._entries[fn] = flags + + # Mark as unsure, so workingctx._dirstatestatus() notices the + # change. The dirstate code doesn't know about the flagsfunc that + # context uses, and only calculates its status based on actual + # filesystem permissions. + if self._repo.dirstate[fn] == 'n': + self._repo.dirstate.normallookup(fn) + + @propertycache + def _checkexec(self): + return util.checkexec(self._repo.root) + + def _load(self): + '''fill the entries from the flagstate file''' + if self._checkexec: + # The filesystem is authoratative if it supports exec permissions + return + + self._dirty = False + try: + fp = self._repo.svfs('flagstate', mode='rb') + except IOError as err: + # XXX: fncachestore.datafiles() is catching OSError, but this needs + # to be IOError here. Figure out the difference. + if err.errno != errno.ENOENT: + raise + return + + try: + for l in fp.read().splitlines(): + fn, flags = l.split('\0') + self._entries[fn] = flags + finally: + fp.close() + + def write(self, tr): + if self._dirty: + tr.addbackup('flagstate') # was fncache + fp = self._repo.svfs('flagstate', mode='wb', atomictemp=True) + try: + for k, v in self._entries.iteritems(): + fp.write(k + '\0' + v + '\n') + fp._fp.truncate() # XXX: Add to atomictempfile? + finally: + fp.close() + + self._repo.dirstate.write(tr) + + self._dirty = False + + def discard(self, filename=None): + """Remove the named file, or all files""" + if not self._checkexec: + if filename: + if filename in self: + del self._entries[filename] + self._dirty = True + else: + self._repo.svfs.unlinkpath(path='flagstate', ignoremissing=True) + self._entries.clear() + self._dirty = False + class abstractvfs(object): """Abstract base class; cannot be instantiated"""