Patchwork [V4] status: add a flag to terse the output (issue4119)

login
register
mail settings
Submitter Pulkit Goyal
Date July 13, 2017, 6:58 p.m.
Message ID <d2f046b37522465606f3.1499972297@workspace>
Download mbox | patch
Permalink /patch/22299/
State Accepted
Headers show

Comments

Pulkit Goyal - July 13, 2017, 6:58 p.m.
# HG changeset patch
# User Pulkit Goyal <7895pulkit@gmail.com>
# Date 1497710422 -19800
#      Sat Jun 17 20:10:22 2017 +0530
# Node ID d2f046b37522465606f3824d663ece2608f0a23d
# Parent  50243c975fc2ee605ebac493f4ab18d37117e46a
# EXP-Topic tersestatus
status: add a flag to terse the output (issue4119)

This adds an experimental flag -t/--terse which will terse the output. The terse flag
will respect other flags which filters the output. The flag takes a string
whose value can be a subsequence of "marduic" (the order does not matter here.)

Ignored files are not considered while tersing unless -i flag is passed or 'i'
is there is the terse flag value.

The flag is experimental for testing as there may be cases which will produce
strange results with the flag. We can set the terse on by default by simply
passing 'u' to the cmdutil.tersestatus().

This patch also adds a test file with tests covering the new feature.
Augie Fackler - July 17, 2017, 3:08 p.m.
On Fri, Jul 14, 2017 at 12:28:17AM +0530, Pulkit Goyal wrote:
> # HG changeset patch
> # User Pulkit Goyal <7895pulkit@gmail.com>
> # Date 1497710422 -19800
> #      Sat Jun 17 20:10:22 2017 +0530
> # Node ID d2f046b37522465606f3824d663ece2608f0a23d
> # Parent  50243c975fc2ee605ebac493f4ab18d37117e46a
> # EXP-Topic tersestatus
> status: add a flag to terse the output (issue4119)

queued, thanks

I'm a little bummed at how complicated this turned out to be, but I'm
not convinced that only supporting tersing of unknown states (which I
suspect is the only common use case of this feature) would be
significantly simpler. :/ If we end up figuring out a better tradeoff
that requires removing some functionality, I'm fine with that given
the experimental nature of the feature.

One thing that'd be glorious to thread through in the future would be
a way to elide walking directories that we know will be shown as ? as
soon as we find a ? file in a directory that can't have any other
status (eg no tracked files in that dir and --ignore not set), as that
could stand to be an _enormous_ performance win for `hg status` when
ignores are incomplete.

>
> This adds an experimental flag -t/--terse which will terse the output. The terse flag
> will respect other flags which filters the output. The flag takes a string
> whose value can be a subsequence of "marduic" (the order does not matter here.)
>
> Ignored files are not considered while tersing unless -i flag is passed or 'i'
> is there is the terse flag value.
>
> The flag is experimental for testing as there may be cases which will produce
> strange results with the flag. We can set the terse on by default by simply
> passing 'u' to the cmdutil.tersestatus().
>
> This patch also adds a test file with tests covering the new feature.
Pulkit Goyal - July 26, 2017, 11:25 p.m.
Sorry for late reply.

On Mon, Jul 17, 2017 at 8:38 PM, Augie Fackler <raf@durin42.com> wrote:
> On Fri, Jul 14, 2017 at 12:28:17AM +0530, Pulkit Goyal wrote:
>> # HG changeset patch
>> # User Pulkit Goyal <7895pulkit@gmail.com>
>> # Date 1497710422 -19800
>> #      Sat Jun 17 20:10:22 2017 +0530
>> # Node ID d2f046b37522465606f3824d663ece2608f0a23d
>> # Parent  50243c975fc2ee605ebac493f4ab18d37117e46a
>> # EXP-Topic tersestatus
>> status: add a flag to terse the output (issue4119)
>
> queued, thanks
Thanks!
>
> I'm a little bummed at how complicated this turned out to be

I am also sadly.

> but I'm
> not convinced that only supporting tersing of unknown states (which I
> suspect is the only common use case of this feature) would be
> significantly simpler. :/ If we end up figuring out a better tradeoff
> that requires removing some functionality, I'm fine with that given
> the experimental nature of the feature.

Maybe we can precompute the ignored files and have them in statlist
before passing into tersestatus function. That way the code will me
more cleaner but I don't have idea about the runtime effect of this
approach.
>
> One thing that'd be glorious to thread through in the future would be
> a way to elide walking directories that we know will be shown as ? as
> soon as we find a ? file in a directory that can't have any other
> status (eg no tracked files in that dir and --ignore not set), as that
> could stand to be an _enormous_ performance win for `hg status` when
> ignores are incomplete.

I like the idea, there are also some other things which can be
improved. I will try to improve on the perf on this once the freeze
ends.

Patch

diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
--- a/mercurial/cmdutil.py
+++ b/mercurial/cmdutil.py
@@ -400,6 +400,178 @@ 
 
     return commit(ui, repo, recordinwlock, pats, opts)
 
+def tersestatus(root, statlist, status, ignorefn, ignore):
+    """
+    Returns a list of statuses with directory collapsed if all the files in the
+    directory has the same status.
+    """
+
+    def numfiles(dirname):
+        """
+        Calculates the number of tracked files in a given directory which also
+        includes files which were removed or deleted. Considers ignored files
+        if ignore argument is True or 'i' is present in status argument.
+        """
+        if lencache.get(dirname):
+            return lencache[dirname]
+        if 'i' in status or ignore:
+            def match(localpath):
+                absolutepath = os.path.join(root, localpath)
+                if os.path.isdir(absolutepath) and isemptydir(absolutepath):
+                    return True
+                return False
+        else:
+            def match(localpath):
+                # there can be directory whose all the files are ignored and
+                # hence the drectory should also be ignored while counting
+                # number of files or subdirs in it's parent directory. This
+                # checks the same.
+                # XXX: We need a better logic here.
+                if os.path.isdir(os.path.join(root, localpath)):
+                    return isignoreddir(localpath)
+                else:
+                    # XXX: there can be files which have the ignored pattern but
+                    # are not ignored. That leads to bug in counting number of
+                    # tracked files in the directory.
+                    return ignorefn(localpath)
+        lendir = 0
+        abspath = os.path.join(root, dirname)
+        # There might be cases when a directory does not exists as the whole
+        # directory can be removed and/or deleted.
+        try:
+            for f in os.listdir(abspath):
+                localpath = os.path.join(dirname, f)
+                if not match(localpath):
+                    lendir += 1
+        except OSError:
+            pass
+        lendir += len(absentdir.get(dirname, []))
+        lencache[dirname] = lendir
+        return lendir
+
+    def isemptydir(abspath):
+        """
+        Check whether a directory is empty or not, i.e. there is no files in the
+        directory and all its subdirectories.
+        """
+        for f in os.listdir(abspath):
+            fullpath = os.path.join(abspath, f)
+            if os.path.isdir(fullpath):
+                # recursion here
+                ret = isemptydir(fullpath)
+                if not ret:
+                    return False
+            else:
+                return False
+        return True
+
+    def isignoreddir(localpath):
+        """
+        This function checks whether the directory contains only ignored files
+        and hence should the directory be considered ignored. Returns True, if
+        that should be ignored otherwise False.
+        """
+        dirpath = os.path.join(root, localpath)
+        for f in os.listdir(dirpath):
+            filepath = os.path.join(dirpath, f)
+            if os.path.isdir(filepath):
+                # recursion here
+                ret = isignoreddir(os.path.join(localpath, f))
+                if not ret:
+                    return False
+            else:
+                if not ignorefn(os.path.join(localpath, f)):
+                    return False
+        return True
+
+    def absentones(removedfiles, missingfiles):
+        """
+        Returns a dictionary of directories with files in it which are either
+        removed or missing (deleted) in them.
+        """
+        absentdir = {}
+        absentfiles = removedfiles + missingfiles
+        while absentfiles:
+            f = absentfiles.pop()
+            par = os.path.dirname(f)
+            if par == '':
+                continue
+            # we need to store files rather than number of files as some files
+            # or subdirectories in a directory can be counted twice. This is
+            # also we have used sets here.
+            try:
+                absentdir[par].add(f)
+            except KeyError:
+                absentdir[par] = set([f])
+            absentfiles.append(par)
+        return absentdir
+
+    indexes = {'m': 0, 'a': 1, 'r': 2, 'd': 3, 'u': 4, 'i': 5, 'c': 6}
+    # get a dictonary of directories and files which are missing as os.listdir()
+    # won't be able to list them.
+    absentdir = absentones(statlist[2], statlist[3])
+    finalrs = [[]] * len(indexes)
+    didsomethingchanged = False
+    # dictionary to store number of files and subdir in a directory so that we
+    # don't compute that again.
+    lencache = {}
+
+    for st in pycompat.bytestr(status):
+
+        try:
+            ind = indexes[st]
+        except KeyError:
+            # TODO: Need a better error message here
+            raise error.Abort("'%s' not recognized" % st)
+
+        sfiles = statlist[ind]
+        if not sfiles:
+            continue
+        pardict = {}
+        for a in sfiles:
+            par = os.path.dirname(a)
+            pardict.setdefault(par, []).append(a)
+
+        rs = []
+        newls = []
+        for par, files in pardict.iteritems():
+            lenpar = numfiles(par)
+            if lenpar == len(files):
+                newls.append(par)
+
+        if not newls:
+            continue
+
+        while newls:
+            newel = newls.pop()
+            if newel == '':
+                continue
+            parn = os.path.dirname(newel)
+            pardict[newel] = []
+            # Adding pycompat.ossep as newel is a directory.
+            pardict.setdefault(parn, []).append(newel + pycompat.ossep)
+            lenpar = numfiles(parn)
+            if lenpar == len(pardict[parn]):
+                newls.append(parn)
+
+        # dict.values() for Py3 compatibility
+        for files in pardict.values():
+            rs.extend(files)
+
+        rs.sort()
+        finalrs[ind] = rs
+        didsomethingchanged = True
+
+    # If nothing is changed, make sure the order of files is preserved.
+    if not didsomethingchanged:
+        return statlist
+
+    for x in xrange(len(indexes)):
+        if not finalrs[x]:
+            finalrs[x] = statlist[x]
+
+    return finalrs
+
 def findpossible(cmd, table, strict=False):
     """
     Return cmd -> (aliases, command table entry)
diff --git a/mercurial/commands.py b/mercurial/commands.py
--- a/mercurial/commands.py
+++ b/mercurial/commands.py
@@ -4617,6 +4617,7 @@ 
     ('u', 'unknown', None, _('show only unknown (not tracked) files')),
     ('i', 'ignored', None, _('show only ignored files')),
     ('n', 'no-status', None, _('hide status prefix')),
+    ('t', 'terse', '', _('show the terse output (EXPERIMENTAL)')),
     ('C', 'copies', None, _('show source of copied files')),
     ('0', 'print0', None, _('end filenames with NUL, for use with xargs')),
     ('', 'rev', [], _('show difference from revision'), _('REV')),
@@ -4662,6 +4663,16 @@ 
 
     .. container:: verbose
 
+      The -t/--terse option abbreviates the output by showing directory name
+      if all the files in it share the same status. The option expects a value
+      which can be a string formed by using 'm', 'a', 'r', 'd', 'u', 'i', 'c'
+      where, 'm' stands for 'modified', 'a' for 'added', 'r' for 'removed',
+      'd' for 'deleted', 'u' for 'unknown', 'i' for 'ignored' and 'c' for clean.
+
+      It terses the output of only those status which are passed. The ignored
+      files are not considered while tersing until 'i' is there in --terse value
+      or the --ignored option is used.
+
       Examples:
 
       - show changes in the working directory relative to a
@@ -4688,10 +4699,14 @@ 
     opts = pycompat.byteskwargs(opts)
     revs = opts.get('rev')
     change = opts.get('change')
+    terse = opts.get('terse')
 
     if revs and change:
         msg = _('cannot specify --rev and --change at the same time')
         raise error.Abort(msg)
+    elif revs and terse:
+        msg = _('cannot use --terse with --rev')
+        raise error.Abort(msg)
     elif change:
         node2 = scmutil.revsingle(repo, change, None).node()
         node1 = repo[node2].p1().node()
@@ -4712,6 +4727,7 @@ 
     show = [k for k in states if opts.get(k)]
     if opts.get('all'):
         show += ui.quiet and (states[:4] + ['clean']) or states
+
     if not show:
         if ui.quiet:
             show = states[:4]
@@ -4722,6 +4738,9 @@ 
     stat = repo.status(node1, node2, m,
                        'ignored' in show, 'clean' in show, 'unknown' in show,
                        opts.get('subrepos'))
+    if terse:
+        stat = cmdutil.tersestatus(repo.root, stat, terse,
+                                    repo.dirstate._ignore, opts.get('ignored'))
     changestates = zip(states, pycompat.iterbytestr('MAR!?IC'), stat)
 
     if (opts.get('all') or opts.get('copies')
diff --git a/tests/test-completion.t b/tests/test-completion.t
--- a/tests/test-completion.t
+++ b/tests/test-completion.t
@@ -230,7 +230,7 @@ 
   push: force, rev, bookmark, branch, new-branch, ssh, remotecmd, insecure
   remove: after, force, subrepos, include, exclude
   serve: accesslog, daemon, daemon-postexec, errorlog, port, address, prefix, name, web-conf, webdir-conf, pid-file, stdio, cmdserver, templates, style, ipv6, certificate, subrepos
-  status: all, modified, added, removed, deleted, clean, unknown, ignored, no-status, copies, print0, rev, change, include, exclude, subrepos, template
+  status: all, modified, added, removed, deleted, clean, unknown, ignored, no-status, terse, copies, print0, rev, change, include, exclude, subrepos, template
   summary: remote
   update: clean, check, merge, date, rev, tool
   addremove: similarity, subrepos, include, exclude, dry-run
diff --git a/tests/test-terse-status.t b/tests/test-terse-status.t
new file mode 100644
--- /dev/null
+++ b/tests/test-terse-status.t
@@ -0,0 +1,185 @@ 
+  $ mkdir folder
+  $ cd folder
+  $ hg init
+  $ mkdir x x/l x/m x/n x/l/u x/l/u/a
+  $ touch a b x/aa.o x/bb.o
+  $ hg status
+  ? a
+  ? b
+  ? x/aa.o
+  ? x/bb.o
+
+  $ hg status --terse u
+  ? a
+  ? b
+  ? x/
+  $ hg status --terse maudric
+  ? a
+  ? b
+  ? x/
+  $ hg status --terse madric
+  ? a
+  ? b
+  ? x/aa.o
+  ? x/bb.o
+  $ hg status --terse f
+  abort: 'f' not recognized
+  [255]
+
+Add a .hgignore so that we can also have ignored files
+
+  $ echo ".*\.o" > .hgignore
+  $ hg status
+  ? .hgignore
+  ? a
+  ? b
+  $ hg status -i
+  I x/aa.o
+  I x/bb.o
+
+Tersing ignored files
+  $ hg status -t i --ignored
+  I x/
+
+Adding more files
+  $ mkdir y
+  $ touch x/aa x/bb y/l y/m y/l.o y/m.o
+  $ touch x/l/aa x/m/aa x/n/aa x/l/u/bb x/l/u/a/bb
+
+  $ hg status
+  ? .hgignore
+  ? a
+  ? b
+  ? x/aa
+  ? x/bb
+  ? x/l/aa
+  ? x/l/u/a/bb
+  ? x/l/u/bb
+  ? x/m/aa
+  ? x/n/aa
+  ? y/l
+  ? y/m
+
+  $ hg status --terse u
+  ? .hgignore
+  ? a
+  ? b
+  ? x/
+  ? y/
+
+  $ hg add x/aa x/bb .hgignore
+  $ hg status --terse au
+  A .hgignore
+  A x/aa
+  A x/bb
+  ? a
+  ? b
+  ? x/l/
+  ? x/m/
+  ? x/n/
+  ? y/
+
+Including ignored files
+
+  $ hg status --terse aui
+  A .hgignore
+  A x/aa
+  A x/bb
+  ? a
+  ? b
+  ? x/l/
+  ? x/m/
+  ? x/n/
+  ? y/l
+  ? y/m
+  $ hg status --terse au -i
+  I x/aa.o
+  I x/bb.o
+  I y/l.o
+  I y/m.o
+
+Committing some of the files
+
+  $ hg commit x/aa x/bb .hgignore -m "First commit"
+  $ hg status
+  ? a
+  ? b
+  ? x/l/aa
+  ? x/l/u/a/bb
+  ? x/l/u/bb
+  ? x/m/aa
+  ? x/n/aa
+  ? y/l
+  ? y/m
+  $ hg status --terse mardu
+  ? a
+  ? b
+  ? x/l/
+  ? x/m/
+  ? x/n/
+  ? y/
+
+Modifying already committed files
+
+  $ echo "Hello" >> x/aa
+  $ echo "World" >> x/bb
+  $ hg status --terse maurdc
+  M x/aa
+  M x/bb
+  ? a
+  ? b
+  ? x/l/
+  ? x/m/
+  ? x/n/
+  ? y/
+
+Respecting other flags
+
+  $ hg status --terse marduic --all
+  M x/aa
+  M x/bb
+  ? a
+  ? b
+  ? x/l/
+  ? x/m/
+  ? x/n/
+  ? y/l
+  ? y/m
+  I x/aa.o
+  I x/bb.o
+  I y/l.o
+  I y/m.o
+  C .hgignore
+  $ hg status --terse marduic -a
+  $ hg status --terse marduic -c
+  C .hgignore
+  $ hg status --terse marduic -m
+  M x/aa
+  M x/bb
+
+Passing 'i' in terse value will consider the ignored files while tersing
+
+  $ hg status --terse marduic -u
+  ? a
+  ? b
+  ? x/l/
+  ? x/m/
+  ? x/n/
+  ? y/l
+  ? y/m
+
+Omitting 'i' in terse value does not consider ignored files while tersing
+
+  $ hg status --terse marduc -u
+  ? a
+  ? b
+  ? x/l/
+  ? x/m/
+  ? x/n/
+  ? y/
+
+Trying with --rev
+
+  $ hg status --terse marduic --rev 0 --rev 1
+  abort: cannot use --terse with --rev
+  [255]