Patchwork [3,of,4,V3] tags: add mergemode flag to _readtags

login
register
mail settings
Submitter Angel Ezquerra
Date June 25, 2014, 11:22 p.m.
Message ID <488f01d33770c338ebe9.1403738529@ubuntu>
Download mbox | patch
Permalink /patch/5066/
State Changes Requested
Headers show

Comments

Angel Ezquerra - June 25, 2014, 11:22 p.m.
# HG changeset patch
# User Angel Ezquerra <angel.ezquerra@gmail.com>
# Date 1403677632 -7200
#      Wed Jun 25 08:27:12 2014 +0200
# Node ID 488f01d33770c338ebe987b4efbd986b0ef91ed0
# Parent  10ffd0db8fea2e7324233136b262822e775c45a5
tags: add mergemode flag to _readtags

When this flag is enabled the output of the function changes in a way that will
make it easier to merge tags read using the _readtags function.

By default _readtags returns a sortdict of tags where each item in the sortdict
correponds to a two element tuple containing the current tag node and the node
history of the tag.

When the mergemode flag is enabled the output is a sortdict mapping each tag to
a list of [hexnode, line number] pairs. That is, each element is a pair where
the first item is a hex node and the second is the line number where that
hexnode was found on the tag file being read. Pairs are ordered from the oldest
to the most recent, current hexnode.

This is not used yet, but will be used on a follow up patch that will introduce
an automatic tag merge algorithm.
Matt Mackall - June 26, 2014, 11:38 p.m.
On Thu, 2014-06-26 at 01:22 +0200, Angel Ezquerra wrote:
> # HG changeset patch
> # User Angel Ezquerra <angel.ezquerra@gmail.com>
> # Date 1403677632 -7200
> #      Wed Jun 25 08:27:12 2014 +0200
> # Node ID 488f01d33770c338ebe987b4efbd986b0ef91ed0
> # Parent  10ffd0db8fea2e7324233136b262822e775c45a5
> tags: add mergemode flag to _readtags
...
> +    The output changes depending on the mergemode flag:
> +    - When mergemode is False (the default), return a mapping from tag name to
> +      (node, hist): node is the node id from the last line read for that name,
> +      and hist is the list of node ids previously associated with it (in file
> +      order).  All node ids are binary, not hex.
> +    - When mergemode is True, return a mapping from tag name to a list of
> +      [hexnode, line number] pairs, ordered from the oldest to the newest node.
> +      hex node is hex not binary.
> +      '''

I generally frown on functions that change return type based on
argument. Perhaps we can have:

_readtags() -> _readtagshist()

..assuming the overhead of tracking the extra history isn't too high.
Angel Ezquerra - June 27, 2014, 7:46 a.m.
On Fri, Jun 27, 2014 at 1:38 AM, Matt Mackall <mpm@selenic.com> wrote:
> On Thu, 2014-06-26 at 01:22 +0200, Angel Ezquerra wrote:
>> # HG changeset patch
>> # User Angel Ezquerra <angel.ezquerra@gmail.com>
>> # Date 1403677632 -7200
>> #      Wed Jun 25 08:27:12 2014 +0200
>> # Node ID 488f01d33770c338ebe987b4efbd986b0ef91ed0
>> # Parent  10ffd0db8fea2e7324233136b262822e775c45a5
>> tags: add mergemode flag to _readtags
> ...
>> +    The output changes depending on the mergemode flag:
>> +    - When mergemode is False (the default), return a mapping from tag name to
>> +      (node, hist): node is the node id from the last line read for that name,
>> +      and hist is the list of node ids previously associated with it (in file
>> +      order).  All node ids are binary, not hex.
>> +    - When mergemode is True, return a mapping from tag name to a list of
>> +      [hexnode, line number] pairs, ordered from the oldest to the newest node.
>> +      hex node is hex not binary.
>> +      '''
>
> I generally frown on functions that change return type based on
> argument. Perhaps we can have:
>
> _readtags() -> _readtagshist()
>
> ..assuming the overhead of tracking the extra history isn't too high.

I don't like doing that much either so I'll be happy to try something
else. I don't want to duplicate _readtags() on the new tagmerge
module, so the tags._readtags() function needs to change a bit
though...

I think that if _readtagshist() had two separte outputs (e.g. one with
the original _readtags() format) and another one with just the line
numbers) the performance hit should be fine (managing the extra dict
plus the extra function call). We could improve things further by
passing an extra flag to _readtagshist() which would disable the
calculation of the second output dict (e.g. make it return None). Then
the only overhead would be the extra function call and an additional
comparison for each tag.

I'll give it a try and send a new patch series.

Thanks,

Angel

Patch

diff --git a/mercurial/tags.py b/mercurial/tags.py
--- a/mercurial/tags.py
+++ b/mercurial/tags.py
@@ -75,20 +75,25 @@ 
         recode=encoding.fromlocal)
     _updatetags(filetags, "local", alltags, tagtypes)
 
-def _readtags(ui, repo, lines, fn, recode=None):
+def _readtags(ui, repo, lines, fn, recode=None, mergemode=False):
     '''Read tag definitions from a file (or any source of lines).
-    Return a mapping from tag name to (node, hist): node is the node id
-    from the last line read for that name, and hist is the list of node
-    ids previously associated with it (in file order).  All node ids are
-    binary, not hex.'''
+    The output changes depending on the mergemode flag:
+    - When mergemode is False (the default), return a mapping from tag name to
+      (node, hist): node is the node id from the last line read for that name,
+      and hist is the list of node ids previously associated with it (in file
+      order).  All node ids are binary, not hex.
+    - When mergemode is True, return a mapping from tag name to a list of
+      [hexnode, line number] pairs, ordered from the oldest to the newest node.
+      hex node is hex not binary.
+      '''
 
-    filetags = util.sortdict()  # map tag name to (node, hist)
+    filetags = util.sortdict()
     count = 0
 
     def warn(msg):
         ui.warn(_("%s, line %s: %s\n") % (fn, count, msg))
 
-    for line in lines:
+    for nline, line in enumerate(lines):
         count += 1
         if not line:
             continue
@@ -108,6 +113,13 @@ 
 
         # update filetags
         hist = []
+        if mergemode:
+            # map tag name to list of [node, line number]
+            if name not in filetags:
+                filetags[name] = []
+            filetags[name].append([nodehex, nline])
+            continue
+        # map tag name to (node, hist)
         if name in filetags:
             n, hist = filetags[name]
             hist.append(n)