Patchwork [2,of,8] discovery: move code for creating outgoing from heads and common (API)

login
register
mail settings
Submitter Gregory Szorc
Date Aug. 5, 2016, 3:16 a.m.
Message ID <19d16c9bce2fdf86d8c8.1470367018@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/16106/
State Accepted
Headers show

Comments

Gregory Szorc - Aug. 5, 2016, 3:16 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1470365488 25200
#      Thu Aug 04 19:51:28 2016 -0700
# Node ID 19d16c9bce2fdf86d8c84915a45815bb7d4fd932
# Parent  8a046444390debab5372de0b4f38863c57058f69
discovery: move code for creating outgoing from heads and common (API)

We recently moved similar code from changegroup.py to
discovery.outgoingbetween(). It makes sense for creation of outgoing
instances to all live together in discovery.py. So move
changegroup.computeoutgoing() to discovery.outgoingheadsandcommon() and
update consumers to use the new function.
Pierre-Yves David - Aug. 6, 2016, 12:37 a.m.
On 08/05/2016 05:16 AM, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1470365488 25200
> #      Thu Aug 04 19:51:28 2016 -0700
> # Node ID 19d16c9bce2fdf86d8c84915a45815bb7d4fd932
> # Parent  8a046444390debab5372de0b4f38863c57058f69
> discovery: move code for creating outgoing from heads and common (API)
>
> We recently moved similar code from changegroup.py to
> discovery.outgoingbetween(). It makes sense for creation of outgoing
> instances to all live together in discovery.py. So move
> changegroup.computeoutgoing() to discovery.outgoingheadsandcommon() and
> update consumers to use the new function.

I like the general idea of cleaning the changegroup API. but I think we 
can move toward something even saner:

With your series we end up with 3 way to obtains an outgoing object

- outgoingbetween(repo, roots, heads)
   # root can be empty//False

- outgoingheadsandcommon(repo, heads, common)
   # common will be filtered from locally missing
   # heads can be empty (means all heads)

- outgoing(revlog, commonheads, missingheads):

That is probably 2 more function than we want.

First, even if the object will only bear a changelog object, all 
creators seems to have access to a repository anyway. So we can probably 
make the class internal and have all caller go through a builder function.

Second, we can probably have a single function handling all this with 
default value:

- outgoing(repo, roots=None, commonheads=None, heads=None)
   # 'roots' and 'commonheads' cannot be both specified
   # if neither 'roots' or 'commonheads' is set, 'commonheads' is nullid
   # commonheads is filtered if specified
   # heads=None mean 'all heads in the repo

In my opinion having a single high level API would make things much simpler.
Gregory Szorc - Aug. 6, 2016, 2:15 a.m.
> On Aug 5, 2016, at 17:37, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote:
> 
> 
> 
>> On 08/05/2016 05:16 AM, Gregory Szorc wrote:
>> # HG changeset patch
>> # User Gregory Szorc <gregory.szorc@gmail.com>
>> # Date 1470365488 25200
>> #      Thu Aug 04 19:51:28 2016 -0700
>> # Node ID 19d16c9bce2fdf86d8c84915a45815bb7d4fd932
>> # Parent  8a046444390debab5372de0b4f38863c57058f69
>> discovery: move code for creating outgoing from heads and common (API)
>> 
>> We recently moved similar code from changegroup.py to
>> discovery.outgoingbetween(). It makes sense for creation of outgoing
>> instances to all live together in discovery.py. So move
>> changegroup.computeoutgoing() to discovery.outgoingheadsandcommon() and
>> update consumers to use the new function.
> 
> I like the general idea of cleaning the changegroup API. but I think we can move toward something even saner:
> 
> With your series we end up with 3 way to obtains an outgoing object
> 
> - outgoingbetween(repo, roots, heads)
>  # root can be empty//False
> 
> - outgoingheadsandcommon(repo, heads, common)
>  # common will be filtered from locally missing
>  # heads can be empty (means all heads)
> 
> - outgoing(revlog, commonheads, missingheads):
> 
> That is probably 2 more function than we want.
> 
> First, even if the object will only bear a changelog object, all creators seems to have access to a repository anyway. So we can probably make the class internal and have all caller go through a builder function.
> 
> Second, we can probably have a single function handling all this with default value:
> 
> - outgoing(repo, roots=None, commonheads=None, heads=None)
>  # 'roots' and 'commonheads' cannot be both specified
>  # if neither 'roots' or 'commonheads' is set, 'commonheads' is nullid
>  # commonheads is filtered if specified
>  # heads=None mean 'all heads in the repo
> 
> In my opinion having a single high level API would make things much simpler.

I agree there is room to improve these APIs. However, that's work for a follow-up series. It's not something that interests me greatly. So if you want to write the patches, go for it.
Augie Fackler - Aug. 6, 2016, 2:26 p.m.
On Fri, Aug 05, 2016 at 07:15:37PM -0700, Gregory Szorc wrote:
>
>
> > On Aug 5, 2016, at 17:37, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote:
> >
> >
> >
> >> On 08/05/2016 05:16 AM, Gregory Szorc wrote:
> >> # HG changeset patch
> >> # User Gregory Szorc <gregory.szorc@gmail.com>
> >> # Date 1470365488 25200
> >> #      Thu Aug 04 19:51:28 2016 -0700
> >> # Node ID 19d16c9bce2fdf86d8c84915a45815bb7d4fd932
> >> # Parent  8a046444390debab5372de0b4f38863c57058f69
> >> discovery: move code for creating outgoing from heads and common (API)
> >>
> >> We recently moved similar code from changegroup.py to
> >> discovery.outgoingbetween(). It makes sense for creation of outgoing
> >> instances to all live together in discovery.py. So move
> >> changegroup.computeoutgoing() to discovery.outgoingheadsandcommon() and
> >> update consumers to use the new function.
> >
> > I like the general idea of cleaning the changegroup API. but I think we can move toward something even saner:
> >
> > With your series we end up with 3 way to obtains an outgoing object
> >
> > - outgoingbetween(repo, roots, heads)
> >  # root can be empty//False
> >
> > - outgoingheadsandcommon(repo, heads, common)
> >  # common will be filtered from locally missing
> >  # heads can be empty (means all heads)
> >
> > - outgoing(revlog, commonheads, missingheads):
> >
> > That is probably 2 more function than we want.
> >
> > First, even if the object will only bear a changelog object, all creators seems to have access to a repository anyway. So we can probably make the class internal and have all caller go through a builder function.
> >
> > Second, we can probably have a single function handling all this with default value:
> >
> > - outgoing(repo, roots=None, commonheads=None, heads=None)
> >  # 'roots' and 'commonheads' cannot be both specified
> >  # if neither 'roots' or 'commonheads' is set, 'commonheads' is nullid
> >  # commonheads is filtered if specified
> >  # heads=None mean 'all heads in the repo
> >
> > In my opinion having a single high level API would make things much simpler.
>
> I agree there is room to improve these APIs. However, that's work for a follow-up series. It's not something that interests me greatly. So if you want to write the patches, go for it.

I felt like this series was a good incremental step towards being able
to have a single coherent UI, whether or not Greg does that work.

> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Pierre-Yves David - Aug. 7, 2016, 11:41 p.m.
On 08/06/2016 04:15 AM, Gregory Szorc wrote:
>
>
>> On Aug 5, 2016, at 17:37, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote:
>>
>>
>>
>>> On 08/05/2016 05:16 AM, Gregory Szorc wrote:
>>> # HG changeset patch
>>> # User Gregory Szorc <gregory.szorc@gmail.com>
>>> # Date 1470365488 25200
>>> #      Thu Aug 04 19:51:28 2016 -0700
>>> # Node ID 19d16c9bce2fdf86d8c84915a45815bb7d4fd932
>>> # Parent  8a046444390debab5372de0b4f38863c57058f69
>>> discovery: move code for creating outgoing from heads and common (API)
>>>
>>> We recently moved similar code from changegroup.py to
>>> discovery.outgoingbetween(). It makes sense for creation of outgoing
>>> instances to all live together in discovery.py. So move
>>> changegroup.computeoutgoing() to discovery.outgoingheadsandcommon() and
>>> update consumers to use the new function.
>>
>> I like the general idea of cleaning the changegroup API. but I think we can move toward something even saner:
>>
>> With your series we end up with 3 way to obtains an outgoing object
>>
>> - outgoingbetween(repo, roots, heads)
>>  # root can be empty//False
>>
>> - outgoingheadsandcommon(repo, heads, common)
>>  # common will be filtered from locally missing
>>  # heads can be empty (means all heads)
>>
>> - outgoing(revlog, commonheads, missingheads):
>>
>> That is probably 2 more function than we want.
>>
>> First, even if the object will only bear a changelog object, all creators seems to have access to a repository anyway. So we can probably make the class internal and have all caller go through a builder function.
>>
>> Second, we can probably have a single function handling all this with default value:
>>
>> - outgoing(repo, roots=None, commonheads=None, heads=None)
>>  # 'roots' and 'commonheads' cannot be both specified
>>  # if neither 'roots' or 'commonheads' is set, 'commonheads' is nullid
>>  # commonheads is filtered if specified
>>  # heads=None mean 'all heads in the repo
>>
>> In my opinion having a single high level API would make things much simpler.
>
> I agree there is room to improve these APIs. However, that's work for a follow-up series. It's not something that interests me greatly. So if you want to write the patches, go for it.

My main issue here is that it make some of the craziness of changegroup 
escape into other module. If we are to clean up this changegroup things, 
cleaning up the API in the process does not seems like a huges overhead. 
You are touching code that is messy and sensitive enough that some round 
trip of discussion is expected before we are happy about the API we are 
building.

Cheers,
Gregory Szorc - Aug. 8, 2016, 1:34 a.m.
On Sun, Aug 7, 2016 at 4:41 PM, Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 08/06/2016 04:15 AM, Gregory Szorc wrote:
>
>>
>>
>> On Aug 5, 2016, at 17:37, Pierre-Yves David <
>>> pierre-yves.david@ens-lyon.org> wrote:
>>>
>>>
>>>
>>> On 08/05/2016 05:16 AM, Gregory Szorc wrote:
>>>> # HG changeset patch
>>>> # User Gregory Szorc <gregory.szorc@gmail.com>
>>>> # Date 1470365488 25200
>>>> #      Thu Aug 04 19:51:28 2016 -0700
>>>> # Node ID 19d16c9bce2fdf86d8c84915a45815bb7d4fd932
>>>> # Parent  8a046444390debab5372de0b4f38863c57058f69
>>>> discovery: move code for creating outgoing from heads and common (API)
>>>>
>>>> We recently moved similar code from changegroup.py to
>>>> discovery.outgoingbetween(). It makes sense for creation of outgoing
>>>> instances to all live together in discovery.py. So move
>>>> changegroup.computeoutgoing() to discovery.outgoingheadsandcommon() and
>>>> update consumers to use the new function.
>>>>
>>>
>>> I like the general idea of cleaning the changegroup API. but I think we
>>> can move toward something even saner:
>>>
>>> With your series we end up with 3 way to obtains an outgoing object
>>>
>>> - outgoingbetween(repo, roots, heads)
>>>  # root can be empty//False
>>>
>>> - outgoingheadsandcommon(repo, heads, common)
>>>  # common will be filtered from locally missing
>>>  # heads can be empty (means all heads)
>>>
>>> - outgoing(revlog, commonheads, missingheads):
>>>
>>> That is probably 2 more function than we want.
>>>
>>> First, even if the object will only bear a changelog object, all
>>> creators seems to have access to a repository anyway. So we can probably
>>> make the class internal and have all caller go through a builder function.
>>>
>>> Second, we can probably have a single function handling all this with
>>> default value:
>>>
>>> - outgoing(repo, roots=None, commonheads=None, heads=None)
>>>  # 'roots' and 'commonheads' cannot be both specified
>>>  # if neither 'roots' or 'commonheads' is set, 'commonheads' is nullid
>>>  # commonheads is filtered if specified
>>>  # heads=None mean 'all heads in the repo
>>>
>>> In my opinion having a single high level API would make things much
>>> simpler.
>>>
>>
>> I agree there is room to improve these APIs. However, that's work for a
>> follow-up series. It's not something that interests me greatly. So if you
>> want to write the patches, go for it.
>>
>
> My main issue here is that it make some of the craziness of changegroup
> escape into other module.


Changegroup craziness already escapes into other modules through a highly
complicated API with redundant functions that all do nearly the same thing,
but not obviously.


> If we are to clean up this changegroup things, cleaning up the API in the
> process does not seems like a huges overhead. You are touching code that is
> messy and sensitive enough that some round trip of discussion is expected
> before we are happy about the API we are building.
>

One of the reasons I'm touching this code is *because* it is messy and
sensitive. I contend that nearly anything is better than what we have today.

I've submitted 2 series that clean up this code. One of them even got
queued - only to be unqueued later. This is quite frustrating, as it is
blocking other work on bundling.

I'm not sure what kind of refactor will pass your review. If you jot down
your ideas for an API, I'll implement them if I agree they are good.
Otherwise, I kindly ask you to consider that perfect is the enemy of done
and to let this refactoring proceed. Worst case, we refactor the API again
later. It's not like the internal API is stable or anything ;)

Patch

diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
--- a/mercurial/changegroup.py
+++ b/mercurial/changegroup.py
@@ -967,46 +967,27 @@  def getlocalchangegroup(repo, source, ou
 
     This is only implemented for local repos and reuses potentially
     precomputed sets in outgoing."""
     if not outgoing.missing:
         return None
     bundler = getbundler(version, repo, bundlecaps)
     return getsubset(repo, outgoing, bundler, source)
 
-def computeoutgoing(repo, heads, common):
-    """Computes which revs are outgoing given a set of common
-    and a set of heads.
-
-    This is a separate function so extensions can have access to
-    the logic.
-
-    Returns a discovery.outgoing object.
-    """
-    cl = repo.changelog
-    if common:
-        hasnode = cl.hasnode
-        common = [n for n in common if hasnode(n)]
-    else:
-        common = [nullid]
-    if not heads:
-        heads = cl.heads()
-    return discovery.outgoing(cl, common, heads)
-
 def getchangegroup(repo, source, heads=None, common=None, bundlecaps=None,
                    version='01'):
     """Like changegroupsubset, but returns the set difference between the
     ancestors of heads and the ancestors common.
 
     If heads is None, use the local heads. If common is None, use [nullid].
 
     The nodes in common might not all be known locally due to the way the
     current discovery protocol works.
     """
-    outgoing = computeoutgoing(repo, heads, common)
+    outgoing = discovery.outgoingheadsandcommon(repo, heads, common)
     return getlocalchangegroup(repo, source, outgoing, bundlecaps=bundlecaps,
                                version=version)
 
 def changegroup(repo, basenodes, source):
     # to avoid a race we use changegroupsubset() (issue1320)
     return changegroupsubset(repo, basenodes, repo.heads(), source)
 
 def _addchangegroupfiles(repo, source, revmap, trp, expectedfiles, needfiles):
diff --git a/mercurial/discovery.py b/mercurial/discovery.py
--- a/mercurial/discovery.py
+++ b/mercurial/discovery.py
@@ -117,16 +117,34 @@  def outgoingbetween(repo, roots, heads):
     # TODO remove call to nodesbetween.
     # TODO populate attributes on outgoing instance instead of setting
     # discbases.
     csets, roots, heads = cl.nodesbetween(roots, heads)
     included = set(csets)
     discbases = [n for n in discbases if n not in included]
     return outgoing(cl, discbases, heads)
 
+def outgoingheadsandcommon(repo, heads, common):
+    """Create an ``outgoing`` from heads and common nodes.
+
+    ``heads`` is a list of binary head nodes or None to use existing
+    heads. ``common`` is a list of binary nodes or ``None`` to indicate
+    no common nodes.
+    """
+    cl = repo.changelog
+    if common:
+        hasnode = cl.hasnode
+        common = [n for n in common if hasnode(n)]
+    else:
+        common = [nullid]
+
+    heads = heads or cl.heads()
+
+    return outgoing(cl, common, heads)
+
 def findcommonoutgoing(repo, other, onlyheads=None, force=False,
                        commoninc=None, portable=False):
     '''Return an outgoing instance to identify the nodes present in repo but
     not in other.
 
     If onlyheads is given, only nodes ancestral to nodes in onlyheads
     (inclusive) are included. If you already know the local repo's heads,
     passing them in onlyheads is faster than letting them be recomputed here.
diff --git a/mercurial/exchange.py b/mercurial/exchange.py
--- a/mercurial/exchange.py
+++ b/mercurial/exchange.py
@@ -1567,17 +1567,17 @@  def _getbundlechangegrouppart(bundler, r
     version = '01'
     cgversions = b2caps.get('changegroup')
     if cgversions:  # 3.1 and 3.2 ship with an empty value
         cgversions = [v for v in cgversions
                       if v in changegroup.supportedoutgoingversions(repo)]
         if not cgversions:
             raise ValueError(_('no common changegroup version'))
         version = max(cgversions)
-    outgoing = changegroup.computeoutgoing(repo, heads, common)
+    outgoing = discovery.outgoingheadsandcommon(repo, heads, common)
     cg = changegroup.getlocalchangegroupraw(repo, source, outgoing,
                                             bundlecaps=bundlecaps,
                                             version=version)
 
     if not cg:
         return
 
     part = bundler.newpart('changegroup', data=cg)
@@ -1622,17 +1622,17 @@  def _getbundletagsfnodes(bundler, repo, 
     filenodes raw values.
     """
     # Don't send unless:
     # - changeset are being exchanged,
     # - the client supports it.
     if not (kwargs.get('cg', True) and 'hgtagsfnodes' in b2caps):
         return
 
-    outgoing = changegroup.computeoutgoing(repo, heads, common)
+    outgoing = discovery.outgoingheadsandcommon(repo, heads, common)
 
     if not outgoing.missingheads:
         return
 
     cache = tags.hgtagsfnodescache(repo.unfiltered())
     chunks = []
 
     # .hgtags fnodes are only relevant for head changesets. While we could