Patchwork [1,of,2] localrepo: add afterpush()

login
register
mail settings
Submitter Gregory Szorc
Date July 8, 2014, 3:43 a.m.
Message ID <30333e516874716b8d9b.1404790986@77.1.168.192.in-addr.arpa>
Download mbox | patch
Permalink /patch/5132/
State Changes Requested
Headers show

Comments

Gregory Szorc - July 8, 2014, 3:43 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1404785357 25200
#      Mon Jul 07 19:09:17 2014 -0700
# Node ID 30333e516874716b8d9b41450282d1f22b87eba1
# Parent  7537e57f5dbd0805aad3c9e5e4f7dd8bbf6bbde1
localrepo: add afterpush()

The exchange.pushoperation type exposes a lot of useful information
about the push. This information can be very valuable to extensions
wishing to perform additional activity at push time. For example, an
extension may wish to initiate code review against pushed changesets or
may wish to communicate to a 3rd party service that a push took place.

Until this patch, there was no easy way to do this. Extensions could
wrap localrepo.checkpush() to obtain an instance of the
exchange.pushoperation. They would have to stuff a reference somewhere
and later consult it after the wrapped localrepo.push() completed.
This was a bit hacky.

This patch introduces localrepo.afterpush() to complement
localrepo.checkpush(). It gives extensions a clear and easy-to-use
extensibility point for "do your post-push operations here."

Changing localrepo.push() to return a pushoperation was considered.
However, this would break a long-standing API and would likely break a
handful of extensions. While this API is internal, the author believed
it best to avoid the concern entirely by introducing a new method.
Pierre-Yves David - July 8, 2014, 9:31 a.m.
On 07/08/2014 05:43 AM, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1404785357 25200
> #      Mon Jul 07 19:09:17 2014 -0700
> # Node ID 30333e516874716b8d9b41450282d1f22b87eba1
> # Parent  7537e57f5dbd0805aad3c9e5e4f7dd8bbf6bbde1
> localrepo: add afterpush()

The trend is to remove method from localrepo, not adding new ones.

(`localrepo.push(…)` was not thrown in fire as a courtesy to all the 
extension that may rely on it.)

The hook in itself is a good idea, but please add it in the 
`mercurial.exchange` module.
Gregory Szorc - July 8, 2014, 10:03 a.m.
On Jul 8, 2014, at 2:31, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote:
> 
> 
> 
>> On 07/08/2014 05:43 AM, Gregory Szorc wrote:
>> # HG changeset patch
>> # User Gregory Szorc <gregory.szorc@gmail.com>
>> # Date 1404785357 25200
>> #      Mon Jul 07 19:09:17 2014 -0700
>> # Node ID 30333e516874716b8d9b41450282d1f22b87eba1
>> # Parent  7537e57f5dbd0805aad3c9e5e4f7dd8bbf6bbde1
>> localrepo: add afterpush()
> 
> The trend is to remove method from localrepo, not adding new ones.
> 
> (`localrepo.push(…)` was not thrown in fire as a courtesy to all the extension that may rely on it.)
> 
> The hook in itself is a good idea, but please add it in the `mercurial.exchange` module.

An issue with moving things out of localrepo is that changes to module level symbols are global and apply to all repos. With localrepo, the pattern is to dynamically declare a new repo class during reposetup and only hack up the repo instances you want. I find this approach far more maintainable than peppering isinstance/hasattribute checks in monkeypatched module functions.

FWIW, this reasoning is why I'm a fan of grouping related functions in classes over the traditional Mercurial "style" of module level functions. With classes, I can go in with a scalpel and adjust __class__ on a per instance basis. When that instance is destroyed, the hacks die with it. Adjusting module symbols is juggling chainsaws by comparison. There's too many considerations, especially in persistent processes, such as hgweb and the command server. I'd prefer we limit the surface area for bugs by encouraging instance specific hacking over global/module hacking.
Pierre-Yves David - July 8, 2014, 10:38 a.m.
On 07/08/2014 12:03 PM, Gregory Szorc wrote:
> On Jul 8, 2014, at 2:31, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote:
>>
>>
>>
>>> On 07/08/2014 05:43 AM, Gregory Szorc wrote:
>>> # HG changeset patch
>>> # User Gregory Szorc <gregory.szorc@gmail.com>
>>> # Date 1404785357 25200
>>> #      Mon Jul 07 19:09:17 2014 -0700
>>> # Node ID 30333e516874716b8d9b41450282d1f22b87eba1
>>> # Parent  7537e57f5dbd0805aad3c9e5e4f7dd8bbf6bbde1
>>> localrepo: add afterpush()
>>
>> The trend is to remove method from localrepo, not adding new ones.
>>
>> (`localrepo.push(…)` was not thrown in fire as a courtesy to all the extension that may rely on it.)
>>
>> The hook in itself is a good idea, but please add it in the `mercurial.exchange` module.
>
> An issue with moving things out of localrepo is that changes to module level symbols are global and apply to all repos. With localrepo, the pattern is to dynamically declare a new repo class during reposetup and only hack up the repo instances you want. I find this approach far more maintainable than peppering isinstance/hasattribute checks in monkeypatched module functions.

This is not a massive issue as the module level function should be able 
to easily check if the repo has the extension enabled or not.

> FWIW, this reasoning is why I'm a fan of grouping related functions in classes over the traditional Mercurial "style" of module level functions. With classes, I can go in with a scalpel and adjust __class__ on a per instance basis. When that instance is destroyed, the hacks die with it. Adjusting module symbols is juggling chainsaws by comparison. There's too many considerations, especially in persistent processes, such as hgweb and the command server. I'd prefer we limit the surface area for bugs by encouraging instance specific hacking over global/module hacking.

The issue here is that all of the Mercurial logic ultimatly ends up on a 
single class, localrepo. And all the possible state that someone persist 
ends up on a single class, localrepo. This has proven to be a massive 
pain an we have been actively working on removing code from localrepo.

Push is a simple, (usually) one shot call. It has nothing to do on the 
localrepo class.
Gregory Szorc - July 8, 2014, 6:31 p.m.
On 7/8/14, 3:38 AM, Pierre-Yves David wrote:
>
>
> On 07/08/2014 12:03 PM, Gregory Szorc wrote:
>> On Jul 8, 2014, at 2:31, Pierre-Yves David
>> <pierre-yves.david@ens-lyon.org> wrote:
>>>
>>>
>>>
>>>> On 07/08/2014 05:43 AM, Gregory Szorc wrote:
>>>> # HG changeset patch
>>>> # User Gregory Szorc <gregory.szorc@gmail.com>
>>>> # Date 1404785357 25200
>>>> #      Mon Jul 07 19:09:17 2014 -0700
>>>> # Node ID 30333e516874716b8d9b41450282d1f22b87eba1
>>>> # Parent  7537e57f5dbd0805aad3c9e5e4f7dd8bbf6bbde1
>>>> localrepo: add afterpush()
>>>
>>> The trend is to remove method from localrepo, not adding new ones.
>>>
>>> (`localrepo.push(…)` was not thrown in fire as a courtesy to all the
>>> extension that may rely on it.)
>>>
>>> The hook in itself is a good idea, but please add it in the
>>> `mercurial.exchange` module.
>>
>> An issue with moving things out of localrepo is that changes to module
>> level symbols are global and apply to all repos. With localrepo, the
>> pattern is to dynamically declare a new repo class during reposetup
>> and only hack up the repo instances you want. I find this approach far
>> more maintainable than peppering isinstance/hasattribute checks in
>> monkeypatched module functions.
>
> This is not a massive issue as the module level function should be able
> to easily check if the repo has the extension enabled or not.

"extension enabled" does not imply "extension active for a given repo".

>> FWIW, this reasoning is why I'm a fan of grouping related functions in
>> classes over the traditional Mercurial "style" of module level
>> functions. With classes, I can go in with a scalpel and adjust
>> __class__ on a per instance basis. When that instance is destroyed,
>> the hacks die with it. Adjusting module symbols is juggling chainsaws
>> by comparison. There's too many considerations, especially in
>> persistent processes, such as hgweb and the command server. I'd prefer
>> we limit the surface area for bugs by encouraging instance specific
>> hacking over global/module hacking.
>
> The issue here is that all of the Mercurial logic ultimatly ends up on a
> single class, localrepo. And all the possible state that someone persist
> ends up on a single class, localrepo. This has proven to be a massive
> pain an we have been actively working on removing code from localrepo.

Sure. But at the same time, the collection of all the APIs on localrepo 
gives extension authors a clear target for extensibility points. As a 
newbie extension author, I would have loved the docs to say "open up 
localrepo.py and look at the methods on localrepo - those are what you 
should consider targeting first." This would have enabled me to write 
extensions easier and faster. Asking extension authors to sift through a 
mountain of modules is comparatively very difficult. Since the Python 
API is not stable, you are asking extension authors to step outside a 
safe zone. Having clear extensibility points [on classes via methods 
such as this patch introduced] is a huge win for extension authors.

I think our disagreement is mostly due to there not being an obvious 
solution to the problem of giving extensions clear extensibility points 
for per-repo behavior. I have a [crazy] idea.

Hooks are a pattern to solve this problem. I think they work well. 
However, the currents hooks system is tailored for out-of-process 
consumers. What if we supplemented that hooks system by adding a 
mechanism for Python-only hooks that is geared towards giving extensions 
a more clear API. Here is an example extension:

def onpushopcreate(repo, pushop):
     # Do stuff with a pushoperation before push

def onpushopfinish(repo, pushop):
     # Do stuff with a pushoperation after push

def reposetup(ui, repo):
     if not repo.local():
         return

     repo.addhook('pushopcreate', onpushopcreate)
     repo.addhook('pushopfinish', onpushopfinish)

We sort of have this today with localrepo.prepushoutgoinghooks. What if 
we made it generic?

It solves the problem the patch is trying to solve, solves it with 
per-repo instance hacking, avoids module level monkeypatching, and gives 
extension authors a clear set of recommended extension points. I think 
there are wins all around.

What do you think? I would happily implement this if given the green light.
Matt Mackall - July 9, 2014, 8:10 p.m.
On Tue, 2014-07-08 at 03:03 -0700, Gregory Szorc wrote:

> FWIW, this reasoning is why I'm a fan of grouping related functions in
> classes over the traditional Mercurial "style" of module level
> functions. With classes, I can go in with a scalpel and adjust
> __class__ on a per instance basis. When that instance is destroyed,
> the hacks die with it. Adjusting module symbols is juggling chainsaws
> by comparison. There's too many considerations, especially in
> persistent processes, such as hgweb and the command server. I'd prefer
> we limit the surface area for bugs by encouraging instance specific
> hacking over global/module hacking.

That line of reasoning is all very good and I'd be inclined to agree
with you if extensions were first-class citizens. But they're explicitly
not: we intentionally sacrifice maintainability of modules for
maintainability of the core. And the complexity of localrepo in
particular has become a very obvious maintenance problem.

However, if this is a big deal, perhaps the answer is a helper in
extensions.py that calls your replacement function if the repo arg is of
a particular class. This probably makes sense for both function and
command wrappers.

Relatedly, lots of our monkeypatch setup work could probably be replaced
with decorators.


However, we can perhaps add a magic wrapper to extensions that does some
sort of class matching on repo args for you.
Augie Fackler - July 11, 2014, 5:17 p.m.
On Wed, Jul 09, 2014 at 03:10:14PM -0500, Matt Mackall wrote:
> On Tue, 2014-07-08 at 03:03 -0700, Gregory Szorc wrote:
>
> > FWIW, this reasoning is why I'm a fan of grouping related functions in
> > classes over the traditional Mercurial "style" of module level
> > functions. With classes, I can go in with a scalpel and adjust
> > __class__ on a per instance basis. When that instance is destroyed,
> > the hacks die with it. Adjusting module symbols is juggling chainsaws
> > by comparison. There's too many considerations, especially in
> > persistent processes, such as hgweb and the command server. I'd prefer
> > we limit the surface area for bugs by encouraging instance specific
> > hacking over global/module hacking.
>
> That line of reasoning is all very good and I'd be inclined to agree
> with you if extensions were first-class citizens. But they're explicitly
> not: we intentionally sacrifice maintainability of modules for
> maintainability of the core. And the complexity of localrepo in
> particular has become a very obvious maintenance problem.
>
> However, if this is a big deal, perhaps the answer is a helper in
> extensions.py that calls your replacement function if the repo arg is of
> a particular class. This probably makes sense for both function and
> command wrappers.

hgsubversion has wrappers along these lines - perhaps those could be a
good guide for this extension hackery as well.

> Relatedly, lots of our monkeypatch setup work could probably be replaced
> with decorators.
>
>
> However, we can perhaps add a magic wrapper to extensions that does some
> sort of class matching on repo args for you.
>
> --
> Mathematics is the supreme nostalgia of our time.
>
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
Gregory Szorc - July 12, 2014, 5:42 p.m.
On 7/9/2014 1:10 PM, Matt Mackall wrote:
> On Tue, 2014-07-08 at 03:03 -0700, Gregory Szorc wrote:
> 
>> FWIW, this reasoning is why I'm a fan of grouping related functions in
>> classes over the traditional Mercurial "style" of module level
>> functions. With classes, I can go in with a scalpel and adjust
>> __class__ on a per instance basis. When that instance is destroyed,
>> the hacks die with it. Adjusting module symbols is juggling chainsaws
>> by comparison. There's too many considerations, especially in
>> persistent processes, such as hgweb and the command server. I'd prefer
>> we limit the surface area for bugs by encouraging instance specific
>> hacking over global/module hacking.
> 
> That line of reasoning is all very good and I'd be inclined to agree
> with you if extensions were first-class citizens. But they're explicitly
> not: we intentionally sacrifice maintainability of modules for
> maintainability of the core. And the complexity of localrepo in
> particular has become a very obvious maintenance problem.
> 
> However, if this is a big deal, perhaps the answer is a helper in
> extensions.py that calls your replacement function if the repo arg is of
> a particular class. This probably makes sense for both function and
> command wrappers.
> 
> Relatedly, lots of our monkeypatch setup work could probably be replaced
> with decorators.
> 
> 
> However, we can perhaps add a magic wrapper to extensions that does some
> sort of class matching on repo args for you.

I don't think this approach will work because I don't believe there's a
good way to get a reference to a dynamically created class to perform
the isinstance() check. e.g.

class Base(object):
    pass

def changeclass(o):
    class Child(o.__class__):
        pass

    o.__class__ = Child

c = Base()
changeclass(c)

print(c)
> <__main__.Child object at 0x7f14836ad450>

print(c.__class__)
> <class '__main__.Child'>

print(dir(changeclass))
['__call__', '__class__', '__closure__', '__code__', '__defaults__',
'__delattr__', '__dict__', '__doc__', '__format__', '__get__',
'__getattribute__', '__globals__', '__hash__', '__init__', '__module__',
'__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__', '__sizeof__', '__str__', '__subclasshook__',
'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc',
'func_globals', 'func_name']

print(sorted(globals()))
['Base', '__builtins__', '__doc__', '__file__', '__name__',
'__package__', 'c', 'changeclass']

print(c.__class__.__module__)
__main__

import sys
print(sorted(dir(sys.modules['__main__'])))
['Base', '__builtins__', '__doc__', '__file__', '__name__',
'__package__', 'c', 'changeclass', 'sys']

Furthermore,

c2 = Base()
changeclass(c2)
c.__class__ == c2.__class__
False
isinstance(c2, c.__class__)
False

Such is the nature of dynamic class creation. Each function call
produces a new class type. The only handle on it is the .__class__ of
the instance it is attached to.

You /could/ look at class names. e.g. c.__class__.__name__. But I'm
going to argue that isn't robust enough.

I'm pretty sure we need to store state on each localrepo instance for
this to work. I'll send a few alternative patches to the list. I hope
one will be liked.

Please disregard this patch series.

Patch

diff --git a/mercurial/exchange.py b/mercurial/exchange.py
--- a/mercurial/exchange.py
+++ b/mercurial/exchange.py
@@ -148,8 +148,9 @@  def push(repo, remote, force=False, revs
         if locallock is not None:
             locallock.release()
 
     _pushbookmark(pushop)
+    pushop.repo.afterpush(pushop)
     return pushop.ret
 
 def _pushdiscovery(pushop):
     # discovery
diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1588,8 +1588,24 @@  class localrepository(object):
 
     def push(self, remote, force=False, revs=None, newbranch=False):
         return exchange.push(self, remote, force, revs, newbranch)
 
+    def afterpush(self, pushop):
+        """Extensibility point to do work after push.
+
+        Extensions can wrap this method if they wish to perform additional
+        work after a push has completed.
+
+        The method receives an exchange.pushoperation instance.
+
+        When this is called, the lock on both the local and remote repositories
+        has already been released.
+
+        This is called even if the remote rejected the push. Extensions should
+        consult pushop.ret to determine whether the push was successful.
+        """
+        pass
+
     def stream_in(self, remote, requirements):
         lock = self.lock()
         try:
             # Save remote branchmap. We will use it later