Patchwork [2,of,6] localrepo: add internal "pyhooks" hooks mechanism

login
register
mail settings
Submitter Gregory Szorc
Date July 12, 2014, 8:12 p.m.
Message ID <8fa7b8de6cf5c6f809ca.1405195942@vm-ubuntu-main.gateway.sonic.net>
Download mbox | patch
Permalink /patch/5148/
State Changes Requested
Headers show

Comments

Gregory Szorc - July 12, 2014, 8:12 p.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1405188833 25200
#      Sat Jul 12 11:13:53 2014 -0700
# Node ID 8fa7b8de6cf5c6f809cabfd530d7f14383a18c2c
# Parent  ab912eb22894240c60ff757db60e06e343fef6a7
localrepo: add internal "pyhooks" hooks mechanism

The intent of this feature addition is to give extensions (and possibly
even core code) a better way than monkeypatching/wrapping to
modify/supplement run-time behavior.

Hooks have a few significant advantages over monkeypatching/wrapping:

* Extensibility points are well defined. If Mercurial maintainers wish
  to clearly mark an activity as extensible, they can add a pyhook
  for it. Contrast with monkeypatching/wrapping, where it isn't always
  clear what the risks with each function are. Hooks give extension
  authors a clear set of points from which to start extending.

* They are instance-specific. Monkeypatching often results in changing
  symbols on modules as opposed to a per-instance basis. Extensions may
  wish to modify behavior for classes of a certain type or perhaps even
  specific instances of a specific class. Globally modifying functions
  and then filtering for applicability at run-time can be difficult
  and dangerous. Beginning extension authors may not realize the full
  impact of global changes, especially in "shared" process spaces, such
  as hgweb or the command server. Per-instance hooks are much safer.

The patch author considered an alternative implementation that
introduced hooks.addrepohook() or extensions.addrepohook() and
hooks.runrepohook() or extensions.runrepohook(). In the mind of
the patch author, the choice of where the API should live and the
names of the APIs (the author concedes "pyhook" isn't a great
name but can think of nothing better) are better decided by
someone with more experience than him. The author anticipates
much bikeshedding on this patch.
Gregory Szorc - July 12, 2014, 9:05 p.m.
On 7/12/2014 1:12 PM, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1405188833 25200
> #      Sat Jul 12 11:13:53 2014 -0700
> # Node ID 8fa7b8de6cf5c6f809cabfd530d7f14383a18c2c
> # Parent  ab912eb22894240c60ff757db60e06e343fef6a7
> localrepo: add internal "pyhooks" hooks mechanism
> 
> The intent of this feature addition is to give extensions (and possibly
> even core code) a better way than monkeypatching/wrapping to
> modify/supplement run-time behavior.
> 
> Hooks have a few significant advantages over monkeypatching/wrapping:
> 
> * Extensibility points are well defined. If Mercurial maintainers wish
>   to clearly mark an activity as extensible, they can add a pyhook
>   for it. Contrast with monkeypatching/wrapping, where it isn't always
>   clear what the risks with each function are. Hooks give extension
>   authors a clear set of points from which to start extending.
> 
> * They are instance-specific. Monkeypatching often results in changing
>   symbols on modules as opposed to a per-instance basis. Extensions may
>   wish to modify behavior for classes of a certain type or perhaps even
>   specific instances of a specific class. Globally modifying functions
>   and then filtering for applicability at run-time can be difficult
>   and dangerous. Beginning extension authors may not realize the full
>   impact of global changes, especially in "shared" process spaces, such
>   as hgweb or the command server. Per-instance hooks are much safer.
> 
> The patch author considered an alternative implementation that
> introduced hooks.addrepohook() or extensions.addrepohook() and
> hooks.runrepohook() or extensions.runrepohook(). In the mind of
> the patch author, the choice of where the API should live and the
> names of the APIs (the author concedes "pyhook" isn't a great
> name but can think of nothing better) are better decided by
> someone with more experience than him. The author anticipates
> much bikeshedding on this patch.
> 
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -300,8 +300,11 @@ class localrepository(object):
>          # - working directory parent change,
>          # - bookmark changes
>          self.filteredrevcache = {}
>  
> +        # Maps names to list of callables.
> +        self._hooks = {}
> +
>      def close(self):
>          pass
>  
>      def _restrictcapabilities(self, caps):
> @@ -485,8 +488,50 @@ class localrepository(object):
>          replacing code that is expected to call a hook.
>          """
>          return hook.hook(self.ui, self, name, throw, **args)
>  
> +    def addpyhook(self, name, fn):
> +        """Register a Python hook against this repo instance.
> +
> +        It is common to want to execute some Python code when certain events
> +        occur. This is common in extensions. This method provides a
> +        registration mechanism to do that.
> +
> +        This method receives the name of a hook, name, and a callable, fn.
> +
> +        If the hook name is not known, KeyError will be raised. This means
> +        that if a hook is deleted, extensions will fail fast unless they catch
> +        KeyError.
> +
> +        The Mercurial API does not make any guarantees about the stability
> +        of arguments passed to any called hook. However, an effort is made
> +        to avoid unnecessary churn.
> +
> +        "pyhooks" are an internal-oriented variation of the external-facing
> +        hooks mechanism. The latter has strong API guarantees and hooks can
> +        be added via hgrc files. pyhooks are strictly internal and have
> +        weaker API guarantees.
> +        """
> +        if not callable(fn):
> +            raise ValueError('Argument is not callable')
> +        self._hooks[name].append(fn)
> +
> +    def runpyhook(self, name, **args):
> +        """Run a Python hook.
> +
> +        All callables registered via addpyhook() will be executed in the order
> +        they were registered.
> +
> +        Each hook will receive as arguments this repo instance as its single
> +        positional argument and the named arguments passed into this method, if
> +        any. All custom arguments are named because the API contract is not
> +        guaranteed and this gives extensions yet another to point to fail
> +        fast (unknown arguments will result in call failures and will require
> +        extensions to adapt to changes in the API).
> +        """
> +        for fn in self._hooks[name]:
> +            fn(self, **args)
> +
>      @unfilteredmethod
>      def _tag(self, names, node, message, local, user, date, extra={},
>               editor=False):
>          if isinstance(names, str):
> 

I went on a short walk and realized of all the potential names I chose
for this, "pyhooks" is one of the worst.

Drawing from personal experience, "event" is almost certainly a better
terminology, made even more certain because of the ambiguity with
existing Mercurial "hooks." "Events" immediately made me think of
"dispatch," "run," and "handlers." I anticipate bikeshedding on names.

I also anticipate bikeshedding on the mechanism for registering these
handlers. I've coded per-type methods for registering per-instance event
handlers. Alternatively, we could stick a .events dict on each instance:

repo.events.beforepush += custombeforepush

or

repo.events.beforepush.append(custombeforepush)

In this approach, each type has its event handling mechanism. Perhaps we
want to unify this in extensions.py or hooks.py? e.g.

extensions.addeventhandler(repo, 'beforepush' custombeforepush)

Where addeventhandler looks something like:

def addeventhandler(thing, event, handler):
    if not hasattr(thing, 'events'):
        raise ValueError('object does not support event handlers')
    thing.events[event].append(handler)

Perhaps this could get rolled into a common base class for everything
with event handlers so instances have a .addeventhandler. This quickly
turns into a DOM-style event handling mechanism. I'm not sure if that's
good or bad.

Honestly, the simplest solution is to add a bunch of instance variables
to classes like localrepo, one variable per event name. e.g.
|repo.beforepush += custombeforepush| or even |repo.onbeforepush +=
custombeforepush| so we can easily denote which attributes are events.
But people have opinions about localrepo having too many methods. I'm
not sure if this extends to attributes or just methods?

I'm just not sure how we want to do this. I've put one solution forward
as code. I'd prefer to only code one rewrite.

Let the bikeshedding begin.
Siddharth Agarwal - July 13, 2014, 2:51 a.m.
On 07/12/2014 01:12 PM, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1405188833 25200
> #      Sat Jul 12 11:13:53 2014 -0700
> # Node ID 8fa7b8de6cf5c6f809cabfd530d7f14383a18c2c
> # Parent  ab912eb22894240c60ff757db60e06e343fef6a7
> localrepo: add internal "pyhooks" hooks mechanism

I really like the idea of this, though as you've already said the name 
deserves some bikeshedding.

The use case I had in mind was remotefilelog not needing to override a 
bunch of methods to batch prefetch files, and instead core having a 
'prefetchfiles' hook that would be called whenever core code knows it's 
going to need a bunch of file revisions. Augie (cc'd) and I talked about 
this in the context of hg-git and remotefilelog.

- Siddharth


>
> The intent of this feature addition is to give extensions (and possibly
> even core code) a better way than monkeypatching/wrapping to
> modify/supplement run-time behavior.
>
> Hooks have a few significant advantages over monkeypatching/wrapping:
>
> * Extensibility points are well defined. If Mercurial maintainers wish
>    to clearly mark an activity as extensible, they can add a pyhook
>    for it. Contrast with monkeypatching/wrapping, where it isn't always
>    clear what the risks with each function are. Hooks give extension
>    authors a clear set of points from which to start extending.
>
> * They are instance-specific. Monkeypatching often results in changing
>    symbols on modules as opposed to a per-instance basis. Extensions may
>    wish to modify behavior for classes of a certain type or perhaps even
>    specific instances of a specific class. Globally modifying functions
>    and then filtering for applicability at run-time can be difficult
>    and dangerous. Beginning extension authors may not realize the full
>    impact of global changes, especially in "shared" process spaces, such
>    as hgweb or the command server. Per-instance hooks are much safer.
>
> The patch author considered an alternative implementation that
> introduced hooks.addrepohook() or extensions.addrepohook() and
> hooks.runrepohook() or extensions.runrepohook(). In the mind of
> the patch author, the choice of where the API should live and the
> names of the APIs (the author concedes "pyhook" isn't a great
> name but can think of nothing better) are better decided by
> someone with more experience than him. The author anticipates
> much bikeshedding on this patch.
>
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -300,8 +300,11 @@ class localrepository(object):
>           # - working directory parent change,
>           # - bookmark changes
>           self.filteredrevcache = {}
>   
> +        # Maps names to list of callables.
> +        self._hooks = {}
> +
>       def close(self):
>           pass
>   
>       def _restrictcapabilities(self, caps):
> @@ -485,8 +488,50 @@ class localrepository(object):
>           replacing code that is expected to call a hook.
>           """
>           return hook.hook(self.ui, self, name, throw, **args)
>   
> +    def addpyhook(self, name, fn):
> +        """Register a Python hook against this repo instance.
> +
> +        It is common to want to execute some Python code when certain events
> +        occur. This is common in extensions. This method provides a
> +        registration mechanism to do that.
> +
> +        This method receives the name of a hook, name, and a callable, fn.
> +
> +        If the hook name is not known, KeyError will be raised. This means
> +        that if a hook is deleted, extensions will fail fast unless they catch
> +        KeyError.
> +
> +        The Mercurial API does not make any guarantees about the stability
> +        of arguments passed to any called hook. However, an effort is made
> +        to avoid unnecessary churn.
> +
> +        "pyhooks" are an internal-oriented variation of the external-facing
> +        hooks mechanism. The latter has strong API guarantees and hooks can
> +        be added via hgrc files. pyhooks are strictly internal and have
> +        weaker API guarantees.
> +        """
> +        if not callable(fn):
> +            raise ValueError('Argument is not callable')
> +        self._hooks[name].append(fn)
> +
> +    def runpyhook(self, name, **args):
> +        """Run a Python hook.
> +
> +        All callables registered via addpyhook() will be executed in the order
> +        they were registered.
> +
> +        Each hook will receive as arguments this repo instance as its single
> +        positional argument and the named arguments passed into this method, if
> +        any. All custom arguments are named because the API contract is not
> +        guaranteed and this gives extensions yet another to point to fail
> +        fast (unknown arguments will result in call failures and will require
> +        extensions to adapt to changes in the API).
> +        """
> +        for fn in self._hooks[name]:
> +            fn(self, **args)
> +
>       @unfilteredmethod
>       def _tag(self, names, node, message, local, user, date, extra={},
>                editor=False):
>           if isinstance(names, str):
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -300,8 +300,11 @@  class localrepository(object):
         # - working directory parent change,
         # - bookmark changes
         self.filteredrevcache = {}
 
+        # Maps names to list of callables.
+        self._hooks = {}
+
     def close(self):
         pass
 
     def _restrictcapabilities(self, caps):
@@ -485,8 +488,50 @@  class localrepository(object):
         replacing code that is expected to call a hook.
         """
         return hook.hook(self.ui, self, name, throw, **args)
 
+    def addpyhook(self, name, fn):
+        """Register a Python hook against this repo instance.
+
+        It is common to want to execute some Python code when certain events
+        occur. This is common in extensions. This method provides a
+        registration mechanism to do that.
+
+        This method receives the name of a hook, name, and a callable, fn.
+
+        If the hook name is not known, KeyError will be raised. This means
+        that if a hook is deleted, extensions will fail fast unless they catch
+        KeyError.
+
+        The Mercurial API does not make any guarantees about the stability
+        of arguments passed to any called hook. However, an effort is made
+        to avoid unnecessary churn.
+
+        "pyhooks" are an internal-oriented variation of the external-facing
+        hooks mechanism. The latter has strong API guarantees and hooks can
+        be added via hgrc files. pyhooks are strictly internal and have
+        weaker API guarantees.
+        """
+        if not callable(fn):
+            raise ValueError('Argument is not callable')
+        self._hooks[name].append(fn)
+
+    def runpyhook(self, name, **args):
+        """Run a Python hook.
+
+        All callables registered via addpyhook() will be executed in the order
+        they were registered.
+
+        Each hook will receive as arguments this repo instance as its single
+        positional argument and the named arguments passed into this method, if
+        any. All custom arguments are named because the API contract is not
+        guaranteed and this gives extensions yet another to point to fail
+        fast (unknown arguments will result in call failures and will require
+        extensions to adapt to changes in the API).
+        """
+        for fn in self._hooks[name]:
+            fn(self, **args)
+
     @unfilteredmethod
     def _tag(self, names, node, message, local, user, date, extra={},
              editor=False):
         if isinstance(names, str):