Patchwork [V2] hgweb: make refresh interval configurable

login
register
mail settings
Submitter Gregory Szorc
Date Aug. 23, 2015, 6:01 a.m.
Message ID <3ea9ccc86445b39ae41a.1440309690@gps-mbp>
Download mbox | patch
Permalink /patch/10262/
State Accepted
Commit 06320fb1169935a68082895c146a3a2fda2ba62a
Headers show

Comments

Gregory Szorc - Aug. 23, 2015, 6:01 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1440309591 25200
#      Sat Aug 22 22:59:51 2015 -0700
# Node ID 3ea9ccc86445b39ae41af1621ad87663e5ad6ed0
# Parent  d9d3d49c4cf77049d12920980e91bf8e4a4ecda2
hgweb: make refresh interval configurable

hgwebdir refreshes the set of known repositories periodically. This
is necessary because refreshing on every request could add significant
request latency.

More than once I've found myself wanting to tweak this interval at
Mozilla. I've also wanted the ability to always refresh (often when
writing tests for our replication setup).

This patch makes the refresh interval configurable. Negative values
indicate to always refresh. The default is left unchanged.
Yuya Nishihara - Aug. 23, 2015, 2 p.m.
On Sat, 22 Aug 2015 23:01:30 -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1440309591 25200
> #      Sat Aug 22 22:59:51 2015 -0700
> # Node ID 3ea9ccc86445b39ae41af1621ad87663e5ad6ed0
> # Parent  d9d3d49c4cf77049d12920980e91bf8e4a4ecda2
> hgweb: make refresh interval configurable
> 
> hgwebdir refreshes the set of known repositories periodically. This
> is necessary because refreshing on every request could add significant
> request latency.
> 
> More than once I've found myself wanting to tweak this interval at
> Mozilla. I've also wanted the ability to always refresh (often when
> writing tests for our replication setup).
> 
> This patch makes the refresh interval configurable. Negative values
> indicate to always refresh. The default is left unchanged.
> 
> diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
> --- a/mercurial/help/config.txt
> +++ b/mercurial/help/config.txt
> @@ -1751,8 +1751,16 @@ The full set of options is:
>  ``push_ssl``
>      Whether to require that inbound pushes be transported over SSL to
>      prevent password sniffing. Default is True.
>  
> +``refreshinterval``
> +    How frequently directory listings re-scan the filesystem for new
> +    repositories, in seconds. This is relevant when wildcards are used
> +    to define paths. Depending on how much filesystem traversal is
> +    required, refreshing may negatively impact performance.
> +
> +    Default is 20. Values less than or equal to 0 always refresh.
> +
>  ``staticurl``
>      Base URL to use for static files. If unset, static files (e.g. the
>      hgicon.png favicon) will be served by the CGI script itself. Use
>      this setting to serve them directly with the HTTP server.
> diff --git a/mercurial/hgweb/hgwebdir_mod.py b/mercurial/hgweb/hgwebdir_mod.py
> --- a/mercurial/hgweb/hgwebdir_mod.py
> +++ b/mercurial/hgweb/hgwebdir_mod.py
> @@ -78,19 +78,25 @@ def geturlcgivars(baseurl, port):
>  
>      return name, str(port), path
>  
>  class hgwebdir(object):
> -    refreshinterval = 20
> -
>      def __init__(self, conf, baseui=None):
>          self.conf = conf
>          self.baseui = baseui
> +        self.ui = None
>          self.lastrefresh = 0
>          self.motd = None
>          self.refresh()
>  
>      def refresh(self):
> -        if self.lastrefresh + self.refreshinterval > time.time():
> +        refreshinterval = 20
> +        if self.ui:

OT: not sure how important it is to delay the initialization of self.ui.

> +            refreshinterval = self.ui.configint('web', 'refreshinterval',
> +                                                refreshinterval)
> +
> +        # refreshinterval <= 0 means to always refresh.
> +        if (refreshinterval > 0 and
> +            self.lastrefresh + refreshinterval > time.time()):
>              return

refreshinterval > 0 isn't necessary?

if
    lastrefresh <= time.time(),
    refreshinterval <= 0
then
    lastrefresh + refreshinterval <= time.time()
    (always refresh)
Matt Mackall - Aug. 25, 2015, 8:46 p.m.
On Sat, 2015-08-22 at 23:01 -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1440309591 25200
> #      Sat Aug 22 22:59:51 2015 -0700
> # Node ID 3ea9ccc86445b39ae41af1621ad87663e5ad6ed0
> # Parent  d9d3d49c4cf77049d12920980e91bf8e4a4ecda2
> hgweb: make refresh interval configurable

I've replaced v1 with v2 here. Too much mail.

Patch

diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
--- a/mercurial/help/config.txt
+++ b/mercurial/help/config.txt
@@ -1751,8 +1751,16 @@  The full set of options is:
 ``push_ssl``
     Whether to require that inbound pushes be transported over SSL to
     prevent password sniffing. Default is True.
 
+``refreshinterval``
+    How frequently directory listings re-scan the filesystem for new
+    repositories, in seconds. This is relevant when wildcards are used
+    to define paths. Depending on how much filesystem traversal is
+    required, refreshing may negatively impact performance.
+
+    Default is 20. Values less than or equal to 0 always refresh.
+
 ``staticurl``
     Base URL to use for static files. If unset, static files (e.g. the
     hgicon.png favicon) will be served by the CGI script itself. Use
     this setting to serve them directly with the HTTP server.
diff --git a/mercurial/hgweb/hgwebdir_mod.py b/mercurial/hgweb/hgwebdir_mod.py
--- a/mercurial/hgweb/hgwebdir_mod.py
+++ b/mercurial/hgweb/hgwebdir_mod.py
@@ -78,19 +78,25 @@  def geturlcgivars(baseurl, port):
 
     return name, str(port), path
 
 class hgwebdir(object):
-    refreshinterval = 20
-
     def __init__(self, conf, baseui=None):
         self.conf = conf
         self.baseui = baseui
+        self.ui = None
         self.lastrefresh = 0
         self.motd = None
         self.refresh()
 
     def refresh(self):
-        if self.lastrefresh + self.refreshinterval > time.time():
+        refreshinterval = 20
+        if self.ui:
+            refreshinterval = self.ui.configint('web', 'refreshinterval',
+                                                refreshinterval)
+
+        # refreshinterval <= 0 means to always refresh.
+        if (refreshinterval > 0 and
+            self.lastrefresh + refreshinterval > time.time()):
             return
 
         if self.baseui:
             u = self.baseui.copy()
diff --git a/tests/test-hgwebdir.t b/tests/test-hgwebdir.t
--- a/tests/test-hgwebdir.t
+++ b/tests/test-hgwebdir.t
@@ -1244,8 +1244,69 @@  rss-log with basedir /foo/
 
   $ get-with-headers.py localhost:$HGPORT2 'a/rss-log' | grep '<guid'
       <guid isPermaLink="true">http://hg.example.com:8080/foo/a/rev/8580ff50825a</guid>
 
+Path refreshing works as expected
+
+  $ killdaemons.py
+  $ mkdir $root/refreshtest
+  $ hg init $root/refreshtest/a
+  $ cat > paths.conf << EOF
+  > [paths]
+  > / = $root/refreshtest/*
+  > EOF
+  $ hg serve -p $HGPORT1 -d --pid-file hg.pid --webdir-conf paths.conf
+  $ cat hg.pid >> $DAEMON_PIDS
+
+  $ get-with-headers.py localhost:$HGPORT1 '?style=raw'
+  200 Script output follows
+  
+  
+  /a/
+  
+
+By default refreshing occurs every 20s and a new repo won't be listed
+immediately.
+
+  $ hg init $root/refreshtest/b
+  $ get-with-headers.py localhost:$HGPORT1 '?style=raw'
+  200 Script output follows
+  
+  
+  /a/
+  
+
+Restart the server with no refresh interval. New repo should appear
+immediately.
+
+  $ killdaemons.py
+  $ cat > paths.conf << EOF
+  > [web]
+  > refreshinterval = -1
+  > [paths]
+  > / = $root/refreshtest/*
+  > EOF
+  $ hg serve -p $HGPORT1 -d --pid-file hg.pid --webdir-conf paths.conf
+  $ cat hg.pid >> $DAEMON_PIDS
+
+  $ get-with-headers.py localhost:$HGPORT1 '?style=raw'
+  200 Script output follows
+  
+  
+  /a/
+  /b/
+  
+
+  $ hg init $root/refreshtest/c
+  $ get-with-headers.py localhost:$HGPORT1 '?style=raw'
+  200 Script output follows
+  
+  
+  /a/
+  /b/
+  /c/
+  
+
 paths errors 1
 
   $ cat error-paths-1.log