Patchwork [5,of,5,clonebundles,V2] clonebundles: support sorting URLs by client-side preferences

login
register
mail settings
Submitter Gregory Szorc
Date Oct. 9, 2015, 7:33 p.m.
Message ID <f9437fc112e5daa1aaa6.1444419234@gps-mbp.local>
Download mbox | patch
Permalink /patch/10935/
State Changes Requested
Delegated to: Matt Mackall
Headers show

Comments

Gregory Szorc - Oct. 9, 2015, 7:33 p.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1444418772 25200
#      Fri Oct 09 12:26:12 2015 -0700
# Node ID f9437fc112e5daa1aaa6d7b9bdc8e78082a23e8e
# Parent  5a067bc8dc2c1c1edb0b42e786fa846a003903da
clonebundles: support sorting URLs by client-side preferences

Not all bundles are appropriate for all clients. For example, someone
with a slow Internet connection may want to prefer bz2 bundles over gzip
bundles because they are smaller and don't take as long to transfer.
This is information that a server cannot know on its own. So, we invent
a mechanism for "preferring" server-advertised URLs based on their
attributes.

We could invent a negotiation between client and server where the client
sends its preferences and the sorting/filtering is done server-side.
However, this feels complex. We can avoid complicating the wire protocol
and exposing ourselves to backwards compatible concerns by performing
the sorting locally.

This patch defines a new config option for expressing preferred
attributes in server-advertised bundles.

At Mozilla, we leverage this feature so clients in fast data centers
prefer uncompressed bundles. (We advertise gzip bundles first because
that is a reasonable default.)

I consider this an advanced feature. I'm on the fence as to whether it
should be documented in `hg help config`.
Augie Fackler - Oct. 12, 2015, 2:36 p.m.
On Fri, Oct 09, 2015 at 12:33:54PM -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1444418772 25200
> #      Fri Oct 09 12:26:12 2015 -0700
> # Node ID f9437fc112e5daa1aaa6d7b9bdc8e78082a23e8e
> # Parent  5a067bc8dc2c1c1edb0b42e786fa846a003903da
> clonebundles: support sorting URLs by client-side preferences

I'm happy with this, but would like to hear mpm's feelings before I queue it.

>
> Not all bundles are appropriate for all clients. For example, someone
> with a slow Internet connection may want to prefer bz2 bundles over gzip
> bundles because they are smaller and don't take as long to transfer.
> This is information that a server cannot know on its own. So, we invent
> a mechanism for "preferring" server-advertised URLs based on their
> attributes.
>
> We could invent a negotiation between client and server where the client
> sends its preferences and the sorting/filtering is done server-side.
> However, this feels complex. We can avoid complicating the wire protocol
> and exposing ourselves to backwards compatible concerns by performing
> the sorting locally.
>
> This patch defines a new config option for expressing preferred
> attributes in server-advertised bundles.
>
> At Mozilla, we leverage this feature so clients in fast data centers
> prefer uncompressed bundles. (We advertise gzip bundles first because
> that is a reasonable default.)
>
> I consider this an advanced feature. I'm on the fence as to whether it
> should be documented in `hg help config`.
>
> diff --git a/mercurial/exchange.py b/mercurial/exchange.py
> --- a/mercurial/exchange.py
> +++ b/mercurial/exchange.py
> @@ -1539,9 +1539,9 @@ def _maybeapplyclonebundle(pullop):
>          repo.ui.warn(_('(you may want to report this to the server '
>                         'operator)\n'))
>          return
>
> -    # TODO sort entries by user preferences.
> +    entries = sortclonebundleentries(repo.ui, entries)
>
>      url = entries[0]['URL']
>      repo.ui.status(_('applying clone bundle from %s\n') % url)
>      if trypullbundlefromurl(repo.ui, repo, url):
> @@ -1600,8 +1600,52 @@ def filterclonebundleentries(ui, entries
>          newentries.append(e)
>
>      return newentries
>
> +def sortclonebundleentries(ui, entries):
> +    prefers = ui.configlist('ui', 'clonebundleprefers', default=[])
> +    if not prefers:
> +        return list(entries)
> +
> +    prefers = [p.split('=', 1) for p in prefers]
> +
> +    # Our sort function.
> +    def compareentry(a, b):
> +        for prefkey, prefvalue in prefers:
> +            avalue = a.get(prefkey)
> +            bvalue = b.get(prefkey)
> +
> +            # Special case for b missing attribute and a matches exactly.
> +            if avalue is not None and bvalue is None and avalue == prefvalue:
> +                return -1
> +
> +            # Special case for a missing attribute and b matches exactly.
> +            if bvalue is not None and avalue is None and bvalue == prefvalue:
> +                return 1
> +
> +            # We can't compare unless attribute present on both.
> +            if avalue is None or bvalue is None:
> +                continue
> +
> +            # Same values should fall back to next attribute.
> +            if avalue == bvalue:
> +                continue
> +
> +            # Exact matches come first.
> +            if avalue == prefvalue:
> +                return -1
> +            if bvalue == prefvalue:
> +                return 1
> +
> +            # Fall back to next attribute.
> +            continue
> +
> +        # If we got here we couldn't sort by attributes and prefers. Fall
> +        # back to index order.
> +        return 0
> +
> +    return sorted(entries, cmp=compareentry)
> +
>  def trypullbundlefromurl(ui, repo, url):
>      """Attempt to apply a bundle from a URL."""
>      lock = repo.lock()
>      try:
> diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
> --- a/mercurial/help/config.txt
> +++ b/mercurial/help/config.txt
> @@ -1426,8 +1426,23 @@ User interface controls.
>      fails.
>
>      (default: False)
>
> +``clonebundleprefers``
> +    List of attributes and values to prefer from server advertised
> +    bundles.
> +
> +    When servers advertise multiple clone bundle URLs, the first
> +    compatible one is used by default. Setting this option overrides
> +    server order with your own preferred order.
> +
> +    Value is a list of "key=value" strings that correspond to attributes
> +    and values from the server advertised manifest.
> +
> +    Contact your server operator for suggested values.
> +
> +    (default: empty)
> +
>  ``commitsubrepos``
>      Whether to commit modified subrepositories when committing the
>      parent repository. If False and one subrepository has uncommitted
>      changes, abort the commit.
> diff --git a/tests/test-clonebundles.t b/tests/test-clonebundles.t
> --- a/tests/test-clonebundles.t
> +++ b/tests/test-clonebundles.t
> @@ -222,4 +222,98 @@ Python <2.7.9 will filter SNI URLs
>    finished applying clone bundle
>    searching for changes
>    no changes found
>  #endif
> +
> +Set up manifest for testing preferences
> +(Remember, the TYPE does not have to match reality - the URL is
> +important)
> +
> +  $ cp full.hg gz-a.hg
> +  $ cp full.hg gz-b.hg
> +  $ cp full.hg bz2-a.hg
> +  $ cp full.hg bz2-b.hg
> +  $ cat > server/.hg/clonebundles.manifest << EOF
> +  > http://localhost:$HGPORT1/gz-a.hg TYPE=HG10GZ extra=a
> +  > http://localhost:$HGPORT1/bz2-a.hg TYPE=HG10BZ extra=a
> +  > http://localhost:$HGPORT1/gz-b.hg TYPE=HG10GZ extra=b
> +  > http://localhost:$HGPORT1/bz2-b.hg TYPE=HG10BZ extra=b
> +  > EOF
> +
> +Preferring an undefined attribute will take first entry
> +
> +  $ hg --config ui.clonebundleprefers=foo=bar clone -U http://localhost:$HGPORT prefer-foo
> +  applying clone bundle from http://localhost:$HGPORT1/gz-a.hg
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files
> +  finished applying clone bundle
> +  searching for changes
> +  no changes found
> +
> +Preferring bz2 type will download first entry of that type
> +
> +  $ hg --config ui.clonebundleprefers=TYPE=HG10BZ clone -U http://localhost:$HGPORT prefer-bz
> +  applying clone bundle from http://localhost:$HGPORT1/bz2-a.hg
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files
> +  finished applying clone bundle
> +  searching for changes
> +  no changes found
> +
> +Preferring multiple values of an option will still work
> +
> +  $ hg --config ui.clonebundleprefers=TYPE=unknown,TYPE=HG10BZ clone -U http://localhost:$HGPORT prefer-multiple-bz
> +  applying clone bundle from http://localhost:$HGPORT1/bz2-a.hg
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files
> +  finished applying clone bundle
> +  searching for changes
> +  no changes found
> +
> +Sorting multiple values should get us back to original first entry
> +
> +  $ hg --config ui.clonebundleprefers=TYPE=unknown,TYPE=HG10GZ,TYPE=HG10BZ clone -U http://localhost:$HGPORT prefer-multiple-gz
> +  applying clone bundle from http://localhost:$HGPORT1/gz-a.hg
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files
> +  finished applying clone bundle
> +  searching for changes
> +  no changes found
> +
> +Preferring multiple attributes has correct order
> +
> +  $ hg --config ui.clonebundleprefers=extra=b,TYPE=HG10BZ clone -U http://localhost:$HGPORT prefer-separate-attributes
> +  applying clone bundle from http://localhost:$HGPORT1/bz2-b.hg
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files
> +  finished applying clone bundle
> +  searching for changes
> +  no changes found
> +
> +Test where attribute is missing from some entries
> +
> +  $ cat > server/.hg/clonebundles.manifest << EOF
> +  > http://localhost:$HGPORT1/gz-a.hg TYPE=HG10GZ
> +  > http://localhost:$HGPORT1/bz2-a.hg TYPE=HG10BZ
> +  > http://localhost:$HGPORT1/gz-b.hg TYPE=HG10GZ extra=b
> +  > http://localhost:$HGPORT1/bz2-b.hg TYPE=HG10BZ extra=b
> +  > EOF
> +
> +  $ hg --config ui.clonebundleprefers=extra=b clone -U http://localhost:$HGPORT prefer-partially-defined-attribute
> +  applying clone bundle from http://localhost:$HGPORT1/gz-b.hg
> +  adding changesets
> +  adding manifests
> +  adding file changes
> +  added 2 changesets with 2 changes to 2 files
> +  finished applying clone bundle
> +  searching for changes
> +  no changes found
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel

Patch

diff --git a/mercurial/exchange.py b/mercurial/exchange.py
--- a/mercurial/exchange.py
+++ b/mercurial/exchange.py
@@ -1539,9 +1539,9 @@  def _maybeapplyclonebundle(pullop):
         repo.ui.warn(_('(you may want to report this to the server '
                        'operator)\n'))
         return
 
-    # TODO sort entries by user preferences.
+    entries = sortclonebundleentries(repo.ui, entries)
 
     url = entries[0]['URL']
     repo.ui.status(_('applying clone bundle from %s\n') % url)
     if trypullbundlefromurl(repo.ui, repo, url):
@@ -1600,8 +1600,52 @@  def filterclonebundleentries(ui, entries
         newentries.append(e)
 
     return newentries
 
+def sortclonebundleentries(ui, entries):
+    prefers = ui.configlist('ui', 'clonebundleprefers', default=[])
+    if not prefers:
+        return list(entries)
+
+    prefers = [p.split('=', 1) for p in prefers]
+
+    # Our sort function.
+    def compareentry(a, b):
+        for prefkey, prefvalue in prefers:
+            avalue = a.get(prefkey)
+            bvalue = b.get(prefkey)
+
+            # Special case for b missing attribute and a matches exactly.
+            if avalue is not None and bvalue is None and avalue == prefvalue:
+                return -1
+
+            # Special case for a missing attribute and b matches exactly.
+            if bvalue is not None and avalue is None and bvalue == prefvalue:
+                return 1
+
+            # We can't compare unless attribute present on both.
+            if avalue is None or bvalue is None:
+                continue
+
+            # Same values should fall back to next attribute.
+            if avalue == bvalue:
+                continue
+
+            # Exact matches come first.
+            if avalue == prefvalue:
+                return -1
+            if bvalue == prefvalue:
+                return 1
+
+            # Fall back to next attribute.
+            continue
+
+        # If we got here we couldn't sort by attributes and prefers. Fall
+        # back to index order.
+        return 0
+
+    return sorted(entries, cmp=compareentry)
+
 def trypullbundlefromurl(ui, repo, url):
     """Attempt to apply a bundle from a URL."""
     lock = repo.lock()
     try:
diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
--- a/mercurial/help/config.txt
+++ b/mercurial/help/config.txt
@@ -1426,8 +1426,23 @@  User interface controls.
     fails.
 
     (default: False)
 
+``clonebundleprefers``
+    List of attributes and values to prefer from server advertised
+    bundles.
+
+    When servers advertise multiple clone bundle URLs, the first
+    compatible one is used by default. Setting this option overrides
+    server order with your own preferred order.
+
+    Value is a list of "key=value" strings that correspond to attributes
+    and values from the server advertised manifest.
+
+    Contact your server operator for suggested values.
+
+    (default: empty)
+
 ``commitsubrepos``
     Whether to commit modified subrepositories when committing the
     parent repository. If False and one subrepository has uncommitted
     changes, abort the commit.
diff --git a/tests/test-clonebundles.t b/tests/test-clonebundles.t
--- a/tests/test-clonebundles.t
+++ b/tests/test-clonebundles.t
@@ -222,4 +222,98 @@  Python <2.7.9 will filter SNI URLs
   finished applying clone bundle
   searching for changes
   no changes found
 #endif
+
+Set up manifest for testing preferences
+(Remember, the TYPE does not have to match reality - the URL is
+important)
+
+  $ cp full.hg gz-a.hg
+  $ cp full.hg gz-b.hg
+  $ cp full.hg bz2-a.hg
+  $ cp full.hg bz2-b.hg
+  $ cat > server/.hg/clonebundles.manifest << EOF
+  > http://localhost:$HGPORT1/gz-a.hg TYPE=HG10GZ extra=a
+  > http://localhost:$HGPORT1/bz2-a.hg TYPE=HG10BZ extra=a
+  > http://localhost:$HGPORT1/gz-b.hg TYPE=HG10GZ extra=b
+  > http://localhost:$HGPORT1/bz2-b.hg TYPE=HG10BZ extra=b
+  > EOF
+
+Preferring an undefined attribute will take first entry
+
+  $ hg --config ui.clonebundleprefers=foo=bar clone -U http://localhost:$HGPORT prefer-foo
+  applying clone bundle from http://localhost:$HGPORT1/gz-a.hg
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files
+  finished applying clone bundle
+  searching for changes
+  no changes found
+
+Preferring bz2 type will download first entry of that type
+
+  $ hg --config ui.clonebundleprefers=TYPE=HG10BZ clone -U http://localhost:$HGPORT prefer-bz
+  applying clone bundle from http://localhost:$HGPORT1/bz2-a.hg
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files
+  finished applying clone bundle
+  searching for changes
+  no changes found
+
+Preferring multiple values of an option will still work
+
+  $ hg --config ui.clonebundleprefers=TYPE=unknown,TYPE=HG10BZ clone -U http://localhost:$HGPORT prefer-multiple-bz
+  applying clone bundle from http://localhost:$HGPORT1/bz2-a.hg
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files
+  finished applying clone bundle
+  searching for changes
+  no changes found
+
+Sorting multiple values should get us back to original first entry
+
+  $ hg --config ui.clonebundleprefers=TYPE=unknown,TYPE=HG10GZ,TYPE=HG10BZ clone -U http://localhost:$HGPORT prefer-multiple-gz
+  applying clone bundle from http://localhost:$HGPORT1/gz-a.hg
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files
+  finished applying clone bundle
+  searching for changes
+  no changes found
+
+Preferring multiple attributes has correct order
+
+  $ hg --config ui.clonebundleprefers=extra=b,TYPE=HG10BZ clone -U http://localhost:$HGPORT prefer-separate-attributes
+  applying clone bundle from http://localhost:$HGPORT1/bz2-b.hg
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files
+  finished applying clone bundle
+  searching for changes
+  no changes found
+
+Test where attribute is missing from some entries
+
+  $ cat > server/.hg/clonebundles.manifest << EOF
+  > http://localhost:$HGPORT1/gz-a.hg TYPE=HG10GZ
+  > http://localhost:$HGPORT1/bz2-a.hg TYPE=HG10BZ
+  > http://localhost:$HGPORT1/gz-b.hg TYPE=HG10GZ extra=b
+  > http://localhost:$HGPORT1/bz2-b.hg TYPE=HG10BZ extra=b
+  > EOF
+
+  $ hg --config ui.clonebundleprefers=extra=b clone -U http://localhost:$HGPORT prefer-partially-defined-attribute
+  applying clone bundle from http://localhost:$HGPORT1/gz-b.hg
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files
+  finished applying clone bundle
+  searching for changes
+  no changes found