Patchwork [2,of,2] tags: make tags cache compatible with future split of filenode cache

login
register
mail settings
Submitter Gregory Szorc
Date March 25, 2015, 4:46 a.m.
Message ID <314d22b3f6418885c3c3.1427258764@vm-ubuntu-main.gateway.sonic.net>
Download mbox | patch
Permalink /patch/8250/
State Rejected
Delegated to: Pierre-Yves David
Headers show

Comments

Gregory Szorc - March 25, 2015, 4:46 a.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1427258716 25200
#      Tue Mar 24 21:45:16 2015 -0700
# Node ID 314d22b3f6418885c3c395f57ce2b2a95e69b3c5
# Parent  30988005a918c28154c77e98160bd6ea3a3b16a1
tags: make tags cache compatible with future split of filenode cache

Pierre-Yves has plans to establish separate, per-filter tags cache
files.

I have plans to establish a separate, shared cache file for .hgtags
filenode data so we don't have redundant lookups of potentially
expensive filenode data.

This patch adds future-proofing to tags cache reading so per-filter
tags cache can land in a way that doesn't make the transition to a
separate .hgtags filenode cache painful.

With this patch applied, clients are able to recognize the planned
future format of the tags cache with external .hgtags filenode data.
When support for that cache lands, clients can read existing .hgtags
filenode data from the prior tags cache files and then write out the
new file format without having to recompute .hgtags filenodes. For
users of large repositories, this will potentially save minutes of
wall time.

Without this patch, we would have to create a new version of the tags
cache files when rolling out the separate .hgtags filenode cache. This
is because the file formats would need to be different. This is
because we want the tags cache to record the tip rev and node for
quick cache freshness checks. We can't simply add this data to an
existing cache file because if an old client came along, it would
choke when reading this data (the format would have to be different
so it wouldn't be confused for a .hgtags filenode mapping of a single
head).

The assumption here is that per-filter tags caches will land shortly
after this patch. If an old client talks to a future repository, that
client is either going to:

a) read the old .hg/cache/tags file
b) read the per-filter tags cache

The old/existing tags cache will *never* have external .hgtags
filenodes, so it will always be able to read that file. (It will likely
have to process a bunch of new heads, but that's the price for using an
old client.)

The new, per-filter tags cache *may* have external .hgtags filenodes.
But, since the code for recognizing the declaration of external .hgtags
(this patch) exists for all clients that support per-filter tags cache
reading (which will land after this patch), all clients are guaranteed
to parse per-filter tags cache files with or without external .hgtags
filenodes. Thus, moving data to an external .hgtags filenodes cache
can occur without backwards compatibility concerns and without
potentially expensive recomputation of .hgtags filenodes values.
Gregory Szorc - March 30, 2015, 12:12 a.m.
On Tue, Mar 24, 2015 at 9:46 PM, Gregory Szorc <gregory.szorc@gmail.com>
wrote:

> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1427258716 25200
> #      Tue Mar 24 21:45:16 2015 -0700
> # Node ID 314d22b3f6418885c3c395f57ce2b2a95e69b3c5
> # Parent  30988005a918c28154c77e98160bd6ea3a3b16a1
> tags: make tags cache compatible with future split of filenode cache
>
> Pierre-Yves has plans to establish separate, per-filter tags cache
> files.
>
> I have plans to establish a separate, shared cache file for .hgtags
> filenode data so we don't have redundant lookups of potentially
> expensive filenode data.
>
> This patch adds future-proofing to tags cache reading so per-filter
> tags cache can land in a way that doesn't make the transition to a
> separate .hgtags filenode cache painful.
>
> With this patch applied, clients are able to recognize the planned
> future format of the tags cache with external .hgtags filenode data.
> When support for that cache lands, clients can read existing .hgtags
> filenode data from the prior tags cache files and then write out the
> new file format without having to recompute .hgtags filenodes. For
> users of large repositories, this will potentially save minutes of
> wall time.
>
> Without this patch, we would have to create a new version of the tags
> cache files when rolling out the separate .hgtags filenode cache. This
> is because the file formats would need to be different. This is
> because we want the tags cache to record the tip rev and node for
> quick cache freshness checks. We can't simply add this data to an
> existing cache file because if an old client came along, it would
> choke when reading this data (the format would have to be different
> so it wouldn't be confused for a .hgtags filenode mapping of a single
> head).
>
> The assumption here is that per-filter tags caches will land shortly
> after this patch. If an old client talks to a future repository, that
> client is either going to:
>
> a) read the old .hg/cache/tags file
> b) read the per-filter tags cache
>
> The old/existing tags cache will *never* have external .hgtags
> filenodes, so it will always be able to read that file. (It will likely
> have to process a bunch of new heads, but that's the price for using an
> old client.)
>
> The new, per-filter tags cache *may* have external .hgtags filenodes.
> But, since the code for recognizing the declaration of external .hgtags
> (this patch) exists for all clients that support per-filter tags cache
> reading (which will land after this patch), all clients are guaranteed
> to parse per-filter tags cache files with or without external .hgtags
> filenodes. Thus, moving data to an external .hgtags filenodes cache
> can occur without backwards compatibility concerns and without
> potentially expensive recomputation of .hgtags filenodes values.
>
> diff --git a/mercurial/tags.py b/mercurial/tags.py
> --- a/mercurial/tags.py
> +++ b/mercurial/tags.py
> @@ -27,8 +27,16 @@ import time
>  # The first part consists of lines of the form:
>  #
>  #   <headrev> <headnode> [<hgtagsnode>]
>  #
> +# *OR* a line of the form:
> +#
> +#   "external" <tiprev> <tipnode>
> +#
> +# The first form is the historical method of storing the .hgtags filenode
> +# mapping inline. The second form (which is reserved for future use) uses
> +# a separate file for this data.
> +#
>  # <headrev> is an integer revision and <headnode> is a 40 character hex
>  # node for that changeset. These redundantly identify a repository
>  # head from the time the cache was written.
>  #
> @@ -262,8 +270,13 @@ def _readtagcache(ui, repo):
>          try:
>              for line in cachelines:
>                  if line == "\n":
>                      break
> +
> +                # Future version of cache encountered. Do nothing yet.
> +                if line.startswith("external "):
> +                    continue
> +
>                  line = line.split()
>                  cacherevs.append(int(line[0]))
>                  headnode = bin(line[1])
>                  cacheheads.append(headnode)
> diff --git a/tests/test-tags.t b/tests/test-tags.t
> --- a/tests/test-tags.t
> +++ b/tests/test-tags.t
> @@ -224,8 +224,33 @@ Dump cache:
>    bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
>    bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
>    78391a272241d70354aa14c874552cad6b51bb42 bar
>
> +External .hgtags filenode cache marker is handled
> +
> +  $ cat > .hg/cache/tags << EOF
> +  > external 4 0c192d7d5e6b78a714de54a2e9627952a877e25a
> +  >
> +  > bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
> +  > bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
> +  > 78391a272241d70354aa14c874552cad6b51bb42 bar
> +  > EOF
> +
> +  $ hg tags
> +  tip                                4:0c192d7d5e6b
> +  bar                                1:78391a272241
> +
> +We should get an old style cache again
> +
> +  $ cat .hg/cache/tags
> +  4 0c192d7d5e6b78a714de54a2e9627952a877e25a
> 0c04f2a8af31de17fab7422878ee5a2dadbc943d
> +  3 6fa450212aeb2a21ed616a54aea39a4a27894cd7
> 7d3b718c964ef37b89e550ebdafd5789e76ce1b0
> +  2 7a94127795a33c10a370c93f731fd9fea0b79af6
> 0c04f2a8af31de17fab7422878ee5a2dadbc943d
> +
> +  bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
> +  bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
> +  78391a272241d70354aa14c874552cad6b51bb42 bar
> +
>  Test tag removal:
>
>    $ hg tag --remove bar     # rev 5
>    $ hg tip -vp
>

Please disregard this patch: I've got a new series consolidating
Pierre-Yves's and my work. This patch will be rolled into that one.

Patch

diff --git a/mercurial/tags.py b/mercurial/tags.py
--- a/mercurial/tags.py
+++ b/mercurial/tags.py
@@ -27,8 +27,16 @@  import time
 # The first part consists of lines of the form:
 #
 #   <headrev> <headnode> [<hgtagsnode>]
 #
+# *OR* a line of the form:
+#
+#   "external" <tiprev> <tipnode>
+#
+# The first form is the historical method of storing the .hgtags filenode
+# mapping inline. The second form (which is reserved for future use) uses
+# a separate file for this data.
+#
 # <headrev> is an integer revision and <headnode> is a 40 character hex
 # node for that changeset. These redundantly identify a repository
 # head from the time the cache was written.
 #
@@ -262,8 +270,13 @@  def _readtagcache(ui, repo):
         try:
             for line in cachelines:
                 if line == "\n":
                     break
+
+                # Future version of cache encountered. Do nothing yet.
+                if line.startswith("external "):
+                    continue
+
                 line = line.split()
                 cacherevs.append(int(line[0]))
                 headnode = bin(line[1])
                 cacheheads.append(headnode)
diff --git a/tests/test-tags.t b/tests/test-tags.t
--- a/tests/test-tags.t
+++ b/tests/test-tags.t
@@ -224,8 +224,33 @@  Dump cache:
   bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
   bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
   78391a272241d70354aa14c874552cad6b51bb42 bar
 
+External .hgtags filenode cache marker is handled
+
+  $ cat > .hg/cache/tags << EOF
+  > external 4 0c192d7d5e6b78a714de54a2e9627952a877e25a
+  > 
+  > bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
+  > bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
+  > 78391a272241d70354aa14c874552cad6b51bb42 bar
+  > EOF
+
+  $ hg tags
+  tip                                4:0c192d7d5e6b
+  bar                                1:78391a272241
+
+We should get an old style cache again
+
+  $ cat .hg/cache/tags
+  4 0c192d7d5e6b78a714de54a2e9627952a877e25a 0c04f2a8af31de17fab7422878ee5a2dadbc943d
+  3 6fa450212aeb2a21ed616a54aea39a4a27894cd7 7d3b718c964ef37b89e550ebdafd5789e76ce1b0
+  2 7a94127795a33c10a370c93f731fd9fea0b79af6 0c04f2a8af31de17fab7422878ee5a2dadbc943d
+  
+  bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
+  bbd179dfa0a71671c253b3ae0aa1513b60d199fa bar
+  78391a272241d70354aa14c874552cad6b51bb42 bar
+
 Test tag removal:
 
   $ hg tag --remove bar     # rev 5
   $ hg tip -vp