Patchwork who: remove OpenJDK

login
register
mail settings
Submitter David Demelier
Date July 25, 2020, 8:11 a.m.
Message ID <7eaad1ed8c743d40fe71.1595664667@lotus.home>
Download mbox | patch
Permalink /patch/46886/
State Accepted
Headers show

Comments

David Demelier - July 25, 2020, 8:11 a.m.
# HG changeset patch
# User David Demelier <markand@malikania.fr>
# Date 1595664656 -7200
#      Sat Jul 25 10:10:56 2020 +0200
# Node ID 7eaad1ed8c743d40fe71620434f3a151f0067105
# Parent  b0e3c6141a7844e1fdd55535677ea3bfb1527707
who: remove OpenJDK

They unfortunately moved to GitHub.

https://openjdk.java.net/jeps/369
Pulkit Goyal - July 25, 2020, 9:45 a.m.
On Sat, Jul 25, 2020 at 1:43 PM David Demelier <markand@malikania.fr> wrote:
>
> # HG changeset patch
> # User David Demelier <markand@malikania.fr>
> # Date 1595664656 -7200
> #      Sat Jul 25 10:10:56 2020 +0200
> # Node ID 7eaad1ed8c743d40fe71620434f3a151f0067105
> # Parent  b0e3c6141a7844e1fdd55535677ea3bfb1527707
> who: remove OpenJDK
>
> They unfortunately moved to GitHub.
>
> https://openjdk.java.net/jeps/369

Queued this, many thanks!
>
> diff -r b0e3c6141a78 -r 7eaad1ed8c74 templates/who/index.html
> --- a/templates/who/index.html  Fri Jul 26 14:27:08 2019 +0200
> +++ b/templates/who/index.html  Sat Jul 25 10:10:56 2020 +0200
> @@ -9,9 +9,6 @@
>          <h3>Mozilla</h3>
>          Mozilla is an open source project that is currently developing the popular <a href="https://www.mozilla.org/firefox">Firefox</a> internet browser, the email client <a href="https://www.mozilla.org/thunderbird">Thunderbird</a> and the application suite SeaMonkey. Mozilla chose Mercurial in 2006.</p>
>          <p><a href="https://www.mozilla.org">https://www.mozilla.org</a></p>
> -        <h3>Java / OpenJDK</h3>
> -        OpenJDK is the official open sourced Java implementation of Sun Microsystems. When open sourcing the project, Sun chose Mercurial as their main version control system.
> -        <p><a href="http://openjdk.java.net/">http://openjdk.java.net/</a></p>
>          <h3>Nginx</h3>
>          The nginx web server is among one of the most popular and used over the world.
>          <p><a href="http://nginx.org/">http://nginx.org/</a></p>
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
via Mercurial-devel - July 25, 2020, 10:27 a.m.
That's sad.

Apparently OpenJDK started contemplating a migration to git one year ago 
(2019-07-12): https://openjdk.java.net/jeps/357

I am reporting (an edited version of) the "motivation" section of that 
ticket, because I'd like a reflection about how mercurial is perceived 
"out there":


===

Motivation:

There are three primary reasons for migrating to Git:

1. Size of version control system metadata:

   a. Initial prototypes [...] show a significant reduction in 
[metadata] size. For example, the .git directory of the jdk/jdk 
repository is approximately 300 MB with Git and the .hg directory is 
around 1.2 GB with Mercurial, depending on the Mercurial version being 
used. The reduction in metadata preserves local disk space and reduces 
clone times [...].

   b. Git also features shallow clones that only clone parts of the 
history, resulting in even less metadata for those users who do not need 
the entire history.


2. Available tooling

There are many more tools for interacting with Git than Mercurial:

    a. All text editors have Git integration, either natively or in the 
form of plugins including Emacs (magit plugin), Vim (fugitive.git 
plugin), VS Code (builtin), and Atom (builtin).

    b. Almost all integrated development environments (IDEs) also ship 
with Git integration out-of-the-box, including IntelliJ (builtin), 
Eclipse (builtin), NetBeans (builtin), and Visual Studio (builtin).

    c. There are multiple desktop clients available for interacting with 
Git repositories locally.

3. Available hosting

Lastly, there are many options available for hosting Git repositories, 
whether self-hosted or hosted as a service.

===

About .hg size (1a): is it really true that .hg is 1.2GB and the 
corresponding .git version is 300 MB? Verifying it should not be too 
difficult. If it's true (I doubt it), something has to be done.

Shallow clones (1b): I never needed that, but now I am curious: do we 
have a similar feature in core or in a extension? If yes (and even if 
no, really), how to better communicate that feature-wise mercurial is on 
par (and sometimes better) than git?

Tooling (2): maybe git has much more, but TortoiseHG has a lot of 
potential. A lot of git tools are not Free Software, too.

Hosting (3): there is Heptapod, there is Kallithea (how is it doing). 
Once more, there is not enough communication IMHO.





On 25/07/20 10:11, David Demelier wrote:
> # HG changeset patch
> # User David Demelier <markand@malikania.fr>
> # Date 1595664656 -7200
> #      Sat Jul 25 10:10:56 2020 +0200
> # Node ID 7eaad1ed8c743d40fe71620434f3a151f0067105
> # Parent  b0e3c6141a7844e1fdd55535677ea3bfb1527707
> who: remove OpenJDK
>
> They unfortunately moved to GitHub.
>
> https://openjdk.java.net/jeps/369
>
> diff -r b0e3c6141a78 -r 7eaad1ed8c74 templates/who/index.html
> --- a/templates/who/index.html	Fri Jul 26 14:27:08 2019 +0200
> +++ b/templates/who/index.html	Sat Jul 25 10:10:56 2020 +0200
> @@ -9,9 +9,6 @@
>           <h3>Mozilla</h3>
>           Mozilla is an open source project that is currently developing the popular <a href="https://www.mozilla.org/firefox">Firefox</a> internet browser, the email client <a href="https://www.mozilla.org/thunderbird">Thunderbird</a> and the application suite SeaMonkey. Mozilla chose Mercurial in 2006.</p>
>           <p><a href="https://www.mozilla.org">https://www.mozilla.org</a></p>
> -        <h3>Java / OpenJDK</h3>
> -        OpenJDK is the official open sourced Java implementation of Sun Microsystems. When open sourcing the project, Sun chose Mercurial as their main version control system.
> -        <p><a href="http://openjdk.java.net/">http://openjdk.java.net/</a></p>
>           <h3>Nginx</h3>
>           The nginx web server is among one of the most popular and used over the world.
>           <p><a href="http://nginx.org/">http://nginx.org/</a></p>
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Josef 'Jeff' Sipek - July 25, 2020, 5:36 p.m.
On Sat, Jul 25, 2020 at 12:27:42 +0200, Antonio Muci via Mercurial-devel wrote:
> That's sad.

Yeah.

This motivated me enough to clone the repos (hg and git) and collect some
data.  Maybe people here will find it useful.

First off, the clone itself.  I cloned it from the official upstream repos.
My internet connection is 150 Mbit/s, the storage is a 3-way ZFS mirror.  I
used hg 4.9.1 (py27), and git 2.21.0.  (I know, I need to update both.  This
is on a box that has a solid network connection but is harder to update.  If
there is interest I can spend the effort to update them and re-run it with
newer versions.)

$ hg clone https://hg.openjdk.java.net/jdk/jdk
destination directory: jdk
requesting all changes
adding changesets
adding manifests
adding file changes
added 60318 changesets with 516970 changes to 187542 files
new changesets fd16c54261b3:227cd01f15fa
updating to branch default
65415 files updated, 0 files merged, 0 files removed, 0 files unresolved

This took a total of ~16.3 mins (978 seconds), of which:

 1) ~30 seconds were used by "adding changesets"
 2) ~8 mins were used by "adding manifests"
 3) ~7 mins were used by "adding files"

The adding of manifests and files was receiving ~1.0-1.2 MB/s (bytes
received on the NIC, *not* actual payload inside TCP and hg specific
framing).

My box still had plenty of CPU, RAM, and I/O left so I don't know if the 1.0
MB/s was a result of hg being sub-optimal or if the hg server or the network
connection were the bottleneck.

To rule out internet slowness, I ran 'hg serve' on the clone and did a clone
on my laptop (5.5rc0+25-fbc53c5853b0, py3) on the same subnet (wifi
connected).  It took 495 seconds (2x faster), and I saw slightly higher
network utilization (~1.7 MB/s) and the laptop CPU pegged at 100% for pretty
much the entire duration of the "adding file changes" portion.  (The laptop
has an SSD, so that probably helped eliminate some of the slowness - it is a
bit of an apples and oranges comparison, but interesting none the less.)

Cloning directly from java.net on my laptop took 1400 seconds - so, about
50% slower.  This could be because of the wifi, py3 vs. py27, hg version
difference, etc., etc.


$ git clone https://github.com/openjdk/jdk.git jdk-git
Cloning into 'jdk-git'...
remote: Enumerating objects: 819, done.
remote: Counting objects: 100% (819/819), done.
remote: Compressing objects: 100% (577/577), done.
remote: Total 1072595 (delta 356), reused 423 (delta 199), pack-reused 1071776
Receiving objects: 100% (1072595/1072595), 414.42 MiB | 6.17 MiB/s, done.
Resolving deltas: 100% (800673/800673), done.
Checking out files: 100% (65415/65415), done.

This took a total of 1 min 49 secs (109 seconds), of which:

 1) 1 min 8 secs were used by "receiving objects"
 2) 25 seconds were used by "resolving deltas"

The receiving of objects was pulling in 6.8 MB/s.

Cloning directly on my laptop took 99 seconds with git version 2.26.2.

...
> About .hg size (1a): is it really true that .hg is 1.2GB and the 
> corresponding .git version is 300 MB? Verifying it should not be too 
> difficult. If it's true (I doubt it), something has to be done.

$ du -shA jdk-*/.{hg,git}
1.10G   jdk-hg/.hg
452M    jdk-git/.git

So, both numbers seem to be tweaked to justify migration - at least on a
fresh clone - but I'd say hg is worse by 2-3x.

The whole checkout in case anyone cares:

$ du -shA *
1014M   jdk-git
1.65G   jdk-hg

Now, hg specifics.  It looks like the manifest is huge.  This corresponds to
how long it took to download.

-rw-r--r--   1 jeffpc   jeffpc     25.2M Jul 25 12:16 00changelog.d
-rw-r--r--   1 jeffpc   jeffpc     3.68M Jul 25 12:01 00changelog.i
-rw-r--r--   1 jeffpc   jeffpc      434M Jul 25 12:09 00manifest.d
-rw-r--r--   1 jeffpc   jeffpc     3.67M Jul 25 12:09 00manifest.i

Not a complete surprised given that there are a lot of files (~65k) tracked
and many use the super-long file paths (e.g.,
test/hotspot/jtreg/runtime/exceptionMsgs/AbstractMethodError/AbstractMethodErrorTest.java).
That adds up.  Just the paths in the manifest itself add up to almost 4.7MB.

$ hg manifest | wc
   65415   65415 4694467

I'm guessing that they would have benefited from treemanifest.


I also tried to clone locally to see what sort of thing a user would see.

$ hg clone jdk-hg test
$ git clone jdk-git test-git

hg took 60 seconds (with hot cache, ~120 secs cold cache), git took 13
seconds.  Git hardlinked the one big pack file, while hg hardlinked each of
the file in .hg/store.  Obviosly, hardlinking 2 files is much faster than
hardlinking ~180k.  (treemanifest would have made this even worse for hg.)


I just kicked off a conversion to treemanifest.  It'll take a while.

Jeff.
Joerg Sonnenberger - July 26, 2020, 2:11 a.m.
On Sat, Jul 25, 2020 at 01:36:32PM -0400, Josef 'Jeff' Sipek wrote:
> First off, the clone itself.  I cloned it from the official upstream repos.
> My internet connection is 150 Mbit/s, the storage is a 3-way ZFS mirror.  I
> used hg 4.9.1 (py27), and git 2.21.0.  (I know, I need to update both.  This
> is on a box that has a solid network connection but is harder to update.  If
> there is interest I can spend the effort to update them and re-run it with
> newer versions.)

It should be noted that for all intends and purposes, a git clone is
much more comparable to hg clone --stream.

> Now, hg specifics.  It looks like the manifest is huge.  This corresponds to
> how long it took to download.
> 
> -rw-r--r--   1 jeffpc   jeffpc     25.2M Jul 25 12:16 00changelog.d
> -rw-r--r--   1 jeffpc   jeffpc     3.68M Jul 25 12:01 00changelog.i
> -rw-r--r--   1 jeffpc   jeffpc      434M Jul 25 12:09 00manifest.d
> -rw-r--r--   1 jeffpc   jeffpc     3.67M Jul 25 12:09 00manifest.i

I have similar reservations about the way manifests are handled for the
NetBSD repository. It's been a topic of discussion recently on IRC. The
manifest processing itself currently takes nearly half of the total
clone time and that looks ...suspicious at best.


> I'm guessing that they would have benefited from treemanifest.

From my testing, treemanifests don't help at all.

> I also tried to clone locally to see what sort of thing a user would see.
> 
> $ hg clone jdk-hg test
> $ git clone jdk-git test-git
> 
> hg took 60 seconds (with hot cache, ~120 secs cold cache), git took 13
> seconds.  Git hardlinked the one big pack file, while hg hardlinked each of
> the file in .hg/store.  Obviosly, hardlinking 2 files is much faster than
> hardlinking ~180k.  (treemanifest would have made this even worse for hg.)

Using a unified storage would help somewhat in general, but I don't
consider local clone a big use case. share serves the purpose generally
much better.

> I just kicked off a conversion to treemanifest.  It'll take a while.

Did you convert to generaldelta and etc already?

Joerg
Josef 'Jeff' Sipek - July 26, 2020, 3:12 p.m.
On Sun, Jul 26, 2020 at 04:11:06 +0200, Joerg Sonnenberger wrote:
> On Sat, Jul 25, 2020 at 01:36:32PM -0400, Josef 'Jeff' Sipek wrote:
> > First off, the clone itself.  I cloned it from the official upstream repos.
> > My internet connection is 150 Mbit/s, the storage is a 3-way ZFS mirror.  I
> > used hg 4.9.1 (py27), and git 2.21.0.  (I know, I need to update both.  This
> > is on a box that has a solid network connection but is harder to update.  If
> > there is interest I can spend the effort to update them and re-run it with
> > newer versions.)
> 
> It should be noted that for all intends and purposes, a git clone is
> much more comparable to hg clone --stream.

I don't know if this is a temporary error or if the java.net server
disallows it, but:

$ hg clone --stream https://hg.openjdk.java.net/jdk/jdk jdk-stream
streaming all changes
abort: locking the remote repository failed

It'd make sense for this to be a disabled by policy, because you don't want
someone doing a slow streaming pull to lock the server's repo for hours
preventing other pushes (assuming that's the same lock).


Doing the clone over the LAN (gigabit ethernet) took 1m26s total (including
the checkout):

$ hg clone --stream http://server-host:8000 test-hg
streaming all changes
187754 files to transfer, 1.07 GB of data
transferred 1.07 GB in 45.5 seconds (24.0 MB/sec)
updating to branch default
65415 files updated, 0 files merged, 0 files removed, 0 files unresolved

The client host was running at 99% CPU while receiving the data, while the
server was at around 80-90%.  So, I'm concluding that in this local case I
was CPU bound on the client, but the server wasn't exactly lightly loaded.

For comparison, git cloning (including checkout) over the same LAN took 60
seconds.  So, faster than hg streaming clone, but only by ~26 seconds.

> > Now, hg specifics.  It looks like the manifest is huge.  This corresponds to
> > how long it took to download.
> > 
> > -rw-r--r--   1 jeffpc   jeffpc     25.2M Jul 25 12:16 00changelog.d
> > -rw-r--r--   1 jeffpc   jeffpc     3.68M Jul 25 12:01 00changelog.i
> > -rw-r--r--   1 jeffpc   jeffpc      434M Jul 25 12:09 00manifest.d
> > -rw-r--r--   1 jeffpc   jeffpc     3.67M Jul 25 12:09 00manifest.i
> 
> I have similar reservations about the way manifests are handled for the
> NetBSD repository. It's been a topic of discussion recently on IRC. The
> manifest processing itself currently takes nearly half of the total
> clone time and that looks ...suspicious at best.

Indeed.  I don't have the knowledge/experience to suggest improvements, but
I can run benchmarks :)

> > I'm guessing that they would have benefited from treemanifest.
> 
> From my testing, treemanifests don't help at all.

They seemed to help with the jdk repo.  I'm guessing that jdk has a deeper
nested directories with longer file names because the conversion certainly
seemed to help (tm == treemanifest):

$ hg --config extensions.convert= convert ../jdk-hg . ../tm-map
$ cd ..
$ du -sAh */.{git,hg}
452M    jdk-git/.git 
1.11G   jdk-hg/.hg
784M    jdk-tm/.hg

Not amazing, but it is about 70% of the "monolithic" manifest repo.  The
manifest part itself:

$ ls -lh 00*
-rw-r--r--   1 jeffpc   jeffpc     25.2M Jul 25 20:46 00changelog.d
-rw-r--r--   1 jeffpc   jeffpc     3.68M Jul 25 20:47 00changelog.i
-rw-r--r--   1 jeffpc   jeffpc     4.08M Jul 25 20:46 00manifest.d
-rw-r--r--   1 jeffpc   jeffpc     3.67M Jul 25 20:47 00manifest.i

$ du -sAh meta    
89.4M   meta

So, the (treemanifest) manifest data is about 97M total vs. 437MB total with
the monolithic manifest.  This equates to 22% of the original manifest size.

...
> > I just kicked off a conversion to treemanifest.  It'll take a while.
> 
> Did you convert to generaldelta and etc already?

'hg clone' produced a reasonable repo without conversion.  The only
requirement added during the conversion was treemanifest.

$ cat jdk-hg/.hg/requires
dotencode
fncache
generaldelta
revlogv1
sparserevlog
store
$ diff jdk-{hg,tm}/.hg/requires
6a7
> treemanifest

I can try other requirements, but I think the manifest problem jdk people
saw was the huge size due to data duplication inside the manifest data -
duplication that went away by manifest subtree "dedup" between revisions.

Jeff.
Joerg Sonnenberger - July 26, 2020, 4:35 p.m.
On Sun, Jul 26, 2020 at 11:12:25AM -0400, Josef 'Jeff' Sipek wrote:
> > > I'm guessing that they would have benefited from treemanifest.
> > 
> > From my testing, treemanifests don't help at all.
> 
> They seemed to help with the jdk repo.  I'm guessing that jdk has a deeper
> nested directories with longer file names because the conversion certainly
> seemed to help (tm == treemanifest):

Can you run "hg debugupgraderepo -o re-delta-all" once? IIRC the
original repository doesn't use generaldelta and this would also affect
the manifest. 

Joerg
Josef 'Jeff' Sipek - July 26, 2020, 6:28 p.m.
On Sun, Jul 26, 2020 at 18:35:03 +0200, Joerg Sonnenberger wrote:
> On Sun, Jul 26, 2020 at 11:12:25AM -0400, Josef 'Jeff' Sipek wrote:
> > > > I'm guessing that they would have benefited from treemanifest.
> > > 
> > > From my testing, treemanifests don't help at all.
> > 
> > They seemed to help with the jdk repo.  I'm guessing that jdk has a deeper
> > nested directories with longer file names because the conversion certainly
> > seemed to help (tm == treemanifest):
> 
> Can you run "hg debugupgraderepo -o re-delta-all" once? IIRC the
> original repository doesn't use generaldelta and this would also affect
> the manifest. 

$ hg debugupgraderepo -o re-delta-all --run --no-backup
...
beginning upgrade...
repository locked and read-only
creating temporary repository to stage migrated data: /ws/tmp/jdk-hg/.hg/upgrade.UaS6Ss
(it is safe to interrupt this process any time before data migration completes)
migrating 637431 total revisions (516970 in filelogs, 60143 in manifests, 60318 in changelog)
migrating 1.07 GB in store; 298 GB tracked data
migrating 187542 filelogs containing 516970 revisions (625 MB in store; 11.9 GB tracked data)
finished migrating 516970 filelog revisions across 187542 filelogs; change in size: -2.14 MB
migrating 1 manifests containing 60143 revisions (438 MB in store; 286 GB tracked data)
finished migrating 60143 manifest revisions across 1 manifests; change in size: -382 MB
migrating changelog containing 60318 revisions (28.8 MB in store; 175 MB tracked data)
finished migrating 60318 changelog revisions; change in size: 0 bytes
finished migrating 637431 total revisions; total change in store size: -384 MB
copying phaseroots
...

Wow, that's a massive change to the manifest size!

-rw-r--r--   1 jeffpc   jeffpc     25.2M Jul 26 13:55 00changelog.d
-rw-r--r--   1 jeffpc   jeffpc     3.68M Jul 26 13:55 00changelog.i
-rw-r--r--   1 jeffpc   jeffpc     52.3M Jul 26 13:54 00manifest.d
-rw-r--r--   1 jeffpc   jeffpc     3.67M Jul 26 13:54 00manifest.i

After the repo upgrade, I ran hg server and cloned it (non-streaming).  The
clone's manifest is somewhat larger but still reasonably sized:

-rw-r--r--   1 jeffpc  jeffpc    25M Jul 26 14:23 00changelog.d
-rw-r--r--   1 jeffpc  jeffpc   3.7M Jul 26 14:16 00changelog.i
-rw-r--r--   1 jeffpc  jeffpc    61M Jul 26 14:17 00manifest.d
-rw-r--r--   1 jeffpc  jeffpc   3.7M Jul 26 14:17 00manifest.i

Jeff.
Augie Fackler - July 26, 2020, 8:57 p.m.
On Sat, Jul 25, 2020 at 10:19 PM Joerg Sonnenberger <joerg@bec.de> wrote:
>
> On Sat, Jul 25, 2020 at 01:36:32PM -0400, Josef 'Jeff' Sipek wrote:
> > First off, the clone itself.  I cloned it from the official upstream repos.
> > My internet connection is 150 Mbit/s, the storage is a 3-way ZFS mirror.  I
> > used hg 4.9.1 (py27), and git 2.21.0.  (I know, I need to update both.  This
> > is on a box that has a solid network connection but is harder to update.  If
> > there is interest I can spend the effort to update them and re-run it with
> > newer versions.)
>
> It should be noted that for all intends and purposes, a git clone is
> much more comparable to hg clone --stream.

One thing we did on Google Code that I've never been able to convince
someone to try is cache deltas: we had an outage caused by delta
computation being slow, and a side effect of that was caching deltas
pretty aggressively. That moved our servers from being CPU-bound to
being IO-bound on BigTable reads, and IIRC we were able to satisfy
pretty much any request at client-limited speeds from then on. It'd
probably still be a worthwhile effort to see about allowing memcached
or similar to store deltas for a server pool and let them avoid
significant amounts of delta computation.

AF
Pierre-Yves David - July 31, 2020, 3:55 p.m.
On 7/25/20 7:36 PM, Josef 'Jeff' Sipek wrote:
> On Sat, Jul 25, 2020 at 12:27:42 +0200, Antonio Muci via Mercurial-devel wrote:
>> That's sad.
> 
> Yeah.
> 
> This motivated me enough to clone the repos (hg and git) and collect some
> data.  Maybe people here will find it useful.
I got int touch with the OpenJDK people one and half year ago. The 
verison of Mercurial they use on the server is extremely old. The 
repository format they use is ancient (not even general delta IIRC).

Moving to a modern Mercurial version, using sparse revlog for storage 
and recomputing delta gave a massive boost to storage size and clone 
performance.

However I never managed to get to even simply upgrade their mercurial 
server side. Some of the issue OpenJDK had were legitimate concerns that 
we could improve, but a good share was also lack of interrest in 
actually improves their Mercurial situation. The crave to move to Github 
for community reason was strong.
via Mercurial-devel - July 31, 2020, 4:30 p.m.
> Il 31/07/2020 17:55 Pierre-Yves David <pierre-yves.david@ens-lyon.org> ha scritto:
>
> I got int touch with the OpenJDK people one and half year ago. [...]

Very active move on your part. Kudos.


> Moving to a modern Mercurial version, using sparse revlog for storage 
> and recomputing delta gave a massive boost to storage size and clone 
> performance.

At least this reassures that performance-wise mercurial has not fallen behind so much.
The tests performed by Josef and Joerg confirm that a performance disadvantage exists indeed, but it's not massive.

> a good share was also lack of interrest in 
> actually improves their Mercurial situation. The crave to move to Github
> for community reason was strong.

I can understand wanting to benefit of the Github network effect, and do not want to focus on it here.

What concerns me the most are two things:


1. scripta manent: when in some years people will google for "mercurial performance" they will stumble upon JDK considerations, and take them form granted. What will remain in a potential user's head is "mercurial is slow, go for git. JDK guys have done the same". There is no other written material counterweighting these moves (except for very interesting blog entries by Gregory Szorc, possibly), and so the collective mindset slowly slips away.

2. (consequence of 1) no mindset that another valid SCM exists: SCM == GitHub, because - obviously - git == "hosted service with integrated issue tracker, CI and whatnot", right?

I am wondering if the countermeasures to this have to be only technical. I see this more as a communication disadvantage compared to the git ecosystem.
Pierre-Yves David - July 31, 2020, 4:43 p.m.
On 7/31/20 6:30 PM, Antonio Muci wrote:
> I am wondering if the countermeasures to this have to be only technical. I see this more as a communication disadvantage compared to the git ecosystem.

We could definitely use more communication :-/
Joerg Sonnenberger - July 31, 2020, 4:59 p.m.
On Fri, Jul 31, 2020 at 06:30:57PM +0200, Antonio Muci via Mercurial-devel wrote:
> What concerns me the most are two things:
> 
> 1. scripta manent: when in some years people will google for "mercurial
> performance" they will stumble upon JDK considerations, and take them
> form granted. What will remain in a potential user's head is "mercurial
> is slow, go for git. JDK guys have done the same". There is no other
> written material counterweighting these moves (except for very
> interesting blog entries by Gregory Szorc, possibly), and so the
> collective mindset slowly slips away.

I fully agree with the problem and I've had to deal with the same issue
before. I consider the write-up from the OpenJDK people to be quite
dishonest, but there is little that we can do about it.

Joerg
Josef 'Jeff' Sipek - Aug. 1, 2020, 2:11 a.m.
On Fri, Jul 31, 2020 at 18:30:57 +0200, Antonio Muci wrote:
> > Il 31/07/2020 17:55 Pierre-Yves David <pierre-yves.david@ens-lyon.org> ha scritto:
...
> > Moving to a modern Mercurial version, using sparse revlog for storage 
> > and recomputing delta gave a massive boost to storage size and clone 
> > performance.
> 
> At least this reassures that performance-wise mercurial has not fallen
> behind so much.
> The tests performed by Josef and Joerg confirm that a performance
> disadvantage exists indeed, but it's not massive.

Keep in mind that I did only clone testing.  I use both hg and git (hg
because I want to, git because I have to), and I have to admit that
something as simple as 'hg log' / 'git log' feel completely different.
git's log output feel instantaneously on the screen, while hg's takes a
fraction of a second.  It is a small fraction, but it "feels" slower.  I
think this has been diagnosed over and over as slow python startup.

...
> What concerns me the most are two things:
> 
> 1. scripta manent: when in some years people will google for "mercurial
> performance" they will stumble upon JDK considerations, and take them form
> granted. What will remain in a potential user's head is "mercurial is
> slow, go for git. JDK guys have done the same". There is no other written
> material counterweighting these moves (except for very interesting blog
> entries by Gregory Szorc, possibly), and so the collective mindset slowly
> slips away.

Around 2010, I messed quite a bit with the xfs file system in linux.  It was
really annoying that users found "tuning guide" slashdot posts from
2001-2003 that were completely wrong but they still kept finding them and
using them.  Often, this resulted in worse performance but the users were
also bad at benchmarking so they didn't notice until it was too late and
they file systems had a lot of data.  (I think it has gotten better, but
those horrid guides are still out there.)  In other words, it takes a *lot*
of effort to make sure people on the internet don't find misinformation.  I
don't really know how, but I think it needs to be a concentrated effort to
be "louder" than the misinformation.  (I consider outdated information
misinformation.)

Jeff.
Marcus Harnisch - Aug. 5, 2020, 8:18 a.m.
On 25/07/2020 10.11, David Demelier wrote:
> # HG changeset patch
> # User David Demelier <markand@malikania.fr>
> # Date 1595664656 -7200
> #      Sat Jul 25 10:10:56 2020 +0200
> # Node ID 7eaad1ed8c743d40fe71620434f3a151f0067105
> # Parent  b0e3c6141a7844e1fdd55535677ea3bfb1527707
> who: remove OpenJDK
> 
> They unfortunately moved to GitHub.


Both, OpenJDK and NetBeans are still mentioned here:

   https://www.mercurial-scm.org/about

Perhaps these could be replaced with other large repos. Mozilla comes to 
mind.

Cheers,
Marcus
Pulkit Goyal - Aug. 5, 2020, 8:47 a.m.
On Wed, Aug 5, 2020 at 1:55 PM Marcus Harnisch <mh-mercurial@online.de> wrote:
>
> On 25/07/2020 10.11, David Demelier wrote:
> > # HG changeset patch
> > # User David Demelier <markand@malikania.fr>
> > # Date 1595664656 -7200
> > #      Sat Jul 25 10:10:56 2020 +0200
> > # Node ID 7eaad1ed8c743d40fe71620434f3a151f0067105
> > # Parent  b0e3c6141a7844e1fdd55535677ea3bfb1527707
> > who: remove OpenJDK
> >
> > They unfortunately moved to GitHub.
>
>
> Both, OpenJDK and NetBeans are still mentioned here:
>
>    https://www.mercurial-scm.org/about

Oops, if possible can you email a patch for this. The website
repository lives at https://www.mercurial-scm.org/repo/hg-website/.
>
> Perhaps these could be replaced with other large repos. Mozilla comes to
> mind.

Yes, that sounds like a good replacement.
>
> Cheers,
> Marcus
>

Thanks and Regards
Pulkit
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Patch

diff -r b0e3c6141a78 -r 7eaad1ed8c74 templates/who/index.html
--- a/templates/who/index.html	Fri Jul 26 14:27:08 2019 +0200
+++ b/templates/who/index.html	Sat Jul 25 10:10:56 2020 +0200
@@ -9,9 +9,6 @@ 
         <h3>Mozilla</h3>
         Mozilla is an open source project that is currently developing the popular <a href="https://www.mozilla.org/firefox">Firefox</a> internet browser, the email client <a href="https://www.mozilla.org/thunderbird">Thunderbird</a> and the application suite SeaMonkey. Mozilla chose Mercurial in 2006.</p>
         <p><a href="https://www.mozilla.org">https://www.mozilla.org</a></p>
-        <h3>Java / OpenJDK</h3>
-        OpenJDK is the official open sourced Java implementation of Sun Microsystems. When open sourcing the project, Sun chose Mercurial as their main version control system.
-        <p><a href="http://openjdk.java.net/">http://openjdk.java.net/</a></p>
         <h3>Nginx</h3>
         The nginx web server is among one of the most popular and used over the world.
         <p><a href="http://nginx.org/">http://nginx.org/</a></p>