Patchwork [3,of,3,RFC] sslutil: try to find and use system CA file when appropriate

login
register
mail settings
Submitter Gregory Szorc
Date June 25, 2016, 4:37 p.m.
Message ID <fb5033513e39b96777f9.1466872635@ubuntu-vm-main>
Download mbox | patch
Permalink /patch/15614/
State Accepted
Headers show

Comments

Gregory Szorc - June 25, 2016, 4:37 p.m.
# HG changeset patch
# User Gregory Szorc <gregory.szorc@gmail.com>
# Date 1466866076 25200
#      Sat Jun 25 07:47:56 2016 -0700
# Node ID fb5033513e39b96777f9794b9542c3632a64fa75
# Parent  13818ab440e16c575b09a6e4583f9a17550a6e52
[RFC] sslutil: try to find and use system CA file when appropriate

Previously, when running on Python installs that don't know how to
load the system CA cert store we wouldn't load any CA certificates
and would subsequently refuse to talk to servers since we were unable
to verify their certificate was signed by an appropriate CA. The
default user experience on these Python installs was poor.

With this patch, we attempt to improve the default situation by
trying to find the system CA store file by looking in well-known
locations. We've defined locations used by popular Linux
distributions.

When running Python 2.6.9 on Ubuntu 16.04, this patch has been
observed to convert aborts from CA verification failure to successful
connections.

I'm not fully convinced this patch is the correct behavior. I can
make an argument that we should issue a ui.warn() when we're
hitting this code because I think Mercurial installations should have
web.cacerts explicitly configured in the system hgrc. Otherwise, we
may inadvertently use a wrong file for the current system and that
could lead to failure to validate or even validating using certificates
that shouldn't be used. In other words, I think that Mercurial packagers
have a responsibility to point Mercurial at the CA store appropriate
for that system and that Mercurial should defer security decisions to
packagers. This works well for system packagers but falls apart for
e.g. pip packages, which are system/disto agnostic.

Another option is to put this new code behind a config option.
Pierre-Yves David - June 26, 2016, 2:06 a.m.
On 06/25/2016 06:37 PM, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc@gmail.com>
> # Date 1466866076 25200
> #      Sat Jun 25 07:47:56 2016 -0700
> # Node ID fb5033513e39b96777f9794b9542c3632a64fa75
> # Parent  13818ab440e16c575b09a6e4583f9a17550a6e52
> [RFC] sslutil: try to find and use system CA file when appropriate

I like the idea of trying to reduce the pain of user still on 2.6 by
looking the CA semi-automatically. However, I agree with you that doing
it might be suboptimal (especially security wise), I do not have a
strong opinion about doing it or not in absolute yet, but we should
probably at least issue a warning when doing so. With a pointer to why
the warning exist and how to suppress it.
As I understand it, in the same situation before we would have accept
the certificate with a warning so this seems like a proper evolution of
the behavior.
timeless - June 27, 2016, 10:37 p.m.
One thing we could do is abort but spit out a config that would work.

So if we detect:
/etc/pki/tls/certs/ca-bundle.trust.crt

abort: ca bundle isn't configured
you can add the following to .hg/hgrc:
web.cacerts=/etc/pki/tls/certs/ca-bundle.trust.crt
Gregory Szorc - June 30, 2016, 3:14 a.m.
On Sat, Jun 25, 2016 at 7:06 PM, Pierre-Yves David <
pierre-yves.david@ens-lyon.org> wrote:

>
>
> On 06/25/2016 06:37 PM, Gregory Szorc wrote:
> > # HG changeset patch
> > # User Gregory Szorc <gregory.szorc@gmail.com>
> > # Date 1466866076 25200
> > #      Sat Jun 25 07:47:56 2016 -0700
> > # Node ID fb5033513e39b96777f9794b9542c3632a64fa75
> > # Parent  13818ab440e16c575b09a6e4583f9a17550a6e52
> > [RFC] sslutil: try to find and use system CA file when appropriate
>
> I like the idea of trying to reduce the pain of user still on 2.6 by
> looking the CA semi-automatically. However, I agree with you that doing
> it might be suboptimal (especially security wise), I do not have a
> strong opinion about doing it or not in absolute yet, but we should
> probably at least issue a warning when doing so. With a pointer to why
> the warning exist and how to suppress it.
> As I understand it, in the same situation before we would have accept
> the certificate with a warning so this seems like a proper evolution of
> the behavior.


So... TIL that distro packagers don't care about Python security either.

On Debian Jessie and Ubuntu 16.04:

$ /usr/bin/python2.7
>>> import ssl
>>> ctx = ssl.create_default_context()
>>> ctx.load_default_certs()
>>> ctx.cert_store_stats()
{'x509': 0, 'x509_ca': 0, 'crl': 0}
>>> ctx.load_verify_locations(cafile='/etc/ssl/certs/ca-certificates.crt')
>>> ctx.cert_store_stats()
{'x509': 173, 'x509_ca': 173, 'crl': 0}

The whole point of ssl.SSLContext.load_default_certs() is it is supposed to
load the default/system CA certificates from default locations. As you can
see, this flat out doesn't work on Debian and Ubuntu 16.04. The result is
that /usr/bin/python2.7 on these distros is unable to load any CA
certificates.

Some very brief perusing of the Internets recommends using non-stdlib
modules [which know the locations of system CA certs on various
distributions] to load CA certs. Seriously.

Fortunately, RHEL 7 and CentOS 7 have a better outcome:

$ /usr/bin/python2.7
>>> import ssl
>>> ctx = ssl.create_default_context()
>>> ctx.load_default_certs()
>>> ctx.cert_store_stats()
{'x509': 167, 'x509_ca': 167, 'crl': 0}

Kudos to RedHat for configuring their Python package to load the system CA
certificates and provide working CA verification when using the standard
library.

Anyway, given the popularity of Debian-based distros, I think we'll need to
do something to detect the system CA certs file
(/etc/ssl/certs/ca-certificates.crt) since Debian's Python package isn't
willing to load it for us. Expect a v2 of this RFC patch from me eventually.
Yuya Nishihara - June 30, 2016, 2:40 p.m.
On Wed, 29 Jun 2016 20:14:43 -0700, Gregory Szorc wrote:
> Anyway, given the popularity of Debian-based distros, I think we'll need to
> do something to detect the system CA certs file
> (/etc/ssl/certs/ca-certificates.crt) since Debian's Python package isn't
> willing to load it for us. Expect a v2 of this RFC patch from me eventually.

Maybe you know, but the Debian package sets up web.cacerts. Users are safe
as long as they install Mercurial as a package.
Gregory Szorc - June 30, 2016, 3:06 p.m.
On Wed, Jun 29, 2016 at 8:14 PM, Gregory Szorc <gregory.szorc@gmail.com>
wrote:

> On Sat, Jun 25, 2016 at 7:06 PM, Pierre-Yves David <
> pierre-yves.david@ens-lyon.org> wrote:
>
>>
>>
>> On 06/25/2016 06:37 PM, Gregory Szorc wrote:
>> > # HG changeset patch
>> > # User Gregory Szorc <gregory.szorc@gmail.com>
>> > # Date 1466866076 25200
>> > #      Sat Jun 25 07:47:56 2016 -0700
>> > # Node ID fb5033513e39b96777f9794b9542c3632a64fa75
>> > # Parent  13818ab440e16c575b09a6e4583f9a17550a6e52
>> > [RFC] sslutil: try to find and use system CA file when appropriate
>>
>> I like the idea of trying to reduce the pain of user still on 2.6 by
>> looking the CA semi-automatically. However, I agree with you that doing
>> it might be suboptimal (especially security wise), I do not have a
>> strong opinion about doing it or not in absolute yet, but we should
>> probably at least issue a warning when doing so. With a pointer to why
>> the warning exist and how to suppress it.
>> As I understand it, in the same situation before we would have accept
>> the certificate with a warning so this seems like a proper evolution of
>> the behavior.
>
>
> So... TIL that distro packagers don't care about Python security either.
>
> On Debian Jessie and Ubuntu 16.04:
>
> $ /usr/bin/python2.7
> >>> import ssl
> >>> ctx = ssl.create_default_context()
> >>> ctx.load_default_certs()
> >>> ctx.cert_store_stats()
> {'x509': 0, 'x509_ca': 0, 'crl': 0}
> >>> ctx.load_verify_locations(cafile='/etc/ssl/certs/ca-certificates.crt')
> >>> ctx.cert_store_stats()
> {'x509': 173, 'x509_ca': 173, 'crl': 0}
>
> The whole point of ssl.SSLContext.load_default_certs() is it is supposed
> to load the default/system CA certificates from default locations. As you
> can see, this flat out doesn't work on Debian and Ubuntu 16.04. The result
> is that /usr/bin/python2.7 on these distros is unable to load any CA
> certificates.
>
> Some very brief perusing of the Internets recommends using non-stdlib
> modules [which know the locations of system CA certs on various
> distributions] to load CA certs. Seriously.
>

The situation on Debian and Ubuntu doesn't appear to be as bad as initially
thought.

ssl.SSLContext.load_verify_locations() specifies 2 ways to load certs: via
a single file and via a path containing files with hashes of CAs.

When you load a directory, the certs are loaded on demand as they are
encountered through connection activity. Only then do the certs show up in
cert_store_stats().

Debian and Ubuntu appear to be using this directory-based cert loading.
RHEL is using the file-based one. And FWIW Mercurial only supports
specifying a config option for loading certs from a file. We /could/
support loading from a directory if we wanted. Although I haven't heard
anybody clamor for this, so I'm not going to do it.

Anyway, what this means is that "are any CAs loaded" is a bit difficult to
implement on systems loading certs from directories because CA count == 0
could mean either no CAs loaded or the CA wanted by the server connection
just isn't present locally. I'll likely have to tweak the warning message
for "no CAs loaded" that I added a few hours ago and Yuya recently queued.
I'll also have to look into behavior on Windows and OS X to see if they are
reporting counts in cert_store_stats(). Perhaps it would be best to prune
that last patch from the committed repo until I get a handle on things. If
not, I'll send follow-up commits.


>
> Fortunately, RHEL 7 and CentOS 7 have a better outcome:
>
> $ /usr/bin/python2.7
> >>> import ssl
> >>> ctx = ssl.create_default_context()
> >>> ctx.load_default_certs()
> >>> ctx.cert_store_stats()
> {'x509': 167, 'x509_ca': 167, 'crl': 0}
>
> Kudos to RedHat for configuring their Python package to load the system CA
> certificates and provide working CA verification when using the standard
> library.
>
> Anyway, given the popularity of Debian-based distros, I think we'll need
> to do something to detect the system CA certs file
> (/etc/ssl/certs/ca-certificates.crt) since Debian's Python package isn't
> willing to load it for us. Expect a v2 of this RFC patch from me eventually.
>
Augie Fackler - June 30, 2016, 4:12 p.m.
> On Jun 29, 2016, at 11:14 PM, Gregory Szorc <gregory.szorc@gmail.com> wrote:
> 
> On Sat, Jun 25, 2016 at 7:06 PM, Pierre-Yves David <pierre-yves.david@ens-lyon.org> wrote:
> 
> 
> On 06/25/2016 06:37 PM, Gregory Szorc wrote:
> > # HG changeset patch
> > # User Gregory Szorc <gregory.szorc@gmail.com>
> > # Date 1466866076 25200
> > #      Sat Jun 25 07:47:56 2016 -0700
> > # Node ID fb5033513e39b96777f9794b9542c3632a64fa75
> > # Parent  13818ab440e16c575b09a6e4583f9a17550a6e52
> > [RFC] sslutil: try to find and use system CA file when appropriate
> 
> I like the idea of trying to reduce the pain of user still on 2.6 by
> looking the CA semi-automatically. However, I agree with you that doing
> it might be suboptimal (especially security wise), I do not have a
> strong opinion about doing it or not in absolute yet, but we should
> probably at least issue a warning when doing so. With a pointer to why
> the warning exist and how to suppress it.
> As I understand it, in the same situation before we would have accept
> the certificate with a warning so this seems like a proper evolution of
> the behavior.
> 
> So... TIL that distro packagers don't care about Python security either.
> 
> On Debian Jessie and Ubuntu 16.04:
> 
> $ /usr/bin/python2.7
> >>> import ssl
> >>> ctx = ssl.create_default_context()
> >>> ctx.load_default_certs()
> >>> ctx.cert_store_stats()
> {'x509': 0, 'x509_ca': 0, 'crl': 0}
> >>> ctx.load_verify_locations(cafile='/etc/ssl/certs/ca-certificates.crt')
> >>> ctx.cert_store_stats()
> {'x509': 173, 'x509_ca': 173, 'crl': 0}
> 
> The whole point of ssl.SSLContext.load_default_certs() is it is supposed to load the default/system CA certificates from default locations. As you can see, this flat out doesn't work on Debian and Ubuntu 16.04. The result is that /usr/bin/python2.7 on these distros is unable to load any CA certificates.
> 
> Some very brief perusing of the Internets recommends using non-stdlib modules [which know the locations of system CA certs on various distributions] to load CA certs. Seriously.
> 
> Fortunately, RHEL 7 and CentOS 7 have a better outcome:
> 
> $ /usr/bin/python2.7
> >>> import ssl
> >>> ctx = ssl.create_default_context()
> >>> ctx.load_default_certs()
> >>> ctx.cert_store_stats()
> {'x509': 167, 'x509_ca': 167, 'crl': 0}
> 
> Kudos to RedHat for configuring their Python package to load the system CA certificates and provide working CA verification when using the standard library.
> 
> Anyway, given the popularity of Debian-based distros, I think we'll need to do something to detect the system CA certs file (/etc/ssl/certs/ca-certificates.crt) since Debian's Python package isn't willing to load it for us. Expect a v2 of this RFC patch from me eventually.

For this case, perhaps we should sniff for https://pypi.python.org/pypi/certifi and if it’s installed use it to find certs? Then we could just recommend that distort packagers use that if they don’t want to (or know how to) point at the system store.

> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Yuya Nishihara - July 1, 2016, 2:02 p.m.
On Thu, 30 Jun 2016 08:06:25 -0700, Gregory Szorc wrote:
> ssl.SSLContext.load_verify_locations() specifies 2 ways to load certs: via
> a single file and via a path containing files with hashes of CAs.
> 
> When you load a directory, the certs are loaded on demand as they are
> encountered through connection activity. Only then do the certs show up in
> cert_store_stats().
> 
> Debian and Ubuntu appear to be using this directory-based cert loading.
> RHEL is using the file-based one. And FWIW Mercurial only supports
> specifying a config option for loading certs from a file. We /could/
> support loading from a directory if we wanted. Although I haven't heard
> anybody clamor for this, so I'm not going to do it.

Ah, that's why I saw nothing returned by get_ca_certs() until a socket
connected to somewhere.

On Windows, CA certificates seem to be loaded by create_default_context(),
but it might be an implementation detail.

> Anyway, what this means is that "are any CAs loaded" is a bit difficult to
> implement on systems loading certs from directories because CA count == 0
> could mean either no CAs loaded or the CA wanted by the server connection
> just isn't present locally. I'll likely have to tweak the warning message
> for "no CAs loaded" that I added a few hours ago and Yuya recently queued.
> I'll also have to look into behavior on Windows and OS X to see if they are
> reporting counts in cert_store_stats(). Perhaps it would be best to prune
> that last patch from the committed repo until I get a handle on things. If
> not, I'll send follow-up commits.

The patch has been published due to the lag of the list?
Gregory Szorc - July 1, 2016, 2:41 p.m.
> On Jul 1, 2016, at 07:02, Yuya Nishihara <yuya@tcha.org> wrote:
> 
>> On Thu, 30 Jun 2016 08:06:25 -0700, Gregory Szorc wrote:
>> ssl.SSLContext.load_verify_locations() specifies 2 ways to load certs: via
>> a single file and via a path containing files with hashes of CAs.
>> 
>> When you load a directory, the certs are loaded on demand as they are
>> encountered through connection activity. Only then do the certs show up in
>> cert_store_stats().
>> 
>> Debian and Ubuntu appear to be using this directory-based cert loading.
>> RHEL is using the file-based one. And FWIW Mercurial only supports
>> specifying a config option for loading certs from a file. We /could/
>> support loading from a directory if we wanted. Although I haven't heard
>> anybody clamor for this, so I'm not going to do it.
> 
> Ah, that's why I saw nothing returned by get_ca_certs() until a socket
> connected to somewhere.
> 
> On Windows, CA certificates seem to be loaded by create_default_context(),
> but it might be an implementation detail.
> 
>> Anyway, what this means is that "are any CAs loaded" is a bit difficult to
>> implement on systems loading certs from directories because CA count == 0
>> could mean either no CAs loaded or the CA wanted by the server connection
>> just isn't present locally. I'll likely have to tweak the warning message
>> for "no CAs loaded" that I added a few hours ago and Yuya recently queued.
>> I'll also have to look into behavior on Windows and OS X to see if they are
>> reporting counts in cert_store_stats(). Perhaps it would be best to prune
>> that last patch from the committed repo until I get a handle on things. If
>> not, I'll send follow-up commits.
> 
> The patch has been published due to the lag of the list?

Don't worry about it. I have a series in the works to overhaul default CA loading. I'll address wording changes there, if necessary.

Patch

diff --git a/mercurial/sslutil.py b/mercurial/sslutil.py
--- a/mercurial/sslutil.py
+++ b/mercurial/sslutil.py
@@ -188,20 +188,20 @@  def _hostsettings(ui, hostname):
             cafile = ui.config('web', 'cacerts')
 
             if cafile:
                 cafile = util.expandpath(cafile)
                 if not os.path.exists(cafile):
                     raise error.Abort(_('could not find web.cacerts: %s') %
                                       cafile)
             else:
-                # No global CA certs. See if we can load defaults.
+                # CAs not defined in config. Try to find system bundles.
                 cafile = _defaultcacerts()
                 if cafile:
-                    ui.debug('using %s to enable OS X system CA\n' % cafile)
+                    ui.debug('using %s for CA file\n' % cafile)
 
             s['cafile'] = cafile
 
         # Require certificate validation if CA certs are being loaded and
         # verification hasn't been disabled above.
         if cafile or (_canloaddefaultcerts and s['allowloaddefaultcerts']):
             s['verifymode'] = ssl.CERT_REQUIRED
         else:
@@ -338,23 +338,42 @@  def _plainapplepython():
       cacerts file
     """
     if sys.platform != 'darwin' or util.mainfrozen() or not sys.executable:
         return False
     exe = os.path.realpath(sys.executable).lower()
     return (exe.startswith('/usr/bin/python') or
             exe.startswith('/system/library/frameworks/python.framework/'))
 
+_systemcacertpaths = [
+    # RHEL, CentOS, and Fedora
+    '/etc/pki/tls/certs/ca-bundle.trust.crt',
+    # Debian, Ubuntu, Gentoo
+    '/etc/ssl/certs/ca-certificates.crt',
+]
+
 def _defaultcacerts():
     """return path to default CA certificates or None."""
+    # When using Apple Python, always attempt to use the dummy cert to load
+    # Apple's system store.
     if _plainapplepython():
         dummycert = os.path.join(os.path.dirname(__file__), 'dummycert.pem')
         if os.path.exists(dummycert):
             return dummycert
 
+    # If Python can load system default certificates, defer to it by
+    # returning None.
+    if _canloaddefaultcerts:
+        return None
+
+    # Else Python is unable to system certs. Try to find them ourselves.
+    for path in _systemcacertpaths:
+        if os.path.isfile(path):
+            return path
+
     return None
 
 def validatesocket(sock):
     """Validate a socket meets security requiremnets.
 
     The passed socket must have been created with ``wrapsocket()``.
     """
     host = sock._hgstate['hostname']