Patchwork [STABLE] encoding: alias cp65001 to utf-8 on Windows

mail settings
Submitter Yuya Nishihara
Date July 1, 2018, 3 p.m.
Message ID <b0f8bd19072c425b76e7.1530457248@mimosa>
Download mbox | patch
Permalink /patch/32549/
State New
Headers show


Yuya Nishihara - July 1, 2018, 3 p.m.
# HG changeset patch
# User Yuya Nishihara <>
# Date 1530455813 -32400
#      Sun Jul 01 23:36:53 2018 +0900
# Branch stable
# Node ID b0f8bd19072c425b76e7df268b4c25ae41f701bc
# Parent  0b63a6743010dfdbf8a8154186e119949bdaa1cc
encoding: alias cp65001 to utf-8 on Windows

As far as I can tell, cp65001 is the Windows name for UTF-8. I don't know
how different it is from the UTF-8, but Python 3 appears to have introduced
new codec for cp65001, so the alias is enabled only for Python 2.

This patch is untested, but hopefully fixes the following issue.


diff --git a/mercurial/ b/mercurial/
--- a/mercurial/
+++ b/mercurial/
@@ -72,6 +72,11 @@  else:
     '646': lambda: 'ascii',
     'ANSI_X3.4-1968': lambda: 'ascii',
+# cp65001 is a Windows variant of utf-8, which isn't supported on Python 2.
+# No idea if it should be rewritten to the canonical name 'utf-8' on Python 3.
+if pycompat.iswindows and not pycompat.ispy3:
+    _encodingfixers['cp65001'] = lambda: 'utf-8'
     encoding = environ.get("HGENCODING")