xref: /freebsd-src/crypto/openssl/doc/man7/passphrase-encoding.pod (revision b077aed33b7b6aefca7b17ddb250cf521f938613)
1e71b7053SJung-uk Kim=pod
2e71b7053SJung-uk Kim
3e71b7053SJung-uk Kim=encoding utf8
4e71b7053SJung-uk Kim
5e71b7053SJung-uk Kim=head1 NAME
6e71b7053SJung-uk Kim
7e71b7053SJung-uk Kimpassphrase-encoding
8e71b7053SJung-uk Kim- How diverse parts of OpenSSL treat pass phrases character encoding
9e71b7053SJung-uk Kim
10e71b7053SJung-uk Kim=head1 DESCRIPTION
11e71b7053SJung-uk Kim
12e71b7053SJung-uk KimIn a modern world with all sorts of character encodings, the treatment of pass
13e71b7053SJung-uk Kimphrases has become increasingly complex.
14e71b7053SJung-uk KimThis manual page attempts to give an overview over how this problem is
15e71b7053SJung-uk Kimcurrently addressed in different parts of the OpenSSL library.
16e71b7053SJung-uk Kim
17e71b7053SJung-uk Kim=head2 The general case
18e71b7053SJung-uk Kim
19e71b7053SJung-uk KimThe OpenSSL library doesn't treat pass phrases in any special way as a general
20e71b7053SJung-uk Kimrule, and trusts the application or user to choose a suitable character set
21e71b7053SJung-uk Kimand stick to that throughout the lifetime of affected objects.
22e71b7053SJung-uk KimThis means that for an object that was encrypted using a pass phrase encoded in
23e71b7053SJung-uk KimISO-8859-1, that object needs to be decrypted using a pass phrase encoded in
24e71b7053SJung-uk KimISO-8859-1.
25e71b7053SJung-uk KimUsing the wrong encoding is expected to cause a decryption failure.
26e71b7053SJung-uk Kim
27e71b7053SJung-uk Kim=head2 PKCS#12
28e71b7053SJung-uk Kim
29e71b7053SJung-uk KimPKCS#12 is a bit different regarding pass phrase encoding.
30e71b7053SJung-uk KimThe standard stipulates that the pass phrase shall be encoded as an ASN.1
31e71b7053SJung-uk KimBMPString, which consists of the code points of the basic multilingual plane,
32e71b7053SJung-uk Kimencoded in big endian (UCS-2 BE).
33e71b7053SJung-uk Kim
34e71b7053SJung-uk KimOpenSSL tries to adapt to this requirements in one of the following manners:
35e71b7053SJung-uk Kim
36e71b7053SJung-uk Kim=over 4
37e71b7053SJung-uk Kim
38e71b7053SJung-uk Kim=item 1.
39e71b7053SJung-uk Kim
40e71b7053SJung-uk KimTreats the received pass phrase as UTF-8 encoded and tries to re-encode it to
41e71b7053SJung-uk KimUTF-16 (which is the same as UCS-2 for characters U+0000 to U+D7FF and U+E000
42e71b7053SJung-uk Kimto U+FFFF, but becomes an expansion for any other character), or failing that,
43e71b7053SJung-uk Kimproceeds with step 2.
44e71b7053SJung-uk Kim
45e71b7053SJung-uk Kim=item 2.
46e71b7053SJung-uk Kim
47e71b7053SJung-uk KimAssumes that the pass phrase is encoded in ASCII or ISO-8859-1 and
48e71b7053SJung-uk Kimopportunistically prepends each byte with a zero byte to obtain the UCS-2
49e71b7053SJung-uk Kimencoding of the characters, which it stores as a BMPString.
50e71b7053SJung-uk Kim
51e71b7053SJung-uk KimNote that since there is no check of your locale, this may produce UCS-2 /
52e71b7053SJung-uk KimUTF-16 characters that do not correspond to the original pass phrase characters
53e71b7053SJung-uk Kimfor other character sets, such as any ISO-8859-X encoding other than
54e71b7053SJung-uk KimISO-8859-1 (or for Windows, CP 1252 with exception for the extra "graphical"
55e71b7053SJung-uk Kimcharacters in the 0x80-0x9F range).
56e71b7053SJung-uk Kim
57e71b7053SJung-uk Kim=back
58e71b7053SJung-uk Kim
59e71b7053SJung-uk KimOpenSSL versions older than 1.1.0 do variant 2 only, and that is the reason why
60e71b7053SJung-uk KimOpenSSL still does this, to be able to read files produced with older versions.
61e71b7053SJung-uk Kim
62e71b7053SJung-uk KimIt should be noted that this approach isn't entirely fault free.
63e71b7053SJung-uk Kim
64e71b7053SJung-uk KimA pass phrase encoded in ISO-8859-2 could very well have a sequence such as
65e71b7053SJung-uk Kim0xC3 0xAF (which is the two characters "LATIN CAPITAL LETTER A WITH BREVE"
66e71b7053SJung-uk Kimand "LATIN CAPITAL LETTER Z WITH DOT ABOVE" in ISO-8859-2 encoding), but would
67e71b7053SJung-uk Kimbe misinterpreted as the perfectly valid UTF-8 encoded code point U+00EF (LATIN
6817f01e99SJung-uk KimSMALL LETTER I WITH DIAERESIS) I<if the pass phrase doesn't contain anything that
69e71b7053SJung-uk Kimwould be invalid UTF-8>.
70e71b7053SJung-uk KimA pass phrase that contains this kind of byte sequence will give a different
71e71b7053SJung-uk Kimoutcome in OpenSSL 1.1.0 and newer than in OpenSSL older than 1.1.0.
72e71b7053SJung-uk Kim
73e71b7053SJung-uk Kim 0x00 0xC3 0x00 0xAF                    # OpenSSL older than 1.1.0
74e71b7053SJung-uk Kim 0x00 0xEF                              # OpenSSL 1.1.0 and newer
75e71b7053SJung-uk Kim
76e71b7053SJung-uk KimOn the same accord, anything encoded in UTF-8 that was given to OpenSSL older
77e71b7053SJung-uk Kimthan 1.1.0 was misinterpreted as ISO-8859-1 sequences.
78e71b7053SJung-uk Kim
79e71b7053SJung-uk Kim=head2 OSSL_STORE
80e71b7053SJung-uk Kim
81e71b7053SJung-uk KimL<ossl_store(7)> acts as a general interface to access all kinds of objects,
82e71b7053SJung-uk Kimpotentially protected with a pass phrase, a PIN or something else.
83e71b7053SJung-uk KimThis API stipulates that pass phrases should be UTF-8 encoded, and that any
84e71b7053SJung-uk Kimother pass phrase encoding may give undefined results.
85e71b7053SJung-uk KimThis API relies on the application to ensure UTF-8 encoding, and doesn't check
86e71b7053SJung-uk Kimthat this is the case, so what it gets, it will also pass to the underlying
87e71b7053SJung-uk Kimloader.
88e71b7053SJung-uk Kim
89e71b7053SJung-uk Kim=head1 RECOMMENDATIONS
90e71b7053SJung-uk Kim
91e71b7053SJung-uk KimThis section assumes that you know what pass phrase was used for encryption,
92e71b7053SJung-uk Kimbut that it may have been encoded in a different character encoding than the
93e71b7053SJung-uk Kimone used by your current input method.
94e71b7053SJung-uk KimFor example, the pass phrase may have been used at a time when your default
95e71b7053SJung-uk Kimencoding was ISO-8859-1 (i.e. "naïve" resulting in the byte sequence 0x6E 0x61
96e71b7053SJung-uk Kim0xEF 0x76 0x65), and you're now in an environment where your default encoding
97e71b7053SJung-uk Kimis UTF-8 (i.e. "naïve" resulting in the byte sequence 0x6E 0x61 0xC3 0xAF 0x76
98e71b7053SJung-uk Kim0x65).
99e71b7053SJung-uk KimWhenever it's mentioned that you should use a certain character encoding, it
100e71b7053SJung-uk Kimshould be understood that you either change the input method to use the
101e71b7053SJung-uk Kimmentioned encoding when you type in your pass phrase, or use some suitable tool
102e71b7053SJung-uk Kimto convert your pass phrase from your default encoding to the target encoding.
103e71b7053SJung-uk Kim
104e71b7053SJung-uk KimAlso note that the sub-sections below discuss human readable pass phrases.
105e71b7053SJung-uk KimThis is particularly relevant for PKCS#12 objects, where human readable pass
106e71b7053SJung-uk Kimphrases are assumed.
107e71b7053SJung-uk KimFor other objects, it's as legitimate to use any byte sequence (such as a
108*b077aed3SPierre Proncherysequence of bytes from F</dev/urandom> that's been saved away), which makes any
109e71b7053SJung-uk Kimcharacter encoding discussion irrelevant; in such cases, simply use the same
110e71b7053SJung-uk Kimbyte sequence as it is.
111e71b7053SJung-uk Kim
112e71b7053SJung-uk Kim=head2 Creating new objects
113e71b7053SJung-uk Kim
114e71b7053SJung-uk KimFor creating new pass phrase protected objects, make sure the pass phrase is
115e71b7053SJung-uk Kimencoded using UTF-8.
116e71b7053SJung-uk KimThis is default on most modern Unixes, but may involve an effort on other
117e71b7053SJung-uk Kimplatforms.
118e71b7053SJung-uk KimSpecifically for Windows, setting the environment variable
119*b077aed3SPierre ProncheryB<OPENSSL_WIN32_UTF8> will have anything entered on [Windows] console prompt
120e71b7053SJung-uk Kimconverted to UTF-8 (command line and separately prompted pass phrases alike).
121e71b7053SJung-uk Kim
122e71b7053SJung-uk Kim=head2 Opening existing objects
123e71b7053SJung-uk Kim
124e71b7053SJung-uk KimFor opening pass phrase protected objects where you know what character
125e71b7053SJung-uk Kimencoding was used for the encryption pass phrase, make sure to use the same
126e71b7053SJung-uk Kimencoding again.
127e71b7053SJung-uk Kim
128e71b7053SJung-uk KimFor opening pass phrase protected objects where the character encoding that was
129e71b7053SJung-uk Kimused is unknown, or where the producing application is unknown, try one of the
130e71b7053SJung-uk Kimfollowing:
131e71b7053SJung-uk Kim
132e71b7053SJung-uk Kim=over 4
133e71b7053SJung-uk Kim
134e71b7053SJung-uk Kim=item 1.
135e71b7053SJung-uk Kim
136e71b7053SJung-uk KimTry the pass phrase that you have as it is in the character encoding of your
137e71b7053SJung-uk Kimenvironment.
138e71b7053SJung-uk KimIt's possible that its byte sequence is exactly right.
139e71b7053SJung-uk Kim
140e71b7053SJung-uk Kim=item 2.
141e71b7053SJung-uk Kim
142e71b7053SJung-uk KimConvert the pass phrase to UTF-8 and try with the result.
143e71b7053SJung-uk KimSpecifically with PKCS#12, this should open up any object that was created
144e71b7053SJung-uk Kimaccording to the specification.
145e71b7053SJung-uk Kim
146e71b7053SJung-uk Kim=item 3.
147e71b7053SJung-uk Kim
148e71b7053SJung-uk KimDo a naïve (i.e. purely mathematical) ISO-8859-1 to UTF-8 conversion and try
149e71b7053SJung-uk Kimwith the result.
150e71b7053SJung-uk KimThis differs from the previous attempt because ISO-8859-1 maps directly to
151e71b7053SJung-uk KimU+0000 to U+00FF, which other non-UTF-8 character sets do not.
152e71b7053SJung-uk Kim
153e71b7053SJung-uk KimThis also takes care of the case when a UTF-8 encoded string was used with
154e71b7053SJung-uk KimOpenSSL older than 1.1.0.
155e71b7053SJung-uk Kim(for example, C<ï>, which is 0xC3 0xAF when encoded in UTF-8, would become 0xC3
156e71b7053SJung-uk Kim0x83 0xC2 0xAF when re-encoded in the naïve manner.
157e71b7053SJung-uk KimThe conversion to BMPString would then yield 0x00 0xC3 0x00 0xA4 0x00 0x00, the
158e71b7053SJung-uk Kimerroneous/non-compliant encoding used by OpenSSL older than 1.1.0)
159e71b7053SJung-uk Kim
160e71b7053SJung-uk Kim=back
161e71b7053SJung-uk Kim
162e71b7053SJung-uk Kim=head1 SEE ALSO
163e71b7053SJung-uk Kim
164e71b7053SJung-uk KimL<evp(7)>,
165e71b7053SJung-uk KimL<ossl_store(7)>,
166e71b7053SJung-uk KimL<EVP_BytesToKey(3)>, L<EVP_DecryptInit(3)>,
167e71b7053SJung-uk KimL<PEM_do_header(3)>,
168e71b7053SJung-uk KimL<PKCS12_parse(3)>, L<PKCS12_newpass(3)>,
169e71b7053SJung-uk KimL<d2i_PKCS8PrivateKey_bio(3)>
170e71b7053SJung-uk Kim
171e71b7053SJung-uk Kim=head1 COPYRIGHT
172e71b7053SJung-uk Kim
173*b077aed3SPierre ProncheryCopyright 2018-2021 The OpenSSL Project Authors. All Rights Reserved.
174e71b7053SJung-uk Kim
175*b077aed3SPierre ProncheryLicensed under the Apache License 2.0 (the "License").  You may not use
176e71b7053SJung-uk Kimthis file except in compliance with the License.  You can obtain a copy
177e71b7053SJung-uk Kimin the file LICENSE in the source distribution or at
178e71b7053SJung-uk KimL<https://www.openssl.org/source/license.html>.
179e71b7053SJung-uk Kim
180e71b7053SJung-uk Kim=cut
181