xref: /freebsd-src/crypto/openssl/doc/man7/passphrase-encoding.pod (revision e71b70530d95c4f34d8bdbd78d1242df1ba4a945)
1*e71b7053SJung-uk Kim=pod
2*e71b7053SJung-uk Kim
3*e71b7053SJung-uk Kim=encoding utf8
4*e71b7053SJung-uk Kim
5*e71b7053SJung-uk Kim=head1 NAME
6*e71b7053SJung-uk Kim
7*e71b7053SJung-uk Kimpassphrase-encoding
8*e71b7053SJung-uk Kim- How diverse parts of OpenSSL treat pass phrases character encoding
9*e71b7053SJung-uk Kim
10*e71b7053SJung-uk Kim=head1 DESCRIPTION
11*e71b7053SJung-uk Kim
12*e71b7053SJung-uk KimIn a modern world with all sorts of character encodings, the treatment of pass
13*e71b7053SJung-uk Kimphrases has become increasingly complex.
14*e71b7053SJung-uk KimThis manual page attempts to give an overview over how this problem is
15*e71b7053SJung-uk Kimcurrently addressed in different parts of the OpenSSL library.
16*e71b7053SJung-uk Kim
17*e71b7053SJung-uk Kim=head2 The general case
18*e71b7053SJung-uk Kim
19*e71b7053SJung-uk KimThe OpenSSL library doesn't treat pass phrases in any special way as a general
20*e71b7053SJung-uk Kimrule, and trusts the application or user to choose a suitable character set
21*e71b7053SJung-uk Kimand stick to that throughout the lifetime of affected objects.
22*e71b7053SJung-uk KimThis means that for an object that was encrypted using a pass phrase encoded in
23*e71b7053SJung-uk KimISO-8859-1, that object needs to be decrypted using a pass phrase encoded in
24*e71b7053SJung-uk KimISO-8859-1.
25*e71b7053SJung-uk KimUsing the wrong encoding is expected to cause a decryption failure.
26*e71b7053SJung-uk Kim
27*e71b7053SJung-uk Kim=head2 PKCS#12
28*e71b7053SJung-uk Kim
29*e71b7053SJung-uk KimPKCS#12 is a bit different regarding pass phrase encoding.
30*e71b7053SJung-uk KimThe standard stipulates that the pass phrase shall be encoded as an ASN.1
31*e71b7053SJung-uk KimBMPString, which consists of the code points of the basic multilingual plane,
32*e71b7053SJung-uk Kimencoded in big endian (UCS-2 BE).
33*e71b7053SJung-uk Kim
34*e71b7053SJung-uk KimOpenSSL tries to adapt to this requirements in one of the following manners:
35*e71b7053SJung-uk Kim
36*e71b7053SJung-uk Kim=over 4
37*e71b7053SJung-uk Kim
38*e71b7053SJung-uk Kim=item 1.
39*e71b7053SJung-uk Kim
40*e71b7053SJung-uk KimTreats the received pass phrase as UTF-8 encoded and tries to re-encode it to
41*e71b7053SJung-uk KimUTF-16 (which is the same as UCS-2 for characters U+0000 to U+D7FF and U+E000
42*e71b7053SJung-uk Kimto U+FFFF, but becomes an expansion for any other character), or failing that,
43*e71b7053SJung-uk Kimproceeds with step 2.
44*e71b7053SJung-uk Kim
45*e71b7053SJung-uk Kim=item 2.
46*e71b7053SJung-uk Kim
47*e71b7053SJung-uk KimAssumes that the pass phrase is encoded in ASCII or ISO-8859-1 and
48*e71b7053SJung-uk Kimopportunistically prepends each byte with a zero byte to obtain the UCS-2
49*e71b7053SJung-uk Kimencoding of the characters, which it stores as a BMPString.
50*e71b7053SJung-uk Kim
51*e71b7053SJung-uk KimNote that since there is no check of your locale, this may produce UCS-2 /
52*e71b7053SJung-uk KimUTF-16 characters that do not correspond to the original pass phrase characters
53*e71b7053SJung-uk Kimfor other character sets, such as any ISO-8859-X encoding other than
54*e71b7053SJung-uk KimISO-8859-1 (or for Windows, CP 1252 with exception for the extra "graphical"
55*e71b7053SJung-uk Kimcharacters in the 0x80-0x9F range).
56*e71b7053SJung-uk Kim
57*e71b7053SJung-uk Kim=back
58*e71b7053SJung-uk Kim
59*e71b7053SJung-uk KimOpenSSL versions older than 1.1.0 do variant 2 only, and that is the reason why
60*e71b7053SJung-uk KimOpenSSL still does this, to be able to read files produced with older versions.
61*e71b7053SJung-uk Kim
62*e71b7053SJung-uk KimIt should be noted that this approach isn't entirely fault free.
63*e71b7053SJung-uk Kim
64*e71b7053SJung-uk KimA pass phrase encoded in ISO-8859-2 could very well have a sequence such as
65*e71b7053SJung-uk Kim0xC3 0xAF (which is the two characters "LATIN CAPITAL LETTER A WITH BREVE"
66*e71b7053SJung-uk Kimand "LATIN CAPITAL LETTER Z WITH DOT ABOVE" in ISO-8859-2 encoding), but would
67*e71b7053SJung-uk Kimbe misinterpreted as the perfectly valid UTF-8 encoded code point U+00EF (LATIN
68*e71b7053SJung-uk KimSMALL LETTER I WITH DIARESIS) I<if the pass phrase doesn't contain anything that
69*e71b7053SJung-uk Kimwould be invalid UTF-8>.
70*e71b7053SJung-uk KimA pass phrase that contains this kind of byte sequence will give a different
71*e71b7053SJung-uk Kimoutcome in OpenSSL 1.1.0 and newer than in OpenSSL older than 1.1.0.
72*e71b7053SJung-uk Kim
73*e71b7053SJung-uk Kim 0x00 0xC3 0x00 0xAF                    # OpenSSL older than 1.1.0
74*e71b7053SJung-uk Kim 0x00 0xEF                              # OpenSSL 1.1.0 and newer
75*e71b7053SJung-uk Kim
76*e71b7053SJung-uk KimOn the same accord, anything encoded in UTF-8 that was given to OpenSSL older
77*e71b7053SJung-uk Kimthan 1.1.0 was misinterpreted as ISO-8859-1 sequences.
78*e71b7053SJung-uk Kim
79*e71b7053SJung-uk Kim=head2 OSSL_STORE
80*e71b7053SJung-uk Kim
81*e71b7053SJung-uk KimL<ossl_store(7)> acts as a general interface to access all kinds of objects,
82*e71b7053SJung-uk Kimpotentially protected with a pass phrase, a PIN or something else.
83*e71b7053SJung-uk KimThis API stipulates that pass phrases should be UTF-8 encoded, and that any
84*e71b7053SJung-uk Kimother pass phrase encoding may give undefined results.
85*e71b7053SJung-uk KimThis API relies on the application to ensure UTF-8 encoding, and doesn't check
86*e71b7053SJung-uk Kimthat this is the case, so what it gets, it will also pass to the underlying
87*e71b7053SJung-uk Kimloader.
88*e71b7053SJung-uk Kim
89*e71b7053SJung-uk Kim=head1 RECOMMENDATIONS
90*e71b7053SJung-uk Kim
91*e71b7053SJung-uk KimThis section assumes that you know what pass phrase was used for encryption,
92*e71b7053SJung-uk Kimbut that it may have been encoded in a different character encoding than the
93*e71b7053SJung-uk Kimone used by your current input method.
94*e71b7053SJung-uk KimFor example, the pass phrase may have been used at a time when your default
95*e71b7053SJung-uk Kimencoding was ISO-8859-1 (i.e. "naïve" resulting in the byte sequence 0x6E 0x61
96*e71b7053SJung-uk Kim0xEF 0x76 0x65), and you're now in an environment where your default encoding
97*e71b7053SJung-uk Kimis UTF-8 (i.e. "naïve" resulting in the byte sequence 0x6E 0x61 0xC3 0xAF 0x76
98*e71b7053SJung-uk Kim0x65).
99*e71b7053SJung-uk KimWhenever it's mentioned that you should use a certain character encoding, it
100*e71b7053SJung-uk Kimshould be understood that you either change the input method to use the
101*e71b7053SJung-uk Kimmentioned encoding when you type in your pass phrase, or use some suitable tool
102*e71b7053SJung-uk Kimto convert your pass phrase from your default encoding to the target encoding.
103*e71b7053SJung-uk Kim
104*e71b7053SJung-uk KimAlso note that the sub-sections below discuss human readable pass phrases.
105*e71b7053SJung-uk KimThis is particularly relevant for PKCS#12 objects, where human readable pass
106*e71b7053SJung-uk Kimphrases are assumed.
107*e71b7053SJung-uk KimFor other objects, it's as legitimate to use any byte sequence (such as a
108*e71b7053SJung-uk Kimsequence of bytes from `/dev/urandom` that's been saved away), which makes any
109*e71b7053SJung-uk Kimcharacter encoding discussion irrelevant; in such cases, simply use the same
110*e71b7053SJung-uk Kimbyte sequence as it is.
111*e71b7053SJung-uk Kim
112*e71b7053SJung-uk Kim=head2 Creating new objects
113*e71b7053SJung-uk Kim
114*e71b7053SJung-uk KimFor creating new pass phrase protected objects, make sure the pass phrase is
115*e71b7053SJung-uk Kimencoded using UTF-8.
116*e71b7053SJung-uk KimThis is default on most modern Unixes, but may involve an effort on other
117*e71b7053SJung-uk Kimplatforms.
118*e71b7053SJung-uk KimSpecifically for Windows, setting the environment variable
119*e71b7053SJung-uk KimC<OPENSSL_WIN32_UTF8> will have anything entered on [Windows] console prompt
120*e71b7053SJung-uk Kimconverted to UTF-8 (command line and separately prompted pass phrases alike).
121*e71b7053SJung-uk Kim
122*e71b7053SJung-uk Kim=head2 Opening existing objects
123*e71b7053SJung-uk Kim
124*e71b7053SJung-uk KimFor opening pass phrase protected objects where you know what character
125*e71b7053SJung-uk Kimencoding was used for the encryption pass phrase, make sure to use the same
126*e71b7053SJung-uk Kimencoding again.
127*e71b7053SJung-uk Kim
128*e71b7053SJung-uk KimFor opening pass phrase protected objects where the character encoding that was
129*e71b7053SJung-uk Kimused is unknown, or where the producing application is unknown, try one of the
130*e71b7053SJung-uk Kimfollowing:
131*e71b7053SJung-uk Kim
132*e71b7053SJung-uk Kim=over 4
133*e71b7053SJung-uk Kim
134*e71b7053SJung-uk Kim=item 1.
135*e71b7053SJung-uk Kim
136*e71b7053SJung-uk KimTry the pass phrase that you have as it is in the character encoding of your
137*e71b7053SJung-uk Kimenvironment.
138*e71b7053SJung-uk KimIt's possible that its byte sequence is exactly right.
139*e71b7053SJung-uk Kim
140*e71b7053SJung-uk Kim=item 2.
141*e71b7053SJung-uk Kim
142*e71b7053SJung-uk KimConvert the pass phrase to UTF-8 and try with the result.
143*e71b7053SJung-uk KimSpecifically with PKCS#12, this should open up any object that was created
144*e71b7053SJung-uk Kimaccording to the specification.
145*e71b7053SJung-uk Kim
146*e71b7053SJung-uk Kim=item 3.
147*e71b7053SJung-uk Kim
148*e71b7053SJung-uk KimDo a naïve (i.e. purely mathematical) ISO-8859-1 to UTF-8 conversion and try
149*e71b7053SJung-uk Kimwith the result.
150*e71b7053SJung-uk KimThis differs from the previous attempt because ISO-8859-1 maps directly to
151*e71b7053SJung-uk KimU+0000 to U+00FF, which other non-UTF-8 character sets do not.
152*e71b7053SJung-uk Kim
153*e71b7053SJung-uk KimThis also takes care of the case when a UTF-8 encoded string was used with
154*e71b7053SJung-uk KimOpenSSL older than 1.1.0.
155*e71b7053SJung-uk Kim(for example, C<ï>, which is 0xC3 0xAF when encoded in UTF-8, would become 0xC3
156*e71b7053SJung-uk Kim0x83 0xC2 0xAF when re-encoded in the naïve manner.
157*e71b7053SJung-uk KimThe conversion to BMPString would then yield 0x00 0xC3 0x00 0xA4 0x00 0x00, the
158*e71b7053SJung-uk Kimerroneous/non-compliant encoding used by OpenSSL older than 1.1.0)
159*e71b7053SJung-uk Kim
160*e71b7053SJung-uk Kim=back
161*e71b7053SJung-uk Kim
162*e71b7053SJung-uk Kim=head1 SEE ALSO
163*e71b7053SJung-uk Kim
164*e71b7053SJung-uk KimL<evp(7)>,
165*e71b7053SJung-uk KimL<ossl_store(7)>,
166*e71b7053SJung-uk KimL<EVP_BytesToKey(3)>, L<EVP_DecryptInit(3)>,
167*e71b7053SJung-uk KimL<PEM_do_header(3)>,
168*e71b7053SJung-uk KimL<PKCS12_parse(3)>, L<PKCS12_newpass(3)>,
169*e71b7053SJung-uk KimL<d2i_PKCS8PrivateKey_bio(3)>
170*e71b7053SJung-uk Kim
171*e71b7053SJung-uk Kim=head1 COPYRIGHT
172*e71b7053SJung-uk Kim
173*e71b7053SJung-uk KimCopyright 2018 The OpenSSL Project Authors. All Rights Reserved.
174*e71b7053SJung-uk Kim
175*e71b7053SJung-uk KimLicensed under the OpenSSL license (the "License").  You may not use
176*e71b7053SJung-uk Kimthis file except in compliance with the License.  You can obtain a copy
177*e71b7053SJung-uk Kimin the file LICENSE in the source distribution or at
178*e71b7053SJung-uk KimL<https://www.openssl.org/source/license.html>.
179*e71b7053SJung-uk Kim
180*e71b7053SJung-uk Kim=cut
181