1*e71b7053SJung-uk Kim=pod 2*e71b7053SJung-uk Kim 3*e71b7053SJung-uk Kim=encoding utf8 4*e71b7053SJung-uk Kim 5*e71b7053SJung-uk Kim=head1 NAME 6*e71b7053SJung-uk Kim 7*e71b7053SJung-uk Kimpassphrase-encoding 8*e71b7053SJung-uk Kim- How diverse parts of OpenSSL treat pass phrases character encoding 9*e71b7053SJung-uk Kim 10*e71b7053SJung-uk Kim=head1 DESCRIPTION 11*e71b7053SJung-uk Kim 12*e71b7053SJung-uk KimIn a modern world with all sorts of character encodings, the treatment of pass 13*e71b7053SJung-uk Kimphrases has become increasingly complex. 14*e71b7053SJung-uk KimThis manual page attempts to give an overview over how this problem is 15*e71b7053SJung-uk Kimcurrently addressed in different parts of the OpenSSL library. 16*e71b7053SJung-uk Kim 17*e71b7053SJung-uk Kim=head2 The general case 18*e71b7053SJung-uk Kim 19*e71b7053SJung-uk KimThe OpenSSL library doesn't treat pass phrases in any special way as a general 20*e71b7053SJung-uk Kimrule, and trusts the application or user to choose a suitable character set 21*e71b7053SJung-uk Kimand stick to that throughout the lifetime of affected objects. 22*e71b7053SJung-uk KimThis means that for an object that was encrypted using a pass phrase encoded in 23*e71b7053SJung-uk KimISO-8859-1, that object needs to be decrypted using a pass phrase encoded in 24*e71b7053SJung-uk KimISO-8859-1. 25*e71b7053SJung-uk KimUsing the wrong encoding is expected to cause a decryption failure. 26*e71b7053SJung-uk Kim 27*e71b7053SJung-uk Kim=head2 PKCS#12 28*e71b7053SJung-uk Kim 29*e71b7053SJung-uk KimPKCS#12 is a bit different regarding pass phrase encoding. 30*e71b7053SJung-uk KimThe standard stipulates that the pass phrase shall be encoded as an ASN.1 31*e71b7053SJung-uk KimBMPString, which consists of the code points of the basic multilingual plane, 32*e71b7053SJung-uk Kimencoded in big endian (UCS-2 BE). 33*e71b7053SJung-uk Kim 34*e71b7053SJung-uk KimOpenSSL tries to adapt to this requirements in one of the following manners: 35*e71b7053SJung-uk Kim 36*e71b7053SJung-uk Kim=over 4 37*e71b7053SJung-uk Kim 38*e71b7053SJung-uk Kim=item 1. 39*e71b7053SJung-uk Kim 40*e71b7053SJung-uk KimTreats the received pass phrase as UTF-8 encoded and tries to re-encode it to 41*e71b7053SJung-uk KimUTF-16 (which is the same as UCS-2 for characters U+0000 to U+D7FF and U+E000 42*e71b7053SJung-uk Kimto U+FFFF, but becomes an expansion for any other character), or failing that, 43*e71b7053SJung-uk Kimproceeds with step 2. 44*e71b7053SJung-uk Kim 45*e71b7053SJung-uk Kim=item 2. 46*e71b7053SJung-uk Kim 47*e71b7053SJung-uk KimAssumes that the pass phrase is encoded in ASCII or ISO-8859-1 and 48*e71b7053SJung-uk Kimopportunistically prepends each byte with a zero byte to obtain the UCS-2 49*e71b7053SJung-uk Kimencoding of the characters, which it stores as a BMPString. 50*e71b7053SJung-uk Kim 51*e71b7053SJung-uk KimNote that since there is no check of your locale, this may produce UCS-2 / 52*e71b7053SJung-uk KimUTF-16 characters that do not correspond to the original pass phrase characters 53*e71b7053SJung-uk Kimfor other character sets, such as any ISO-8859-X encoding other than 54*e71b7053SJung-uk KimISO-8859-1 (or for Windows, CP 1252 with exception for the extra "graphical" 55*e71b7053SJung-uk Kimcharacters in the 0x80-0x9F range). 56*e71b7053SJung-uk Kim 57*e71b7053SJung-uk Kim=back 58*e71b7053SJung-uk Kim 59*e71b7053SJung-uk KimOpenSSL versions older than 1.1.0 do variant 2 only, and that is the reason why 60*e71b7053SJung-uk KimOpenSSL still does this, to be able to read files produced with older versions. 61*e71b7053SJung-uk Kim 62*e71b7053SJung-uk KimIt should be noted that this approach isn't entirely fault free. 63*e71b7053SJung-uk Kim 64*e71b7053SJung-uk KimA pass phrase encoded in ISO-8859-2 could very well have a sequence such as 65*e71b7053SJung-uk Kim0xC3 0xAF (which is the two characters "LATIN CAPITAL LETTER A WITH BREVE" 66*e71b7053SJung-uk Kimand "LATIN CAPITAL LETTER Z WITH DOT ABOVE" in ISO-8859-2 encoding), but would 67*e71b7053SJung-uk Kimbe misinterpreted as the perfectly valid UTF-8 encoded code point U+00EF (LATIN 68*e71b7053SJung-uk KimSMALL LETTER I WITH DIARESIS) I<if the pass phrase doesn't contain anything that 69*e71b7053SJung-uk Kimwould be invalid UTF-8>. 70*e71b7053SJung-uk KimA pass phrase that contains this kind of byte sequence will give a different 71*e71b7053SJung-uk Kimoutcome in OpenSSL 1.1.0 and newer than in OpenSSL older than 1.1.0. 72*e71b7053SJung-uk Kim 73*e71b7053SJung-uk Kim 0x00 0xC3 0x00 0xAF # OpenSSL older than 1.1.0 74*e71b7053SJung-uk Kim 0x00 0xEF # OpenSSL 1.1.0 and newer 75*e71b7053SJung-uk Kim 76*e71b7053SJung-uk KimOn the same accord, anything encoded in UTF-8 that was given to OpenSSL older 77*e71b7053SJung-uk Kimthan 1.1.0 was misinterpreted as ISO-8859-1 sequences. 78*e71b7053SJung-uk Kim 79*e71b7053SJung-uk Kim=head2 OSSL_STORE 80*e71b7053SJung-uk Kim 81*e71b7053SJung-uk KimL<ossl_store(7)> acts as a general interface to access all kinds of objects, 82*e71b7053SJung-uk Kimpotentially protected with a pass phrase, a PIN or something else. 83*e71b7053SJung-uk KimThis API stipulates that pass phrases should be UTF-8 encoded, and that any 84*e71b7053SJung-uk Kimother pass phrase encoding may give undefined results. 85*e71b7053SJung-uk KimThis API relies on the application to ensure UTF-8 encoding, and doesn't check 86*e71b7053SJung-uk Kimthat this is the case, so what it gets, it will also pass to the underlying 87*e71b7053SJung-uk Kimloader. 88*e71b7053SJung-uk Kim 89*e71b7053SJung-uk Kim=head1 RECOMMENDATIONS 90*e71b7053SJung-uk Kim 91*e71b7053SJung-uk KimThis section assumes that you know what pass phrase was used for encryption, 92*e71b7053SJung-uk Kimbut that it may have been encoded in a different character encoding than the 93*e71b7053SJung-uk Kimone used by your current input method. 94*e71b7053SJung-uk KimFor example, the pass phrase may have been used at a time when your default 95*e71b7053SJung-uk Kimencoding was ISO-8859-1 (i.e. "naïve" resulting in the byte sequence 0x6E 0x61 96*e71b7053SJung-uk Kim0xEF 0x76 0x65), and you're now in an environment where your default encoding 97*e71b7053SJung-uk Kimis UTF-8 (i.e. "naïve" resulting in the byte sequence 0x6E 0x61 0xC3 0xAF 0x76 98*e71b7053SJung-uk Kim0x65). 99*e71b7053SJung-uk KimWhenever it's mentioned that you should use a certain character encoding, it 100*e71b7053SJung-uk Kimshould be understood that you either change the input method to use the 101*e71b7053SJung-uk Kimmentioned encoding when you type in your pass phrase, or use some suitable tool 102*e71b7053SJung-uk Kimto convert your pass phrase from your default encoding to the target encoding. 103*e71b7053SJung-uk Kim 104*e71b7053SJung-uk KimAlso note that the sub-sections below discuss human readable pass phrases. 105*e71b7053SJung-uk KimThis is particularly relevant for PKCS#12 objects, where human readable pass 106*e71b7053SJung-uk Kimphrases are assumed. 107*e71b7053SJung-uk KimFor other objects, it's as legitimate to use any byte sequence (such as a 108*e71b7053SJung-uk Kimsequence of bytes from `/dev/urandom` that's been saved away), which makes any 109*e71b7053SJung-uk Kimcharacter encoding discussion irrelevant; in such cases, simply use the same 110*e71b7053SJung-uk Kimbyte sequence as it is. 111*e71b7053SJung-uk Kim 112*e71b7053SJung-uk Kim=head2 Creating new objects 113*e71b7053SJung-uk Kim 114*e71b7053SJung-uk KimFor creating new pass phrase protected objects, make sure the pass phrase is 115*e71b7053SJung-uk Kimencoded using UTF-8. 116*e71b7053SJung-uk KimThis is default on most modern Unixes, but may involve an effort on other 117*e71b7053SJung-uk Kimplatforms. 118*e71b7053SJung-uk KimSpecifically for Windows, setting the environment variable 119*e71b7053SJung-uk KimC<OPENSSL_WIN32_UTF8> will have anything entered on [Windows] console prompt 120*e71b7053SJung-uk Kimconverted to UTF-8 (command line and separately prompted pass phrases alike). 121*e71b7053SJung-uk Kim 122*e71b7053SJung-uk Kim=head2 Opening existing objects 123*e71b7053SJung-uk Kim 124*e71b7053SJung-uk KimFor opening pass phrase protected objects where you know what character 125*e71b7053SJung-uk Kimencoding was used for the encryption pass phrase, make sure to use the same 126*e71b7053SJung-uk Kimencoding again. 127*e71b7053SJung-uk Kim 128*e71b7053SJung-uk KimFor opening pass phrase protected objects where the character encoding that was 129*e71b7053SJung-uk Kimused is unknown, or where the producing application is unknown, try one of the 130*e71b7053SJung-uk Kimfollowing: 131*e71b7053SJung-uk Kim 132*e71b7053SJung-uk Kim=over 4 133*e71b7053SJung-uk Kim 134*e71b7053SJung-uk Kim=item 1. 135*e71b7053SJung-uk Kim 136*e71b7053SJung-uk KimTry the pass phrase that you have as it is in the character encoding of your 137*e71b7053SJung-uk Kimenvironment. 138*e71b7053SJung-uk KimIt's possible that its byte sequence is exactly right. 139*e71b7053SJung-uk Kim 140*e71b7053SJung-uk Kim=item 2. 141*e71b7053SJung-uk Kim 142*e71b7053SJung-uk KimConvert the pass phrase to UTF-8 and try with the result. 143*e71b7053SJung-uk KimSpecifically with PKCS#12, this should open up any object that was created 144*e71b7053SJung-uk Kimaccording to the specification. 145*e71b7053SJung-uk Kim 146*e71b7053SJung-uk Kim=item 3. 147*e71b7053SJung-uk Kim 148*e71b7053SJung-uk KimDo a naïve (i.e. purely mathematical) ISO-8859-1 to UTF-8 conversion and try 149*e71b7053SJung-uk Kimwith the result. 150*e71b7053SJung-uk KimThis differs from the previous attempt because ISO-8859-1 maps directly to 151*e71b7053SJung-uk KimU+0000 to U+00FF, which other non-UTF-8 character sets do not. 152*e71b7053SJung-uk Kim 153*e71b7053SJung-uk KimThis also takes care of the case when a UTF-8 encoded string was used with 154*e71b7053SJung-uk KimOpenSSL older than 1.1.0. 155*e71b7053SJung-uk Kim(for example, C<ï>, which is 0xC3 0xAF when encoded in UTF-8, would become 0xC3 156*e71b7053SJung-uk Kim0x83 0xC2 0xAF when re-encoded in the naïve manner. 157*e71b7053SJung-uk KimThe conversion to BMPString would then yield 0x00 0xC3 0x00 0xA4 0x00 0x00, the 158*e71b7053SJung-uk Kimerroneous/non-compliant encoding used by OpenSSL older than 1.1.0) 159*e71b7053SJung-uk Kim 160*e71b7053SJung-uk Kim=back 161*e71b7053SJung-uk Kim 162*e71b7053SJung-uk Kim=head1 SEE ALSO 163*e71b7053SJung-uk Kim 164*e71b7053SJung-uk KimL<evp(7)>, 165*e71b7053SJung-uk KimL<ossl_store(7)>, 166*e71b7053SJung-uk KimL<EVP_BytesToKey(3)>, L<EVP_DecryptInit(3)>, 167*e71b7053SJung-uk KimL<PEM_do_header(3)>, 168*e71b7053SJung-uk KimL<PKCS12_parse(3)>, L<PKCS12_newpass(3)>, 169*e71b7053SJung-uk KimL<d2i_PKCS8PrivateKey_bio(3)> 170*e71b7053SJung-uk Kim 171*e71b7053SJung-uk Kim=head1 COPYRIGHT 172*e71b7053SJung-uk Kim 173*e71b7053SJung-uk KimCopyright 2018 The OpenSSL Project Authors. All Rights Reserved. 174*e71b7053SJung-uk Kim 175*e71b7053SJung-uk KimLicensed under the OpenSSL license (the "License"). You may not use 176*e71b7053SJung-uk Kimthis file except in compliance with the License. You can obtain a copy 177*e71b7053SJung-uk Kimin the file LICENSE in the source distribution or at 178*e71b7053SJung-uk KimL<https://www.openssl.org/source/license.html>. 179*e71b7053SJung-uk Kim 180*e71b7053SJung-uk Kim=cut 181