1e71b7053SJung-uk Kim=pod 2e71b7053SJung-uk Kim 3e71b7053SJung-uk Kim=encoding utf8 4e71b7053SJung-uk Kim 5e71b7053SJung-uk Kim=head1 NAME 6e71b7053SJung-uk Kim 7e71b7053SJung-uk Kimpassphrase-encoding 8e71b7053SJung-uk Kim- How diverse parts of OpenSSL treat pass phrases character encoding 9e71b7053SJung-uk Kim 10e71b7053SJung-uk Kim=head1 DESCRIPTION 11e71b7053SJung-uk Kim 12e71b7053SJung-uk KimIn a modern world with all sorts of character encodings, the treatment of pass 13e71b7053SJung-uk Kimphrases has become increasingly complex. 14e71b7053SJung-uk KimThis manual page attempts to give an overview over how this problem is 15e71b7053SJung-uk Kimcurrently addressed in different parts of the OpenSSL library. 16e71b7053SJung-uk Kim 17e71b7053SJung-uk Kim=head2 The general case 18e71b7053SJung-uk Kim 19e71b7053SJung-uk KimThe OpenSSL library doesn't treat pass phrases in any special way as a general 20e71b7053SJung-uk Kimrule, and trusts the application or user to choose a suitable character set 21e71b7053SJung-uk Kimand stick to that throughout the lifetime of affected objects. 22e71b7053SJung-uk KimThis means that for an object that was encrypted using a pass phrase encoded in 23e71b7053SJung-uk KimISO-8859-1, that object needs to be decrypted using a pass phrase encoded in 24e71b7053SJung-uk KimISO-8859-1. 25e71b7053SJung-uk KimUsing the wrong encoding is expected to cause a decryption failure. 26e71b7053SJung-uk Kim 27e71b7053SJung-uk Kim=head2 PKCS#12 28e71b7053SJung-uk Kim 29e71b7053SJung-uk KimPKCS#12 is a bit different regarding pass phrase encoding. 30e71b7053SJung-uk KimThe standard stipulates that the pass phrase shall be encoded as an ASN.1 31e71b7053SJung-uk KimBMPString, which consists of the code points of the basic multilingual plane, 32e71b7053SJung-uk Kimencoded in big endian (UCS-2 BE). 33e71b7053SJung-uk Kim 34e71b7053SJung-uk KimOpenSSL tries to adapt to this requirements in one of the following manners: 35e71b7053SJung-uk Kim 36e71b7053SJung-uk Kim=over 4 37e71b7053SJung-uk Kim 38e71b7053SJung-uk Kim=item 1. 39e71b7053SJung-uk Kim 40e71b7053SJung-uk KimTreats the received pass phrase as UTF-8 encoded and tries to re-encode it to 41e71b7053SJung-uk KimUTF-16 (which is the same as UCS-2 for characters U+0000 to U+D7FF and U+E000 42e71b7053SJung-uk Kimto U+FFFF, but becomes an expansion for any other character), or failing that, 43e71b7053SJung-uk Kimproceeds with step 2. 44e71b7053SJung-uk Kim 45e71b7053SJung-uk Kim=item 2. 46e71b7053SJung-uk Kim 47e71b7053SJung-uk KimAssumes that the pass phrase is encoded in ASCII or ISO-8859-1 and 48e71b7053SJung-uk Kimopportunistically prepends each byte with a zero byte to obtain the UCS-2 49e71b7053SJung-uk Kimencoding of the characters, which it stores as a BMPString. 50e71b7053SJung-uk Kim 51e71b7053SJung-uk KimNote that since there is no check of your locale, this may produce UCS-2 / 52e71b7053SJung-uk KimUTF-16 characters that do not correspond to the original pass phrase characters 53e71b7053SJung-uk Kimfor other character sets, such as any ISO-8859-X encoding other than 54e71b7053SJung-uk KimISO-8859-1 (or for Windows, CP 1252 with exception for the extra "graphical" 55e71b7053SJung-uk Kimcharacters in the 0x80-0x9F range). 56e71b7053SJung-uk Kim 57e71b7053SJung-uk Kim=back 58e71b7053SJung-uk Kim 59e71b7053SJung-uk KimOpenSSL versions older than 1.1.0 do variant 2 only, and that is the reason why 60e71b7053SJung-uk KimOpenSSL still does this, to be able to read files produced with older versions. 61e71b7053SJung-uk Kim 62e71b7053SJung-uk KimIt should be noted that this approach isn't entirely fault free. 63e71b7053SJung-uk Kim 64e71b7053SJung-uk KimA pass phrase encoded in ISO-8859-2 could very well have a sequence such as 65e71b7053SJung-uk Kim0xC3 0xAF (which is the two characters "LATIN CAPITAL LETTER A WITH BREVE" 66e71b7053SJung-uk Kimand "LATIN CAPITAL LETTER Z WITH DOT ABOVE" in ISO-8859-2 encoding), but would 67e71b7053SJung-uk Kimbe misinterpreted as the perfectly valid UTF-8 encoded code point U+00EF (LATIN 6817f01e99SJung-uk KimSMALL LETTER I WITH DIAERESIS) I<if the pass phrase doesn't contain anything that 69e71b7053SJung-uk Kimwould be invalid UTF-8>. 70e71b7053SJung-uk KimA pass phrase that contains this kind of byte sequence will give a different 71e71b7053SJung-uk Kimoutcome in OpenSSL 1.1.0 and newer than in OpenSSL older than 1.1.0. 72e71b7053SJung-uk Kim 73e71b7053SJung-uk Kim 0x00 0xC3 0x00 0xAF # OpenSSL older than 1.1.0 74e71b7053SJung-uk Kim 0x00 0xEF # OpenSSL 1.1.0 and newer 75e71b7053SJung-uk Kim 76e71b7053SJung-uk KimOn the same accord, anything encoded in UTF-8 that was given to OpenSSL older 77e71b7053SJung-uk Kimthan 1.1.0 was misinterpreted as ISO-8859-1 sequences. 78e71b7053SJung-uk Kim 79e71b7053SJung-uk Kim=head2 OSSL_STORE 80e71b7053SJung-uk Kim 81e71b7053SJung-uk KimL<ossl_store(7)> acts as a general interface to access all kinds of objects, 82e71b7053SJung-uk Kimpotentially protected with a pass phrase, a PIN or something else. 83e71b7053SJung-uk KimThis API stipulates that pass phrases should be UTF-8 encoded, and that any 84e71b7053SJung-uk Kimother pass phrase encoding may give undefined results. 85e71b7053SJung-uk KimThis API relies on the application to ensure UTF-8 encoding, and doesn't check 86e71b7053SJung-uk Kimthat this is the case, so what it gets, it will also pass to the underlying 87e71b7053SJung-uk Kimloader. 88e71b7053SJung-uk Kim 89e71b7053SJung-uk Kim=head1 RECOMMENDATIONS 90e71b7053SJung-uk Kim 91e71b7053SJung-uk KimThis section assumes that you know what pass phrase was used for encryption, 92e71b7053SJung-uk Kimbut that it may have been encoded in a different character encoding than the 93e71b7053SJung-uk Kimone used by your current input method. 94e71b7053SJung-uk KimFor example, the pass phrase may have been used at a time when your default 95e71b7053SJung-uk Kimencoding was ISO-8859-1 (i.e. "naïve" resulting in the byte sequence 0x6E 0x61 96e71b7053SJung-uk Kim0xEF 0x76 0x65), and you're now in an environment where your default encoding 97e71b7053SJung-uk Kimis UTF-8 (i.e. "naïve" resulting in the byte sequence 0x6E 0x61 0xC3 0xAF 0x76 98e71b7053SJung-uk Kim0x65). 99e71b7053SJung-uk KimWhenever it's mentioned that you should use a certain character encoding, it 100e71b7053SJung-uk Kimshould be understood that you either change the input method to use the 101e71b7053SJung-uk Kimmentioned encoding when you type in your pass phrase, or use some suitable tool 102e71b7053SJung-uk Kimto convert your pass phrase from your default encoding to the target encoding. 103e71b7053SJung-uk Kim 104e71b7053SJung-uk KimAlso note that the sub-sections below discuss human readable pass phrases. 105e71b7053SJung-uk KimThis is particularly relevant for PKCS#12 objects, where human readable pass 106e71b7053SJung-uk Kimphrases are assumed. 107e71b7053SJung-uk KimFor other objects, it's as legitimate to use any byte sequence (such as a 108*b077aed3SPierre Proncherysequence of bytes from F</dev/urandom> that's been saved away), which makes any 109e71b7053SJung-uk Kimcharacter encoding discussion irrelevant; in such cases, simply use the same 110e71b7053SJung-uk Kimbyte sequence as it is. 111e71b7053SJung-uk Kim 112e71b7053SJung-uk Kim=head2 Creating new objects 113e71b7053SJung-uk Kim 114e71b7053SJung-uk KimFor creating new pass phrase protected objects, make sure the pass phrase is 115e71b7053SJung-uk Kimencoded using UTF-8. 116e71b7053SJung-uk KimThis is default on most modern Unixes, but may involve an effort on other 117e71b7053SJung-uk Kimplatforms. 118e71b7053SJung-uk KimSpecifically for Windows, setting the environment variable 119*b077aed3SPierre ProncheryB<OPENSSL_WIN32_UTF8> will have anything entered on [Windows] console prompt 120e71b7053SJung-uk Kimconverted to UTF-8 (command line and separately prompted pass phrases alike). 121e71b7053SJung-uk Kim 122e71b7053SJung-uk Kim=head2 Opening existing objects 123e71b7053SJung-uk Kim 124e71b7053SJung-uk KimFor opening pass phrase protected objects where you know what character 125e71b7053SJung-uk Kimencoding was used for the encryption pass phrase, make sure to use the same 126e71b7053SJung-uk Kimencoding again. 127e71b7053SJung-uk Kim 128e71b7053SJung-uk KimFor opening pass phrase protected objects where the character encoding that was 129e71b7053SJung-uk Kimused is unknown, or where the producing application is unknown, try one of the 130e71b7053SJung-uk Kimfollowing: 131e71b7053SJung-uk Kim 132e71b7053SJung-uk Kim=over 4 133e71b7053SJung-uk Kim 134e71b7053SJung-uk Kim=item 1. 135e71b7053SJung-uk Kim 136e71b7053SJung-uk KimTry the pass phrase that you have as it is in the character encoding of your 137e71b7053SJung-uk Kimenvironment. 138e71b7053SJung-uk KimIt's possible that its byte sequence is exactly right. 139e71b7053SJung-uk Kim 140e71b7053SJung-uk Kim=item 2. 141e71b7053SJung-uk Kim 142e71b7053SJung-uk KimConvert the pass phrase to UTF-8 and try with the result. 143e71b7053SJung-uk KimSpecifically with PKCS#12, this should open up any object that was created 144e71b7053SJung-uk Kimaccording to the specification. 145e71b7053SJung-uk Kim 146e71b7053SJung-uk Kim=item 3. 147e71b7053SJung-uk Kim 148e71b7053SJung-uk KimDo a naïve (i.e. purely mathematical) ISO-8859-1 to UTF-8 conversion and try 149e71b7053SJung-uk Kimwith the result. 150e71b7053SJung-uk KimThis differs from the previous attempt because ISO-8859-1 maps directly to 151e71b7053SJung-uk KimU+0000 to U+00FF, which other non-UTF-8 character sets do not. 152e71b7053SJung-uk Kim 153e71b7053SJung-uk KimThis also takes care of the case when a UTF-8 encoded string was used with 154e71b7053SJung-uk KimOpenSSL older than 1.1.0. 155e71b7053SJung-uk Kim(for example, C<ï>, which is 0xC3 0xAF when encoded in UTF-8, would become 0xC3 156e71b7053SJung-uk Kim0x83 0xC2 0xAF when re-encoded in the naïve manner. 157e71b7053SJung-uk KimThe conversion to BMPString would then yield 0x00 0xC3 0x00 0xA4 0x00 0x00, the 158e71b7053SJung-uk Kimerroneous/non-compliant encoding used by OpenSSL older than 1.1.0) 159e71b7053SJung-uk Kim 160e71b7053SJung-uk Kim=back 161e71b7053SJung-uk Kim 162e71b7053SJung-uk Kim=head1 SEE ALSO 163e71b7053SJung-uk Kim 164e71b7053SJung-uk KimL<evp(7)>, 165e71b7053SJung-uk KimL<ossl_store(7)>, 166e71b7053SJung-uk KimL<EVP_BytesToKey(3)>, L<EVP_DecryptInit(3)>, 167e71b7053SJung-uk KimL<PEM_do_header(3)>, 168e71b7053SJung-uk KimL<PKCS12_parse(3)>, L<PKCS12_newpass(3)>, 169e71b7053SJung-uk KimL<d2i_PKCS8PrivateKey_bio(3)> 170e71b7053SJung-uk Kim 171e71b7053SJung-uk Kim=head1 COPYRIGHT 172e71b7053SJung-uk Kim 173*b077aed3SPierre ProncheryCopyright 2018-2021 The OpenSSL Project Authors. All Rights Reserved. 174e71b7053SJung-uk Kim 175*b077aed3SPierre ProncheryLicensed under the Apache License 2.0 (the "License"). You may not use 176e71b7053SJung-uk Kimthis file except in compliance with the License. You can obtain a copy 177e71b7053SJung-uk Kimin the file LICENSE in the source distribution or at 178e71b7053SJung-uk KimL<https://www.openssl.org/source/license.html>. 179e71b7053SJung-uk Kim 180e71b7053SJung-uk Kim=cut 181