1*4c3eb207SmrgThis directory contains a mechanism for GCC to have its own internal 2*4c3eb207Smrgimplementation of wcwidth functionality. (cpp_wcwidth () in libcpp/charset.c). 3*4c3eb207Smrg 4*4c3eb207SmrgThe idea is to produce the necessary lookup table 5*4c3eb207Smrg(../../libcpp/generated_cpp_wcwidth.h) in a reproducible way, starting from the 6*4c3eb207Smrgfollowing files that are distributed by the Unicode Consortium: 7*4c3eb207Smrg 8*4c3eb207Smrgftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt 9*4c3eb207Smrgftp://ftp.unicode.org/Public/UNIDATA/EastAsianWidth.txt 10*4c3eb207Smrgftp://ftp.unicode.org/Public/UNIDATA/PropList.txt 11*4c3eb207Smrg 12*4c3eb207SmrgThese three files have been added to source control in this directory; 13*4c3eb207Smrgplease see unicode-license.txt for the relevant copyright information. 14*4c3eb207Smrg 15*4c3eb207SmrgIn order to keep in sync with glibc's wcwidth as much as possible, it is 16*4c3eb207Smrgdesirable for the logic that processes the Unicode data to be the same as 17*4c3eb207Smrgglibc's. To that end, we also put in this directory, in the from_glibc/ 18*4c3eb207Smrgdirectory, the glibc python code that implements their logic. This code was 19*4c3eb207Smrgcopied verbatim from glibc, and it can be updated at any time from the glibc 20*4c3eb207Smrgsource code repository. The files copied from that respository are: 21*4c3eb207Smrg 22*4c3eb207Smrglocaledata/unicode-gen/unicode_utils.py 23*4c3eb207Smrglocaledata/unicode-gen/utf8_gen.py 24*4c3eb207Smrg 25*4c3eb207SmrgAnd the most recent versions added to GCC are from glibc git commit: 26*4c3eb207Smrg2a764c6ee848dfe92cb2921ed3b14085f15d9e79 27*4c3eb207Smrg 28*4c3eb207SmrgFinally, the script gen_wcwidth.py found here contains the GCC-specific code to 29*4c3eb207Smrgmap glibc's output to the lookup tables we require. This script should not need 30*4c3eb207Smrgto change, unless there are structural changes to the Unicode data files or to 31*4c3eb207Smrgthe glibc code. 32*4c3eb207Smrg 33*4c3eb207SmrgThe procedure to update GCC's wcwidth tables is the following: 34*4c3eb207Smrg 35*4c3eb207Smrg1. Update the three Unicode data files from the above URLs. 36*4c3eb207Smrg 37*4c3eb207Smrg2. Update the two glibc files in from_glibc/ from glibc's git. Update 38*4c3eb207Smrg the commit number above in this README. 39*4c3eb207Smrg 40*4c3eb207Smrg3. Run ./gen_wcwidth.py X.Y > ../../libcpp/generated_cpp_wcwidth.h 41*4c3eb207Smrg (where X.Y is the version of the Unicode standard corresponding to the 42*4c3eb207Smrg Unicode data files being used, most recently, 12.1). 43*4c3eb207Smrg 44*4c3eb207SmrgAfter that, GCC's wcwidth will match the most recent glibc. 45