History log of /llvm-project/llvm/lib/Support/UnicodeNameToCodepoint.cpp (Results 1 – 12 of 12)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# 03e43cf1 17-Jan-2024 cor3ntin <corentinjabot@gmail.com>

[Clang] Update Unicode version to 15.1 (#77147)

This update all of our Unicode tables to Unicode 15.1. This is a minor
version so only a relatively small numbers of characters are added,
mainly ideo

[Clang] Update Unicode version to 15.1 (#77147)

This update all of our Unicode tables to Unicode 15.1. This is a minor
version so only a relatively small numbers of characters are added,
mainly ideographs

https://www.unicode.org/versions/Unicode15.1.0/#Appendices_nb

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5
# bcb685e1 03-Nov-2023 Simon Pilgrim <llvm-dev@redking.me.uk>

[Support] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC.

startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)


Revision tags: llvmorg-17.0.4
# d11c4542 19-Oct-2023 Kazu Hirata <kazu@google.com>

[Support] Use StringRef::contains_insensitive (NFC)


Revision tags: llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1
# 68410fbe 28-Jul-2023 Corentin Jabot <corentinjabot@gmail.com>

Fix handling of medial hyphens in Unicode Names.

In a Unicode name was stored in a way that caused
a medial hyphen to be at the end of a a chunk, it would not
be properly ignored by the loose matchi

Fix handling of medial hyphens in Unicode Names.

In a Unicode name was stored in a way that caused
a medial hyphen to be at the end of a a chunk, it would not
be properly ignored by the loose matching algorithm.

For example if `LEFT-TO-RIGHT OVERRIDE` was stored as
`LEFT-` [...], the `-` would not be ignored.

The generators now ensures nodes are not cut accross
medial hyphen boundaries.

Fixes #64161

Differential Revision: https://reviews.llvm.org/D156518

show more ...


Revision tags: llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init
# cfdba8b7 14-Jan-2023 Fangrui Song <i@maskray.me>

Fix some -Wconstant-conversion warnings for future Clang (D139114)


# 2ca4b4f8 13-Jan-2023 Vitaly Buka <vitalybuka@google.com>

[NFC] Suppress warning after D139114


Revision tags: llvmorg-15.0.7
# b1df3a2c 16-Dec-2022 Fangrui Song <i@maskray.me>

[Support] llvm::Optional => std::optional

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716


# aadaafac 03-Dec-2022 Kazu Hirata <kazu@google.com>

[llvm] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of ma

[llvm] Use std::nullopt instead of None (NFC)

This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1
# c932cef3 13-Sep-2022 Corentin Jabot <corentinjabot@gmail.com>

Update Unicode to 15.0

Unicode 15.0 adds 4,489 characters, for a total of 149,186 characters.
These additions include 2 new scripts along with 20 new emoji characters,
and 4,193 CJK ideographs.

Thi

Update Unicode to 15.0

Unicode 15.0 adds 4,489 characters, for a total of 149,186 characters.
These additions include 2 new scripts along with 20 new emoji characters,
and 4,193 CJK ideographs.

This changes modify most existing tables including
- XID_Start/XID_Continue in Clang
- The character name database (used by \N{} in Clang)
- The list of formattable/printable codepoints
- The case folding algorithm (which we had not updated since Unicode 9)
- The list of nonspacing/enclosing marks used by the column width
computation algorithm. The rest of the column width algorithm
is not updated.

Reviewed By: tahonermann

Differential Revision: https://reviews.llvm.org/D133807

show more ...


Revision tags: llvmorg-15.0.0
# 89f14332 03-Sep-2022 Kazu Hirata <kazu@google.com>

Use llvm::lower_bound (NFC)


Revision tags: llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# f5cd172e 26-Jun-2022 Benjamin Kramer <benny.kra@googlemail.com>

[Support] Work around an issue when building with old versions of libstdc++

llvm/lib/Support/UnicodeNameToCodepoint.cpp:189:12: error: chosen constructor is explicit in copy-initialization
retur

[Support] Work around an issue when building with old versions of libstdc++

llvm/lib/Support/UnicodeNameToCodepoint.cpp:189:12: error: chosen constructor is explicit in copy-initialization
return {N, false, 0};
^~~~~~~~~~~~~
/usr/include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here
constexpr tuple(_UElements&&... __elements)
^

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# c92056d0 04-Apr-2022 Corentin Jabot <corentinjabot@gmail.com>

[Clang][C++23] P2071 Named universal character escapes

Implements [[ https://wg21.link/p2071r1 | P2071 Named Universal Character Escapes ]] - as an extension in all language mode, the patch not wa

[Clang][C++23] P2071 Named universal character escapes

Implements [[ https://wg21.link/p2071r1 | P2071 Named Universal Character Escapes ]] - as an extension in all language mode, the patch not warn in c++23 mode will be done later once this paper is plenary approved (in July).

We add

* A code generator that transforms `UnicodeData.txt` and `NameAliases.txt` to a space efficient data structure that can be queried in `O(NameLength)`
* A set of functions in `Unicode.h` to query that data, including

* A function to find an exact match of a given Unicode character name
* A function to perform a loose (ignoring case, space, underscore, medial hyphen) matching
* A function returning the best matching codepoint for a given string per edit distance

* Support of `\N{}` escape sequences in String and character Literals, with loose and typos diagnostics/fixits
* Support of `\N{}` as UCN with loose matching diagnostics/fixits.

Loose matching is considered an error to match closely the semantics of P2071.

The generated data contributes to 280kB of data to the binaries.

`UnicodeData.txt` and `NameAliases.txt` are not committed to the repository in this patch, and regenerating the data is a manual process.

Reviewed By: tahonermann

Differential Revision: https://reviews.llvm.org/D123064

show more ...