From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.5-pre1 (2020-06-20) on ip-172-31-74-118.ec2.internal X-Spam-Level: X-Spam-Status: No, score=-1.9 required=3.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.5-pre1 Path: eternal-september.org!reader02.eternal-september.org!aioe.org!JUN8/iIzeA71QWaIWFKODA.user.gioia.aioe.org.POSTED!not-for-mail From: "Luke A. Guest" Newsgroups: comp.lang.ada Subject: Re: Ada and Unicode Date: Mon, 19 Apr 2021 12:50:40 +0100 Organization: Aioe.org NNTP Server Message-ID: References: <607b5b20$0$27442$426a74cc@news.free.fr> <660e25a5-506b-43c0-b4ac-e7738e5500e5n@googlegroups.com> NNTP-Posting-Host: JUN8/iIzeA71QWaIWFKODA.user.gioia.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Complaints-To: abuse@aioe.org User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.9.1 X-Notice: Filtered by postfilter v. 0.9.2 Content-Language: en-GB Xref: reader02.eternal-september.org comp.lang.ada:61837 List-Id: On 19/04/2021 12:15, Simon Wright wrote: > Maxim Reznik writes: > >> воскресенье, 18 апреля 2021 г. в 01:03:14 UTC+3, DrPi: >>> >>> Any way to use source code encoded in UTF-8 ? >> >> Yes, with GNAT just use "-gnatW8" for compiler flag (in command line >> or your project file): > > But don't use unit names containing international characters, at any > rate if you're (interested in compiling on) Windows or macOS: There's no such thing as "character" any more and we need to move away from that. Unicode has the concept of a code point which is 32 bit and any "character" as we know it, or glyph, can consist of multiple code points. In my lib, nowhere near ready (whether it will be I don't know), I define octets, Unicode_String (utf-8 string) which is array of octets and Code_Points which an iterator produces as it iterates over those strings. I was intending to have an iterator for grapheme clusters and other units.