From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on ip-172-31-74-118.ec2.internal X-Spam-Level: X-Spam-Status: No, score=-1.9 required=3.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.6 Path: eternal-september.org!reader02.eternal-september.org!aioe.org!Hx95GBhnJb0Xc8StPhH8AA.user.46.165.242.91.POSTED!not-for-mail From: "Dmitry A. Kazakov" Newsgroups: comp.lang.ada Subject: =?UTF-8?B?UmU6IEd0a0FkYSBhbmQg4oKs?= Date: Sat, 11 Sep 2021 16:13:07 +0200 Organization: Aioe.org NNTP Server Message-ID: References: <06dbbe8e-737e-44c2-9e9c-40e8f8aade2fn@googlegroups.com> <234ee351-3abb-445e-9d34-d5abd7a8a9b6n@googlegroups.com> <4b1683a1-6ad5-4692-b671-807db5b51f27n@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: gioia.aioe.org; logging-data="340"; posting-host="Hx95GBhnJb0Xc8StPhH8AA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org"; User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 X-Notice: Filtered by postfilter v. 0.9.2 Content-Language: en-US Xref: reader02.eternal-september.org comp.lang.ada:62707 List-Id: On 2021-09-11 15:51, AdaMagica wrote: > Being German, I need umlauts and € together in strings to write them to some labels. > Using Character'Val (16#E2#) & Character'Val (16#82#) & Character'Val (16#AC#) > complicates things, since umlauts are above 255 and need transformation to UTF8, > whereas the euro sequence above is already in UTF8 and must not again be transformed. > > What a mess! Huh, the mess here is Latin-1 introduced by Ada 95, no such thing should have been even supported. This happened because in 90s UTF-8 was not yet established, so Ada 95 made Character Latin-1 and added Wide_Character for UCS-2. This was a huge mistake with wide (pun intended) reaching nasty consequences. Since Ada type system is too weak to handle encodings, Strings should simply be UTF-8 and Character an octet with lower 7-bits corresponding to ASCII. Anyway, for anything that is not ASCII I use a named constant. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de