From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on ip-172-31-74-118.ec2.internal X-Spam-Level: X-Spam-Status: No, score=-0.9 required=3.0 tests=AC_FROM_MANY_DOTS,BAYES_00 autolearn=no autolearn_force=no version=3.4.6 Path: eternal-september.org!reader02.eternal-september.org!news.mixmin.net!proxad.net!feeder1-2.proxad.net!212.27.60.64.MISMATCH!cleanfeed3-b.proxad.net!nnrp2-2.free.fr!not-for-mail From: Thomas Newsgroups: comp.lang.ada Mail-Copies-To: nobody Subject: Re: Ada and Unicode References: <607b5b20$0$27442$426a74cc@news.free.fr> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit User-Agent: MT-NewsWatcher/3.5.3b3 (Intel Mac OS X) Date: Sun, 03 Apr 2022 18:51:56 +0200 Message-ID: Organization: Guest of ProXad - France NNTP-Posting-Date: 03 Apr 2022 18:51:57 CEST NNTP-Posting-Host: 91.175.52.121 X-Trace: 1649004717 news-2.free.fr 11564 91.175.52.121:8636 X-Complaints-To: abuse@proxad.net Xref: reader02.eternal-september.org comp.lang.ada:63687 List-Id: In article , Vadim Godunko wrote: > On Sunday, April 18, 2021 at 1:03:14 AM UTC+3, DrPi wrote: > > What's the way to manage Unicode correctly ? > > > > Ada doesn't have good Unicode support. :( So, you need to find suitable set > of "workarounds". > > There are few different aspects of Unicode support need to be considered: > > 1. Representation of string literals. If you want to use non-ASCII characters > in source code, you need to use -gnatW8 switch and it will require use of > Wide_Wide_String everywhere. > 2. Internal representation during application execution. You are forced to > use Wide_Wide_String at previous step, so it will be UCS4/UTF32. > It is hard to say that it is reasonable set of features for modern world. I don't think Ada would be lacking that much, for having good UTF-8 support. the cardinal point is to be able to fill a Ada.Strings.UTF_Encoding.UTF_8_String with a litteral. (once you got it, when you'll try to fill a Standard.String with a non-Latin-1 character, it'll make an error, i think it's fine :-) ) does Ada 202x allow it ? if not, it would probably be easier if it was type UTF_8_String is new String; instead of subtype UTF_8_String is String; for all subprograms it's quite easy: we just have to duplicate them with the new type, and to mark the old one as Obsolescent. but, now that "subtype UTF_8_String" exists, i don't know what we can do for types. (is the only way to choose a new name?) > To > fix some of drawbacks of current situation we are developing new text > processing library, know as VSS. > > https://github.com/AdaCore/VSS (are you working at AdaCore ?) -- RAPID maintainer http://savannah.nongnu.org/projects/rapid/