From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.5-pre1 (2020-06-20) on ip-172-31-74-118.ec2.internal X-Spam-Level: X-Spam-Status: No, score=-1.9 required=3.0 tests=BAYES_00,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.5-pre1 X-Received: by 2002:a05:620a:40ce:: with SMTP id g14mr10669064qko.190.1618820976002; Mon, 19 Apr 2021 01:29:36 -0700 (PDT) X-Received: by 2002:a05:6902:4a8:: with SMTP id r8mr15516875ybs.173.1618820975790; Mon, 19 Apr 2021 01:29:35 -0700 (PDT) Path: eternal-september.org!reader02.eternal-september.org!news.mixmin.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Mon, 19 Apr 2021 01:29:35 -0700 (PDT) In-Reply-To: <607b5b20$0$27442$426a74cc@news.free.fr> Injection-Info: google-groups.googlegroups.com; posting-host=2a03:7380:380d:3b:3cde:a100:9d47:ff55; posting-account=K1cP1QoAAAD_GR6kW2Td0NqGqGBLRE8h NNTP-Posting-Host: 2a03:7380:380d:3b:3cde:a100:9d47:ff55 References: <607b5b20$0$27442$426a74cc@news.free.fr> User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <660e25a5-506b-43c0-b4ac-e7738e5500e5n@googlegroups.com> Subject: Re: Ada and Unicode From: Maxim Reznik Injection-Date: Mon, 19 Apr 2021 08:29:35 +0000 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Xref: reader02.eternal-september.org comp.lang.ada:61830 List-Id: =D0=B2=D0=BE=D1=81=D0=BA=D1=80=D0=B5=D1=81=D0=B5=D0=BD=D1=8C=D0=B5, 18 =D0= =B0=D0=BF=D1=80=D0=B5=D0=BB=D1=8F 2021 =D0=B3. =D0=B2 01:03:14 UTC+3, DrPi: >=20 > Any way to use source code encoded in UTF-8 ?=20 Yes, with GNAT just use "-gnatW8" for compiler flag (in command line or you= r project file): -- main.adb: with Ada.Wide_Wide_Text_IO; procedure Main is =D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82 : constant Wide_Wide_String :=3D "= =D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82"; begin Ada.Wide_Wide_Text_IO.Put_Line (=D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82); end Main; $ gprbuild -gnatW8 main.adb $ ./main=20 =D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82 > In some languages, it is possible to set a tag at the beginning of the=20 > source file to direct the compiler which encoding to use.=20 You can do this with putting the Wide_Character_Encoding pragma (This is a = GNAT specific pragma) at the top of the file. Take a look: -- main.adb: pragma Wide_Character_Encoding (UTF8); with Ada.Wide_Wide_Text_IO; procedure Main is =D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82 : constant Wide_Wide_String :=3D "= =D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82"; begin Ada.Wide_Wide_Text_IO.Put_Line (=D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82); end Main; $ gprbuild main.adb $ ./main=20 =D0=9F=D1=80=D0=B8=D0=B2=D0=B5=D1=82 > What's the way to manage Unicode correctly ?=20 >=20 You can use Wide_Wide_String and Unbounded_Wide_Wide_String type to process= Unicode strings. But this is not very handy. I use the Matreshka library f= or Unicode strings. It has a lot of features (regexp, string vectors, XML, = JSON, databases, Web Servlets, template engine, etc.). URL: https://forge.a= da-ru.org/matreshka > Regards,=20 > Nicolas