comp.lang.ada
 help / color / mirror / Atom feed
From: Thomas <fantome.forums.tDeContes@free.fr.invalid>
Subject: Re: Ada and Unicode
Date: Tue, 04 Apr 2023 02:02:03 +0200	[thread overview]
Message-ID: <642b68fb$0$3206$426a34cc@news.free.fr> (raw)
In-Reply-To: fantome.forums.tDeContes-079FD6.18515603042022@news.free.fr

In article 
<fantome.forums.tDeContes-079FD6.18515603042022@news.free.fr>,
 Thomas <fantome.forums.tDeContes@free.fr.invalid> wrote:

> In article <f9d91cb0-c9bb-4d42-a1a9-0cd546da436cn@googlegroups.com>,
>  Vadim Godunko <vgodunko@gmail.com> wrote:
> 
> > On Sunday, April 18, 2021 at 1:03:14 AM UTC+3, DrPi wrote:
> 
> > > What's the way to manage Unicode correctly ? 


> > Ada doesn't have good Unicode support. :( So, you need to find suitable set 
> > of "workarounds".
> > 
> > There are few different aspects of Unicode support need to be considered:
> > 
> > 1. Representation of string literals. If you want to use non-ASCII 
> > characters 
> > in source code, you need to use -gnatW8 switch and it will require use of 
> > Wide_Wide_String everywhere.
> > 2. Internal representation during application execution. You are forced to 
> > use Wide_Wide_String at previous step, so it will be UCS4/UTF32.
> 
> > It is hard to say that it is reasonable set of features for modern world.
> 
> I don't think Ada would be lacking that much, for having good UTF-8 
> support.
> 
> the cardinal point is to be able to fill a 
> Ada.Strings.UTF_Encoding.UTF_8_String with a litteral.
> (once you got it, when you'll try to fill a Standard.String with a 
> non-Latin-1 character, it'll make an error, i think it's fine :-) )
> 
> does Ada 202x allow it ?


hi !

I think I found a quite nice solution!
(reading <t3lj44$fh5$1@dont-email.me> again)
(not tested yet)


it's not perfect as in the rules of the art,
but it is:

- Ada 2012 compatible
- better than writing UTF-8 Ada code and then telling gnat it is Latin-1
  (in this way it would take UTF_8_String for what it is:
  an array of octets, but it would not detect an invalid UTF-8 string,
  and if someone tells it's really UTF-8 all goes wrong)
- better than being limited to ASCII in string literals
- never need to explicitely declare Wide_Wide_String:
  it's always implicit, for very short time,
  and AFAIK eligible for optimization



package UTF_Encoding is

   subtype UTF_8_String is Ada.Strings.UTF_Encoding.UTF_8_String;

   function "+" (A : in Wide_Wide_String) return UTF_8_String
   renames Ada.Strings.UTF_Encoding.Wide_Wide_Strings.Encode;

end UTF_Encoding;


then we can do:


package User is

   use UTF_Encoding;

   My_String : UTF_8_String := + "Greek characters + smileys";

end User;


if you want to avoid "use UTF_Encoding;",
i think "use type UTF_Encoding.UTF_8_String;" doesn't work,
but this should work:


package UTF_Encoding is

   subtype UTF_8_String is Ada.Strings.UTF_Encoding.UTF_8_String;

   type Literals_For_UTF_8_String is new Wide_Wide_String;

   function "+" (A : in Literals_For_UTF_8_String) return UTF_8_String
   renames Ada.Strings.UTF_Encoding.Wide_Wide_Strings.Encode;

end UTF_Encoding;


package User is

   use type UTF_Encoding.Literals_For_UTF_8_String;

   My_String : UTF_Encoding.UTF_8_String
               := + "Greek characters + smileys";

end User;



what do you think about that ? good idea or not ? :-)

-- 
RAPID maintainer
http://savannah.nongnu.org/projects/rapid/

  reply	other threads:[~2023-04-04  0:02 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-17 22:03 Ada and Unicode DrPi
2021-04-18  0:02 ` Luke A. Guest
2021-04-19  9:09   ` DrPi
2021-04-19  8:29 ` Maxim Reznik
2021-04-19  9:28   ` DrPi
2021-04-19 13:50     ` Maxim Reznik
2021-04-19 15:51       ` DrPi
2021-04-19 11:15   ` Simon Wright
2021-04-19 11:50     ` Luke A. Guest
2021-04-19 15:53     ` DrPi
2022-04-03 19:20     ` Thomas
2022-04-04  6:10       ` Vadim Godunko
2022-04-04 14:19         ` Simon Wright
2022-04-04 15:11           ` Simon Wright
2022-04-05  7:59           ` Vadim Godunko
2022-04-08  9:01             ` Simon Wright
2023-03-30 23:35         ` Thomas
2022-04-04 14:33       ` Simon Wright
2021-04-19  9:08 ` Stephen Leake
2021-04-19  9:34   ` Dmitry A. Kazakov
2021-04-19 11:56   ` Luke A. Guest
2021-04-19 12:13     ` Luke A. Guest
2021-04-19 15:48       ` DrPi
2021-04-19 12:52     ` Dmitry A. Kazakov
2021-04-19 13:00       ` Luke A. Guest
2021-04-19 13:10         ` Dmitry A. Kazakov
2021-04-19 13:15           ` Luke A. Guest
2021-04-19 13:31             ` Dmitry A. Kazakov
2022-04-03 17:24               ` Thomas
2021-04-19 13:24         ` J-P. Rosen
2021-04-20 19:13           ` Randy Brukardt
2022-04-03 18:04           ` Thomas
2022-04-06 18:57             ` J-P. Rosen
2022-04-07  1:30               ` Randy Brukardt
2022-04-08  8:56                 ` Simon Wright
2022-04-08  9:26                   ` Dmitry A. Kazakov
2022-04-08 19:19                     ` Simon Wright
2022-04-08 19:45                       ` Dmitry A. Kazakov
2022-04-09  4:05                         ` Randy Brukardt
2022-04-09  7:43                           ` Simon Wright
2022-04-09 10:27                           ` DrPi
2022-04-09 16:46                             ` Dennis Lee Bieber
2022-04-09 18:59                               ` DrPi
2022-04-10  5:58                             ` Vadim Godunko
2022-04-10 18:59                               ` DrPi
2022-04-12  6:13                               ` Randy Brukardt
2021-04-19 16:07         ` DrPi
2021-04-20 19:06         ` Randy Brukardt
2022-04-03 18:37           ` Thomas
2022-04-04 23:52             ` Randy Brukardt
2023-03-31  3:06               ` Thomas
2023-04-01 10:18                 ` Randy Brukardt
2021-04-19 16:14   ` DrPi
2021-04-19 17:12     ` Björn Lundin
2021-04-19 19:44       ` DrPi
2022-04-16  2:32   ` Thomas
2021-04-19 13:18 ` Vadim Godunko
2022-04-03 16:51   ` Thomas
2023-04-04  0:02     ` Thomas [this message]
2021-04-19 22:40 ` Shark8
2021-04-20 15:05   ` Simon Wright
2021-04-20 19:17     ` Randy Brukardt
2021-04-20 20:04       ` Simon Wright
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox