comp.lang.ada
 help / color / mirror / Atom feed
From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Subject: Re: Ada and Unicode
Date: Mon, 19 Apr 2021 14:52:43 +0200	[thread overview]
Message-ID: <s5juep$1lbu$1@gioia.aioe.org> (raw)
In-Reply-To: s5jr59$1tkq$1@gioia.aioe.org

On 2021-04-19 13:56, Luke A. Guest wrote:
> On 19/04/2021 10:08, Stephen Leake wrote:
>>> What's the way to manage Unicode correctly ?
>>
>> There are two issues: Unicode in source code, that the compiler must
>> understand, and Unicode in strings, that your program must understand.
> 
> And this is there the Ada standard gets it wrong, in the encodings 
> package re utf-8.
> 
> Unicode is a superset of 7-bit ASCII not Latin 1. The high bit in the 
> leading octet indicates whether there are trailing octets. See 
> https://github.com/Lucretia/uca/blob/master/src/uca.ads#L70 for the data 
> layout. The first 128 "characters" in Unicode match that of 7-bit ASCII, 
> not 8-bit ASCII, and certainly not Latin 1. Therefore this:
> 
> package Ada.Strings.UTF_Encoding
>    ...
>    subtype UTF_8_String is String;
>    ...
> end Ada.Strings.UTF_Encoding;
> 
> Was absolutely and totally wrong.

It is practical solution. Ada type system cannot express differently 
represented/constrained string/array/vector subtypes. Ignoring Latin-1 
and using String as if it were an array of octets is the best available 
solution.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

  parent reply	other threads:[~2021-04-19 12:52 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-17 22:03 Ada and Unicode DrPi
2021-04-18  0:02 ` Luke A. Guest
2021-04-19  9:09   ` DrPi
2021-04-19  8:29 ` Maxim Reznik
2021-04-19  9:28   ` DrPi
2021-04-19 13:50     ` Maxim Reznik
2021-04-19 15:51       ` DrPi
2021-04-19 11:15   ` Simon Wright
2021-04-19 11:50     ` Luke A. Guest
2021-04-19 15:53     ` DrPi
2022-04-03 19:20     ` Thomas
2022-04-04  6:10       ` Vadim Godunko
2022-04-04 14:19         ` Simon Wright
2022-04-04 15:11           ` Simon Wright
2022-04-05  7:59           ` Vadim Godunko
2022-04-08  9:01             ` Simon Wright
2023-03-30 23:35         ` Thomas
2022-04-04 14:33       ` Simon Wright
2021-04-19  9:08 ` Stephen Leake
2021-04-19  9:34   ` Dmitry A. Kazakov
2021-04-19 11:56   ` Luke A. Guest
2021-04-19 12:13     ` Luke A. Guest
2021-04-19 15:48       ` DrPi
2021-04-19 12:52     ` Dmitry A. Kazakov [this message]
2021-04-19 13:00       ` Luke A. Guest
2021-04-19 13:10         ` Dmitry A. Kazakov
2021-04-19 13:15           ` Luke A. Guest
2021-04-19 13:31             ` Dmitry A. Kazakov
2022-04-03 17:24               ` Thomas
2021-04-19 13:24         ` J-P. Rosen
2021-04-20 19:13           ` Randy Brukardt
2022-04-03 18:04           ` Thomas
2022-04-06 18:57             ` J-P. Rosen
2022-04-07  1:30               ` Randy Brukardt
2022-04-08  8:56                 ` Simon Wright
2022-04-08  9:26                   ` Dmitry A. Kazakov
2022-04-08 19:19                     ` Simon Wright
2022-04-08 19:45                       ` Dmitry A. Kazakov
2022-04-09  4:05                         ` Randy Brukardt
2022-04-09  7:43                           ` Simon Wright
2022-04-09 10:27                           ` DrPi
2022-04-09 16:46                             ` Dennis Lee Bieber
2022-04-09 18:59                               ` DrPi
2022-04-10  5:58                             ` Vadim Godunko
2022-04-10 18:59                               ` DrPi
2022-04-12  6:13                               ` Randy Brukardt
2021-04-19 16:07         ` DrPi
2021-04-20 19:06         ` Randy Brukardt
2022-04-03 18:37           ` Thomas
2022-04-04 23:52             ` Randy Brukardt
2023-03-31  3:06               ` Thomas
2023-04-01 10:18                 ` Randy Brukardt
2021-04-19 16:14   ` DrPi
2021-04-19 17:12     ` Björn Lundin
2021-04-19 19:44       ` DrPi
2022-04-16  2:32   ` Thomas
2021-04-19 13:18 ` Vadim Godunko
2022-04-03 16:51   ` Thomas
2023-04-04  0:02     ` Thomas
2021-04-19 22:40 ` Shark8
2021-04-20 15:05   ` Simon Wright
2021-04-20 19:17     ` Randy Brukardt
2021-04-20 20:04       ` Simon Wright
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox