From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on ip-172-31-65-14.ec2.internal X-Spam-Level: X-Spam-Status: No, score=-2.9 required=3.0 tests=BAYES_00,NICE_REPLY_A autolearn=ham autolearn_force=no version=3.4.6 Path: eternal-september.org!reader02.eternal-september.org!news.gegeweb.eu!gegeweb.org!usenet-fr.net!proxad.net!feeder1-2.proxad.net!cleanfeed1-a.proxad.net!nnrp4-1.free.fr!not-for-mail Date: Sat, 9 Apr 2022 20:59:59 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: Ada and Unicode Content-Language: en-US Newsgroups: comp.lang.ada References: <86mttuk5f0.fsf@stephe-leake.org> <62515f7a$0$25324$426a74cc@news.free.fr> From: DrPi <314@drpi.fr> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Message-ID: <6251d7b1$0$3427$426a74cc@news.free.fr> Organization: Guest of ProXad - France NNTP-Posting-Date: 09 Apr 2022 21:00:01 CEST NNTP-Posting-Host: 82.65.30.55 X-Trace: 1649530801 news-2.free.fr 3427 82.65.30.55:51589 X-Complaints-To: abuse@proxad.net Xref: reader02.eternal-september.org comp.lang.ada:63726 List-Id: Le 09/04/2022 à 18:46, Dennis Lee Bieber a écrit : > On Sat, 9 Apr 2022 12:27:04 +0200, DrPi <314@drpi.fr> declaimed the > following: > >> >> In Python-3, a string is a character(glyph ?) array. The internal >> representation is hidden to the programmer. > > >> >> On the Ada side, I've still not understood how to correctly deal with >> all this stuff. > > One thing to take into account is that Python strings are immutable. > Changing the contents of a string requires constructing a new string from > parts that incorporate the change. > Right. I forgot to mention it. > That allows for the second aspect -- even if not visible to a > programmer, Python (3) strings are not a fixed representation: If all > characters in the string fit in the 8-bit UTF range, that string is stored > using one byte per character. If any character uses a 16-bit UTF > representation, the entire string is stored as 16-bit characters (and > similar for 32-bit UTF points). Thus, indexing into the string is still > fast -- just needing to scale the index by the character width of the > entire string. > Thanks for clarifying.