From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on ip-172-31-65-14.ec2.internal X-Spam-Level: X-Spam-Status: No, score=-0.0 required=3.0 tests=BAYES_20,FREEMAIL_FROM, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Received: by 2002:ad4:48d3:0:b0:635:6fb4:ec58 with SMTP id v19-20020ad448d3000000b006356fb4ec58mr19212qvx.1.1688028553589; Thu, 29 Jun 2023 01:49:13 -0700 (PDT) X-Received: by 2002:a05:6870:5ba2:b0:1b0:60ff:b748 with SMTP id em34-20020a0568705ba200b001b060ffb748mr4889447oab.3.1688028553304; Thu, 29 Jun 2023 01:49:13 -0700 (PDT) Path: eternal-september.org!news.eternal-september.org!news.mixmin.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Thu, 29 Jun 2023 01:49:12 -0700 (PDT) In-Reply-To: Injection-Info: google-groups.googlegroups.com; posting-host=90.63.246.187; posting-account=hya6vwoAAADTA0O27Aq3u6Su3lQKpSMz NNTP-Posting-Host: 90.63.246.187 References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <0162bf97-8a37-4244-a368-1bf7ae00077bn@googlegroups.com> Subject: Re: [ANN] Release of UXStrings 0.5.0 From: "Vincent D." Injection-Date: Thu, 29 Jun 2023 08:49:13 +0000 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Xref: news.eternal-september.org comp.lang.ada:65371 List-Id: Hello Pascal, Thank you for this contribution. Here are some comments: - since UTFString is a class ("a tagged record type"), why don't you create= an abstract root "UXString" and then derive specialized object types ? Lik= e UTF_8_XString, UTF_16_XString, ASCII_XString, Win_1252_XString, Latin_XSt= ring, etc. - The default format to convert between different encodings should be UTF-8= as it is now ubiquitous. > [...] moreover in the case of strings accentuated in French and strings c= ontaining emojis the process times are also improved (factor 7 to 8 by comp= ared to UXStrings1=20 - I find quite astonishing to have a factor 8 compared to UTF-8 encoding. D= o you have an explanation ? This looks like a poor implementation because U= TF-8 encoding is fast and allows direct manipulation in most cases. Maybe b= ecause random access is treated as sequential access for UTF-8 encoded stri= ngs but this again is poor implementation. Kind regards, Vincent