From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!nntp-feed.chiark.greenend.org.uk!ewrotcd!reality.xs3.de!news.jacob-sparre.dk!loke.jacob-sparre.dk!pnx.dk!.POSTED!not-for-mail From: "Randy Brukardt" Newsgroups: comp.lang.ada Subject: Re: strange behaviour of utf-8 files Date: Thu, 21 Nov 2013 18:53:32 -0600 Organization: Jacob Sparre Andersen Research & Innovation Message-ID: References: <73e0853b-454a-467f-9dc7-84ca5b9c29b2@googlegroups.com> <1ghx537y5gbfq.17oazom68d4n6.dlg@40tude.net> <5bf1b290-70bc-4240-b27c-120ce6b0b840@googlegroups.com> NNTP-Posting-Host: static-69-95-181-76.mad.choiceone.net X-Trace: loke.gir.dk 1385081612 15662 69.95.181.76 (22 Nov 2013 00:53:32 GMT) X-Complaints-To: news@jacob-sparre.dk NNTP-Posting-Date: Fri, 22 Nov 2013 00:53:32 +0000 (UTC) X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2900.5931 X-RFC2646: Format=Flowed; Response X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Xref: news.eternal-september.org comp.lang.ada:17760 Date: 2013-11-21T18:53:32-06:00 List-Id: "Peter C. Chapin" wrote in message news:alpine.DEB.2.02.1311161503000.6074@whirlwind... > On Sat, 16 Nov 2013, Stoik wrote: > >> By the way, nothing changes if I use wide_character and wide_string >> instead of character and string. Even if character=octet, certainly >> wide_character is not an octet! > > It sounds like you want something like > > function UTF8_String_To_Wide_String(S : String) return Wide_String; > > UTF-8 is a variable length encoding and thus not the same beast as > Wide_String. String literals are going to be encoded in the same manner as > the rest of the source text, of course. Ada 2012 has Ada.Strings.UTF_Encodings for run-time encoding conversions. (See A.4.11.) We might be able to do better in the next version of Ada (whenever that is), but I wouldn't hold my breath. Randy.