comp.lang.ada
 help / color / mirror / Atom feed
From: Simon Wright <simon@pushface.org>
Subject: Re: XMLAda & unicode symbols
Date: Mon, 21 Jun 2021 16:26:01 +0100	[thread overview]
Message-ID: <lybl7zgrxy.fsf@pushface.org> (raw)
In-Reply-To: b4c0edbd-7567-47cb-ba75-2fa27d75a788n@googlegroups.com

"196...@googlemail.com" <1963bib@googlemail.com> writes:

> Asking for the degree sign, was probably a slight mistake. There is
> Degree_Celsius and also Degree_Fahrenheit for those who have not yet
> embraced metric. These are the "correct" symbols.

You might equally have meant angular degrees.

> Both of these exist in Unicode.Names.Letterlike_Symbols, and probably
> elsewhere,but trying to shoehorn these in seems impossible.

A scan through XML/Ada shows that the only uses of Unicode_Char are in
the SAX subset. I don't see any way in the DOM subset of XML/Ada of
using them - someone please prove me wrong!

You could build a Unicode_Char to UTF_8_String converter using
Ada.Strings.UTF_Encoding.Wide_Wide_Strings, ARM 4.11(30)
http://www.ada-auth.org/standards/rm12_w_tc1/html/RM-A-4-11.html#p30

> I just wish XMLAda could just accept whatever we throw at it, and if
> we need to convert it, then let us do so outside of it.

That is *exactly* what you have to do (convert outside, not throw any
old sequence of octets and 32-bit values somehow mashed together at
it). It wants a utf-8-encoded string (though XML/Ada doesn't seem to say
so - RFC 3076 implies it, 7303 (8.1) recommends it).

OK, Text_IO might not prove the point to you, but what about this?

   with Ada.Characters.Latin_1;
   with DOM.Core.Documents;
   with DOM.Core.Elements;
   with DOM.Core.Nodes;
   with DOM.Core;
   with Unicode.CES;
   with Unicode.Encodings;

   procedure Utf is
      Impl : DOM.Core.DOM_Implementation;
      Doc : DOM.Core.Document;
      Dummy, Element : DOM.Core.Node;
      Fifty_Degrees_Latin1 : constant String
        := "50" & Ada.Characters.Latin_1.Degree_Sign;
      Fifty_Degrees_UTF8 : constant Unicode.CES.Byte_Sequence
        := Unicode.Encodings.Convert
          (Fifty_Degrees_Latin1,
           From => Unicode.Encodings.Get_By_Name ("iso-8859-15"),
           To => Unicode.Encodings.Get_By_Name ("utf-8"));
   begin
      Doc := DOM.Core.Create_Document (Impl);

      Element := DOM.Core.Documents.Create_Element (Doc, "utf");
      DOM.Core.Elements.Set_Attribute (Element, "temp", Fifty_Degrees_UTF8);
      Dummy := DOM.Core.Nodes.Append_Child (Doc, Element);

      DOM.Core.Nodes.Print (Doc);
   end Utf;

  reply	other threads:[~2021-06-21 15:26 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-19 18:28 XMLAda & unicode symbols 196...@googlemail.com
2021-06-19 19:53 ` Jeffrey R. Carter
2021-06-20 17:02   ` 196...@googlemail.com
2021-06-20 17:23     ` Dmitry A. Kazakov
2021-06-20 17:58       ` 196...@googlemail.com
2021-06-20 18:16         ` Dmitry A. Kazakov
2021-06-21 19:40           ` 196...@googlemail.com
2021-06-21 20:18             ` Dmitry A. Kazakov
2021-06-21 15:37         ` Simon Wright
2021-06-21 19:49           ` 196...@googlemail.com
2021-06-21 20:23             ` Dmitry A. Kazakov
2021-06-21 20:47             ` Simon Wright
2021-06-22  0:30             ` Spiros Bousbouras
2021-06-20 18:21     ` Jeffrey R. Carter
2021-06-20 18:47       ` Dmitry A. Kazakov
2021-06-20 22:50         ` Jeffrey R. Carter
2021-06-21  4:16           ` Marius Amado-Alves
2021-06-21  9:39             ` Jeffrey R. Carter
2021-06-21  6:14           ` Dmitry A. Kazakov
2021-06-19 21:24 ` Simon Wright
2021-06-20 17:10   ` 196...@googlemail.com
2021-06-21 15:26     ` Simon Wright [this message]
2021-06-21 18:33       ` Emmanuel Briot
2021-06-21 20:06         ` 196...@googlemail.com
2021-06-21 21:26         ` Simon Wright
2021-06-22  6:52           ` Emmanuel Briot
2021-06-21 21:22       ` Simon Wright
2021-06-21  6:07 ` Vadim Godunko
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox