comp.lang.ada
 help / color / mirror / Atom feed
From: briot.emmanuel@gmail.com
Subject: Re: Latest suggestion for 202x
Date: Wed, 19 Jun 2019 04:45:40 -0700 (PDT)
Date: 2019-06-19T04:45:40-07:00	[thread overview]
Message-ID: <6be082a3-b8aa-4e7d-825c-bd998894f077@googlegroups.com> (raw)
In-Reply-To: <800240ae-4c5f-424e-869f-2791e07a50d2@googlegroups.com>

> 1) Follow String and Unbounded_String, by having a static length Unicode_String which would be UTF8. Then have a number of iterators which act on the basic array:
> 
> a) The normal array iterator, built-in.
> b) Code-point iterator which returns, 32-bit code points.
> c) Grapheme cluster iterators.
> d) Other iterators, i.e. words.
> 
> 2) Then the unbounded version which utilises the static stuff, same set of iterators.
> 
> 3) The character database with access via unicode names and index numbers.
> 
> 4) Unicode regular expression engine.

It seems to me that all of this can be implemented as a library, and doesn't need to be in the language itself. The nice thing with libraries is that users can provide their own implementation tailored to their needs.

When I implemented GNATCOLL.Strings, I was careful to optionally support unicode via various formal parameters: internally, we store an array of codepoints. Encoding and decoding to utf-8, utf-16 and others is orthogonal to string manipulation (and right now you would have to use some other package for the encoding).

The various iterators you suggest are nice, but can also be implemented on top of it (iterating by words, sentence, paragraph,... is tricky, and irrelevant for most applications). You would provide a `Word_Iterator_Type`, with a GNAT `Iterable` aspect to use, and a function that takes a string and return that iterator.

Regexps are very difficult to implement for unicode, but I would suggest a binding to an existing library like pcre.

I would love to see more such libraries, and this is why I had started GNATCOLL initially. The more the merrier, even when they compete with each other. Users will have more choices. If this is part of the language, it is harder to provide competitors.
And distributions can package the compiler along with a number of such libs to make things easier for new comers to the language.

Project Alire (https://github.com/alire-project/alire) might be a nice way to contribute such libs. GNATCOLL has the same drawback as other libraries regularly mentioned here: it tends to be too monolithic.

  reply	other threads:[~2019-06-19 11:45 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-15 23:59 Latest suggestion for 202x Micah Waddoups
2019-06-16  5:14 ` Jerry
2019-06-16  7:17 ` Dmitry A. Kazakov
2019-06-16 10:22 ` Egil H H
2019-06-16 16:54   ` Maciej Sobczak
2019-06-16 20:09     ` Dmitry A. Kazakov
2019-06-17  6:54     ` Egil H H
2019-06-17  7:42       ` J-P. Rosen
2019-06-17 12:01     ` Mart van de Wege
2019-06-17 13:35       ` Maciej Sobczak
2019-06-17 15:20         ` Dmitry A. Kazakov
2019-06-17 15:32           ` Paul Rubin
2019-06-17 16:43             ` Dmitry A. Kazakov
2019-06-17 21:38           ` Keith Thompson
2019-06-18 15:48             ` Jeffrey R. Carter
2019-06-20 22:21             ` Randy Brukardt
2019-06-21  9:42               ` Dmitry A. Kazakov
2019-06-21 18:12                 ` Keith Thompson
2019-06-21 18:43                   ` Dmitry A. Kazakov
2019-06-21 20:24                     ` Keith Thompson
2019-06-22  6:54                       ` Dmitry A. Kazakov
2019-06-22  8:43                         ` Randy Brukardt
2019-06-22  9:00                           ` Dmitry A. Kazakov
2019-06-22 17:44                         ` Keith Thompson
2019-06-22 18:34                           ` Bill Findlay
2019-06-22 18:37                           ` Dmitry A. Kazakov
2019-06-23  7:38                             ` G.B.
2019-06-23  8:29                               ` Dmitry A. Kazakov
2019-06-23 18:34                               ` Optikos
2019-06-23 19:20                                 ` Dennis Lee Bieber
2019-06-22 20:48                           ` Optikos
2019-06-22 20:53                             ` Optikos
2019-06-23 17:42                             ` Dennis Lee Bieber
2019-06-24  5:07                               ` J-P. Rosen
2019-06-24  5:40                                 ` Paul Rubin
2019-06-24  7:16                                   ` Niklas Holsti
2019-06-26 18:00                                     ` Stephen Leake
2019-06-24 13:07                                   ` J-P. Rosen
2019-06-24 11:12                                 ` Stefan.Lucks
2019-06-24 12:06                                   ` Niklas Holsti
2019-06-24 20:22                                     ` Randy Brukardt
2019-06-24 20:32                                       ` Keith Thompson
2019-06-24 20:47                                       ` Jeffrey R. Carter
2019-06-24 13:10                                   ` J-P. Rosen
2019-06-22  8:36                   ` Randy Brukardt
2019-06-22 17:39                     ` Keith Thompson
2019-06-16 19:34 ` Optikos
2019-06-16 20:10   ` John Perry
2019-06-16 20:57     ` Optikos
2019-06-16 21:36       ` Dmitry A. Kazakov
2019-06-17 16:48     ` G. B.
2019-06-17 17:12     ` Paul Rubin
2019-06-16 21:41 ` Lucretia
2019-06-19  2:36 ` Micah Waddoups
2019-06-19 11:14   ` Lucretia
2019-06-19 11:45     ` briot.emmanuel [this message]
2019-06-19 14:34       ` Optikos
2019-06-19 19:29         ` Lucretia
2019-06-19 16:12   ` G. B.
2019-06-23 20:17 ` Per Sandberg
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox