Lower bounds of Strings

comp.lang.ada
 help / color / mirror / Atom feed

* Lower bounds of Strings
@ 2021-01-05 11:04 Stephen Davies
  2021-01-05 11:57 ` Dmitry A. Kazakov
                   ` (5 more replies)
  0 siblings, 6 replies; 66+ messages in thread
From: Stephen Davies @ 2021-01-05 11:04 UTC (permalink / raw)



I'm sure this must have been discussed before, but the issue doesn't
seem to have been resolved and I think it makes Ada code look ugly and
frankly reflects poorly on the language.

I'm referring to the fact that any subprogram with a String
parameter, e.g. Expiration_Date, has to use something like
Expiration_Date (Expiration_Date'First .. Expiration_Date'First + 1)
to refer to the first two characters rather than simply saying
Expiration_Date (1..2).

Not only is it ugly, but it's potentially dangerous if code uses the
latter and works for ages until one day somebody passes a slice instead
of a string starting at 1 (yes, compilers might generate warnings,
but that doesn't negate the issue, imho).

There must be many possible solutions, without breaking compatibility
for those very rare occasions where code actually makes use of the
lower bound of a string.

e.g. Perhaps the following could be made legal and added to Standard:

subtype Mono_String is String (1 .. <>);

One question with this would be whether or not to allow procedure bodies
to specify parameters as Mono_String when the corresponding procedure
declaration uses String.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 11:04 Lower bounds of Strings Stephen Davies
@ 2021-01-05 11:57 ` Dmitry A. Kazakov
  2021-01-05 12:32   ` Jeffrey R. Carter
  2021-01-05 12:24 ` Luke A. Guest
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-05 11:57 UTC (permalink / raw)


On 2021-01-05 12:04, Stephen Davies wrote:
> 
> I'm sure this must have been discussed before, but the issue doesn't
> seem to have been resolved and I think it makes Ada code look ugly and
> frankly reflects poorly on the language.

There is no issue, it must be this way.

> I'm referring to the fact that any subprogram with a String
> parameter, e.g. Expiration_Date, has to use something like
> Expiration_Date (Expiration_Date'First .. Expiration_Date'First + 1)
> to refer to the first two characters rather than simply saying
> Expiration_Date (1..2).

This is a different operation. See function Head in Ada.Strings.Fixed. I 
does what you want.

> Not only is it ugly, but it's potentially dangerous if code uses the
> latter and works for ages until one day somebody passes a slice instead
> of a string starting at 1 (yes, compilers might generate warnings,
> but that doesn't negate the issue, imho).

Yes, slicing using constant values is dangerous. Slicing using indices 
is safe and consitent.

> There must be many possible solutions, without breaking compatibility
> for those very rare occasions where code actually makes use of the
> lower bound of a string.
> 
> e.g. Perhaps the following could be made legal and added to Standard:
> 
> subtype Mono_String is String (1 .. <>);

This is a constraint, a meaningless and dangerous one:

    procedure Foo (X : Mono_String);
    S : String := "abcdefgh";
begin
    Foo (S (2..S'Last)); -- Boom! Constraint_Error

> One question with this would be whether or not to allow procedure bodies
> to specify parameters as Mono_String when the corresponding procedure
> declaration uses String.

There are two separate issues.

1. Explicit forced index sliding. There is no operation for that, though 
it could be easily implemented, e.g.

    function "abs" (S : String) return String is
       Result : constant String (1..S'Length) := S;
    begin
       return Result;
    end "abs";

2. Position-based indices an slices. One could add [] brackets for that 
stuff to make C crowd happy.

    S [1] - First array element
    S (1) - Array element at index 1

Same with slices. Differently to #1, this is would not require excessive 
string copying.

------------------
The real problem is though, that traditionally integer types were used 
for array indices since very beginning of computing.

They never should be because indices are not additive:

    A'First + A'First

is garbage. It is same as with Time and Duration. Time is index (a fixed 
point on the time axis). Duration is position, an offset to some 
unspecified epoch. You can do Time + Duration and Duration + Duration, 
but not Time + Time. So addition must apply to index and position or two 
positions only.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 11:04 Lower bounds of Strings Stephen Davies
  2021-01-05 11:57 ` Dmitry A. Kazakov
@ 2021-01-05 12:24 ` Luke A. Guest
  2021-01-05 12:49 ` Simon Wright
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 66+ messages in thread
From: Luke A. Guest @ 2021-01-05 12:24 UTC (permalink / raw)


On 05/01/2021 11:04, Stephen Davies wrote:
> 
> I'm sure this must have been discussed before, but the issue doesn't
> seem to have been resolved and I think it makes Ada code look ugly and
> frankly reflects poorly on the language.

Wrong. It highlights how poor programmers are, especially from other 
languages which love to hard code numbers.

> I'm referring to the fact that any subprogram with a String
> parameter, e.g. Expiration_Date, has to use something like
> Expiration_Date (Expiration_Date'First .. Expiration_Date'First + 1)
> to refer to the first two characters rather than simply saying
> Expiration_Date (1..2).

What if the length changes? By using a constant, say it was 'Last being 
used, if that last changes, the program won't crash.

Luke.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 11:57 ` Dmitry A. Kazakov
@ 2021-01-05 12:32   ` Jeffrey R. Carter
  2021-01-05 13:40     ` Dmitry A. Kazakov
  0 siblings, 1 reply; 66+ messages in thread
From: Jeffrey R. Carter @ 2021-01-05 12:32 UTC (permalink / raw)


On 1/5/21 12:57 PM, Dmitry A. Kazakov wrote:
> 
> This is a constraint, a meaningless and dangerous one:
> 
>     procedure Foo (X : Mono_String);
>     S : String := "abcdefgh";
> begin
>     Foo (S (2..S'Last)); -- Boom! Constraint_Error

Surely the slice would slide, as it does in

with Ada.Text_IO;

procedure Slider is
    subtype S7 is String (1 .. 7);

    procedure Foo (X : in S7);

    procedure Foo (X : in S7) is
       -- Empty
    begin -- Foo
       Ada.Text_IO.Put (Item => X (7) );
       Ada.Text_IO.New_Line;
    end Foo;

    S : constant String := "abcdefgh";
begin -- Slider
    Foo (X => S (2 .. 8) );
end Slider;

~/Code$ gnatmake -gnatan -gnato2 -O2 -fstack-check slider.adb
x86_64-linux-gnu-gcc-9 -c -gnatan -gnato2 -O2 -fstack-check slider.adb
x86_64-linux-gnu-gnatbind-9 -x slider.ali
x86_64-linux-gnu-gnatlink-9 slider.ali -O2 -fstack-check
~/Code$ ./slider
h

-- 
Jeff Carter
"I'm a vicious jungle beast!"
Play It Again, Sam
131

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 11:04 Lower bounds of Strings Stephen Davies
  2021-01-05 11:57 ` Dmitry A. Kazakov
  2021-01-05 12:24 ` Luke A. Guest
@ 2021-01-05 12:49 ` Simon Wright
  2021-01-05 12:51 ` Jeffrey R. Carter
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 66+ messages in thread
From: Simon Wright @ 2021-01-05 12:49 UTC (permalink / raw)

Stephen Davies <joviangm@gmail.com> writes:

> Not only is it ugly, but it's potentially dangerous if code uses the
> latter and works for ages until one day somebody passes a slice
> instead of a string starting at 1 (yes, compilers might generate
> warnings, but that doesn't negate the issue, imho).

I suppose you could use a precondition.

I had a similar problem with the Ada 2005 Math Extensions[1], where the Ada
code supports arbitrary ranges on the input & output matrices (well, I
did insist that the output ranges matches the inputs!) but the
underlying Fortran code assumes that ranges start at 1 .. hmm. I see
from a perfunctory search that F90 allows arbitrary ranges. Still,
LAPACK seems not to ...[2]

I see that current GNATs give a lot of warnings about use of an
anonymous access type allocator.

[1] http://gnat-math-extn.sourceforge.net/index.html/
[2] http://www.netlib.org/lapack/explore-html/d3/dfb/group__real_g_eeigen_ga6176eadcb5a027beb0b000fbf74f9e35.html#ga6176eadcb5a027beb0b000fbf74f9e35

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 11:04 Lower bounds of Strings Stephen Davies
                   ` (2 preceding siblings ...)
  2021-01-05 12:49 ` Simon Wright
@ 2021-01-05 12:51 ` Jeffrey R. Carter
  2021-01-06  3:08 ` Randy Brukardt
  2021-01-14 11:38 ` AdaMagica
  5 siblings, 0 replies; 66+ messages in thread
From: Jeffrey R. Carter @ 2021-01-05 12:51 UTC (permalink / raw)


On 1/5/21 12:04 PM, Stephen Davies wrote:
> 
> subtype Mono_String is String (1 .. <>);

This already exists, although with a slightly different syntax:

type String_From_1 (Length : Natural) is record
    Value : String (1 .. Length);
end record;

But perhaps it would be a good idea for a new language to have separate sequence 
types with positions but no indices? Such a language would only need constrained 
array types to serve as maps.

-- 
Jeff Carter
"I'm a vicious jungle beast!"
Play It Again, Sam
131

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 12:32   ` Jeffrey R. Carter
@ 2021-01-05 13:40     ` Dmitry A. Kazakov
  2021-01-05 14:31       ` Stephen Davies
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-05 13:40 UTC (permalink / raw)


On 2021-01-05 13:32, Jeffrey R. Carter wrote:
> On 1/5/21 12:57 PM, Dmitry A. Kazakov wrote:
>>
>> This is a constraint, a meaningless and dangerous one:
>>
>>     procedure Foo (X : Mono_String);
>>     S : String := "abcdefgh";
>> begin
>>     Foo (S (2..S'Last)); -- Boom! Constraint_Error
> 
> Surely the slice would slide, as it does in
> 
> with Ada.Text_IO;
> 
> procedure Slider is
>     subtype S7 is String (1 .. 7);
> 
>     procedure Foo (X : in S7);
> 
>     procedure Foo (X : in S7) is
>        -- Empty
>     begin -- Foo
>        Ada.Text_IO.Put (Item => X (7) );
>        Ada.Text_IO.New_Line;
>     end Foo;
> 
>     S : constant String := "abcdefgh";
> begin -- Slider
>     Foo (X => S (2 .. 8) );
> end Slider;
> 
> ~/Code$ gnatmake -gnatan -gnato2 -O2 -fstack-check slider.adb
> x86_64-linux-gnu-gcc-9 -c -gnatan -gnato2 -O2 -fstack-check slider.adb
> x86_64-linux-gnu-gnatbind-9 -x slider.ali
> x86_64-linux-gnu-gnatlink-9 slider.ali -O2 -fstack-check
> ~/Code$ ./slider
> h

Yes, but here S7 is definite, it is a quite different case. Sliding 
indefinite subtypes without copies, with access types allowed?

And even with definite subtypes it is broken:

    with Ada.Text_IO;  use Ada.Text_IO;

    procedure Main is
       subtype S7 is String (1..7);
       S : constant String := "abcdefgh";
       V : S7 renames S (2..8);
    begin
       Put_Line ("Is it broken? " & Boolean'Image (S (7) = V (7)));
    end Main;

This will print: Is it broken? TRUE

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 13:40     ` Dmitry A. Kazakov
@ 2021-01-05 14:31       ` Stephen Davies
  2021-01-05 17:24         ` Stephen Davies
  0 siblings, 1 reply; 66+ messages in thread
From: Stephen Davies @ 2021-01-05 14:31 UTC (permalink / raw)


On Tuesday, 5 January 2021 at 12:24:52 UTC, Luke A. Guest wrote:
> On 05/01/2021 11:04, Stephen Davies wrote: 
> > 
> > I'm referring to the fact that any subprogram with a String 
> > parameter, e.g. Expiration_Date, has to use something like 
> > Expiration_Date (Expiration_Date'First .. Expiration_Date'First + 1) 
> > to refer to the first two characters rather than simply saying 
> > Expiration_Date (1..2).
> What if the length changes? By using a constant, say it was 'Last being 
> used, if that last changes, the program won't crash. 
> 
That was simply an example, though it does highlight that you currently
cannnot have global constants "Month_First : constant Positive := 1" and
"Month_Last : constant Positive := 2".

On Tuesday, 5 January 2021 at 11:57:18 UTC, Dmitry A. Kazakov wrote:
>
> Position-based indices an slices. One could add [] brackets for that 
> stuff to make C crowd happy. 
> 
> S [1] - First array element 
> S (1) - Array element at index 1 
> 
> Same with slices. Differently to #1, this is would not require excessive 
> string copying. 
Yes, I like that idea as a possible alternative solution.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 14:31       ` Stephen Davies
@ 2021-01-05 17:24         ` Stephen Davies
  2021-01-05 18:28           ` Jeffrey R. Carter
  0 siblings, 1 reply; 66+ messages in thread
From: Stephen Davies @ 2021-01-05 17:24 UTC (permalink / raw)


On Tuesday, 5 January 2021 at 14:31:27 UTC, Stephen Davies wrote:
> On Tuesday, 5 January 2021 at 11:57:18 UTC, Dmitry A. Kazakov wrote: 
> > S [1] - First array element 
> > S (1) - Array element at index 1 
> Yes, I like that idea as a possible alternative solution.

To take this a bit further, suppose we have an array on "Natural range <>"
rather than "Positive range <>", I guess S[0] would now be the first
element of slice S, which still seems reasonable.

Alternatively, thinking about my original suggestion, I guess the modern
Ada way would be to allow an aspect along the lines of:
subtype Mono_String is String with First => 1;

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 17:24         ` Stephen Davies
@ 2021-01-05 18:28           ` Jeffrey R. Carter
  2021-01-05 21:02             ` Stephen Davies
  0 siblings, 1 reply; 66+ messages in thread
From: Jeffrey R. Carter @ 2021-01-05 18:28 UTC (permalink / raw)


On 1/5/21 6:24 PM, Stephen Davies wrote:
> 
> Alternatively, thinking about my original suggestion, I guess the modern
> Ada way would be to allow an aspect along the lines of:
> subtype Mono_String is String with First => 1;

You can get the effect of this with a predicate, but slices won't slide to match.

-- 
Jeff Carter
"He didn't get that nose from playing ping-pong."
Never Give a Sucker an Even Break
110

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 18:28           ` Jeffrey R. Carter
@ 2021-01-05 21:02             ` Stephen Davies
  2021-01-07 10:38               ` Stephen Davies
  0 siblings, 1 reply; 66+ messages in thread
From: Stephen Davies @ 2021-01-05 21:02 UTC (permalink / raw)


On Tuesday, 5 January 2021 at 18:28:37 UTC, Jeffrey R. Carter wrote:
> On 1/5/21 6:24 PM, Stephen Davies wrote: 
> > subtype Mono_String is String with First => 1;
> You can get the effect of this with a predicate, but slices won't slide to match.
Sliding is the (my) main goal.
Perhaps a different aspect name would be clearer:
subtype Mono_String is String with Slide => True;

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 11:04 Lower bounds of Strings Stephen Davies
                   ` (3 preceding siblings ...)
  2021-01-05 12:51 ` Jeffrey R. Carter
@ 2021-01-06  3:08 ` Randy Brukardt
  2021-01-06  9:13   ` Dmitry A. Kazakov
  2021-01-14 11:38 ` AdaMagica
  5 siblings, 1 reply; 66+ messages in thread
From: Randy Brukardt @ 2021-01-06  3:08 UTC (permalink / raw)


IMHO, "String" shouldn't be an array at all. In a UTF-8 world, it makes 
little sense to index into a string - it would be expensive to do it based 
on characters (since they vary in size), and dangerous to do it based on 
octets (since you could get part of a character).

The only real solution is to never use String in the first place. A number 
of people are building UTF-8 abstractions to replace String, and I expect 
those to become common in the coming years.

Indeed, (as I've mentioned before) I would go further and abandon arrays 
altogether -- containers cover the same ground (or could easily) -- the vast 
complication of operators popping up much after type declarations, 
assignable slices, and supernull arrays all waste resources and cause 
oddities and dangers. It's a waste of time to fix arrays in Ada -- just 
don't use them.

                                                        Randy.



"Stephen Davies" <joviangm@gmail.com> wrote in message 
news:1cc09f04-98f2-4ef3-ac84-9a9ca5aa3fd5n@googlegroups.com...
>
> I'm sure this must have been discussed before, but the issue doesn't
> seem to have been resolved and I think it makes Ada code look ugly and
> frankly reflects poorly on the language.
>
> I'm referring to the fact that any subprogram with a String
> parameter, e.g. Expiration_Date, has to use something like
> Expiration_Date (Expiration_Date'First .. Expiration_Date'First + 1)
> to refer to the first two characters rather than simply saying
> Expiration_Date (1..2).
>
> Not only is it ugly, but it's potentially dangerous if code uses the
> latter and works for ages until one day somebody passes a slice instead
> of a string starting at 1 (yes, compilers might generate warnings,
> but that doesn't negate the issue, imho).
>
> There must be many possible solutions, without breaking compatibility
> for those very rare occasions where code actually makes use of the
> lower bound of a string.
>
> e.g. Perhaps the following could be made legal and added to Standard:
>
> subtype Mono_String is String (1 .. <>);
>
> One question with this would be whether or not to allow procedure bodies
> to specify parameters as Mono_String when the corresponding procedure
> declaration uses String. 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-06  3:08 ` Randy Brukardt
@ 2021-01-06  9:13   ` Dmitry A. Kazakov
  2021-01-07  0:17     ` Randy Brukardt
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-06  9:13 UTC (permalink / raw)

On 2021-01-06 04:08, Randy Brukardt wrote:
> IMHO, "String" shouldn't be an array at all. In a UTF-8 world, it makes
> little sense to index into a string - it would be expensive to do it based
> on characters (since they vary in size), and dangerous to do it based on
> octets (since you could get part of a character).

It will not work. There is no useful integral operations defined on 
strings. It is like arguing that image is not an array of pixels because 
you could distort objects in there when altering individual pixels.

> The only real solution is to never use String in the first place. A number
> of people are building UTF-8 abstractions to replace String, and I expect
> those to become common in the coming years.

This will never happen. Ada standard library already has lots of 
integral operations defined on strings. They are practically never used. 
The UTF-8 (or whatever encoding) abstraction thing simply does not exist.

> Indeed, (as I've mentioned before) I would go further and abandon arrays
> altogether -- containers cover the same ground (or could easily) -- the vast
> complication of operators popping up much after type declarations,
> assignable slices, and supernull arrays all waste resources and cause
> oddities and dangers. It's a waste of time to fix arrays in Ada -- just
> don't use them.

How these containers are supposed to be implemented? As linked lists? 
How Stream_Element_Array is supposed to be an opaque container? How file 
read operation is supposed to assign part of a container?

You cannot rid of array interface with all its types involved: index, 
index set (range), element, element set (array). A containers without 
the array interface cannot replace array. Any language must support 
them. The problem is that Ada has array interfaces once built-in and as 
an ugly lame monstrosity of helper tagged types, a mockery of array.

Array implementation is a fundamental building block of computing. That 
does not go either. Of course you could have two languages, one with 
arrays to implement containers and one without them for end users. But 
this is neither Ada philosophy nor a concept for any good 
universal-purpose language.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-06  9:13   ` Dmitry A. Kazakov
@ 2021-01-07  0:17     ` Randy Brukardt
  2021-01-07  9:57       ` Dmitry A. Kazakov
  0 siblings, 1 reply; 66+ messages in thread
From: Randy Brukardt @ 2021-01-07  0:17 UTC (permalink / raw)

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:rt3uv2$1nrd$1@gioia.aioe.org...
> On 2021-01-06 04:08, Randy Brukardt wrote:
>> IMHO, "String" shouldn't be an array at all. In a UTF-8 world, it makes
>> little sense to index into a string - it would be expensive to do it 
>> based
>> on characters (since they vary in size), and dangerous to do it based on
>> octets (since you could get part of a character).
>
> It will not work. There is no useful integral operations defined on 
> strings. It is like arguing that image is not an array of pixels because 
> you could distort objects in there when altering individual pixels.
>
>> The only real solution is to never use String in the first place. A 
>> number
>> of people are building UTF-8 abstractions to replace String, and I expect
>> those to become common in the coming years.
>
> This will never happen. Ada standard library already has lots of integral 
> operations defined on strings. They are practically never used. The UTF-8 
> (or whatever encoding) abstraction thing simply does not exist.
>
>> Indeed, (as I've mentioned before) I would go further and abandon arrays
>> altogether -- containers cover the same ground (or could easily) -- the 
>> vast
>> complication of operators popping up much after type declarations,
>> assignable slices, and supernull arrays all waste resources and cause
>> oddities and dangers. It's a waste of time to fix arrays in Ada -- just
>> don't use them.
>
> How these containers are supposed to be implemented?

Built-in to the implementation, of course. Implementing these things in Ada 
is a nice capability, because that allows simple quick-and-dirty 
implementations. But for things that are commonly used, that necessarily 
leads to lousy performance. One has to have at least some special cases even 
for the Ada.Containers to get adequate performance, so there's no problem 
extending that.

...
> How Stream_Element_Array is supposed to be an opaque container?

It should already be an opaque container. You use language-defined stream 
attributes to implement user-defined stream attributes - not unportable 
direct byte twiddling.

> How file read operation is supposed to assign part of a container?

??? Why would you want to do that? Streaming a bounded vector (which almost 
all existing arrays should be) naturally would read only the active part of 
the vector. Non-streaming reading is left over Ada 83 nonsense; all I/O 
should be built on top of streams (as a practical matter, the vast majority 
is anyway).

> You cannot rid of array interface with all its types involved: index, 
> index set (range), element, element set (array). A containers without the 
> array interface cannot replace array. Any language must support them. The 
> problem is that Ada has array interfaces once built-in and as an ugly lame 
> monstrosity of helper tagged types, a mockery of array.

There no reason that a container interface cannot have those things --  
Ada.Containers.Vectors does. The things that Vectors is missing (mainly the 
ability to use enumeration and modular indexes) was a mistake that I 
complained about repeatedly during the design, but I lost on that.

> Array implementation is a fundamental building block of computing.

Surely. But one does not need the nonsense of requiring an underlying 
implementation (which traditional arrays do) in order to get that building 
block. You always talk about this in terms of an "interface", which is 
essentially the same idea. One cannot have any sort of non-contigious or 
persistent arrays with the Ada interface, since operations like assigning 
into slices are impossible in such representations. One has to give those 
things up in order to have an "interface" rather than the concrete form for 
Ada arrays.

I prefer to not call the result an array, since an array implies a 
contiguous in-memory representation. Of course, some vectors will have such 
a representation, but that needs to be a requirement only for vectors used 
for interfacing. (And those should be used rarely.)

> That does not go either. Of course you could have two languages, one with 
> arrays to implement containers and one without them for end users. But 
> this is neither Ada philosophy nor a concept for any good 
> universal-purpose language.

Compilers implement arrays in Ada; there is no possibility a user doing it. 
I see no difference between that and having the compiler implement a bounded 
vector instead as the fundamental building block. You seem fixated on the 
form of declaration (that is a generic package vs. some sort of built-in 
syntax) -- there's no fundamental difference. There are many Ada packages 
that are built-in to compilers (for Janus/Ada, these include System and 
Ada.Exceptions and Ada.Assertions) -- there's no body or even source of 
these to be seen.

We're not even talking about different syntax for the use of vectors (and it 
would be easy to have some syntax sugar for declarations - we already have a 
proposal on those lines for Ada). Indeed, in a new language, one would 
certainly call these "array" containers (couldn't do that in Ada as the word 
"array" is reserved).

Sometimes, one has to step back and look at the bigger picture and not 
always at the way things have always been done. Arrays (at least as defined 
in Ada) have outlived their usefulness.

                          Randy.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-07  0:17     ` Randy Brukardt
@ 2021-01-07  9:57       ` Dmitry A. Kazakov
  2021-01-07 22:03         ` Randy Brukardt
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-07  9:57 UTC (permalink / raw)


On 2021-01-07 01:17, Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:rt3uv2$1nrd$1@gioia.aioe.org...
>> On 2021-01-06 04:08, Randy Brukardt wrote:
>>> IMHO, "String" shouldn't be an array at all. In a UTF-8 world, it makes
>>> little sense to index into a string - it would be expensive to do it
>>> based
>>> on characters (since they vary in size), and dangerous to do it based on
>>> octets (since you could get part of a character).
>>
>> It will not work. There is no useful integral operations defined on
>> strings. It is like arguing that image is not an array of pixels because
>> you could distort objects in there when altering individual pixels.
>>
>>> The only real solution is to never use String in the first place. A
>>> number
>>> of people are building UTF-8 abstractions to replace String, and I expect
>>> those to become common in the coming years.
>>
>> This will never happen. Ada standard library already has lots of integral
>> operations defined on strings. They are practically never used. The UTF-8
>> (or whatever encoding) abstraction thing simply does not exist.
>>
>>> Indeed, (as I've mentioned before) I would go further and abandon arrays
>>> altogether -- containers cover the same ground (or could easily) -- the
>>> vast
>>> complication of operators popping up much after type declarations,
>>> assignable slices, and supernull arrays all waste resources and cause
>>> oddities and dangers. It's a waste of time to fix arrays in Ada -- just
>>> don't use them.
>>
>> How these containers are supposed to be implemented?
> 
> Built-in to the implementation, of course. Implementing these things in Ada
> is a nice capability, because that allows simple quick-and-dirty
> implementations. But for things that are commonly used, that necessarily
> leads to lousy performance. One has to have at least some special cases even
> for the Ada.Containers to get adequate performance, so there's no problem
> extending that.

OK, they cannot be implemented in this new Ada. How is that different to 
the present status? Drop features and make syntax ugly is that all?

> ...
>> How Stream_Element_Array is supposed to be an opaque container?
> 
> It should already be an opaque container. You use language-defined stream
> attributes to implement user-defined stream attributes - not unportable
> direct byte twiddling.

No, you are talking about stream here, I am about Stream_Element_Array.

>> How file read operation is supposed to assign part of a container?
> 
> ??? Why would you want to do that?

Well, to read a chuck of data from the socket. Where it goes?

> Streaming a bounded vector (which almost
> all existing arrays should be) naturally would read only the active part of
> the vector.

What is active part of vector? Does it have some type? How do I pass it 
to a subprogram?

> Non-streaming reading is left over Ada 83 nonsense; all I/O
> should be built on top of streams (as a practical matter, the vast majority
> is anyway).

No physical I/O is stream with very rare exceptions.

>> You cannot rid of array interface with all its types involved: index,
>> index set (range), element, element set (array). A containers without the
>> array interface cannot replace array. Any language must support them. The
>> problem is that Ada has array interfaces once built-in and as an ugly lame
>> monstrosity of helper tagged types, a mockery of array.
> 
> There no reason that a container interface cannot have those things --
> Ada.Containers.Vectors does.

As array it is unusable.

> The things that Vectors is missing (mainly the
> ability to use enumeration and modular indexes) was a mistake that I
> complained about repeatedly during the design, but I lost on that.

It has all disadvantages of being no array and no advantages of being a 
tagged type. I do not see how two bad could make one good.

>> Array implementation is a fundamental building block of computing.
> 
> Surely. But one does not need the nonsense of requiring an underlying
> implementation (which traditional arrays do) in order to get that building
> block. You always talk about this in terms of an "interface", which is
> essentially the same idea. One cannot have any sort of non-contigious or
> persistent arrays with the Ada interface, since operations like assigning
> into slices are impossible in such representations. One has to give those
> things up in order to have an "interface" rather than the concrete form for
> Ada arrays.

No, one should have interfaces for such operations as well. You cannot 
do that with a single type and single dispatch. That is another reason 
why you cannot replace built-in arrays with anything before you resolve 
the type system issues. The elephant is the room is that you cannot 
spell array type in Ada. So your solution is supposed to be let's have 
no arrays. That is no solution at all.

> I prefer to not call the result an array, since an array implies a
> contiguous in-memory representation. Of course, some vectors will have such
> a representation, but that needs to be a requirement only for vectors used
> for interfacing. (And those should be used rarely.)

And how these special vectors will differ from other vectors? If Read 
takes special vector (does it?) can I pass a non-special vector instead? 
It leaks. Ada 83 resolved all this per compiler magic. Modern Ada has 
nothing more to offer.

>> That does not go either. Of course you could have two languages, one with
>> arrays to implement containers and one without them for end users. But
>> this is neither Ada philosophy nor a concept for any good
>> universal-purpose language.
> 
> Compilers implement arrays in Ada; there is no possibility a user doing it.
> I see no difference between that and having the compiler implement a bounded
> vector instead as the fundamental building block.

See above. Is bounded vector a vector? No, you cannot make it. So you 
are right back with ugly half-baked arrays named bounded vectors and 
even uglier unbounded vectors with zillion generic functions to convert 
one to another.

> You seem fixated on the
> form of declaration (that is a generic package vs. some sort of built-in
> syntax) -- there's no fundamental difference.

If syntax were the only problem it would be easy to resolve by adding it 
to generics. The problem is not syntax but lacking functionality and 
types and subtypes involved. Generics simply cannot do anything 
resembling Ada 83 arrays.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 21:02             ` Stephen Davies
@ 2021-01-07 10:38               ` Stephen Davies
  2021-01-07 21:39                 ` Randy Brukardt
  0 siblings, 1 reply; 66+ messages in thread
From: Stephen Davies @ 2021-01-07 10:38 UTC (permalink / raw)


Stephen Davies wrote: 
> subtype Mono_String is String (1 .. <>);
> subtype Mono_String is String with First => 1;
> subtype Mono_String is String with Slide => True;

Dmitry A. Kazakov wrote:
> Position-based indices an slices. One could add [] brackets for that 
> stuff to make C crowd happy. 
> S [1] - First array element 
> S (1) - Array element at index 1 

Alternatively, to make the distinction more explicit, perhaps:
S[Slide:1] - First array element

Or maybe a new attribute:
S'Slide(1)

Reminder, my justification is not only for strings being indexed by
numeric literals, e.g.:
subtype Some_Range is Positive range 4..5;
...
Some_String(Some_Range) -- fails if Some_String'First /= 1

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-07 10:38               ` Stephen Davies
@ 2021-01-07 21:39                 ` Randy Brukardt
  2021-01-07 22:38                   ` Stephen Davies
  0 siblings, 1 reply; 66+ messages in thread
From: Randy Brukardt @ 2021-01-07 21:39 UTC (permalink / raw)


"Stephen Davies" <joviangm@gmail.com> wrote in message 
news:2ef3d694-b382-4274-adc1-92345f31a602n@googlegroups.com...
...
> Reminder, my justification is not only for strings being indexed by
> numeric literals, e.g.:
> subtype Some_Range is Positive range 4..5;
> ...
> Some_String(Some_Range) -- fails if Some_String'First /= 1

I don't follow this at all. You say your justification "is not only for 
strings" but only give a string example. And what you want is essentially 
the semantics of a vector container (more specifically, a bounded vector 
container), but want to add a pile of additional complication to something 
that is already far to complicated. What's wrong with building around a 
bounded vector??? Ada (or its successor) needs to be simpler, not more 
complicated.

                           Randy. 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-07  9:57       ` Dmitry A. Kazakov
@ 2021-01-07 22:03         ` Randy Brukardt
  2021-01-08  9:04           ` Dmitry A. Kazakov
  2021-01-08 17:23           ` Shark8
  0 siblings, 2 replies; 66+ messages in thread
From: Randy Brukardt @ 2021-01-07 22:03 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:rt6ltg$922$1@gioia.aioe.org...
...
> No, one should have interfaces for such operations as well.

You're again losing sight of the ultimate goal here, which is to have a 
conventionally compiled language that works close to the metal for high 
performance. (And is compatible with embedded systems, runable from ROM, 
etc.)

Interfaces of any kind are completely counter to that goal, *especially* in 
the case of arrays/vector containers.

Any sort of multiple inheritance (not to mention multiple dispatch) requires 
searching a lookup table for the appropriate interface. That is unaccepably 
expensive for an operation as basic as array indexing. One could use 
just-in-time compilation and similar techniques to reduce those costs, but 
those sort of things are not usable in a ROM environment and are much more 
appropriate for a language like Python.

If you aren't using the interface as such (that is, with some form of 
dispatching), then it is simply a complication with no semantic meaning. 
There's no problem thinking about concepts as some sort of logical 
interface, but it is a distinction without meaning in that case. (By that 
logic, Ada 202x has interfaces for Image, indexing, dereferencing, literals, 
and streaming that can be applied to any tagged type.)

So either you are talking about a complication without value, or an 
extremely expensive implementation that doesn't meet the goals for a 
language like Ada. What's the point?

BTW, you have yet to show me any useful example that you can't reasonably do 
with a bounded vector (assuming that vector supports any discrete index 
type). For almost any possible language feature, there's always some example 
where it works better than the alternative using the basic features of the 
language. But that's not the question, the question is whether it is worth 
it in terms of language complexity, opportunity cost (time spent 
implementing feature X is time not spent on features A, B, and C), and 
usability (too many special case features make it harder to learn and use a 
language).

Slices fall on the wrong side of this boundary for me; the nonsense required 
to implement them seems reasonable at first but rapidly disappears as of the 
many other things that cannot be done in their presense. And they're mainly 
useful for strings, which are not arrays to begin with.

> You cannot do that with a single type and single dispatch.

Exactly my point. The implementation of multiple inheritance and multiple 
dispatch is simply too expensive for a language like Ada, and that's 
especially true in the case of basic building blocks like arrays/vectors.

The language you want is not feasible to implement IMHO. A language without 
a feasible implementation doesn't exist practically, and there's little 
sense in talking about it.

                                         Randy.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-07 21:39                 ` Randy Brukardt
@ 2021-01-07 22:38                   ` Stephen Davies
  0 siblings, 0 replies; 66+ messages in thread
From: Stephen Davies @ 2021-01-07 22:38 UTC (permalink / raw)


On Thursday, 7 January 2021 at 21:39:26 UTC, Randy Brukardt wrote:
> "Stephen Davies" <jovi...@gmail.com> wrote:
> ...
> > Reminder, my justification is not only for strings being indexed by 
> > numeric literals, e.g.: 
> > subtype Some_Range is Positive range 4..5; 
> > ... 
> > Some_String(Some_Range) -- fails if Some_String'First /= 1
> I don't follow this at all. You say your justification "is not only for 
> strings" but only give a string example.

I said not "not only for strings being indexed by *numeric literals*"
then gave an example where the index was a subtype.

> And what you want is essentially the semantics of a vector container

For the foreseeable future, people are going to use arrays,
especially Strings. I want a small change that will make code
more readable and could prevent erroneous behaviour.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-07 22:03         ` Randy Brukardt
@ 2021-01-08  9:04           ` Dmitry A. Kazakov
  2021-01-08 17:23           ` Shark8
  1 sibling, 0 replies; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-08  9:04 UTC (permalink / raw)

On 2021-01-07 23:03, Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:rt6ltg$922$1@gioia.aioe.org...
> ...
>> No, one should have interfaces for such operations as well.
> 
> You're again losing sight of the ultimate goal here, which is to have a
> conventionally compiled language that works close to the metal for high
> performance. (And is compatible with embedded systems, runable from ROM,
> etc.)
> 
> Interfaces of any kind are completely counter to that goal, *especially* in
> the case of arrays/vector containers.

Ada 83 achieved that goal easily while having most of array interfaces 
implemented. The missing interfaces were subarrays and planes of 
multidimensional arrays and proper types of index ranges.

> Any sort of multiple inheritance (not to mention multiple dispatch) requires
> searching a lookup table for the appropriate interface. That is unaccepably
> expensive for an operation as basic as array indexing. One could use
> just-in-time compilation and similar techniques to reduce those costs, but
> those sort of things are not usable in a ROM environment and are much more
> appropriate for a language like Python.

You are talking here about having run-time classes of array types and 
dynamic dispatch on them. That is another story.

[ Though Ada was not shy of introducing ugly tagged iterators, if one 
can do one, why not another? ]

> If you aren't using the interface as such (that is, with some form of
> dispatching), then it is simply a complication with no semantic meaning.

The semantic meaning is defining operations of the type has and the 
class it belongs. With arrays you need that because there are lots of 
type conversions between array and array-related types. Interface is 
mere formalization what can be converted into what. You need that 
regardless the method you implement arrays.

The point is that you cannot throw that away without making your 
containers totally useless, you will end up with the notorious 
"Pascal-arrays" pretty soon.

> So either you are talking about a complication without value, or an
> extremely expensive implementation that doesn't meet the goals for a
> language like Ada. What's the point?

It was not expensive in Ada 83. Speaking of the goals, the goal is to 
have user-defined arrays intermixeable with the built-in ones.

> BTW, you have yet to show me any useful example that you can't reasonably do
> with a bounded vector (assuming that vector supports any discrete index
> type).

About 90% of anything is not possible in the existing form. Without 
slices you must pass arrays bounds (or offset/length) down to each call 
as in C.

+ 100% of bindings are impossible.

>> You cannot do that with a single type and single dispatch.
> 
> Exactly my point. The implementation of multiple inheritance and multiple
> dispatch is simply too expensive for a language like Ada, and that's
> especially true in the case of basic building blocks like arrays/vectors.

Remember, we are talking about static cases. Your containers with or 
without generics cannot do dynamic dispatch. So everything is static, 
thus, voilà, it costs nothing.

The problem I see is not performance costs, but the fact nobody know how 
to implement multiple dispatch consistently.

> The language you want is not feasible to implement IMHO. A language without
> a feasible implementation doesn't exist practically, and there's little
> sense in talking about it.

I am pretty much satisfied with Ada 83 arrays. What I want is compiler 
magic open to user-defined types. You propose to kill the first without 
offering the second.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-07 22:03         ` Randy Brukardt
  2021-01-08  9:04           ` Dmitry A. Kazakov
@ 2021-01-08 17:23           ` Shark8
  2021-01-08 20:19             ` Dmitry A. Kazakov
  2021-01-09  2:31             ` Randy Brukardt
  1 sibling, 2 replies; 66+ messages in thread
From: Shark8 @ 2021-01-08 17:23 UTC (permalink / raw)


On Thursday, January 7, 2021 at 3:03:54 PM UTC-7, Randy Brukardt wrote:
> "Dmitry A. Kazakov" wrote in message
> news:rt6ltg$922$... 
> ...
> > No, one should have interfaces for such operations as well.
> You're again losing sight of the ultimate goal here, which is to have a 
> conventionally compiled language that works close to the metal for high 
> performance. (And is compatible with embedded systems, runable from ROM, 
> etc.) 
> 
> Interfaces of any kind are completely counter to that goal, *especially* in 
> the case of arrays/vector containers.
Why?
An interface, in abstract, says very little about the implementation; it's essentially equivalent to the "set of operations" side of the working-definition of a Type: "A set of possible values and a set of operations on those values." — And aren't arrays commonly used in high-performance, "close to the metal" computing all the time?

It's also easy to imagine a wizard/database system which selects the proper structure/set-of-structures for you given the constraints requested; in psudeo-SQL:
Select from Structures where Look_up.Maximum <= Linear and Deletion.Average <= Linearithmic;

Combining these two: you could, with an abstract interface, have a system that chose an underlying/implementation structure given the requisite properties, of which the interface could be a part. Similar to Ada's Generic parameter "Type X is private" indicating any type which has assignment.

> Any sort of multiple inheritance (not to mention multiple dispatch) requires 
> searching a lookup table for the appropriate interface.
Ah.
It appears you're confusing the Ada concept underlying keyword INTERFACE with the general/abstract notion.
It appears to me that Dmitry is referring to the latter, not the Ada-construct of INTERFACE, which requires a tagged type.

I think what he's getting at is something that I considered/proposed here on C.L.A some years ago, adding the concept of an "ABSTRACT INTERFACE" to Ada. (IIRC, the proposal I had in mind was to be able to model the notional meta type-hierarchy; eg: Number ⊃ Universal_Integer ⊃ System.Integer.)

> If you aren't using the interface as such (that is, with some form of 
> dispatching), then it is simply a complication with no semantic meaning. 
> There's no problem thinking about concepts as some sort of logical 
> interface, but it is a distinction without meaning in that case. (By that 
> logic, Ada 202x has interfaces for Image, indexing, dereferencing, literals, 
> and streaming that can be applied to any tagged type.) 
I don't think he is talking about dispatching.
I also don't think that we need to *have* dispatching to have classes-of-types; the generic formal parameters exemplify that.

> So either you are talking about a complication without value, or an 
> extremely expensive implementation that doesn't meet the goals for a 
> language like Ada. What's the point? 
In the matter of the "ABSTRACT INTERFACE" idea, we could define things in such a way at the language level that:
 (1) we could specify the notional types (eg Universal_Float, Universal_Fixed, etc) in this system, providing the interface of that type for the usage of the language;
 (2) the specification of these notional-types would allow for less verbiage in the standard, and possibly allow proving on the classes they encompass.

I think it would be interesting to also have a sort of "ABSTRACT TYPE" that could allow an Ada compiler/wizard-system (as mentioned above) fill in the gap with some appropriately matching type; this would be good for the usage of prototyping, and probably also optimization, to have a way to say "I need something that I can insert elements into, delete them, and index into, but will never exceed 2**8 items." -- though this might be a bit too much for some people/applications, especially those in fields where you commonly specify bit-patterns and the like (there are times for those, but unless you're doing the low-level stuff, you probably should let the compiler work things out).
 
> BTW, you have yet to show me any useful example that you can't reasonably do 
> with a bounded vector (assuming that vector supports any discrete index 
> type). For almost any possible language feature, there's always some example 
> where it works better than the alternative using the basic features of the 
> language. But that's not the question, the question is whether it is worth 
> it in terms of language complexity, opportunity cost (time spent 
> implementing feature X is time not spent on features A, B, and C), and 
> usability (too many special case features make it harder to learn and use a 
> language). 
True, but making an appropriately abstract/general feature can allow you to subsume/replace older features, defining them in terms of the new feature.
If the ARG were more open to the idea of removing language features, we could be a bit more aggressive here; but that would mean breaking backwards-compatibility, which is rather unpopular.

> Slices fall on the wrong side of this boundary for me; the nonsense required 
> to implement them seems reasonable at first but rapidly disappears as of the 
> many other things that cannot be done in their presense.
Can you give some examples here?

> And they're mainly useful for strings, which are not arrays to begin with.
The proliferation of strings in Ada is lamentable; I would love to be able to have a set of generics so you could instantiate with [[Wide_]Wide_]Character and get the current-Ada hierarchy.

> > You cannot do that with a single type and single dispatch.
> Exactly my point. The implementation of multiple inheritance and multiple 
> dispatch is simply too expensive for a language like Ada, and that's 
> especially true in the case of basic building blocks like arrays/vectors.
Some of the MI/MD motivation seems to be unnecessarily complicated, in addition to the inherent complications.
Though I do like the PRINT: PRINTER x IMAGE example, where you have the class of images and the class of printers for the parameters of Print, and there are good reasons for choosing to resolve first the printer or the image.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-08 17:23           ` Shark8
@ 2021-01-08 20:19             ` Dmitry A. Kazakov
  2021-01-09  2:18               ` Randy Brukardt
  2021-01-09  2:31             ` Randy Brukardt
  1 sibling, 1 reply; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-08 20:19 UTC (permalink / raw)


On 2021-01-08 18:23, Shark8 wrote:
> On Thursday, January 7, 2021 at 3:03:54 PM UTC-7, Randy Brukardt wrote:

>> Any sort of multiple inheritance (not to mention multiple dispatch) requires
>> searching a lookup table for the appropriate interface.
> Ah.
> It appears you're confusing the Ada concept underlying keyword INTERFACE with the general/abstract notion.
> It appears to me that Dmitry is referring to the latter, not the Ada-construct of INTERFACE, which requires a tagged type.
> 
> I think what he's getting at is something that I considered/proposed here on C.L.A some years ago, adding the concept of an "ABSTRACT INTERFACE" to Ada. (IIRC, the proposal I had in mind was to be able to model the notional meta type-hierarchy; eg: Number ⊃ Universal_Integer ⊃ System.Integer.)

Right, though I do not think that tags can inflict any cost. The 
situation is same as with array bounds. You do keep bounds when the 
array is statically constrained. Tag is just another constraint like 
bounds. It must be handled just same way, removed when statically known. 
No penalty, unless classes are actually used. I do not know why people 
always bring dispatch into discussions about static cases.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-08 20:19             ` Dmitry A. Kazakov
@ 2021-01-09  2:18               ` Randy Brukardt
  2021-01-09 10:53                 ` Dmitry A. Kazakov
  0 siblings, 1 reply; 66+ messages in thread
From: Randy Brukardt @ 2021-01-09  2:18 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:rtaeok$l8b$1@gioia.aioe.org...
> On 2021-01-08 18:23, Shark8 wrote:
>> On Thursday, January 7, 2021 at 3:03:54 PM UTC-7, Randy Brukardt wrote:
>
>>> Any sort of multiple inheritance (not to mention multiple dispatch) 
>>> requires
>>> searching a lookup table for the appropriate interface.
>> Ah.
>> It appears you're confusing the Ada concept underlying keyword INTERFACE 
>> with the general/abstract notion.
>> It appears to me that Dmitry is referring to the latter, not the 
>> Ada-construct of INTERFACE, which requires a tagged type.
>>
>> I think what he's getting at is something that I considered/proposed here 
>> on C.L.A some years ago, adding the concept of an "ABSTRACT INTERFACE" to 
>> Ada. (IIRC, the proposal I had in mind was to be able to model the 
>> notional meta type-hierarchy; eg: Number ? Universal_Integer ? 
>> System.Integer.)
>
> Right, though I do not think that tags can inflict any cost. The situation 
> is same as with array bounds. You do keep bounds when the array is 
> statically constrained. Tag is just another constraint like bounds. It 
> must be handled just same way, removed when statically known. No penalty, 
> unless classes are actually used. I do not know why people always bring 
> dispatch into discussions about static cases.

The possibility of dynamic dispatch (in some code that doesn't exist yet, 
but *could*) is what is so expensive and is virtually impossible to remove 
after the fact (that is, in an optimizer). If you wanted to include a 
declaration that you *never* were going to use any dynamic dispatch, then 
you could talk try to ignore the possibility. But without some sort of 
dispatch, a defined interface buys you nothing other than complication. It 
doesn't simplify the description and would substantially complicate the 
implementation. What's the point??

                              Randy.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-08 17:23           ` Shark8
  2021-01-08 20:19             ` Dmitry A. Kazakov
@ 2021-01-09  2:31             ` Randy Brukardt
  2021-01-09 14:52               ` Why UTF-8 (was Re: Lower bounds of Strings) Jeffrey R. Carter
  2021-01-11 21:35               ` Lower bounds of Strings Shark8
  1 sibling, 2 replies; 66+ messages in thread
From: Randy Brukardt @ 2021-01-09  2:31 UTC (permalink / raw)

"Shark8" <onewingedshark@gmail.com> wrote in message 
news:37ada5ff-eee7-4082-ad20-3bd65b5a2778n@googlegroups.com...
On Thursday, January 7, 2021 at 3:03:54 PM UTC-7, Randy Brukardt wrote:
....
>> So either you are talking about a complication without value, or an
>> extremely expensive implementation that doesn't meet the goals for a
>> language like Ada. What's the point?
>In the matter of the "ABSTRACT INTERFACE" idea, we could define
>things in such a way at the language level that:
> (1) we could specify the notional types (eg Universal_Float, 
> Universal_Fixed, etc)
>in this system, providing the interface of that type for the usage of the 
>language;
> (2) the specification of these notional-types would allow for less 
> verbiage in the
> standard, and possibly allow proving on the classes they encompass.

I again don't see the point. The thing that you can't do with the Ada 
universals is name them, because they don't have a dynamic presence. There's 
no problem proving anything with them, they're already explicitly defined in 
the RM. All you would be doing is changing the syntax in the subclauses of 
4.5. For a totally new language, that might be useful, but I fail to see any 
benefit for Ada -- it's a lot of work without changing much of anything.

...
>> Slices fall on the wrong side of this boundary for me; the nonsense 
>> required
>> to implement them seems reasonable at first but rapidly disappears as of 
>> the
>> many other things that cannot be done in their presense.
>Can you give some examples here?

All of the super-null nonsense occurs because of slices, which slows down 
any code that uses them. And they are designed to be a "window" into an 
existing contiguous single-dimensioned array. But syntactically, they appear 
to be some sort of general concept. Which is why you always hear people 
talking about slices of multi-dimensional arrays. But those would require 
dealing with discontiguous chunks of an array, which would have a heavy 
distributed overhead (since you can assign into them, just making a copy 
isn't an implementation option). So you would have to pass an assignment 
"thunk" (compiler-generated subprogram) with any writable array parameter --  
even if you never, ever passed a slice to that parameter. The cost would be 
massive.

Any sort of consistent slice interface would run into the same problem, 
since you always want the invariant that

   A(1..3) = A(1) & A(2) & A(3)

and that is extremely difficult to do with writable slices.

If you could only *read* a slice (which would cover 90% of the uses), things 
would be better. But I doubt anyone wants to go in that direction.

>> And they're mainly useful for strings, which are not arrays to begin 
>> with.
>The proliferation of strings in Ada is lamentable; I would love to be able 
>to
>have a set of generics so you could instantiate with 
>[[Wide_]Wide_]Character
>and get the current-Ada hierarchy.

That wouldn't work for UTF-8 and UTF-16, which makes it pointless at this 
point. The default String should be UTF-8, the others should be reserved for 
special cases (interfacing in particular). You don't want the default string 
type to restrict the contents, and you don't want it to waste a lot of 
space. As Dmitry has said in the past, operations that go at a character at 
a time through a string are rare, and most of those are properly implemented 
in Ada.Strings -- users shouldn't be reinventing those wheels in the first 
place. (That's especially true as many of them can do octet operations 
rather than character indexing.) So UTF-8 is the best trade-off for general 
use.

                           Randy.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-09  2:18               ` Randy Brukardt
@ 2021-01-09 10:53                 ` Dmitry A. Kazakov
  2021-01-12  8:19                   ` Randy Brukardt
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-09 10:53 UTC (permalink / raw)


On 2021-01-09 03:18, Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:rtaeok$l8b$1@gioia.aioe.org...
>> On 2021-01-08 18:23, Shark8 wrote:
>>> On Thursday, January 7, 2021 at 3:03:54 PM UTC-7, Randy Brukardt wrote:
>>
>>>> Any sort of multiple inheritance (not to mention multiple dispatch)
>>>> requires
>>>> searching a lookup table for the appropriate interface.
>>> Ah.
>>> It appears you're confusing the Ada concept underlying keyword INTERFACE
>>> with the general/abstract notion.
>>> It appears to me that Dmitry is referring to the latter, not the
>>> Ada-construct of INTERFACE, which requires a tagged type.
>>>
>>> I think what he's getting at is something that I considered/proposed here
>>> on C.L.A some years ago, adding the concept of an "ABSTRACT INTERFACE" to
>>> Ada. (IIRC, the proposal I had in mind was to be able to model the
>>> notional meta type-hierarchy; eg: Number ? Universal_Integer ?
>>> System.Integer.)
>>
>> Right, though I do not think that tags can inflict any cost. The situation
>> is same as with array bounds. You do keep bounds when the array is
>> statically constrained. Tag is just another constraint like bounds. It
>> must be handled just same way, removed when statically known. No penalty,
>> unless classes are actually used. I do not know why people always bring
>> dispatch into discussions about static cases.
> 
> The possibility of dynamic dispatch (in some code that doesn't exist yet,
> but *could*) is what is so expensive and is virtually impossible to remove
> after the fact (that is, in an optimizer).

The representation without the tag must be mandatory.

> If you wanted to include a
> declaration that you *never* were going to use any dynamic dispatch,

This declaration is already in the language. Only A'Class dispatches. 
Again, it is like with bounds, only array (... range <>) of ... has 
dynamic bounds. To enforce dispatch one would have to convert to A'Class 
first, that will add the tag to the array's dope vector.

> then
> you could talk try to ignore the possibility. But without some sort of
> dispatch, a defined interface buys you nothing other than complication. It
> doesn't simplify the description and would substantially complicate the
> implementation. What's the point??

Of course it will simplify everything. E.g. our beloved generics. Compare

    generic
       type I is (<>);
       type E is private;
       type A is array (I range <>) of E;

with

    generic
       type A is new Root_Array_Type with private;

or

    package Ada.Containers.Vectors is
       ...
       type Vector is tagged private
          with Constant_Indexing => Constant_Reference,
               Variable_Indexing => Reference,
               Default_Iterator  => Iterate,
               Iterator_Element  => Element_Type;
       -- hundreds of operation declarations

with

    type Vector is new array (Index_Type) of Element_Type with private;
       -- nothing else to declare

Everything in Ada could be formalized in interfaces:

   type Index is new Discrete_Type; -- Same as <>
   type Discrete_Type is abstract new Ordered_Type and Copyable_Type;

and so on.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Why UTF-8 (was Re: Lower bounds of Strings)
  2021-01-09  2:31             ` Randy Brukardt
@ 2021-01-09 14:52               ` Jeffrey R. Carter
  2021-01-09 18:08                 ` Dmitry A. Kazakov
  2021-01-11 21:35               ` Lower bounds of Strings Shark8
  1 sibling, 1 reply; 66+ messages in thread
From: Jeffrey R. Carter @ 2021-01-09 14:52 UTC (permalink / raw)

On 1/9/21 3:31 AM, Randy Brukardt wrote:
> The default String should be UTF-8, the others should be reserved for
> special cases (interfacing in particular). You don't want the default string
> type to restrict the contents, and you don't want it to waste a lot of
> space.

I don't understand this. I presume there was a time when the extra complexity of 
UTF-8 was a reasonable price to pay for the larger than 1-byte character range 
it provided, and there may be systems where it still makes sense, but with most 
systems these days having GB of memory and TB of storage, the simplicity of 
using 2 bytes per character seems worth the wasted space. On my 4-yr-old 
computer I could do everything with 4-byte characters and not have a problem.

-- 
Jeff Carter
"Unix and C are the ultimate computer viruses."
Richard Gabriel
99

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Why UTF-8 (was Re: Lower bounds of Strings)
  2021-01-09 14:52               ` Why UTF-8 (was Re: Lower bounds of Strings) Jeffrey R. Carter
@ 2021-01-09 18:08                 ` Dmitry A. Kazakov
  2021-01-12  7:58                   ` Randy Brukardt
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-09 18:08 UTC (permalink / raw)


On 2021-01-09 15:52, Jeffrey R. Carter wrote:
> On 1/9/21 3:31 AM, Randy Brukardt wrote:
>> The default String should be UTF-8, the others should be reserved for
>> special cases (interfacing in particular). You don't want the default 
>> string
>> type to restrict the contents, and you don't want it to waste a lot of
>> space.
> 
> I don't understand this. I presume there was a time when the extra 
> complexity of UTF-8 was a reasonable price to pay for the larger than 
> 1-byte character range it provided, and there may be systems where it 
> still makes sense, but with most systems these days having GB of memory 
> and TB of storage, the simplicity of using 2 bytes per character seems 
> worth the wasted space. On my 4-yr-old computer I could do everything 
> with 4-byte characters and not have a problem.

Because there is no complexity in UTF-8. String characters are always 
accessed consequently. So UCS-4 has no advantage over UTF-8.

As for interfaces any string has always two: presentation array 
interface (encoding) and character array interface (view). The language 
should better support proper abstractions and then having whatever 
encoding will be no problem.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-09  2:31             ` Randy Brukardt
  2021-01-09 14:52               ` Why UTF-8 (was Re: Lower bounds of Strings) Jeffrey R. Carter
@ 2021-01-11 21:35               ` Shark8
  2021-01-12  8:12                 ` Randy Brukardt
  1 sibling, 1 reply; 66+ messages in thread
From: Shark8 @ 2021-01-11 21:35 UTC (permalink / raw)


On Friday, January 8, 2021 at 7:31:40 PM UTC-7, Randy Brukardt wrote:
> "Shark8" wrote in message 
> news:37ada5ff-eee7-408...
> On Thursday, January 7, 2021 at 3:03:54 PM UTC-7, Randy Brukardt wrote:
> ....
> >> So either you are talking about a complication without value, or an 
> >> extremely expensive implementation that doesn't meet the goals for a 
> >> language like Ada. What's the point? 
> >In the matter of the "ABSTRACT INTERFACE" idea, we could define 
> >things in such a way at the language level that: 
> > (1) we could specify the notional types (eg Universal_Float, 
> > Universal_Fixed, etc) 
> >in this system, providing the interface of that type for the usage of the 
> >language; 
> > (2) the specification of these notional-types would allow for less 
> > verbiage in the 
> > standard, and possibly allow proving on the classes they encompass.
> I again don't see the point. The thing that you can't do with the Ada 
> universals is name them, because they don't have a dynamic presence.
But that's the point, it's not about them having a dynamic presence, it's about the ability to classify and categorize.
Kind of like how SPARK's Ghost-code is there to help the prover, not be part of the actual program. Likewise, these ABSTRACT INTERFACES are about making an interface into a type, and doing so consistently, language-wide; kind of like how the containers use special aspects, and also like how generic formal parameters provide minimal information [to the implementation] the less-definite you get.

Perhaps a psuedo-Ada example would work.

Abstract Type DISCRETE_ITEM is (<>);
Abstract Type CONSTRAINED_ITEM is private;
Abstract Type ARRAY_ITEM( Index : DISCRETE_ITEM; Element : CONSTRAINED_ITEM ) is private
  with Access => Element At ARRAY_ITEM(Index); -- Get an ELEMENT by accessing through Index.

>There's 
> no problem proving anything with them, they're already explicitly defined in 
> the RM. All you would be doing is changing the syntax in the subclauses of 
> 4.5. For a totally new language, that might be useful, but I fail to see any 
> benefit for Ada -- it's a lot of work without changing much of anything.
The idea here is to build a system wherein you can (a) define the interfaces of any type, (b) give names to the notional entities [like Universal_Integer], (c) have a construct that is itself robust/provable enough that you can use it to describe/redefine Ada's type-system in its terms, allowing the subclause to be shortened [hopefully], machine translatable, and machine provable.

Much like moving from:
  Function Foo( I : Integer ) return Natural;
  * Foo's parameter must be in the range of -1 and System.Integer'Last;
  * Exception Program_Error is raised on any negative number, except -1.
to:
  Function Foo(I : Integer) return Natural
    with Pre => I in -1..System.Integer'Last or else raise Program_Error;
eliminates English verbiage in favor of language-defined constructs can [hopefully] reduce the size/complexity of the standard, as well as increase the machine-translatability,

> 
> ...
> >> Slices fall on the wrong side of this boundary for me; the nonsense 
> >> required 
> >> to implement them seems reasonable at first but rapidly disappears as of 
> >> the 
> >> many other things that cannot be done in their presense. 
> >Can you give some examples here?
> All of the super-null nonsense occurs because of slices, which slows down 
> any code that uses them.
What do you mean "super-null" nonsense?

>And they are designed to be a "window" into an 
> existing contiguous single-dimensioned array. But syntactically, they appear 
> to be some sort of general concept.
Pretty much the only times I've used slicing [aside from "copy me this substring, and continue on"] have been windowing, though that's typically with a renames.

> Which is why you always hear people 
> talking about slices of multi-dimensional arrays. But those would require 
> dealing with discontiguous chunks of an array, which would have a heavy 
> distributed overhead (since you can assign into them, just making a copy 
> isn't an implementation option). So you would have to pass an assignment 
> "thunk" (compiler-generated subprogram) with any writable array parameter -- 
> even if you never, ever passed a slice to that parameter. The cost would be 
> massive.
Ah, I think I see now.

> >The proliferation of strings in Ada is lamentable; I would love to be able 
> >to 
> >have a set of generics so you could instantiate with 
> >[[Wide_]Wide_]Character 
> >and get the current-Ada hierarchy.
> That wouldn't work for UTF-8 and UTF-16, which makes it pointless at this 
> point. The default String should be UTF-8, the others should be reserved for 
> special cases (interfacing in particular).
Why wouldn't it?
UTF- 8, and UTF-16 are both 8/16 bits, so their underlying "sequence of codepoints" should be 8/16... but then, Unicode makes everyone's lives harder while pretending to be helpful.

>You don't want the default string 
> type to restrict the contents, and you don't want it to waste a lot of 
> space. As Dmitry has said in the past, operations that go at a character at 
> a time through a string are rare, and most of those are properly implemented 
> in Ada.Strings -- users shouldn't be reinventing those wheels in the first 
> place.
Right, which is why if I could I'd 'collapse' the [[Wide_]Wide_]String into a single generic, then we would have less to maintain.
GNAT, for example is *still* missing Wide_Wide_Equal_Case_Insensitive.

> (That's especially true as many of them can do octet operations 
> rather than character indexing.) So UTF-8 is the best trade-off for general 
> use. 
True, but my usage is typically either ASCII, LATIN-1, or Wide_Wide_Character.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Why UTF-8 (was Re: Lower bounds of Strings)
  2021-01-09 18:08                 ` Dmitry A. Kazakov
@ 2021-01-12  7:58                   ` Randy Brukardt
  0 siblings, 0 replies; 66+ messages in thread
From: Randy Brukardt @ 2021-01-12  7:58 UTC (permalink / raw)

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:rtcre8$1bug$1@gioia.aioe.org...
> On 2021-01-09 15:52, Jeffrey R. Carter wrote:
>> On 1/9/21 3:31 AM, Randy Brukardt wrote:
>>> The default String should be UTF-8, the others should be reserved for
>>> special cases (interfacing in particular). You don't want the default 
>>> string
>>> type to restrict the contents, and you don't want it to waste a lot of
>>> space.
>>
>> I don't understand this. I presume there was a time when the extra 
>> complexity of UTF-8 was a reasonable price to pay for the larger than 
>> 1-byte character range it provided, and there may be systems where it 
>> still makes sense, but with most systems these days having GB of memory 
>> and TB of storage, the simplicity of using 2 bytes per character seems 
>> worth the wasted space. On my 4-yr-old computer I could do everything 
>> with 4-byte characters and not have a problem.
>
> Because there is no complexity in UTF-8. String characters are always 
> accessed consequently. So UCS-4 has no advantage over UTF-8.

I wouldn't go so far as to say *no* complexity, but the cases where the 
complexity is a major issue are fairly rare. As Dmitry says, most operations 
are scans, and UTF-8 was designed so that scans don't need to identify the 
starts and ends of characters (you can't get mismatches when doing pattern 
matching, for instance).

And wasting memory will remain an issue for the foreseeable future. The 
amount of data structures that fit in the various caches have a substantial 
effect on performance. Similarly, the amount of data read/written to 
disk/nonvolatile memory also has a big effect on the cost of those 
operations. While a programmer can ignore those issues (avoiding premature 
optimization is usually a good thing), it's not as clear that a programming 
language design can. If the default representation is slow and large, there 
can be costs in switching to a better representation (see Ada's support for 
UTF-8 for an example!), and also in the perception of the language.

                                         Randy.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-11 21:35               ` Lower bounds of Strings Shark8
@ 2021-01-12  8:12                 ` Randy Brukardt
  2021-01-12 20:51                   ` Shark8
  2021-01-13 14:08                   ` Jeffrey R. Carter
  0 siblings, 2 replies; 66+ messages in thread
From: Randy Brukardt @ 2021-01-12  8:12 UTC (permalink / raw)

"Shark8" <onewingedshark@gmail.com> wrote in message 
news:26cac901-b901-4c4f-aba9-eab6cbd2a525n@googlegroups.com...
On Friday, January 8, 2021 at 7:31:40 PM UTC-7, Randy Brukardt wrote:
...
> >Can you give some examples here?
> All of the super-null nonsense occurs because of slices, which slows down
> any code that uses them.
What do you mean "super-null" nonsense?

Any range that is null is treated as the same by Ada. Any null range where 
the bounds are more than 1 apart is known as a "super-null" range. For 
instance, one typically writes:

       N : String(1..0)

to define a null string object. But you can write *any* null range here:

      N2 : String(314 .. 25);

And it takes code to figure this out at runtime if either bound is nonstatic 
(which is usually the case with slices). And you still have to store the 
bounds (you can still ask for the bounds of N2, and one better get 314 and 
25, but the length is still zero). And N = N2, so compares are complicated 
(one has to check the length, but that's more expensive to figure out than 
just a subtract). You end up generating a lot of code to deal with a rather 
unlikely case.

One of the reasons for suggesting using the bounded vector as a model is all 
of this sort of stuff vanishes, as the lower bound never changes in the 
vector packages. Only the upper bound moves, which is what is needed the 
vast majority of the time.

Both the early Pascal compilers we used in the very early days of Janus/Ada, 
and Janus/Ada itself, used a form of bounded strings rather than the more 
complex Ada model. (The CS 701 language that we originally were tasked to 
implement used bounded strings, and we kept it on our early commercial 
compilers so we could concentrate on other functionality like packages and 
private types.) That was a lot easier to use than the Ada model, it was a 
huge adjustment to switch to the Ada form and it bloated the code as well.

                                   Randy.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-09 10:53                 ` Dmitry A. Kazakov
@ 2021-01-12  8:19                   ` Randy Brukardt
  2021-01-12  9:37                     ` Dmitry A. Kazakov
  0 siblings, 1 reply; 66+ messages in thread
From: Randy Brukardt @ 2021-01-12  8:19 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:rtc1vu$1cfj$1@gioia.aioe.org...
...
> Of course it will simplify everything. E.g. our beloved generics. Compare
>
>    generic
>       type I is (<>);
>       type E is private;
>       type A is array (I range <>) of E;
>
> with
>
>    generic
>       type A is new Root_Array_Type with private;

Surely, it simplifies it to uselessness. You haven't declared the index and 
component types here. And if they are implicit in Root_Array_Type, you've 
now run into the fundemental problem of interfaces: you need a generic or 
everything has to be related in order to use an interface.

For instance, you can't have a useful container interface because the 
interface has to somehow contain the element type. The only way to do that 
is to add additional generics (which is madness), or require all of the 
elements to be descendants of some root type (which is a different form of 
madness).

Interfaces aren't a free lunch, they just add complexity (or, at best, move 
it around). One has to look at the entire set of declarations needed, not 
just one use (it would be a sorry language feature indeed if you couldn't 
find any example that it simplified!).

                                       Randy.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-12  8:19                   ` Randy Brukardt
@ 2021-01-12  9:37                     ` Dmitry A. Kazakov
  0 siblings, 0 replies; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-12  9:37 UTC (permalink / raw)


On 2021-01-12 09:19, Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:rtc1vu$1cfj$1@gioia.aioe.org...
> ...
>> Of course it will simplify everything. E.g. our beloved generics. Compare
>>
>>     generic
>>        type I is (<>);
>>        type E is private;
>>        type A is array (I range <>) of E;
>>
>> with
>>
>>     generic
>>        type A is new Root_Array_Type with private;
> 
> Surely, it simplifies it to uselessness. You haven't declared the index and
> component types here.

It is a different issue, same like T and T'Class. You can get T'Class 
type from T. So there should be A'Index and A'Element type-valued 
attributes to get the companion types. In the traditional syntax:

    generic
       type A is array (<>) of <>;
    package Foo is
       subtype I is A'Index;
       subtype E is A'Element;

> And if they are implicit in Root_Array_Type, you've
> now run into the fundemental problem of interfaces: you need a generic or
> everything has to be related in order to use an interface.

Not fundamental, we just need Ada 83 "new" back:

    type S is new new T with private;
       -- Ada 83 new on top of Ada 95 new (:-))

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-12  8:12                 ` Randy Brukardt
@ 2021-01-12 20:51                   ` Shark8
  2021-01-12 22:56                     ` Randy Brukardt
  2021-01-13 14:08                   ` Jeffrey R. Carter
  1 sibling, 1 reply; 66+ messages in thread
From: Shark8 @ 2021-01-12 20:51 UTC (permalink / raw)


On Tuesday, January 12, 2021 at 1:12:35 AM UTC-7, Randy Brukardt wrote:
> "Shark8" wrote in message
> news:26cac901-b901-4c4f
> On Friday, January 8, 2021 at 7:31:40 PM UTC-7, Randy Brukardt wrote:
> ...
> > >Can you give some examples here? 
> > All of the super-null nonsense occurs because of slices, which slows down 
> > any code that uses them. 
> What do you mean "super-null" nonsense?
> Any range that is null is treated as the same by Ada. Any null range where 
> the bounds are more than 1 apart is known as a "super-null" range. For 
> instance, one typically writes: 
> 
> N : String(1..0) 
> 
> to define a null string object. But you can write *any* null range here: 
> 
> N2 : String(314 .. 25); 
> 
> And it takes code to figure this out at runtime if either bound is nonstatic 
> (which is usually the case with slices). And you still have to store the 
> bounds (you can still ask for the bounds of N2, and one better get 314 and 
> 25, but the length is still zero). And N = N2, so compares are complicated 
> (one has to check the length, but that's more expensive to figure out than 
> just a subtract). You end up generating a lot of code to deal with a rather 
> unlikely case. 
I understand now.
You mention the case where the difference in the indices is 1 as being separate; why?
Also, would the sometimes talked about idea of a "null range" have helped the situation out?
(I don't think so, since the one is the required implementation, and the other is a syntax- and partially semantic-issue.)

> One of the reasons for suggesting using the bounded vector as a model is all 
> of this sort of stuff vanishes, as the lower bound never changes in the 
> vector packages. Only the upper bound moves, which is what is needed the 
> vast majority of the time.
True.
I always thought the C model of strings/arrays was stupid; even accepting array=address/ equating 'offset' and 'index' is just asking for trouble, and it forces you to discard half your possible length, given a signed "int".
Having an unsigned "int" and indexing off 1 gets rid of the idiocy of having a single negative value of importance (-1) for signalling "not here" to string handling functions...

> 
> Both the early Pascal compilers we used in the very early days of Janus/Ada, 
> and Janus/Ada itself, used a form of bounded strings rather than the more 
> complex Ada model. (The CS 701 language that we originally were tasked to 
> implement used bounded strings, and we kept it on our early commercial 
> compilers so we could concentrate on other functionality like packages and 
> private types.) That was a lot easier to use than the Ada model, it was a 
> huge adjustment to switch to the Ada form and it bloated the code as well. 
Interesting information.
I was thinking of implementing the thick-pointer as something like a private type simply being defined as

 -- Object: some 'tagged' type representing the bounds of an array.
 Type Object is tagged with private;
[...]
Function Length( Item : Object ) return Natural is
  (if Object.High > Object.Low then 0 else (Object.High-Object.Low)+1)

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-12 20:51                   ` Shark8
@ 2021-01-12 22:56                     ` Randy Brukardt
  2021-01-13 12:00                       ` Dmitry A. Kazakov
  0 siblings, 1 reply; 66+ messages in thread
From: Randy Brukardt @ 2021-01-12 22:56 UTC (permalink / raw)


"Shark8" <onewingedshark@gmail.com> wrote in message 
news:50f68100-8909-4fdb-ad26-14bcbc010775n@googlegroups.com...
> On Tuesday, January 12, 2021 at 1:12:35 AM UTC-7, Randy Brukardt wrote:
>> "Shark8" wrote in message
>> news:26cac901-b901-4c4f
>> On Friday, January 8, 2021 at 7:31:40 PM UTC-7, Randy Brukardt wrote:
>> ...
>> > >Can you give some examples here?
>> > All of the super-null nonsense occurs because of slices, which slows 
>> > down
>> > any code that uses them.
>> What do you mean "super-null" nonsense?
>> Any range that is null is treated as the same by Ada. Any null range 
>> where
>> the bounds are more than 1 apart is known as a "super-null" range. For
>> instance, one typically writes:
>>
>> N : String(1..0)
>>
>> to define a null string object. But you can write *any* null range here:
>>
>> N2 : String(314 .. 25);
>>
>> And it takes code to figure this out at runtime if either bound is 
>> nonstatic
>> (which is usually the case with slices). And you still have to store the
>> bounds (you can still ask for the bounds of N2, and one better get 314 
>> and
>> 25, but the length is still zero). And N = N2, so compares are 
>> complicated
>> (one has to check the length, but that's more expensive to figure out 
>> than
>> just a subtract). You end up generating a lot of code to deal with a 
>> rather
>> unlikely case.
> I understand now.
> You mention the case where the difference in the indices is 1 as being 
> separate; why?

Because the "natural" implementation of the length of an array works when 
the indices are one apart; you don't need extra code to deal with 
String(1..0) - subtracting the bounds and adding 1 (the usual length 
formula) works fine.

> Also, would the sometimes talked about idea of a "null range" have helped 
> the situation out?
> (I don't think so, since the one is the required implementation, and the 
> other is a syntax- and partially semantic-issue.)

Right, the problem is that the syntax allows too much, and any rule to avoid 
the problem necessarily will take code to make some sort of check.

...

                                       Randy.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-12 22:56                     ` Randy Brukardt
@ 2021-01-13 12:00                       ` Dmitry A. Kazakov
  2021-01-13 13:27                         ` AdaMagica
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-13 12:00 UTC (permalink / raw)


On 2021-01-12 23:56, Randy Brukardt wrote:
> "Shark8" <onewingedshark@gmail.com> wrote in message
> news:50f68100-8909-4fdb-ad26-14bcbc010775n@googlegroups.com...

>> You mention the case where the difference in the indices is 1 as being
>> separate; why?
> 
> Because the "natural" implementation of the length of an array works when
> the indices are one apart; you don't need extra code to deal with
> String(1..0) - subtracting the bounds and adding 1 (the usual length
> formula) works fine.

Nothing prevents implementation from using one of the bounds and the 
length in the array's dope vector. It is a question of optimization.

Then, maybe I am wrong, but I do not see anything that must prevent 
bounds on a null-array from sliding. They could slide to some canonical 
range making the "natural" implementation safe.

>> Also, would the sometimes talked about idea of a "null range" have helped
>> the situation out?
>> (I don't think so, since the one is the required implementation, and the
>> other is a syntax- and partially semantic-issue.)
> 
> Right, the problem is that the syntax allows too much, and any rule to avoid
> the problem necessarily will take code to make some sort of check.

Syntax can never be the problem, only semantics can.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-13 12:00                       ` Dmitry A. Kazakov
@ 2021-01-13 13:27                         ` AdaMagica
  2021-01-13 13:53                           ` Dmitry A. Kazakov
  0 siblings, 1 reply; 66+ messages in thread
From: AdaMagica @ 2021-01-13 13:27 UTC (permalink / raw)


Dmitry A. Kazakov schrieb am Mittwoch, 13. Januar 2021 um 13:00:19 UTC+1:
> Nothing prevents implementation from using one of the bounds and the 
> length in the array's dope vector. It is a question of optimization. 

type UC is array (Integer range <>) of Something;
procedure Proc (X: UC);

Here, no sliding is allowed in a call to Proc.
And, as Randy said before, X: UC (+1234..-1234); must return the correct values for 'First and 'Last.
So an optimization would need three values, First, Last, Length. Why not?

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-13 13:27                         ` AdaMagica
@ 2021-01-13 13:53                           ` Dmitry A. Kazakov
  0 siblings, 0 replies; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-13 13:53 UTC (permalink / raw)


On 2021-01-13 14:27, AdaMagica wrote:
> Dmitry A. Kazakov schrieb am Mittwoch, 13. Januar 2021 um 13:00:19 UTC+1:
>> Nothing prevents implementation from using one of the bounds and the
>> length in the array's dope vector. It is a question of optimization.
> 
> type UC is array (Integer range <>) of Something;
> procedure Proc (X: UC);
> 
> Here, no sliding is allowed in a call to Proc.
> And, as Randy said before, X: UC (+1234..-1234); must return the correct values for 'First and 'Last.

I am not sure that the invariant

    (I1..I2 => 0)'Last = I2

must be true while

    I1 + (I1..I2 => 0)'Length - 1 = I2

False.

Yes, it is kind of broken. One of them does not hold.

If we wanted to fix it we should respect mathematics where intervals of 
negative length are not empty intervals. So

    (0..-100 => 0)'Length = -99

or else

    Constraint_Error in the aggregate.

[And yes, that would break a lot of sloppy legacy code if we attempted 
to fix it]

> So an optimization would need three values, First, Last, Length. Why not?

Possibly. However, I see little problem with indices of null arrays. 
Semantically an index in the null array is meaningless and could be any 
[or none], because the whole idea of an index to point to an element. No 
elements, nothing to point to. Logically, Null_Array'Last should be 
Constraint_Error.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-12  8:12                 ` Randy Brukardt
  2021-01-12 20:51                   ` Shark8
@ 2021-01-13 14:08                   ` Jeffrey R. Carter
  1 sibling, 0 replies; 66+ messages in thread
From: Jeffrey R. Carter @ 2021-01-13 14:08 UTC (permalink / raw)


On 1/12/21 9:12 AM, Randy Brukardt wrote:
>        N2 : String(314 .. 25);
> 
> And it takes code to figure this out at runtime if either bound is nonstatic
> (which is usually the case with slices). And you still have to store the
> bounds (you can still ask for the bounds of N2, and one better get 314 and
> 25, but the length is still zero).

I don't think I have ever cared what values 'First and 'Last return for a null 
array. While I was aware that such super-null ranges were possible, I presumed 
that some canonical values were used for the bounds of a null array, regardless 
of the actual bounds given.

For a 1-D array X with X'Length = 0, my uses of the bounds of X are usually one of

I in X'range
-----
I := X'First;
followed by comparison to X'Last
-----
comparison of X'Length to zero
-----
if X is a string type, comparison of X to ""

Are there real-world cases where retrieving the actual bounds given is necessary?

-- 
Jeff Carter
"Your mother was a hamster and your father smelt of elderberries."
Monty Python & the Holy Grail
06

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-05 11:04 Lower bounds of Strings Stephen Davies
                   ` (4 preceding siblings ...)
  2021-01-06  3:08 ` Randy Brukardt
@ 2021-01-14 11:38 ` AdaMagica
  2021-01-14 12:27   ` Dmitry A. Kazakov
                     ` (3 more replies)
  5 siblings, 4 replies; 66+ messages in thread
From: AdaMagica @ 2021-01-14 11:38 UTC (permalink / raw)


Stephen Davies schrieb am Dienstag, 5. Januar 2021 um 12:04:33 UTC+1:
> I'm referring to the fact that any subprogram with a String 
> parameter, e.g. Expiration_Date, has to use something like 
> Expiration_Date (Expiration_Date'First .. Expiration_Date'First + 1) 
> to refer to the first two characters rather than simply saying 
> Expiration_Date (1..2). 
> 
> Not only is it ugly, but it's potentially dangerous if code uses the 
> latter and works for ages until one day somebody passes a slice instead 
> of a string starting at 1 (yes, compilers might generate warnings, 
> but that doesn't negate the issue, imho). 

I really do not see the problem here. If I want the first element, I write X(X'First).
Where's the problem?

In his paper about model railroads
http://www.cs.uni.edu/~mccormic/RealTime/
John McCormick came to the conclusion that one of the reasons why Ada was so successful was the fact that indices had not to start with 0 resp. 1, i.e. they may bear meaning. In such cases, it is absolute nonsense to slide slices to the first index value.

Also for enumeration indices, sliding does not make sense.

So why is the bad habit dangerous to think that the first element must have index one (or zero)? For me, this is a non sequitur.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-14 11:38 ` AdaMagica
@ 2021-01-14 12:27   ` Dmitry A. Kazakov
  2021-01-14 13:31   ` AdaMagica
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-14 12:27 UTC (permalink / raw)

On 2021-01-14 12:38, AdaMagica wrote:

> Also for enumeration indices, sliding does not make sense.

Sliding does not make sense for any type of index.

Again, people are confusing indices (cardinal) with positions (ordinal). 
These are distinct concepts and different types. E.g. A'Length is an 
ordinal numeral and thus has the type Universal_Integer. A'First is a 
cardinal numeral and is of the index type.

> So why is the bad habit dangerous to think that the first element must have index one (or zero)? For me, this is a non sequitur.

The first element may have no index at all, e.g. the first element of a 
list, the first character read from the input stream etc.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-14 11:38 ` AdaMagica
  2021-01-14 12:27   ` Dmitry A. Kazakov
@ 2021-01-14 13:31   ` AdaMagica
  2021-01-14 14:02   ` Jeffrey R. Carter
  2021-01-15 10:24   ` Stephen Davies
  3 siblings, 0 replies; 66+ messages in thread
From: AdaMagica @ 2021-01-14 13:31 UTC (permalink / raw)


AdaMagica schrieb am Donnerstag, 14. Januar 2021 um 12:38:29 UTC+1:
> So why is the bad habit dangerous to think that the first element must have index one (or zero)? For me, this is a non sequitur.

Äh, what I really wanted to say: This is a bad and dangerous habit to think indices must start with 0 or 1.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-14 11:38 ` AdaMagica
  2021-01-14 12:27   ` Dmitry A. Kazakov
  2021-01-14 13:31   ` AdaMagica
@ 2021-01-14 14:02   ` Jeffrey R. Carter
  2021-01-14 14:34     ` Dmitry A. Kazakov
  2021-01-15 10:24   ` Stephen Davies
  3 siblings, 1 reply; 66+ messages in thread
From: Jeffrey R. Carter @ 2021-01-14 14:02 UTC (permalink / raw)

On 1/14/21 12:38 PM, AdaMagica wrote:
> 
> I really do not see the problem here. If I want the first element, I write X(X'First).
> Where's the problem?
> 
> In his paper about model railroads
> http://www.cs.uni.edu/~mccormic/RealTime/
> John McCormick came to the conclusion that one of the reasons why Ada was so successful was the fact that indices had not to start with 0 resp. 1, i.e. they may bear meaning. In such cases, it is absolute nonsense to slide slices to the first index value.
> 
> Also for enumeration indices, sliding does not make sense.

The trouble is that this is not really discussing arrays. It's discussing 
sequences, implemented by arrays, such as String.

1-D arrays are often used to implement sequences. In arrays used as sequences, 
the indices are meaningless, and slicing, sliding, and sorting are often 
appropriate. As the indices are meaningless, it makes sense for them to be 
integers with a fixed lower bound of 1, since that is how we typically talk 
about positions in sequences. However, there are also many cases when it's 
useful to be able to have slices of sequences with a different lower bound, so 
remembering to use 'First is still important. Array types used as sequences are 
often unconstrained.

The other use ofarrays (1- and multidimensional) is as maps. In arrays as maps, 
the indices are meaningful, and slicing, sliding, and sorting are usually 
inappropriate. Array types used as maps are usually constrained.

Ada's Vector containers are really variable-length sequences.

In designing a new language, it might be useful to keep these two concepts separate.

-- 
Jeff Carter
"Nobody expects the Spanish Inquisition!"
Monty Python's Flying Circus
22

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-14 14:02   ` Jeffrey R. Carter
@ 2021-01-14 14:34     ` Dmitry A. Kazakov
  2021-01-14 15:28       ` Shark8
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-14 14:34 UTC (permalink / raw)

On 2021-01-14 15:02, Jeffrey R. Carter wrote:

> The other use ofarrays (1- and multidimensional) is as maps. In arrays 
> as maps, the indices are meaningful, and slicing, sliding, and sorting 
> are usually inappropriate.

More than appropriate in linear algebra, image processing, major 
application areas of multidimensional arrays.

> In designing a new language, it might be useful to keep these two 
> concepts separate.

I do not see how a sequence is less a map position->element than 
anything else.

In order to keep them separate there are types and interfaces. The 
language must simply support them. A typical 1-D array would implement 
both. A typical n-D array would possibly only one.

BTW, in image processing, for segmenting images I used to have them 
sequenced. The sequence looked this way:

  1  2  5  6 17 18 21 22
  3  4  7  8 19 20 23 24
  9 10 13 14 25 26 30 31
11 12 15 16 27 28 32 33

and so on. There are many sorts of algorithmically interesting mappings 
from n-D to sequence.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-14 14:34     ` Dmitry A. Kazakov
@ 2021-01-14 15:28       ` Shark8
  2021-01-14 15:41         ` Dmitry A. Kazakov
  0 siblings, 1 reply; 66+ messages in thread
From: Shark8 @ 2021-01-14 15:28 UTC (permalink / raw)


On Thursday, January 14, 2021 at 7:34:12 AM UTC-7, Dmitry A. Kazakov wrote:
 I do not see how a sequence is less a map position->element than 
> anything else. 
Oh, that's simple.
A sequence, at it's most abstract resembles a Stream.
Basically needing only a "Next"-function, "End_Error"-exception, and possibly a "Current"-function; assuming a one-element buffer.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-14 15:28       ` Shark8
@ 2021-01-14 15:41         ` Dmitry A. Kazakov
  2021-01-19 21:02           ` G.B.
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-14 15:41 UTC (permalink / raw)


On 2021-01-14 16:28, Shark8 wrote:
> On Thursday, January 14, 2021 at 7:34:12 AM UTC-7, Dmitry A. Kazakov wrote:
>   I do not see how a sequence is less a map position->element than
>> anything else.
> Oh, that's simple.
> A sequence, at it's most abstract resembles a Stream.
> Basically needing only a "Next"-function, "End_Error"-exception, and possibly a "Current"-function; assuming a one-element buffer.

That would be a sequential access interface. A common sequence as in 
mathematics has nth-element random access on top of it.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-14 11:38 ` AdaMagica
                     ` (2 preceding siblings ...)
  2021-01-14 14:02   ` Jeffrey R. Carter
@ 2021-01-15 10:24   ` Stephen Davies
  2021-01-15 11:41     ` J-P. Rosen
                       ` (2 more replies)
  3 siblings, 3 replies; 66+ messages in thread
From: Stephen Davies @ 2021-01-15 10:24 UTC (permalink / raw)


On Thursday, 14 January 2021 at 11:38:29 UTC, AdaMagica wrote:
> Stephen Davies schrieb am Dienstag, 5. Januar 2021 um 12:04:33 UTC+1: 
> I really do not see the problem here. If I want the first element,
> I write X(X'First). Where's the problem? 

Long_String_Name(1..2) is much nicer than
Long_String_Name(Long_String_Name'First..Long_String_Name'First+1)

subtype Some_Range is Positive range 4..5;
Some_String(Some_Range) -- erroneous if Some_String'First/=1

I think the root of the problem is that Ada Strings almost always
start at 1 (note that the functions in Ada.Strings.Fixed all
return Strings that start at 1), so the cases when they don't
are at best annoying, and potentially erroneous.

I'd say my favourite solution proposed so far is the following,
which surely can't be that hard to implement:
Some_String'Slide(Some_Range)

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 10:24   ` Stephen Davies
@ 2021-01-15 11:41     ` J-P. Rosen
  2021-01-15 17:35       ` Stephen Davies
  2021-01-15 11:48     ` Jeffrey R. Carter
  2021-01-16  9:30     ` G.B.
  2 siblings, 1 reply; 66+ messages in thread
From: J-P. Rosen @ 2021-01-15 11:41 UTC (permalink / raw)


Le 15/01/2021 à 11:24, Stephen Davies a écrit :
> I'd say my favourite solution proposed so far is the following,
> which surely can't be that hard to implement:
> Some_String'Slide(Some_Range)
Well, it's not that difficult to write a "slide" function, here it is as 
an expression function (with test program):

with Ada.Text_IO; use Ada.Text_IO;
procedure Test_Slide is
    function Slide (S : String) return String is
           (S(S'First) & S (S'First+1 .. S'Last));

    S1 : String (3 .. 5);
    S2 : String := Slide (S1);
begin
    Put ("S1"); Put (S1'First'Image); Put (S1'Last'Image); New_Line;
    Put ("S2"); Put (S2'First'Image); Put (S2'Last'Image); New_Line;
end Test_Slide ;


-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 10:24   ` Stephen Davies
  2021-01-15 11:41     ` J-P. Rosen
@ 2021-01-15 11:48     ` Jeffrey R. Carter
  2021-01-15 13:34       ` Dmitry A. Kazakov
                         ` (2 more replies)
  2021-01-16  9:30     ` G.B.
  2 siblings, 3 replies; 66+ messages in thread
From: Jeffrey R. Carter @ 2021-01-15 11:48 UTC (permalink / raw)

On 1/15/21 11:24 AM, Stephen Davies wrote:
> 
> I think the root of the problem is that Ada Strings almost always
> start at 1 (note that the functions in Ada.Strings.Fixed all
> return Strings that start at 1), so the cases when they don't
> are at best annoying, and potentially erroneous.

There are many cases where having String values with a lower bound other than 1 
is more convenient, clearer, and less error prone than if all String values have 
a lower bound of 1. For example

loop
    exit when End_Of_File;

    declare
       Line : constant String := Get_Line;
    begin
       Idx := 0;

       loop
          Idx := Index (Line (Idx + 1 .. Line'Last), Pattern);

          exit when Idx = 0;

          Put_Line (Item => Idx'Image);
       end loop;
    end;
end loop;

where Index is Ada.Strings.Fixed.Index. Even without comments and descriptive 
loop and block names, this is reasonably clear.

Compare that to a language where the slice slides to have a lower bound of 1 
(because Index takes a String, which always has a lower bound of 1), and you'll 
see that it is more complex, less clear, and has more opportunities for error 
than current Ada.

A string, being a sequence, should usually have a lower bound of 1, but a decent 
language needs to also allow string values with other lower bounds. Maybe 
something like

type String_Base is array (Positive range <>) of Character;
subtype String is String_Base (Positive range 1 .. <>);

Slices would be String_Base, not String, and Index would take String_Base.

-- 
Jeff Carter
"[I]t is more important to make the purpose
of the code unmistakable than to display
virtuosity. Even storage requirements and
execution time are unimportant by
comparison ..."
Elements of Programming Style
184

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 11:48     ` Jeffrey R. Carter
@ 2021-01-15 13:34       ` Dmitry A. Kazakov
  2021-01-15 13:56       ` Stephen Davies
  2021-01-15 14:00       ` Stephen Davies
  2 siblings, 0 replies; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-15 13:34 UTC (permalink / raw)


On 2021-01-15 12:48, Jeffrey R. Carter wrote:

> A string, being a sequence, should usually have a lower bound of 1, but 
> a decent language needs to also allow string values with other lower 
> bounds. Maybe something like
> 
> type String_Base is array (Positive range <>) of Character;
> subtype String is String_Base (Positive range 1 .. <>);
> 
> Slices would be String_Base, not String, and Index would take String_Base.

    S     : String_Base (1..1024);
    First : Integer := 1;
    Last  : Integer;
begin
    loop
       Get_Line (S (First..S'Last), Last);
       First := Last + 1;
       exit when ...;
    end loop;

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 11:48     ` Jeffrey R. Carter
  2021-01-15 13:34       ` Dmitry A. Kazakov
@ 2021-01-15 13:56       ` Stephen Davies
  2021-01-15 15:12         ` Jeffrey R. Carter
  2021-01-15 14:00       ` Stephen Davies
  2 siblings, 1 reply; 66+ messages in thread
From: Stephen Davies @ 2021-01-15 13:56 UTC (permalink / raw)


On Friday, 15 January 2021 at 11:48:27 UTC, Jeffrey R. Carter wrote:
> type String_Base is array (Positive range <>) of Character; 
> subtype String is String_Base (Positive range 1 .. <>); 

I wish it had been this way since the beginning. That way, in those
rare instances where code is really using the variable lower-bound,
the use of String_Base would the intention clear. Alas, making that
change now would break that code.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 11:48     ` Jeffrey R. Carter
  2021-01-15 13:34       ` Dmitry A. Kazakov
  2021-01-15 13:56       ` Stephen Davies
@ 2021-01-15 14:00       ` Stephen Davies
  2 siblings, 0 replies; 66+ messages in thread
From: Stephen Davies @ 2021-01-15 14:00 UTC (permalink / raw)


On Friday, 15 January 2021 at 11:48:27 UTC, Jeffrey R. Carter wrote:
> type String_Base is array (Positive range <>) of Character; 
> subtype String is String_Base (Positive range 1 .. <>); 

I wish it had been this way since the beginning. That way, in those
rare instances where code is really using the variable lower-bound,
the use of String_Base would make the intention clear. Alas, adopting
this now would break that code.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 13:56       ` Stephen Davies
@ 2021-01-15 15:12         ` Jeffrey R. Carter
  2021-01-15 17:22           ` Stephen Davies
  0 siblings, 1 reply; 66+ messages in thread
From: Jeffrey R. Carter @ 2021-01-15 15:12 UTC (permalink / raw)


On 1/15/21 2:56 PM, Stephen Davies wrote:
> On Friday, 15 January 2021 at 11:48:27 UTC, Jeffrey R. Carter wrote:
>> type String_Base is array (Positive range <>) of Character;
>> subtype String is String_Base (Positive range 1 .. <>);
> 
> I wish it had been this way since the beginning. That way, in those
> rare instances where code is really using the variable lower-bound,
> the use of String_Base would the intention clear. Alas, making that
> change now would break that code.

We have that now, with the substitutions

String_Base => String
String      => type S1 (Length : Natural is record
                   Value : String (1 .. Length);
                end record;
             or
                subtype S1 is String with Dynamic_Predicate => S1'First = 1;

-- 
Jeff Carter
"[I]t is more important to make the purpose
of the code unmistakable than to display
virtuosity. Even storage requirements and
execution time are unimportant by
comparison ..."
Elements of Programming Style
184

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 15:12         ` Jeffrey R. Carter
@ 2021-01-15 17:22           ` Stephen Davies
  2021-01-15 21:10             ` Jeffrey R. Carter
  0 siblings, 1 reply; 66+ messages in thread
From: Stephen Davies @ 2021-01-15 17:22 UTC (permalink / raw)


On Friday, 15 January 2021 at 15:12:39 UTC, Jeffrey R. Carter wrote:
> subtype S1 is String with Dynamic_Predicate => S1'First = 1;
Like I said before, I want Sliding, not bounds checking. I guess
most Usenet discussion eventually end up going around in circles.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 11:41     ` J-P. Rosen
@ 2021-01-15 17:35       ` Stephen Davies
  2021-01-15 19:36         ` Egil H H
                           ` (2 more replies)
  0 siblings, 3 replies; 66+ messages in thread
From: Stephen Davies @ 2021-01-15 17:35 UTC (permalink / raw)


On Friday, 15 January 2021 at 11:41:24 UTC, J-P. Rosen wrote:
> function Slide (S : String) return String is 
> (S(S'First) & S (S'First+1 .. S'Last)); 
To me, this is a fundamental enough operation that, in the absence of
being able to specify a subtype where it happens automatically, it at
least deserves to be an attribute. But then again, I also think that
the attribute Trim_Image should be added, and I know you disagree
with me on that too :-)

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 17:35       ` Stephen Davies
@ 2021-01-15 19:36         ` Egil H H
  2021-01-16 12:57           ` Stephen Davies
  2021-01-17 14:10         ` Stephen Davies
  2021-01-19  6:13         ` Gautier write-only address
  2 siblings, 1 reply; 66+ messages in thread
From: Egil H H @ 2021-01-15 19:36 UTC (permalink / raw)


On Friday, January 15, 2021 at 6:35:54 PM UTC+1, Stephen Davies wrote:
> On Friday, 15 January 2021 at 11:41:24 UTC, J-P. Rosen wrote: 
> > function Slide (S : String) return String is 
> > (S(S'First) & S (S'First+1 .. S'Last));
> To me, this is a fundamental enough operation that, in the absence of 
> being able to specify a subtype where it happens automatically, it at 
> least deserves to be an attribute. But then again, I also think that 
> the attribute Trim_Image should be added, and I know you disagree 
> with me on that too :-)

Not entirely automatically, but you can do something like this:

...

   procedure Foo(Item :in out String) is
      subtype Mono_String is String (1..Item'Length);
   
      procedure Inner_Foo(Item : in out Mono_String) is
      begin
         Ada.Text_IO.Put_Line(Positive'Image(Item'First) & Positive'Image(Item'Last));
         Item(1) := 'C';
      end Inner_Foo;

   begin
      Inner_Foo(Item);
   end Foo;

 ...

   Foobar : String := "Foobar";
begin
   Foo(Foobar(4..5));
   Ada.Text_IO.Put_Line(Foobar);
...

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 17:22           ` Stephen Davies
@ 2021-01-15 21:10             ` Jeffrey R. Carter
  0 siblings, 0 replies; 66+ messages in thread
From: Jeffrey R. Carter @ 2021-01-15 21:10 UTC (permalink / raw)


On 1/15/21 6:22 PM, Stephen Davies wrote:
> On Friday, 15 January 2021 at 15:12:39 UTC, Jeffrey R. Carter wrote:
>> subtype S1 is String with Dynamic_Predicate => S1'First = 1;
> Like I said before, I want Sliding, not bounds checking. I guess
> most Usenet discussion eventually end up going around in circles.

Then you would probably prefer the record version. Neither is perfect, but both, 
with appropriate conversion functions, give you the effect you want with current 
Ada.

-- 
Jeff Carter
"[I]t is more important to make the purpose
of the code unmistakable than to display
virtuosity. Even storage requirements and
execution time are unimportant by
comparison ..."
Elements of Programming Style
184

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 10:24   ` Stephen Davies
  2021-01-15 11:41     ` J-P. Rosen
  2021-01-15 11:48     ` Jeffrey R. Carter
@ 2021-01-16  9:30     ` G.B.
  2021-01-16 13:13       ` Stephen Davies
  2 siblings, 1 reply; 66+ messages in thread
From: G.B. @ 2021-01-16  9:30 UTC (permalink / raw)


On 15.01.21 11:24, Stephen Davies wrote:
> On Thursday, 14 January 2021 at 11:38:29 UTC, AdaMagica wrote:
>> Stephen Davies schrieb am Dienstag, 5. Januar 2021 um 12:04:33 UTC+1:
>> I really do not see the problem here. If I want the first element,
>> I write X(X'First). Where's the problem?
> 
> Long_String_Name(1..2) is much nicer than
> Long_String_Name(Long_String_Name'First..Long_String_Name'First+1)

Avoid literals for indexing.

Of course, that makes them all the more popular.
"On which side are you on 1 vs 0 for The First?"
(Discussion starts...)

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 19:36         ` Egil H H
@ 2021-01-16 12:57           ` Stephen Davies
  0 siblings, 0 replies; 66+ messages in thread
From: Stephen Davies @ 2021-01-16 12:57 UTC (permalink / raw)


On Friday, 15 January 2021 at 19:36:03 UTC, ehh.p...@gmail.com wrote:
> procedure Foo(Item :in out String) is 
> subtype Mono_String is String (1..Item'Length); 
> procedure Inner_Foo(Item : in out Mono_String) is 
> ...
That works, but I'd rather the language directly supported
something that achieved the same thing.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-16  9:30     ` G.B.
@ 2021-01-16 13:13       ` Stephen Davies
  0 siblings, 0 replies; 66+ messages in thread
From: Stephen Davies @ 2021-01-16 13:13 UTC (permalink / raw)


On Saturday, 16 January 2021 at 09:30:19 UTC, G.B. wrote:
> On 15.01.21 11:24, Stephen Davies wrote: 
> > Long_String_Name(1..2) is much nicer than 
> > Long_String_Name(Long_String_Name'First..Long_String_Name'First+1)
> Avoid literals for indexing. 
My very next example used a subtype for indexing,
which you omitted from your reply.
It makes me wonder if people like X(X'First) because it
gives the comfort of not using literals, even though it's
basically equivalent to X[0] in C, Python, etc.

> "On which side are you on 1 vs 0 for The First?" 
I like that Ada gives the choice of "Positive range <>" or
"Natural range <>".

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 17:35       ` Stephen Davies
  2021-01-15 19:36         ` Egil H H
@ 2021-01-17 14:10         ` Stephen Davies
  2021-01-19  5:48           ` Randy Brukardt
  2021-01-19  6:13         ` Gautier write-only address
  2 siblings, 1 reply; 66+ messages in thread
From: Stephen Davies @ 2021-01-17 14:10 UTC (permalink / raw)


On Friday, 15 January 2021 at 17:35:54 UTC, Stephen Davies wrote:
> On Friday, 15 January 2021 at 11:41:24 UTC, J-P. Rosen wrote: 
> > function Slide (S : String) return String is 
> > (S(S'First) & S (S'First+1 .. S'Last));
> To me, this is a fundamental enough operation that, in the absence of 
> being able to specify a subtype where it happens automatically, it at 
> least deserves to be an attribute. 
Also, a Slide function does not work for "out" and "in out" parameters.
Admittedly, ehh.p...'s workaround does solve this, but I would prefer
a proper solution in the language, e.g. a Slide attribute that acts as a
view conversion.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-17 14:10         ` Stephen Davies
@ 2021-01-19  5:48           ` Randy Brukardt
  0 siblings, 0 replies; 66+ messages in thread
From: Randy Brukardt @ 2021-01-19  5:48 UTC (permalink / raw)

"Stephen Davies" <joviangm@gmail.com> wrote in message 
news:1ea70290-a1a8-4a63-8f38-922e966084c9n@googlegroups.com...
> On Friday, 15 January 2021 at 17:35:54 UTC, Stephen Davies wrote:
>> On Friday, 15 January 2021 at 11:41:24 UTC, J-P. Rosen wrote:
>> > function Slide (S : String) return String is
>> > (S(S'First) & S (S'First+1 .. S'Last));
>> To me, this is a fundamental enough operation that, in the absence of
>> being able to specify a subtype where it happens automatically, it at
>> least deserves to be an attribute.
> Also, a Slide function does not work for "out" and "in out" parameters.
> Admittedly, ehh.p...'s workaround does solve this, but I would prefer
> a proper solution in the language, e.g. a Slide attribute that acts as a
> view conversion.

Thank god. Slices passed as in out parameters are the bane of the 
compiler-writers existence, and outside of types like String, have a very 
expensive implementation. On common machines like the x86, copying an 
arbitrary bit string from one location to another is not an easy operation 
to perform. (Remember, one can slice packed arrays, arrays of controlled 
objects, and other nasty cases. And with the sort of interface others here 
are proposing, you'd have to do it for various discontiguous 
representations, too.)

This way leads to madness -- at least of compiler implementers. ;-)

                             Randy.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-15 17:35       ` Stephen Davies
  2021-01-15 19:36         ` Egil H H
  2021-01-17 14:10         ` Stephen Davies
@ 2021-01-19  6:13         ` Gautier write-only address
  2 siblings, 0 replies; 66+ messages in thread
From: Gautier write-only address @ 2021-01-19  6:13 UTC (permalink / raw)


On Friday, January 15, 2021 at 6:35:54 PM UTC+1, Stephen Davies wrote:
> But then again, I also think that 
> the attribute Trim_Image should be added, and I know you disagree 
> with me on that too :-)

Just use the Image function here:
https://github.com/zertovitch/hac/blob/master/src/hal.ads#L161
...and don't worry again anymore :-)

There is also a user-friendly Image for floating-points, and both are used for the concatenation "&" operator in the same package, so usually you don't need to type " 'Trim_Image" or "Image" at all.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-14 15:41         ` Dmitry A. Kazakov
@ 2021-01-19 21:02           ` G.B.
  2021-01-19 22:27             ` Dmitry A. Kazakov
  0 siblings, 1 reply; 66+ messages in thread
From: G.B. @ 2021-01-19 21:02 UTC (permalink / raw)

On 14.01.21 16:41, Dmitry A. Kazakov wrote:
>
> That would be a sequential access interface. A common sequence as in mathematics has nth-element random access on top of it.

Mathematical sequences are not real. Their description is:

The finite variants are tuples. Component access is "at random".
The infinite variants tend to be construed so that randomness is a useful fiction;
a program can access an element if successors are produced one by one.
Order exists only after the fact.
The other sequences cannot be computed for the most part.

So, therefore, with regard to what should characterize sequences in
programming, I find sequential access to be the defining feature.

function Whop (S : Sequence) return Sequence with Impure;

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-19 21:02           ` G.B.
@ 2021-01-19 22:27             ` Dmitry A. Kazakov
  2021-01-20 20:10               ` G.B.
  0 siblings, 1 reply; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-19 22:27 UTC (permalink / raw)

On 2021-01-19 22:02, G.B. wrote:
> On 14.01.21 16:41, Dmitry A. Kazakov wrote:
>>
>> That would be a sequential access interface. A common sequence as in 
>> mathematics has nth-element random access on top of it.
> 
> Mathematical sequences are not real. Their description is:

Nope, only mathematics is real, the rest is fiction! (:-))

> The finite variants are tuples. Component access is "at random".
> The infinite variants tend to be construed so that randomness is a 
> useful fiction;

The term is countable infinite. Note the word countable, that guaranties 
that each element has an integer number, ergo randomly accessed by this 
number.

> a program can access an element if successors are produced one by one.
> Order exists only after the fact.

Order exists per definition of sequential access. More constrained 
interfaces could be sequences that cannot be enumerated backwards. E.g. 
there is Sequence.Next, but no Sequence.Previous, e.g. a queue end's 
interface.

> The other sequences cannot be computed for the most part.
> 
> So, therefore, with regard to what should characterize sequences in
> programming, I find sequential access to be the defining feature.

Nope. That is no feature, that is an *interface* with certain 
operations, like Next and Previous. This interface inherits the one-way 
interface. Array interface inherits both and defines further operations.

This is the standard procedure how mathematical structures interact, the 
language types must follow the principle. There is no other way.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-19 22:27             ` Dmitry A. Kazakov
@ 2021-01-20 20:10               ` G.B.
  2021-01-20 20:25                 ` Dmitry A. Kazakov
  0 siblings, 1 reply; 66+ messages in thread
From: G.B. @ 2021-01-20 20:10 UTC (permalink / raw)


On 19.01.21 23:27, Dmitry A. Kazakov wrote:

>> The infinite variants tend to be construed so that randomness is a useful fiction;
> 
> The term is countable infinite. Note the word countable, that guaranties that each element has an integer number, ergo randomly accessed by this number.

I'll happily leave it at that except for one thing. Let's count and access randomly:

    Nth_Prime (Exabyte'Last / 2) = ?

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lower bounds of Strings
  2021-01-20 20:10               ` G.B.
@ 2021-01-20 20:25                 ` Dmitry A. Kazakov
  0 siblings, 0 replies; 66+ messages in thread
From: Dmitry A. Kazakov @ 2021-01-20 20:25 UTC (permalink / raw)


On 2021-01-20 21:10, G.B. wrote:
> On 19.01.21 23:27, Dmitry A. Kazakov wrote:
> 
>>> The infinite variants tend to be construed so that randomness is a 
>>> useful fiction;
>>
>> The term is countable infinite. Note the word countable, that 
>> guaranties that each element has an integer number, ergo randomly 
>> accessed by this number.
> 
> I'll happily leave it at that except for one thing. Let's count and 
> access randomly:
> 
>     Nth_Prime (Exabyte'Last / 2) = ?

You didn't specify the type of Nth_Prime. But it any case where is a 
problem?

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2021-01-20 20:25 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-05 11:04 Lower bounds of Strings Stephen Davies
2021-01-05 11:57 ` Dmitry A. Kazakov
2021-01-05 12:32   ` Jeffrey R. Carter
2021-01-05 13:40     ` Dmitry A. Kazakov
2021-01-05 14:31       ` Stephen Davies
2021-01-05 17:24         ` Stephen Davies
2021-01-05 18:28           ` Jeffrey R. Carter
2021-01-05 21:02             ` Stephen Davies
2021-01-07 10:38               ` Stephen Davies
2021-01-07 21:39                 ` Randy Brukardt
2021-01-07 22:38                   ` Stephen Davies
2021-01-05 12:24 ` Luke A. Guest
2021-01-05 12:49 ` Simon Wright
2021-01-05 12:51 ` Jeffrey R. Carter
2021-01-06  3:08 ` Randy Brukardt
2021-01-06  9:13   ` Dmitry A. Kazakov
2021-01-07  0:17     ` Randy Brukardt
2021-01-07  9:57       ` Dmitry A. Kazakov
2021-01-07 22:03         ` Randy Brukardt
2021-01-08  9:04           ` Dmitry A. Kazakov
2021-01-08 17:23           ` Shark8
2021-01-08 20:19             ` Dmitry A. Kazakov
2021-01-09  2:18               ` Randy Brukardt
2021-01-09 10:53                 ` Dmitry A. Kazakov
2021-01-12  8:19                   ` Randy Brukardt
2021-01-12  9:37                     ` Dmitry A. Kazakov
2021-01-09  2:31             ` Randy Brukardt
2021-01-09 14:52               ` Why UTF-8 (was Re: Lower bounds of Strings) Jeffrey R. Carter
2021-01-09 18:08                 ` Dmitry A. Kazakov
2021-01-12  7:58                   ` Randy Brukardt
2021-01-11 21:35               ` Lower bounds of Strings Shark8
2021-01-12  8:12                 ` Randy Brukardt
2021-01-12 20:51                   ` Shark8
2021-01-12 22:56                     ` Randy Brukardt
2021-01-13 12:00                       ` Dmitry A. Kazakov
2021-01-13 13:27                         ` AdaMagica
2021-01-13 13:53                           ` Dmitry A. Kazakov
2021-01-13 14:08                   ` Jeffrey R. Carter
2021-01-14 11:38 ` AdaMagica
2021-01-14 12:27   ` Dmitry A. Kazakov
2021-01-14 13:31   ` AdaMagica
2021-01-14 14:02   ` Jeffrey R. Carter
2021-01-14 14:34     ` Dmitry A. Kazakov
2021-01-14 15:28       ` Shark8
2021-01-14 15:41         ` Dmitry A. Kazakov
2021-01-19 21:02           ` G.B.
2021-01-19 22:27             ` Dmitry A. Kazakov
2021-01-20 20:10               ` G.B.
2021-01-20 20:25                 ` Dmitry A. Kazakov
2021-01-15 10:24   ` Stephen Davies
2021-01-15 11:41     ` J-P. Rosen
2021-01-15 17:35       ` Stephen Davies
2021-01-15 19:36         ` Egil H H
2021-01-16 12:57           ` Stephen Davies
2021-01-17 14:10         ` Stephen Davies
2021-01-19  5:48           ` Randy Brukardt
2021-01-19  6:13         ` Gautier write-only address
2021-01-15 11:48     ` Jeffrey R. Carter
2021-01-15 13:34       ` Dmitry A. Kazakov
2021-01-15 13:56       ` Stephen Davies
2021-01-15 15:12         ` Jeffrey R. Carter
2021-01-15 17:22           ` Stephen Davies
2021-01-15 21:10             ` Jeffrey R. Carter
2021-01-15 14:00       ` Stephen Davies
2021-01-16  9:30     ` G.B.
2021-01-16 13:13       ` Stephen Davies

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox