comp.lang.ada
 help / color / mirror / Atom feed
* Problem with Unbounded Strings
@ 2015-12-04 15:49 Laurent
  2015-12-04 16:04 ` Dmitry A. Kazakov
  2015-12-04 18:08 ` Jeffrey R. Carter
  0 siblings, 2 replies; 22+ messages in thread
From: Laurent @ 2015-12-04 15:49 UTC (permalink / raw)


Hi

I have a problem with unbounded strings. When I append a String(1..10) to my Buffer which is an unbounded string it works. 

So my Buffer has the length 10. 

If I use this buffer of length 10 and append again a String(1..10) the length of the buffer should increase to 20 and I should have the content of both strings?

Somehow in the procedure I have written that doesn't happen. The Buffer seems to get fixed to the length of the first string and further appending doesn't change it. The new string gets appended but truncated at its begin and the buffer at the end.

I have no idea why that happens. I suppose something silly as always.

There is no difference between using append or "&". The behaviour is the same.

If someone could enlighten me I would be happy. 

Thanks

Laurent

   package U_B renames Ada.Strings.Unbounded;
   subtype Ub_S is U_B.Unbounded_String;

  function "+" (Right : U_B.Unbounded_String) return String
                 renames U_B.To_String;

   function "+" (Right : String) return U_B.Unbounded_String
   is (U_B.To_Unbounded_String (Right));

---------------

procedure Read_Message (File         : in Ada.Text_IO.File_Type;
                                           Buffer       : out Ub_S;
                                      Has_Finished : out Boolean)
   is
   begin -- Read_Message
      Has_Finished := False;
      Buffer := +"";

      while not Ada.Text_IO.End_Of_File (Log_File) loop
         declare
            S     : String := Ada.Text_IO.Get_Line (File => File);
            Line : String (1 .. S'Length - TStamp);
            -- removes this useless timestamp

         begin -- declare

            Line (1 .. Line'Last) := S (38 .. S'Last);
            
            if Line (1 .. 5) = EOT then
               -- we have reached the end of the message so we can exit the loop
               -- and hand over the Buffer for treatment
               -- Has_Finished has to be set to True

               Has_Finished := True;
               exit; -- end of message

            elsif Line (1 .. 5) = ETX
              or Line (1 .. 5) = ACK
              or Line (1 .. 5) = ENQ
              or Line (1 .. 5) = STX
              or Line (1 .. 4) = GS
            -- if Line contains ETX, ACK, ENQ, STX or GS commands,
            -- we don't need them, so just ignore
            then
               null;

            elsif Line (1 .. 4) = RS then
               -- so Line contains RS command which indicates the begin of
               -- a message so we have to take care of those lines

                 U_B.Append (Source   => Buffer,
                            New_Item => Line (5 .. Line'Last));

               -- Buffer := U_B."&" (Buffer,Line ( 5 .. Line'Last));

            else
               -- Line is something else, like some info about the succes of
               -- the backup engine. who cares?
               null;
            end if;
         end;
      end loop;
   end Read_Message;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 15:49 Problem with Unbounded Strings Laurent
@ 2015-12-04 16:04 ` Dmitry A. Kazakov
  2015-12-04 16:53   ` Laurent
  2015-12-04 18:08 ` Jeffrey R. Carter
  1 sibling, 1 reply; 22+ messages in thread
From: Dmitry A. Kazakov @ 2015-12-04 16:04 UTC (permalink / raw)


On Fri, 4 Dec 2015 07:49:28 -0800 (PST), Laurent wrote:

> I have a problem with unbounded strings. When I append a String(1..10) to
> my Buffer which is an unbounded string it works. 

You should not use Ubounded_String for I/O and parsing.
 
> procedure Read_Message (File         : in Ada.Text_IO.File_Type;
>                                            Buffer       : out Ub_S;
>                                       Has_Finished : out Boolean)

Pass a String buffer inside. Read into that buffer (Get_Line procedure, not
function). Handle buffer overrun (Last = Buffer'Last) as a protocol error. 

Don't make any dynamic allocations inside. If you have to return the
payload data, do that through the indices to the beginning and end of the
data in the buffer. Better yet to call a semantic callback with the buffer
slice.

>          begin -- declare
>             Line (1 .. Line'Last) := S (38 .. S'Last);

Don't copy anything, use indices in the raw buffer or pass a slice
downward. The only case where copying comes in question is when decoding
data.
             
>             elsif Line (1 .. 5) = ETX
>               or Line (1 .. 5) = ACK
>               or Line (1 .. 5) = ENQ
>               or Line (1 .. 5) = STX
>               or Line (1 .. 4) = GS

Always use a case statement instead of cascaded IFs. (I have no idea how
ETX could be 5 characters long...)

P.S. Ubounded_String Append certainly works OK. You just don't use it.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 16:04 ` Dmitry A. Kazakov
@ 2015-12-04 16:53   ` Laurent
  2015-12-04 18:00     ` Niklas Holsti
  2015-12-04 20:34     ` Dmitry A. Kazakov
  0 siblings, 2 replies; 22+ messages in thread
From: Laurent @ 2015-12-04 16:53 UTC (permalink / raw)


I don't use append? What is this then doing? Where U_B = ada.strings.unbound...

                 U_B.Append (Source   => Buffer, 
                            New_Item => Line (5 .. Line'Last)); 

ETX is as the other commands 5 chars long because in the txt file I read from, it is  used as <ETX> . To lazy to rewrite the <>. So ETX : String constant := "<ETX>";...

I tried a case statement at beginning but I got an error don't remember now. Something about needs to be a real variable? 

For the rest I have to wait until I'm at home again and if I can figure out what you tried to explain. 

Thanks


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 16:53   ` Laurent
@ 2015-12-04 18:00     ` Niklas Holsti
  2015-12-04 21:05       ` Laurent
  2015-12-04 20:34     ` Dmitry A. Kazakov
  1 sibling, 1 reply; 22+ messages in thread
From: Niklas Holsti @ 2015-12-04 18:00 UTC (permalink / raw)


On 15-12-04 18:53 , Laurent wrote:
> I don't use append? What is this then doing? Where U_B = ada.strings.unbound...
>
>                   U_B.Append (Source   => Buffer,
>                              New_Item => Line (5 .. Line'Last));

Dmitry is (as usual) trying to teach (or preach) his own style of 
coding, and is not really looking at your problem.

Meanwhile, are you sure that the statement

    while not Ada.Text_IO.End_Of_File (Log_File) loop

should be using "Log_File", and not the "File" parameter?

(To be honest, I did not detect this error until I tried to make a 
compilable version of your code.)

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 15:49 Problem with Unbounded Strings Laurent
  2015-12-04 16:04 ` Dmitry A. Kazakov
@ 2015-12-04 18:08 ` Jeffrey R. Carter
  2015-12-04 21:21   ` Laurent
  1 sibling, 1 reply; 22+ messages in thread
From: Jeffrey R. Carter @ 2015-12-04 18:08 UTC (permalink / raw)


I would ignore Kazakov's comments. While he prefers to avoid unbounded strings,
many others don't, so if the characteristics of Unbounded_String are suitable
for your project there is no reason not to use them.

You can't use a case statement because you're comparing strings, and case
statements can only be used for discrete types.

On 12/04/2015 08:49 AM, Laurent wrote:
> procedure Read_Message (File         : in Ada.Text_IO.File_Type;
>  ...
>       while not Ada.Text_IO.End_Of_File (Log_File) loop

What is Log_File?

>                  U_B.Append (Source   => Buffer,
>                             New_Item => Line (5 .. Line'Last));

I see nothing wrong with this use of append. If it's possible that 5 > Line'Last
then you would be appending a null string, which would not change the value of
Buffer. It would be nice if you could reduce your example to eliminate the
irrelevant parts and provide some input and corresponding output (and expected
output) that demonstrates your problem. Otherwise I'm unable to to offer any
advice. Certainly I've made extensive use of Unbounded_String and never
encountered any problem with Append, so you probably have a logic problem.

-- 
Jeff Carter
"Well, a gala day is enough for me. I don't think
I can handle any more."
Duck Soup
93


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 16:53   ` Laurent
  2015-12-04 18:00     ` Niklas Holsti
@ 2015-12-04 20:34     ` Dmitry A. Kazakov
  2015-12-04 21:43       ` Laurent
  1 sibling, 1 reply; 22+ messages in thread
From: Dmitry A. Kazakov @ 2015-12-04 20:34 UTC (permalink / raw)


On Fri, 4 Dec 2015 08:53:54 -0800 (PST), in comp.lang.ada you wrote:

> I don't use append?

No. There is nothing that is not in the buffer read already. Why would you
append anything?

In some protocols payload data are accumulated from several packets.
Unbounded_String is not used for that either. You want to limit the size of
the accumulated data to prevent DoS attacks unless the protocol limits that
already, as most protocols do. Simply keep the buffer in the communication
object, that is all.

> What is this then doing? Where U_B = ada.strings.unbound...
> 
>                  U_B.Append (Source   => Buffer, 
>                             New_Item => Line (5 .. Line'Last)); 
> 
> ETX is as the other commands 5 chars long because in the txt file I read
> from, it is  used as <ETX> . To lazy to rewrite the <>. So ETX : String
> constant := "<ETX>";...

In such cases I am usually match the current string prefix against a table
of tokens (table-driven recursive descent).

But if you want an explicit form it is like this (a recursive descent
parser):

   case Buffer (Index) is -- Index is the current position
      when '<' =>
         Index := Index + 1;
         case Buffer (Index) is
            when 'A' =>
               Index := Index + 1;
               ...
            when 'E' =>
               Index := Index + 1;
               ...
            when 'G' =>
               Index := Index + 1;
               ...
            when 'R' =>
               Index := Index + 1;
               ...
            when 'S' =>
               Index := Index + 1;
               ...
            when others =>
               raise Protocol_Error with
                     "ACK, ETX, ENQ, STX or GS is expected at" &
                     Integer'Image (Index);
      when others =>
         raise Protocol_Error with
                "< is expected at" & Integer'Image (Index);
   end case;

The point is to visit each source character just once and prevent excessive
copying. That is also the reason why you practically never need or should
not use Unbounded_String for I/O. It won't make your life easier anyway.
You would have to convert it to String anytime you would use its contents
as your code demonstrates. The code will be cleaner, safer and many times
faster.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 18:00     ` Niklas Holsti
@ 2015-12-04 21:05       ` Laurent
  0 siblings, 0 replies; 22+ messages in thread
From: Laurent @ 2015-12-04 21:05 UTC (permalink / raw)


On Friday, 4 December 2015 19:00:46 UTC+1, Niklas Holsti  wrote:

> Meanwhile, are you sure that the statement
> 
>     while not Ada.Text_IO.End_Of_File (Log_File) loop
> 
> should be using "Log_File", and not the "File" parameter?
> 
> (To be honest, I did not detect this error until I tried to make a 
> compilable version of your code.)

Yes indeed should be File. Didn't get an compile error. Log_File is defined somewhere else so the compiler didn't complain.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 18:08 ` Jeffrey R. Carter
@ 2015-12-04 21:21   ` Laurent
  2015-12-04 21:59     ` Simon Wright
                       ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Laurent @ 2015-12-04 21:21 UTC (permalink / raw)


On Friday, 4 December 2015 19:08:11 UTC+1, Jeffrey R. Carter  wrote:
> I would ignore Kazakov's comments. While he prefers to avoid unbounded strings,
> many others don't, so if the characteristics of Unbounded_String are suitable
> for your project there is no reason not to use them.
> 
> You can't use a case statement because you're comparing strings, and case
> statements can only be used for discrete types.
> 
> On 12/04/2015 08:49 AM, Laurent wrote:
> > procedure Read_Message (File         : in Ada.Text_IO.File_Type;
> >  ...
> >       while not Ada.Text_IO.End_Of_File (Log_File) loop
> 
> What is Log_File?
> 
> >                  U_B.Append (Source   => Buffer,
> >                             New_Item => Line (5 .. Line'Last));
> 
> I see nothing wrong with this use of append. If it's possible that 5 > Line'Last
> then you would be appending a null string, which would not change the value of
> Buffer. It would be nice if you could reduce your example to eliminate the
> irrelevant parts and provide some input and corresponding output (and expected
> output) that demonstrates your problem. Otherwise I'm unable to to offer any
> advice. Certainly I've made extensive use of Unbounded_String and never
> encountered any problem with Append, so you probably have a logic problem.

Ok I have reduced the whole thing to the minimum and still get the same behaviour + an additional glitch

The Log_File I have reduced it to the minimum too:

0123456789
abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ

I expect my code to do this:

1st read: 0123456789
append to the buffer which is empty so content of buffer is: 0123456789

2nd read: abcdefghijklmnopqrstuvwxyz
append to buffer so it becomes: 0123456789abcdefghijklmnopqrstuvwxyz

3rd read: ABCDEFGHIJKLMNOPQRSTUVWXYZ
append to buffer so it would be: 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

But I get this:

1st read:
Line: 0123456789
Buffer after append: 0123456789

2nd read:
Line: abcdefghijklmnopqrstuvwxyz
abcdefghijklmnopqrstuvwxyz56789

3rd read:
Line: ABCDEFGHIJKLMNOPQRSTUVWXYZ
ABCDEFGHIJKLMNOPQRSTUVWXYZ56789

I first thought that I could have inverted source and new_item. But there are only 2 appends in Unbound_String. One which accepts only Unbound_Strings and one where source:Unbound_String and New_Item: String. So I can't inverse them, the compiler would complain. But the output looks a bit like inverted. So no idea.

And as a bonus after the first loop 
Ada.Text_IO.Put_Line ("Buffer after append: " & (+Buffer)); does no longer print the "Buffer after append: " part. No idea why.

So the reduced code which compiles. 

with Ada.Text_IO;
with Ada.Strings.Unbounded;

procedure Test_UB is
   
   package U_B renames Ada.Strings.Unbounded;
   subtype Ub_S is U_B.Unbounded_String;

   function "+" (Right : U_B.Unbounded_String) return String
                 renames U_B.To_String;

   function "+" (Right : String) return U_B.Unbounded_String
   is (U_B.To_Unbounded_String (Right));

   Log_File    : Ada.Text_IO.File_Type;
   Message_Buf : Ub_S;

   ------------------
   -- Read_Message --
   ------------------
   procedure Read_Message (File         : in Ada.Text_IO.File_Type;
                           Buffer       : out Ub_S)
   is
   begin -- Read_Message

      while not Ada.Text_IO.End_Of_File (File) loop
         declare
            Line     : String := Ada.Text_IO.Get_Line (File => File);

         begin -- declare

            Ada.Text_IO.Put_Line ("Line: " & Line);

            U_B.Append (Source   => Buffer,
                        New_Item => Line);

            Ada.Text_IO.Put_Line ("Buffer after append: " & (+Buffer));
            Ada.Text_IO.New_Line;
            
         end;
      end loop;
   end Read_Message;

begin -- Main

   Ada.Text_IO.Open (File => Log_File,
                     Mode => Ada.Text_IO.In_File, Name => "Test.log");

   Read_Message (File         => Log_File,
                 Buffer       => Message_Buf);

   Ada.Text_IO.Close (File => Log_File);

end Test_UB;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 20:34     ` Dmitry A. Kazakov
@ 2015-12-04 21:43       ` Laurent
  2015-12-04 22:23         ` Dmitry A. Kazakov
  0 siblings, 1 reply; 22+ messages in thread
From: Laurent @ 2015-12-04 21:43 UTC (permalink / raw)


On Friday, 4 December 2015 21:34:59 UTC+1, Dmitry A. Kazakov  wrote:
> On Fri, 4 Dec 2015 08:53:54 -0800 (PST), in comp.lang.ada you wrote:
> 
> > I don't use append?
> 
> No. There is nothing that is not in the buffer read already. Why would you
> append anything?
> 
> In some protocols payload data are accumulated from several packets.
> Unbounded_String is not used for that either. You want to limit the size of
> the accumulated data to prevent DoS attacks unless the protocol limits that
> already, as most protocols do. Simply keep the buffer in the communication
> object, that is all.

DoS attacks? Why should I care in this case. It is more an exercise/for fun. If the guy from the company which installed the communication apps is right, then I won't need to suck this informations from the log files. For the case that he is wrong and the informations haven't been stored somewhere on the server, the devs can write their own thing. I don't get paid for this. Is still better than watching the crap on tv.

> The point is to visit each source character just once and prevent excessive
> copying. That is also the reason why you practically never need or should
> not use Unbounded_String for I/O. It won't make your life easier anyway.
> You would have to convert it to String anytime you would use its contents
> as your code demonstrates. The code will be cleaner, safer and many times
> faster.

I don't have the necessary skills/experience to argument about this. 

I use the Unbounded Strings because I don't know in advance how many lines the message will be long. So I collect everything in a buffer, appending the following lines, so that it makes sense and when the message is complete, feed it to next procedure which slices of the interesting parts and uses them to generate a message object which I store somewhere or whatever if I should ever get so far.

A short example from the log file. Every new line begins with the time stamp. The only thing I want to get are those lines containing an <RS> and which is behind. Because the length of the lines is fixed to 121 chars, everything which is too long gets hard wrapped. If I don't put everything together I will lose informations, won't I?

[19/09/2015 00:05:39:526][INFO ]  <- <ENQ>
[19/09/2015 00:05:39:564][INFO ]  -> <ACK>
[19/09/2015 00:05:39:565][INFO ]  <- <STX>
[19/09/2015 00:05:39:565][INFO ]  <- <RS>mtmpr|pi2015120412345|pnTest_Name|pb08/08/1990|psF|si|ssURINES|stUrines spon
[19/09/2015 00:05:39:650][INFO ]  <- <RS>tanees|sp903775|pp903775|pa18/09/2015|s119/09/2015|s212:00|s319/09/2015|s412:04|
[19/09/2015 00:05:39:735][INFO ]  <- <RS>ci501509190001|ctAERO|sl100115|pl100115|pda|sxA|zz|
[19/09/2015 00:05:39:787][INFO ]  <- <GS>53
[19/09/2015 00:05:39:793][INFO ]  <- <ETX>
[19/09/2015 00:05:39:834][INFO ]  -> <ACK>
[19/09/2015 00:05:39:839][INFO ]  <- <EOT>
[19/09/2015 01:37:43:096][WARN ]  Handle event: TIMEOUT_EVENT HostResponseTimeout (3000 milliseconds)
[19/09/2015 01:37:43:096][INFO ]  <- <ENQ>
[19/09/2015 01:37:46:109][WARN ]  Handle event: TIMEOUT_EVENT HostResponseTimeout (3000 milliseconds)
[19/09/2015 01:37:46:109][ERROR]  Report Enq failure
[19/09/2015 01:37:46:109][ERROR]  Signal Enq failure
[19/09/2015 01:37:46:109][INFO ]  <- <ENQ>
[19/09/2015 01:37:49:119][WARN ]  Handle event: TIMEOUT_EVENT HostResponseTimeout (3000 milliseconds)
[19/09/2015 01:37:49:119][ERROR]  Report Enq failure
[19/09/2015 01:37:49:119][ERROR]  Signal Enq failure
[19/09/2015 01:37:49:119][INFO ]  <- <ENQ>
[19/09/2015 01:38:00:088][INFO ]  <- <ENQ>
[19/09/2015 01:38:03:096][WARN ]  Handle event: TIMEOUT_EVENT HostResponseTimeout (3000 milliseconds)
[19/09/2015 01:38:03:096][INFO ]  <- <ENQ>
[19/09/2015 07:09:32:624][INFO ]  -> <ENQ>
[19/09/2015 07:09:32:624][INFO ]  <- <ACK>
[19/09/2015 07:09:32:662][INFO ]  -> <STX>
[19/09/2015 07:09:32:782][INFO ]  -> <RS>mtrsl|pi2015120412345|p2185|pp906347|p5906347|si|s044587|ssHEMO|s5HEMO|ci5015091
[19/09/2015 07:09:32:879][INFO ]  -> <RS>70001|c044587|ctHEMOAE|ta|rtAST-N264|rr11437321|t11|o1esccol|ra|a1tem|a3<=4|a4S|
[19/09/2015 07:09:32:976][INFO ]  -> <RS>ra|a1am|a3>=32|a4R|ra|a1amc|a38|a4S|ra|a1tzp|a3<=4|a4S|ra|a1rox|a34|a4S|ra|a1tax
[19/09/2015 07:09:33:072][INFO ]  -> <RS>|a3<=1|a4S|ra|a1taz|a3<=1|a4S|ra|a1fep|a3<=1|a4S|ra|a1etp|a3<=0,5|a4S|ra|a1mem|a
[19/09/2015 07:09:33:169][INFO ]  -> <RS>3<=0,25|a4S|ra|a1an|a3<=2|a4S|ra|a1gm|a3<=1|a4S|ra|a1cip|a3<=0,25|a4S|ra|a1lev|a
[19/09/2015 07:09:33:266][INFO ]  -> <RS>3<=0,12|a4S|ra|a1tgc|a3<=0,5|a4S|ra|a1fos|a3<=16|a4S|ra|a1ftn|a3<=16|a4S|ra|a1sx
[19/09/2015 07:09:33:309][INFO ]  -> <RS>t|a3<=20|a4S|zz|
[19/09/2015 07:09:33:309][INFO ]  -> <GS>1c
[19/09/2015 07:09:33:332][INFO ]  -> <ETX>
[19/09/2015 07:09:33:337][INFO ]  <- <ACK>
[19/09/2015 07:09:33:375][INFO ]  -> <EOT>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 21:21   ` Laurent
@ 2015-12-04 21:59     ` Simon Wright
  2015-12-04 23:19       ` Laurent
  2015-12-04 22:02     ` Dmitry A. Kazakov
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 22+ messages in thread
From: Simon Wright @ 2015-12-04 21:59 UTC (permalink / raw)


Laurent <lutgenl@icloud.com> writes:

> The Log_File I have reduced it to the minimum too:
>
> 0123456789
> abcdefghijklmnopqrstuvwxyz
> ABCDEFGHIJKLMNOPQRSTUVWXYZ
>
> I expect my code to do this:
>
> 1st read: 0123456789
> append to the buffer which is empty so content of buffer is: 0123456789
>
> 2nd read: abcdefghijklmnopqrstuvwxyz
> append to buffer so it becomes: 0123456789abcdefghijklmnopqrstuvwxyz
>
> 3rd read: ABCDEFGHIJKLMNOPQRSTUVWXYZ
> append to buffer so it would be: 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

This is exactly what I get with OS X El Capitan & FSF GCC 5.2.0. Also
GNAT GPL 2015.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 21:21   ` Laurent
  2015-12-04 21:59     ` Simon Wright
@ 2015-12-04 22:02     ` Dmitry A. Kazakov
  2015-12-04 22:28     ` Jeffrey R. Carter
  2015-12-04 23:00     ` Ben Bacarisse
  3 siblings, 0 replies; 22+ messages in thread
From: Dmitry A. Kazakov @ 2015-12-04 22:02 UTC (permalink / raw)


On Fri, 4 Dec 2015 13:21:55 -0800 (PST), Laurent wrote:

[...]

Works just fine with GNAT GPL 2015.

Try to print Length (Buffer) as well to see how long the string actually
is. You could have problems with controlling characters.

1. When dealing with character files, it is better to use Stream I/O
instead of Text_IO and recognize line ends manually instead of Get_Line.
The way the protocol defines line termination might be incompatible with
one of Get_Line and/or the OS.

2. Buffer should be an "in out" parameter. It is not a big difference in
this case, but cleaner.

3. You should never use End_Of_File. It might be quite expensive or not
working, e.g. some the protocol may deploy explicit terminators, like EOF.
Exit file read loops with End_Error exception.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 21:43       ` Laurent
@ 2015-12-04 22:23         ` Dmitry A. Kazakov
  2015-12-05 10:22           ` Laurent
  0 siblings, 1 reply; 22+ messages in thread
From: Dmitry A. Kazakov @ 2015-12-04 22:23 UTC (permalink / raw)


On Fri, 4 Dec 2015 13:43:22 -0800 (PST), Laurent wrote:

> On Friday, 4 December 2015 21:34:59 UTC+1, Dmitry A. Kazakov  wrote:
>> On Fri, 4 Dec 2015 08:53:54 -0800 (PST), in comp.lang.ada you wrote:
>> 
>>> I don't use append?
>> 
>> No. There is nothing that is not in the buffer read already. Why would you
>> append anything?
>> 
>> In some protocols payload data are accumulated from several packets.
>> Unbounded_String is not used for that either. You want to limit the size of
>> the accumulated data to prevent DoS attacks unless the protocol limits that
>> already, as most protocols do. Simply keep the buffer in the communication
>> object, that is all.
> 
> DoS attacks? Why should I care in this case.

Corrupt files?

> I use the Unbounded Strings because I don't know in advance how many lines
> the message will be long. So I collect everything in a buffer, appending
> the following lines, so that it makes sense and when the message is
> complete, feed it to next procedure which slices of the interesting parts
> and uses them to generate a message object which I store somewhere or
> whatever if I should ever get so far.

No, the strategy is to pass lines or other parts of the message further
instead of accumulating it in the memory. Instead of this "hamster" design
I would do something like:

type Log_Parser
     (  Stream : not null access Root_Stream_Type'Class;
        Size : Positive
     )  is new Ada.Finalization.Limited_Controlled with
record
   Buffer : String (1..Size); -- Input is accumulated here
   Last : Natural; -- Current message line last character in Line_Buffer
   Stamp : Time; -- Current message's time stamp
   Line_No : Positive; -- Current message's line number
   ...
end record;

function Get_Line (Source : not null access Log_Parser) return String;
   -- reads from stream and returns Source.Buffer (1..Source.Last)

This will work with files as well. The client will call Get_Line for each
message line.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 21:21   ` Laurent
  2015-12-04 21:59     ` Simon Wright
  2015-12-04 22:02     ` Dmitry A. Kazakov
@ 2015-12-04 22:28     ` Jeffrey R. Carter
  2015-12-04 23:35       ` Laurent
  2015-12-04 23:00     ` Ben Bacarisse
  3 siblings, 1 reply; 22+ messages in thread
From: Jeffrey R. Carter @ 2015-12-04 22:28 UTC (permalink / raw)


On 12/04/2015 02:21 PM, Laurent wrote:
> 
> 1st read: 0123456789
> append to the buffer which is empty so content of buffer is: 0123456789
> 
> 2nd read: abcdefghijklmnopqrstuvwxyz
> append to buffer so it becomes: 0123456789abcdefghijklmnopqrstuvwxyz
> 
> 3rd read: ABCDEFGHIJKLMNOPQRSTUVWXYZ
> append to buffer so it would be: 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

This is what your code does for me, compiled with GNAT 4.9.2 under Linux Mint
Debian Edition.

One possible explanation is that your file was produced by a Windows system, and
you're running your program on a Unix-based platform. In that case, Get_Line
will return a string with a CR at the end. When displayed, the CR will return
the cursor to the beginning of the line. So you begin by outputting

Buffer after append: 0123456789

You then output the CR, and the cursor returns to the beginning of the line.
Then you output

abcdefghijklmnopqrstuvwxyz

which overwrites everything already displayed through the '4'.

-- 
Jeff Carter
"Well, a gala day is enough for me. I don't think
I can handle any more."
Duck Soup
93

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 21:21   ` Laurent
                       ` (2 preceding siblings ...)
  2015-12-04 22:28     ` Jeffrey R. Carter
@ 2015-12-04 23:00     ` Ben Bacarisse
  3 siblings, 0 replies; 22+ messages in thread
From: Ben Bacarisse @ 2015-12-04 23:00 UTC (permalink / raw)


Laurent <lutgenl@icloud.com> writes:
<snip>
> Ok I have reduced the whole thing to the minimum and still get the
> same behaviour + an additional glitch
>
> The Log_File I have reduced it to the minimum too:
>
> 0123456789
> abcdefghijklmnopqrstuvwxyz
> ABCDEFGHIJKLMNOPQRSTUVWXYZ
<snip>
> But I get this:
>
> 1st read:
> Line: 0123456789
> Buffer after append: 0123456789
>
> 2nd read:
> Line: abcdefghijklmnopqrstuvwxyz
> abcdefghijklmnopqrstuvwxyz56789
>
> 3rd read:
> Line: ABCDEFGHIJKLMNOPQRSTUVWXYZ
> ABCDEFGHIJKLMNOPQRSTUVWXYZ56789

Just a data point.  I get:

Line: 0123456789
Buffer after append: 0123456789

Line: abcdefghijklmnopqrstuvwxyz
Buffer after append: 0123456789abcdefghijklmnopqrstuvwxyz

Line: ABCDEFGHIJKLMNOPQRSTUVWXYZ
Buffer after append: 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

just as you expect.  Using GNATMAKE 4.9.3 on latest Ubuntu.

<snip>
-- 
Ben.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 21:59     ` Simon Wright
@ 2015-12-04 23:19       ` Laurent
  0 siblings, 0 replies; 22+ messages in thread
From: Laurent @ 2015-12-04 23:19 UTC (permalink / raw)


On Friday, 4 December 2015 22:59:19 UTC+1, Simon Wright  wrote:
> 
> > The Log_File I have reduced it to the minimum too:
> >
> > 0123456789
> > abcdefghijklmnopqrstuvwxyz
> > ABCDEFGHIJKLMNOPQRSTUVWXYZ
> >
> > I expect my code to do this:
> >
> > 1st read: 0123456789
> > append to the buffer which is empty so content of buffer is: 0123456789
> >
> > 2nd read: abcdefghijklmnopqrstuvwxyz
> > append to buffer so it becomes: 0123456789abcdefghijklmnopqrstuvwxyz
> >
> > 3rd read: ABCDEFGHIJKLMNOPQRSTUVWXYZ
> > append to buffer so it would be: 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
> 
> This is exactly what I get with OS X El Capitan & FSF GCC 5.2.0. Also
> GNAT GPL 2015.

El Capitane and Yosemite produce the same garbage using Gnat GPL 2015. I compile with GPS and gprbuild. No idea if that is of importance. Didn't try FSF. Later when I got some sleep and when I am back from work I will try that one. And Windows.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 22:28     ` Jeffrey R. Carter
@ 2015-12-04 23:35       ` Laurent
  2015-12-04 23:59         ` Jeffrey R. Carter
  0 siblings, 1 reply; 22+ messages in thread
From: Laurent @ 2015-12-04 23:35 UTC (permalink / raw)


On Friday, 4 December 2015 23:28:46 UTC+1, Jeffrey R. Carter  wrote:

> One possible explanation is that your file was produced by a Windows system, and
> you're running your program on a Unix-based platform. In that case, Get_Line
> will return a string with a CR at the end. When displayed, the CR will return
> the cursor to the beginning of the line. So you begin by outputting

Bingo. The real log file is from a windows system. The simplified file I have made on my mac but with BBEdit. For some reason it also used the windows line endings? After changing that everything works as expected. Should have checked that first. I had already once a problem with that I think. Otherwise I wouldn't have put the "file" and "od -c" command into a txt file with an explanation.

Is there no function in the text io libraries to check which line endings are used?

Thanks

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 23:35       ` Laurent
@ 2015-12-04 23:59         ` Jeffrey R. Carter
  2015-12-05 10:13           ` Laurent
  2015-12-08  1:53           ` Randy Brukardt
  0 siblings, 2 replies; 22+ messages in thread
From: Jeffrey R. Carter @ 2015-12-04 23:59 UTC (permalink / raw)


On 12/04/2015 04:35 PM, Laurent wrote:
> 
> Is there no function in the text io libraries to check which line endings are used?

Afraid not. If you're processing files from several platforms, you can check for
and trim off a CR at the end. This will work with files from either platform.

However, all bets are off if you might get a file with the old Mac CR line
terminators. I worked on a system that used the Get_Line function and dealt with
CRs as above since many of the files we processed were produced on Windows
systems. One day we got a 5-MB file saved with CRs for line terminators.
Get_Line attempted to read a 5-MB line. That was not pretty.

-- 
Jeff Carter
"Well, a gala day is enough for me. I don't think
I can handle any more."
Duck Soup
93

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 23:59         ` Jeffrey R. Carter
@ 2015-12-05 10:13           ` Laurent
  2015-12-08  1:53           ` Randy Brukardt
  1 sibling, 0 replies; 22+ messages in thread
From: Laurent @ 2015-12-05 10:13 UTC (permalink / raw)


If I had gotten some run time error I would have checked the line endings much earlier. So I have to put a post it somewhere. 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 22:23         ` Dmitry A. Kazakov
@ 2015-12-05 10:22           ` Laurent
  2015-12-05 12:19             ` Dmitry A. Kazakov
  0 siblings, 1 reply; 22+ messages in thread
From: Laurent @ 2015-12-05 10:22 UTC (permalink / raw)


>No, the strategy is to pass lines or other parts of the >message further 
>instead of accumulating it in the memory. Instead of this >"hamster" design 
>I would do something like: 

>   Buffer : String (1..Size); -- Input is accumulated here 

Eh isn't that a contradiction? There is still a buffer and it is even combined with those evil accesses!?

<sarcasm>
Perhaps the company which developped our LIS should have been using more hamster code instead of trusting in java's garbage creator. The company has gone bancrupt this year. The repo guys were removing the chairs under their asses. Only because it is software in the medical domain, there is an other company which will garante a technical support for 3 years.<sarcasm/>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-05 10:22           ` Laurent
@ 2015-12-05 12:19             ` Dmitry A. Kazakov
  2015-12-05 13:15               ` Laurent
  0 siblings, 1 reply; 22+ messages in thread
From: Dmitry A. Kazakov @ 2015-12-05 12:19 UTC (permalink / raw)


On Sat, 5 Dec 2015 02:22:36 -0800 (PST), Laurent wrote:

>>No, the strategy is to pass lines or other parts of the >message further 
>>instead of accumulating it in the memory. Instead of this >"hamster" design 
>>I would do something like: 
> 
>>   Buffer : String (1..Size); -- Input is accumulated here 
> 
> Eh isn't that a contradiction? There is still a buffer

For just one line. You are trying to merge an unlimited number of. This is
a waste of resources and potential problem with latencies. For example,
modern poorly designed text editors load whole file first. What happens
when the file is large? The editor gets blocked.

> and it is even combined with those evil accesses!?

Access is needed for the mix-in pattern. It is a language design bug that
the discriminant must be an access even if the target is a by-reference
type. Instead of mix-in you can pass stream to each call explicitly, which
is not better. Or you can pack the access type into a handle type. BTW,
Unbounded_String is just that, a handle type wrapping some access to the
dynamically allocated body.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-05 12:19             ` Dmitry A. Kazakov
@ 2015-12-05 13:15               ` Laurent
  0 siblings, 0 replies; 22+ messages in thread
From: Laurent @ 2015-12-05 13:15 UTC (permalink / raw)


My "studies" haven't gone so far that I would feel confident about using more advanced OO concepts or to understand how they work. Don't like to use things I don't understand. Would explain why I still haven't made an effort to advance with Gnoga.

The humain brain is also an expert in lazyness, it first tries the techniques which provided a satisfying solution with similar problems befor. Only if those fail it will try to acquire new ways to find a solution. Unfortunately too much specific knowlege kills the creativity. So you no longer recognize the forest because there are too many trees.

So that is the reason why I used unbound_strings and still only read the information from txt files. Not motivated too advance to xml or use sqlite.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Problem with Unbounded Strings
  2015-12-04 23:59         ` Jeffrey R. Carter
  2015-12-05 10:13           ` Laurent
@ 2015-12-08  1:53           ` Randy Brukardt
  1 sibling, 0 replies; 22+ messages in thread
From: Randy Brukardt @ 2015-12-08  1:53 UTC (permalink / raw)


"Jeffrey R. Carter" <spam.jrcarter.not@spam.not.acm.org> wrote in message 
news:n3t98j$5pj$1@dont-email.me...
> On 12/04/2015 04:35 PM, Laurent wrote:
>>
>> Is there no function in the text io libraries to check which line endings 
>> are used?
>
> Afraid not. If you're processing files from several platforms, you can 
> check for
> and trim off a CR at the end. This will work with files from either 
> platform.

It also depends on your vendor's implementation of Text_IO. For Janus/Ada, 
we treated LF as the line ending and discard a single CR preceding it (if 
one exists). That way, we didn't have to have separate versions of 
Text_IO.Get_Line for the different platforms, and it essentially 
auto-converts between them. (Of course, Text_IO.Put_Line has to output 
different line terminators on the different platforms.) IMHO, this the best 
approach for Text_IO, but it does have one drawback...

> However, all bets are off if you might get a file with the old Mac CR line
> terminators. I worked on a system that used the Get_Line function and 
> dealt with
> CRs as above since many of the files we processed were produced on Windows
> systems. One day we got a 5-MB file saved with CRs for line terminators.
> Get_Line attempted to read a 5-MB line. That was not pretty.

..and that's it. Doesn't seem worse than any other possibility, though.

                                   Randy.


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-12-08  1:53 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-04 15:49 Problem with Unbounded Strings Laurent
2015-12-04 16:04 ` Dmitry A. Kazakov
2015-12-04 16:53   ` Laurent
2015-12-04 18:00     ` Niklas Holsti
2015-12-04 21:05       ` Laurent
2015-12-04 20:34     ` Dmitry A. Kazakov
2015-12-04 21:43       ` Laurent
2015-12-04 22:23         ` Dmitry A. Kazakov
2015-12-05 10:22           ` Laurent
2015-12-05 12:19             ` Dmitry A. Kazakov
2015-12-05 13:15               ` Laurent
2015-12-04 18:08 ` Jeffrey R. Carter
2015-12-04 21:21   ` Laurent
2015-12-04 21:59     ` Simon Wright
2015-12-04 23:19       ` Laurent
2015-12-04 22:02     ` Dmitry A. Kazakov
2015-12-04 22:28     ` Jeffrey R. Carter
2015-12-04 23:35       ` Laurent
2015-12-04 23:59         ` Jeffrey R. Carter
2015-12-05 10:13           ` Laurent
2015-12-08  1:53           ` Randy Brukardt
2015-12-04 23:00     ` Ben Bacarisse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox