comp.lang.ada
 help / color / mirror / Atom feed
From: Vadim Godunko <vgodunko@gmail.com>
Subject: Re: String Buffer
Date: Fri, 3 Dec 2021 00:11:56 -0800 (PST)	[thread overview]
Message-ID: <cf1181b6-4aaf-4056-8a46-67b9f8d7a99en@googlegroups.com> (raw)
In-Reply-To: <c80ebd43-1a92-4a05-a0db-7cb707191315n@googlegroups.com>

On Thursday, December 2, 2021 at 9:17:38 PM UTC+3, kevc...@gmail.com wrote:
> In this thread bounded and unbounded get quite a bashing. 
> 
> "https://groups.google.com/g/comp.lang.ada/c/NINmFln-YS4/m/5De5DeUAAAAJ" 
> 
> I thought bounded looked useful but then I realised that it allocates the max immediately anyway. It may be useful in constrained environments but then I do not use Strings in constrained environments. 
> 
> Unbounded is said to be inefficient because it re-allocates. 
> 
> In Go they have strings.Builder. I assume that is what Text_Buffer is aimed to be. (Actually Go seems to have copied a lot from Ada such as AWS API, unless they both are similar to something else like JAVA). 
> 
> Is Text_Buffer usable today with GCC 11? 
> 
> strings.Builder in Go behaves similarly to unbounded in that it doubles the allocation as required but it only returns a string when needed and does not have string operations. You can Grow the builder to avoid re-allocations. 
> 
> "https://pkg.go.dev/strings#Builder" 
> 
> If possible without breaking all of the string functions (length and separate capacity) and Unbounded Strings had a Grow function, then wouldn't that relieve the efficiency issue? 
> 
> In any case avoiding unbounded strings is almost certainly in the realm of premature optimisation most of the alleged 10% of the time that it useful, but it would be nice to know of and use something akin to strings.Builder, preferably from the standard library, if it is available?

For VSS.Strings.Virtual_String we used two kinds of optimization:

1. Short strings are stored without any memory allocation. It saves a lot of time. This is not very visible in multithread applications due to runtime cost of controlled objects; however it is very visible on manycore due to less amount of involved atomic operations.  How "short" string should be depends from underlying encoding, content and machine architecture, on modern 64bit systems when UTF-8 encoding is used it is 17 ASCII characters, or 8 Cyrillic characters, or 4 math characters. In context of Ada Language Server most of cases are such small strings.

2. It is possible to set capacity for the particular string object. Memory will be reallocated on next modification operation of the string object. This may be useful for large strings, when approximate size of the string is known and allows to save few allocate/move memory cycles on append. I don't know any real use cases of this feature right now.

      parent reply	other threads:[~2021-12-03  8:11 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-02 18:17 String Buffer Kevin Chadwick
2021-12-02 19:56 ` Jeffrey R.Carter
2021-12-02 20:15   ` Dmitry A. Kazakov
2021-12-02 21:06     ` Jeffrey R.Carter
2021-12-02 21:45       ` Dmitry A. Kazakov
2021-12-03  0:49         ` Kevin Chadwick
2021-12-03  5:25       ` Randy Brukardt
2021-12-03  8:31   ` ldries46
2021-12-02 20:51 ` Simon Wright
2021-12-03  8:11 ` Vadim Godunko [this message]
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox