From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=0.7 required=5.0 tests=BAYES_00,INVALID_DATE,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
Path: utzoo!utgpu!watmath!clyde!ima!think!barmar
From: barmar@think.COM (Barry Margolin)
Newsgroups: comp.lang.ada
Subject: Re: Garbage Collection
Message-ID: <35340@think.UUCP>
Date: 11 Jan 89 18:21:36 GMT
References: <35328@think.UUCP> <4066@hubcap.UUCP>
Sender: news@think.UUCP
Reply-To: barmar@kulla.think.com.UUCP (Barry Margolin)
Organization: Thinking Machines Corporation, Cambridge MA, USA
List-Id: <comp.lang.ada>

In article <4066@hubcap.UUCP> billwolf@hubcap.clemson.edu writes:
>>From article <35328@think.UUCP>, by barmar@think.COM (Barry Margolin):
>> First of all, whether the file is locked is immaterial to the
>> discussion (I never actually said that the compiler compiles files --
>> in Lisp the compiler can also be invoked on in-core interpreted
>> functions, and many Lisp programming environments allow in-core editor
>> buffers to be compiled).
>
>    Whatever is being processed, be it a file or something else,
>    would have to be locked.  If it is already inaccessible to
>    all other processes, then it is continuously locked already.

The reason I said it was immaterial was because we are discussing
garbage collection and structure sharing, not file management.

>
>> [discussion of copying vs. read-locking]
>
>    All editors I know of work by copying the targeted file into
>    memory, holding it there while it's being modified, and then
>    writing it back to the file.  Another approach is the locking
>    mechanism.  Since compiler warnings generally amount to very
>    small text files, I'd probably go the copying route if locking
>    /unlocking consumed too much time.  However, a good argument 
>    can also be made that there should not be 3 million people 
>    simultaneously editing and/or compiling the same file anyway, 
>    so it probably makes little difference which method is chosen. 

And you're still harping on how the editor manages the file.  The
whole point of my example was on how the editor, compiler and command
processor, running in the same address space (the compiler could be
invoked as a subroutine by the editor, for instance), share the
in-memory warnings database.  3 million people aren't editing and
compiling the same file, but one user is, and he may tell the command
processor to delete the compiler warnings before he's finished
scanning through them in the editor (because he doesn't want them
cluttering up the database if he asks for other compiler warnings to
be displayed).

>> Most modern GC schemes have time overhead that is a function (often
>> linear) of the frequency of allocation.  Since assignments are
>> always more frequent than allocations, and I suspect usually MUCH more
>> frequent, this difference is important.
>
>    No, not just the frequency of allocation.  GC's performance also
>    depends upon the frequency of running out of memory.  Furthermore,
>    GC is a global mechanism, and it wastes much time scanning space
>    which is already being properly managed.  

Real-time GC mechanisms are defined to do a bounded amount of work PER
ALLOCATION.  Decent GC mechanisms can be told to ignore manually
managed space.  And generational and ephemeral GC mechanisms
concentrate their efforts on memory most likely to contain garbage.

>     Why was each line read into a newly allocated monolithic string,
>     with pointers into this string?  It would seem far more sensible
>     to read each *field* into a newly allocated string; then when we
>     need to revise a field to a larger value, deallocate the old string 
>     and allocate a new one.  Flags are not necessary.

Probably because the language's runtime library provides operations
for reading by lines and for searching strings.  A program that uses
standard library routines is easier to read than one that uses its
own.

I admit that the database package was not written wonderfully (the guy
who wrote it is primarily a Lisp programmer, but he was one of the few
people in our company willing to write C utilities at the time).  It's
a locally-used facility, not part of any product, so high quality was
not the top priority.

>     Which is why application programmers should make use of ADTs,
>     which encapsulate and hide the details of storage management.

No.  The application programmer must still know when to call the ADT's
DESTROY procedure, or know to call the LOCK and UNLOCK procedures,
etc.

>    Deallocation of storage is a one-time cost (and a small one at that)
>    if done by the programmer.  Given the implicit destruction of local
>    environments and the use of ADTs, application programmers will 
>    practically never have to do any explicit deallocation anyway.  
>    When it is necessary, it's not that difficult.  In the database example,
>    a single line of code would suffice.

You continue with this "one-time cost" fallacy.  It is a cost that
must be paid at least every time an ADT is designed and implemented,
and it also adds complexity that must be borne by the users of the
ADT.  GC is truly a one-time cost (per runtime implementation).


Barry Margolin
Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar