From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on ip-172-31-74-118.ec2.internal X-Spam-Level: X-Spam-Status: No, score=0.8 required=3.0 tests=BAYES_50,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.6 X-Received: by 2002:a37:652:: with SMTP id 79mr6603542qkg.197.1631257850099; Fri, 10 Sep 2021 00:10:50 -0700 (PDT) X-Received: by 2002:a5b:58e:: with SMTP id l14mr8702648ybp.143.1631257849711; Fri, 10 Sep 2021 00:10:49 -0700 (PDT) Path: eternal-september.org!reader02.eternal-september.org!news.misty.com!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Fri, 10 Sep 2021 00:10:49 -0700 (PDT) In-Reply-To: Injection-Info: google-groups.googlegroups.com; posting-host=87.88.29.208; posting-account=6yLzewoAAABoisbSsCJH1SPMc9UrfXBH NNTP-Posting-Host: 87.88.29.208 References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <71d68159-3f72-4949-bc4f-ef83f8cd7067n@googlegroups.com> Subject: Re: Single-Instance Executable, TSR-style programs, "lockfiles" and the DSA From: Emmanuel Briot Injection-Date: Fri, 10 Sep 2021 07:10:50 +0000 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Xref: reader02.eternal-september.org comp.lang.ada:62688 List-Id: When I wrote the program that processes all incoming email on the mailing l= ists at AdaCore (in particular to manage tickets), we were using lockfiles = indeed to coordinate between all the instances of the program (one per inco= ming email). The lockfile contained an expiration date and the PID of the p= rocess that took the lock, and a program was allowed to break the lock when= that date was reached (like 10min or something, I forgot the value we came= up with, when processing one message takes a few milliseconds), or when th= e process no longer existed (so crashed). So at least the system could not = totally break and would eventually recover. This is of course far from perfect, since during those 10 minutes no email = could be processed or delivered, and if the timeout is incorrect we could e= nd up with two programs executing concurrently (in practice, this was not a= major issue for us and we could deal with the once-a-year duplicate ticket= generated). Years later, we finally moved to an actual database (postgresql) and we wer= e able to remove the locks altogether by taking advantage of transactions t= here. This is of course a much better approach. When I look at systems like Kafka (multi-node exchange of messages), they h= ave an external program (ZooKeeper) in charge of monitoring the various ins= tances. Presumably a similar approach could be used, where the external pro= gram is much simpler and only in charge of synchronizing things. Being simp= ler and fully written in Ada, it would be simpler to ensure this one doesn'= t crash (famous last words...).