From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on ip-172-31-74-118.ec2.internal X-Spam-Level: * X-Spam-Status: No, score=1.8 required=5.0 tests=BAYES_50,FORGED_GMAIL_RCVD, FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4 X-Received: by 2002:ac8:a45:: with SMTP id f5mr3426576qti.116.1592469291279; Thu, 18 Jun 2020 01:34:51 -0700 (PDT) X-Received: by 2002:aca:ecc7:: with SMTP id k190mr1948410oih.117.1592469290908; Thu, 18 Jun 2020 01:34:50 -0700 (PDT) Path: eternal-september.org!reader01.eternal-september.org!feeder.eternal-september.org!weretis.net!feeder7.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Thu, 18 Jun 2020 01:34:50 -0700 (PDT) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: google-groups.googlegroups.com; posting-host=2604:4080:1147:82d1:806:8aa3:81c7:9ab3; posting-account=0o25mgoAAADCKLPwyX1CeAjIZQZlfNA1 NNTP-Posting-Host: 2604:4080:1147:82d1:806:8aa3:81c7:9ab3 References: <29e8f766-410c-466f-9048-e86cf57b05fbo@googlegroups.com> User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <427b7007-1214-404c-9243-3d907f11c5a4o@googlegroups.com> Subject: Re: Searchable comp.lang.ada archive dating back to 1982 From: jgrosser@gmail.com Injection-Date: Thu, 18 Jun 2020 08:34:51 +0000 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Xref: reader01.eternal-september.org comp.lang.ada:59117 List-Id: On Thursday, June 18, 2020 at 1:15:01 AM UTC-7, Nasser M. Abbasi wrote: > On 6/17/2020 9:39 PM, Jeremy Grosser wrote: > > I'm not fond of Google Groups, so I built my own archive. > >=20 > > https://archive.legitdata.co/comp.lang.ada/ > >=20 > > Sources: > >=20 > > UTZOO tapes > > 1982 - 1991 > > https://archive.org/details/utzoo-wiseman-usenet-archive > >=20 > > Usenet Historical Collection > > 1993 - 2013 > > https://archive.org/details/usenethistorical > >=20 > > Eternal September NNTP > > 2012 - Current > > http://www.eternal-september.org/ > >=20 > > The earliest messages here were copied from the net.lang.ada group, whi= ch was renamed to comp.lang.ada in 1986. If you have messages from either o= f these groups that aren't in the archive, I'd love to include them. > >=20 > > Where practical, an additional Date header has been added to each messa= ge in ISO 8601 format to aid in chronological sorting. Where no timezone wa= s given, UTC is assumed. Early messages routed via UUCP were often delayed = by days as indicated by the difference between the Posted and Date-Received= timestamps. In most cases, I use the value from the Posted timestamp. > >=20 > > A spam filter has been applied to the archive. Many thousands of advert= isements for prescription drugs, sex acts, spiritual salvation, and prejudi= ce have been removed. I do not wish to host this type of content and are ac= tively working to train better filters and remove spam that slipped through= . > >=20 > > This archive is updated hourly via NNTP. > >=20 > > -- > > Jeremy Grosser > >=20 >=20 >=20 > Very nice and good job. Thanks for doing this. >=20 > Did you use Ada to do the above? Or just pure javascript or > other software? >=20 > --Nasser At the moment, I'm using public-inbox [1], which is a horrifying mess of Pe= rl scripts. Now that I've got all of the archives in a reasonably consisten= t format, I do plan to rewrite it in Ada. The only JavaScript involved at the moment is for obfuscating email address= es from naive crawlers, which the public-inbox maintainers felt was necessa= ry. [1] https://public-inbox.org/README.html