From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
	ip-172-31-74-118.ec2.internal
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=3.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.6
Path: eternal-september.org!reader02.eternal-september.org!news.misty.com!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!buffer1.nntp.dca1.giganews.com!news.giganews.com.POSTED!not-for-mail
NNTP-Posting-Date: Mon, 27 Dec 2021 11:41:48 -0600
From: Dennis Lee Bieber <wlfraed@ix.netcom.com>
Newsgroups: comp.lang.ada
Subject: Re: Some advice required [OT]
Date: Mon, 27 Dec 2021 12:41:43 -0500
Organization: IISS Elusive Unicorn
Message-ID: <cjsjsg1r74m5euhqmsd64lre5rc43dpf2n@4ax.com>
References: <7bede061-4b0f-4029-beb1-1056637e57d6n@googlegroups.com> <j2tlk8FneraU1@mid.individual.net> <49538254-21ed-4fd0-8316-1bccc7d3c635n@googlegroups.com>
User-Agent: ForteAgent/8.00.32.1272
X-No-Archive: yes
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Usenet-Provider: http://www.giganews.com
X-Trace: sv3-53bovxQPVSBV2FM33I59r1EQnefoLQCzNsYX+/bwYR/fSnxOQRPxwREipWuuq5r+u9LtMgPfnwc5PXa!nwfA8qu+j0sMaltj6IoDpH/3DG42JGguSqXo1nC0Ug3mP/eJpjbnVB7AArszOvE2B3Qlkxw2
X-Complaints-To: abuse@giganews.com
X-DMCA-Notifications: http://www.giganews.com/info/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
X-Original-Bytes: 3346
Xref: reader02.eternal-september.org comp.lang.ada:63281
List-Id: <comp.lang.ada>

On Mon, 27 Dec 2021 04:29:06 -0800 (PST), Laurent <lutgenl@icloud.com>
declaimed the following:

>Why I would choose result from strain B over the result from strain A.
>
>strain A: SSSRSS
>strain B: SSRRRS
>
>Simply counting the number of S, I and R doesn't work. ?Checksum with/without weight for the column number doesn't
>work either.

	I wouldn't expect a checksum to be of any use, since the idea of most
checksums (and CRCs) is to be able to verify that a data sequence has not
been corrupted. Checksums don't "rank" data.

>
>Even if I get a correct result I have still the same problem as before why result B over result A.
>

	Unfortunately, until you CAN describe why one result is preferred over
another, no one will be able to suggest algorithm(s) that may work (of
course, once you can explain it, you may not need assistance translating it
to code). For all we know, the cost of the various compounds might be a
factor affecting which of two similar result rows might be desired.

	Actually, I'm still perplexed at the idea that the solution is picking
microbe strains that are most resistant to drugs -- unless one is trying to
reduce test candidates for yet undeveloped drugs ("if our new concoction
kills this strain, /then/ we will test it against the rest of the
strains").

	I'm tempted to suggest R (or other statistical software) and
experimenting with various presentation/partitioning operations to see if
something reasonable pops out. Your data is NOT numerical (so don't bother
assigning numbers to your <null>SIR -- after all, you could just as easily
assign the ordinal position in the ASCII alphabet to them), so statistical
operations that work on non-numeric "factors" makes as much, if not more,
sense. (I've only toyed with R, so I don't know if it has partitioning
ability for three factors -- be a bit tedious to have to specify, say,

	compound-X factor = R
(true)					(false)
						compound-X factor = S
						(true)				(false)
)


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed@ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/