From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on ip-172-31-74-118.ec2.internal X-Spam-Level: X-Spam-Status: No, score=-1.9 required=3.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.6 Path: eternal-september.org!reader02.eternal-september.org!news.misty.com!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!buffer1.nntp.dca1.giganews.com!news.giganews.com.POSTED!not-for-mail NNTP-Posting-Date: Mon, 27 Dec 2021 11:41:48 -0600 From: Dennis Lee Bieber Newsgroups: comp.lang.ada Subject: Re: Some advice required [OT] Date: Mon, 27 Dec 2021 12:41:43 -0500 Organization: IISS Elusive Unicorn Message-ID: References: <7bede061-4b0f-4029-beb1-1056637e57d6n@googlegroups.com> <49538254-21ed-4fd0-8316-1bccc7d3c635n@googlegroups.com> User-Agent: ForteAgent/8.00.32.1272 X-No-Archive: yes MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Usenet-Provider: http://www.giganews.com X-Trace: sv3-53bovxQPVSBV2FM33I59r1EQnefoLQCzNsYX+/bwYR/fSnxOQRPxwREipWuuq5r+u9LtMgPfnwc5PXa!nwfA8qu+j0sMaltj6IoDpH/3DG42JGguSqXo1nC0Ug3mP/eJpjbnVB7AArszOvE2B3Qlkxw2 X-Complaints-To: abuse@giganews.com X-DMCA-Notifications: http://www.giganews.com/info/dmca.html X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.40 X-Original-Bytes: 3346 Xref: reader02.eternal-september.org comp.lang.ada:63281 List-Id: On Mon, 27 Dec 2021 04:29:06 -0800 (PST), Laurent declaimed the following: >Why I would choose result from strain B over the result from strain A. > >strain A: SSSRSS >strain B: SSRRRS > >Simply counting the number of S, I and R doesn't work. ?Checksum with/without weight for the column number doesn't >work either. I wouldn't expect a checksum to be of any use, since the idea of most checksums (and CRCs) is to be able to verify that a data sequence has not been corrupted. Checksums don't "rank" data. > >Even if I get a correct result I have still the same problem as before why result B over result A. > Unfortunately, until you CAN describe why one result is preferred over another, no one will be able to suggest algorithm(s) that may work (of course, once you can explain it, you may not need assistance translating it to code). For all we know, the cost of the various compounds might be a factor affecting which of two similar result rows might be desired. Actually, I'm still perplexed at the idea that the solution is picking microbe strains that are most resistant to drugs -- unless one is trying to reduce test candidates for yet undeveloped drugs ("if our new concoction kills this strain, /then/ we will test it against the rest of the strains"). I'm tempted to suggest R (or other statistical software) and experimenting with various presentation/partitioning operations to see if something reasonable pops out. Your data is NOT numerical (so don't bother assigning numbers to your SIR -- after all, you could just as easily assign the ordinal position in the ASCII alphabet to them), so statistical operations that work on non-numeric "factors" makes as much, if not more, sense. (I've only toyed with R, so I don't know if it has partitioning ability for three factors -- be a bit tedious to have to specify, say, compound-X factor = R (true) (false) compound-X factor = S (true) (false) ) -- Wulfraed Dennis Lee Bieber AF6VN wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/