comp.lang.ada
 help / color / mirror / Atom feed
* Ada and software testing
@ 2021-07-12  0:49 Paul Rubin
  2021-07-12  8:40 ` Dmitry A. Kazakov
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Paul Rubin @ 2021-07-12  0:49 UTC (permalink / raw)


I wonder if there is good guidance around for software testing in the
Ada world, or if it depends too closely on the application area.  I'm
aware of DO-178B and DO-178C in the aviation world, though I haven't
studied either of them.  Sqlite's document about its testing procedure
is also interesting and maybe a cautionary tale.  Sqlite is a really
nice SQL database whose main misfortune from the Ada perspective is that
it is written in C.  Its testing doc is here:

   https://sqlite.org/testing.html

and a little more info can be found in this interview with the author:

  https://corecursive.com/066-sqlite-with-richard-hipp/#testing-and-aviation-standards


Overview:

1. Sqlite originally had only ad hoc testing.  Then the author
(Dr. Richard Hipp) did some work with Rockwell, heard about DO-178B
there, and embarked on a large effort to strenghten Sqlite's testing in
accordance with DO-178B.  Particularly, the Sqlite team created an
enormous suite of unit tests aiming to get 100% MC/DC test coverage.
That is, for any "if" statement, there must be tests that exercise both
branches of the "if".  This seemingly got Sqlite to be very reliable.

2. Later on, fuzz testing came into vogue, so they started fuzzing
Sqlite.  This in fact found a bunch of crashes and vulnerabilities that
were duly fixed, and nonstop fuzzing was added to the test setup.  But
the testing document (section 4.1.6) notes a tension between MC/DC and
fuzzing: MC/DC requires deep parts of the code to be reachable by test
inputs, while fuzz protection tends to use defensive programming against
"impossible" inputs, resulting in seemingly unreachable code.  Fuzz
testing has been effective enough at finding bugs in C programs that it
has now displaced a lot of static analysis in the C world.

3. Sqlite uses a little bit of static analysis (section 11) but the
document says it has not helped much.  Ada on the other hand uses static
analysis extensively, both in its fine grained type system (compared
with C's) and using tools like SPARK.

4. Bugs found by fuzz testing C programs are typically the standard C
hazards like buffer overflows, undefined behaviour (UB) from bad
arithmetic operands, etc.  I'm of the impression that Ada is less
susceptible to these bugs because of mandatory range checking and less
UB in the language.

Well, that went on for longer than I expected.  My questions are
basically:

    Q1. Are there good recommendations for Ada testing strategies?  Do
    the tests resemble the stuff in the Sqlite doc?

    Q2. Is fuzz testing an important part of Ada testing, and does it
    tend to find many bugs?

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and software testing
  2021-07-12  0:49 Ada and software testing Paul Rubin
@ 2021-07-12  8:40 ` Dmitry A. Kazakov
  2021-07-14 19:56   ` Paul Rubin
  2021-07-12 16:14 ` Gautier write-only address
  2021-07-16 10:01 ` G.B.
  2 siblings, 1 reply; 12+ messages in thread
From: Dmitry A. Kazakov @ 2021-07-12  8:40 UTC (permalink / raw)


On 2021-07-12 02:49, Paul Rubin wrote:

> Well, that went on for longer than I expected.  My questions are
> basically:
> 
>      Q1. Are there good recommendations for Ada testing strategies?  Do
>      the tests resemble the stuff in the Sqlite doc?
> 
>      Q2. Is fuzz testing an important part of Ada testing, and does it
>      tend to find many bugs?

I do not think so.

Here is a war story of a bug I fixed recently. A network protocol 
implementation used a callback to send the next portion of data, when 
the transport becomes available.

The callback implementation peeks a portion of data from the outgoing 
queue and *asynchronously* sends it away. *If* initiation of sending is 
successful, the queue is popped.

OK?

No, it is a bug that almost never shows itself because initiation of I/O 
would normally deprive the task of the processor. But if it does not and 
I/O completes without losing the processor, the callback is called 
recursively *before* popping the queue and the *same* portion of data is 
sent again.

Now, nether 100% coverage, nor fuzz, not even 100% black box testing can 
detect this, arguably trivial bug.

[The fix is to make recursive calls void]

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and software testing
  2021-07-12  0:49 Ada and software testing Paul Rubin
  2021-07-12  8:40 ` Dmitry A. Kazakov
@ 2021-07-12 16:14 ` Gautier write-only address
  2021-07-12 16:41   ` Dmitry A. Kazakov
  2021-07-14 19:32   ` Paul Rubin
  2021-07-16 10:01 ` G.B.
  2 siblings, 2 replies; 12+ messages in thread
From: Gautier write-only address @ 2021-07-12 16:14 UTC (permalink / raw)


Le lundi 12 juillet 2021 à 02:50:17 UTC+2, Paul Rubin a écrit :

> Q2. Is fuzz testing an important part of Ada testing, and does it 
> tend to find many bugs? 

You can combine the power of fuzzing with the power of Ada's strong typing, implying standard Ada run-time checks (e.g. range checks), plus a compiler's own checks (e.g. GNAT's validity checks).

Read the following article for details: https://blog.adacore.com/running-american-fuzzy-lop-on-your-ada-code

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and software testing
  2021-07-12 16:14 ` Gautier write-only address
@ 2021-07-12 16:41   ` Dmitry A. Kazakov
  2021-07-14 19:32   ` Paul Rubin
  1 sibling, 0 replies; 12+ messages in thread
From: Dmitry A. Kazakov @ 2021-07-12 16:41 UTC (permalink / raw)


On 2021-07-12 18:14, Gautier write-only address wrote:

> You can combine the power of fuzzing with the power of Ada's strong typing, implying standard Ada run-time checks (e.g. range checks), plus a compiler's own checks (e.g. GNAT's validity checks).

Before the Dark Age of Computing, testing was not arbitrary. You knew 
things about your implementation and even, God forbid, foresaw some of them.

E.g. if the implementation was "linear" (the case for all buffer 
overflow stuff) you would simply test the end points (extremes) and one 
point inside instead of wasting time on anything else.

Of course, to make such considerations and techniques work, the programs 
needed to be designed very differently, which was one of the motivations 
behind Ada constrained subtypes, ranges etc.

This is also one of the reasons why unbounded strings, dynamic memory 
allocation etc must be avoided as you leave some upper bounds undefined 
making lot of things non-testable.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and software testing
  2021-07-12 16:14 ` Gautier write-only address
  2021-07-12 16:41   ` Dmitry A. Kazakov
@ 2021-07-14 19:32   ` Paul Rubin
  2021-07-14 19:51     ` Dmitry A. Kazakov
  1 sibling, 1 reply; 12+ messages in thread
From: Paul Rubin @ 2021-07-14 19:32 UTC (permalink / raw)


Gautier write-only address <gautier_niouzes@hotmail.com> writes:
> You can combine the power of fuzzing with the power of Ada's strong
> typing, implying standard Ada run-time checks (e.g. range checks),
> Read the following article for details:
> https://blog.adacore.com/running-american-fuzzy-lop-on-your-ada-code

Thanks, this is pretty interesting.  He runs AFL on three Ada programs:
Zip-Ada, and Ada libraries for reading YAML and JSON.  It finds bugs in
all three, though not very many.  It fits my picture that Ada programs
are less susceptible than C programs are, to the types of bugs that
fuzzing uncovers.

I do have to say that errors thrown by runtime checks on range types are
still program bugs, in the sense that they are type errors, that in
principle we should want to catch at compile time.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and software testing
  2021-07-14 19:32   ` Paul Rubin
@ 2021-07-14 19:51     ` Dmitry A. Kazakov
  2021-07-14 20:02       ` Paul Rubin
  0 siblings, 1 reply; 12+ messages in thread
From: Dmitry A. Kazakov @ 2021-07-14 19:51 UTC (permalink / raw)


On 2021-07-14 21:32, Paul Rubin wrote:

> I do have to say that errors thrown by runtime checks on range types are
> still program bugs,

No, it depends on the contract.

> in the sense that they are type errors,

A type error cannot happen at run-time per definition of strong typing. 
Constraint violation is not a type error.

> that in
> principle we should want to catch at compile time.

If you can. In reality it is impossible to enforce validity per type 
system, because such contracts are often not enforceable.

So the trick is to relax the contract by including exceptions, which is 
what Ada constrained subtypes do. But then Constraint_Error becomes a 
legal "value" function + would "return" on overflow.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and software testing
  2021-07-12  8:40 ` Dmitry A. Kazakov
@ 2021-07-14 19:56   ` Paul Rubin
  0 siblings, 0 replies; 12+ messages in thread
From: Paul Rubin @ 2021-07-14 19:56 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
> But if it does not and I/O completes without losing the processor, the
> callback is called recursively *before* popping the queue and the
> *same* portion of data is sent again.

This is a garden variety concurrency bug that you're right, wouldn't
normally be found with conventional fuzzing, but might be findable with
stress testing.  A more rigorous approach would involve model checking.

This type of problem happens in C programs all the time as well, and
doesn't really signify anything about the effectiveness of fuzz testing.
Fuzzing is very effective against C programs, but tentatively maybe less
so against Ada programs, because of Ada's more thorough type checking.

> [The fix is to make recursive calls void]

Hopefully there would be some locks between the tasks, though in that
case the problem would show up as deadlock.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and software testing
  2021-07-14 19:51     ` Dmitry A. Kazakov
@ 2021-07-14 20:02       ` Paul Rubin
  2021-07-15  7:27         ` Dmitry A. Kazakov
  0 siblings, 1 reply; 12+ messages in thread
From: Paul Rubin @ 2021-07-14 20:02 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
>> I do have to say that errors thrown by runtime checks on range types are
>> still program bugs,
> No, it depends on the contract.

If a contract is broken by either the caller or the callee, it is a
program bug either way, I would have thought.

>> in the sense that they are type errors,
> A type error cannot happen at run-time per definition of strong
> typing. Constraint violation is not a type error.

Hmm ok, if out of range for a range type is considered a constraint
error rather than a type error, then it's ok to say the compiler can't
check it even in principle, and it becomes the responsibility of the
application user or environment.  Inputs that trigger a constraint error
might be considered invalid in some situations.

> If you can. In reality it is impossible to enforce validity per type
> system, because such contracts are often not enforceable.

Yep.  SPARK tries to enforce such constraints at compile time, but it's
not always possible to use it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and software testing
  2021-07-14 20:02       ` Paul Rubin
@ 2021-07-15  7:27         ` Dmitry A. Kazakov
  0 siblings, 0 replies; 12+ messages in thread
From: Dmitry A. Kazakov @ 2021-07-15  7:27 UTC (permalink / raw)


On 2021-07-14 22:02, Paul Rubin wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
>>> I do have to say that errors thrown by runtime checks on range types are
>>> still program bugs,
>> No, it depends on the contract.
> 
> If a contract is broken by either the caller or the callee, it is a
> program bug either way, I would have thought.

If the contract includes exception, then nothing is broken.

>>> in the sense that they are type errors,
>> A type error cannot happen at run-time per definition of strong
>> typing. Constraint violation is not a type error.
> 
> Hmm ok, if out of range for a range type is considered a constraint
> error rather than a type error, then it's ok to say the compiler can't
> check it even in principle, and it becomes the responsibility of the
> application user or environment.  Inputs that trigger a constraint error
> might be considered invalid in some situations.

Yes, and the tests must include the cases when exceptions are 
propagated, which is frequently ignored, though in my view such tests 
are even more important than the "normal" cases. Exceptions are not 
likely to happen. So the code not handling contracted exceptions tend to 
slip into production with catastrophic results.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and software testing
  2021-07-12  0:49 Ada and software testing Paul Rubin
  2021-07-12  8:40 ` Dmitry A. Kazakov
  2021-07-12 16:14 ` Gautier write-only address
@ 2021-07-16 10:01 ` G.B.
  2021-07-16 10:21   ` Paul Rubin
  2 siblings, 1 reply; 12+ messages in thread
From: G.B. @ 2021-07-16 10:01 UTC (permalink / raw)


On 12.07.21 02:49, Paul Rubin wrote:

> Well, that went on for longer than I expected.  My questions are
> basically:
> 
>      Q1. Are there good recommendations for Ada testing strategies?  Do
>      the tests resemble the stuff in the Sqlite doc?
> 
>      Q2. Is fuzz testing an important part of Ada testing, and does it
>      tend to find many bugs?

I'd like to add, if I may, a third question, perhaps a a follow-up question,
after having been bitten by a bug that was hidden behind assumptions.

Is there a way of systematically looking for hiding places of bugs
specifically in places external to the program text? And, then, what kind
of mock-ups could establish typical testing patterns? I/O is mentioned
in in the sqlite examples, but what if you do not assume that there
is going to be X .equiv. I/O?

Example: Some external library, of very closed source nature,
exposes an unforeseen behavior. It turns out that a library function
uses a lock, and while waiting for it, the function call times out,
the client program reports failure and terminates normally - with
side effects...

After the fact, after some reading and then some testing, in an adjusted
setup, it all seems plausible. "But, I didn't think of that!".
Educated guesses about what the library might do need to be based
on a vast set of documents, plus the seller of the library
also sells expensive training. Programs need a quick fix, though.

So, what is a proper testing strategy once the programmers have found
that the transitive closure of some call might sometimes incur externally
caused behavior? Such as timeout, or ordering effect due to concurrency?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and software testing
  2021-07-16 10:01 ` G.B.
@ 2021-07-16 10:21   ` Paul Rubin
  2021-07-28 15:28     ` Paul Butcher
  0 siblings, 1 reply; 12+ messages in thread
From: Paul Rubin @ 2021-07-16 10:21 UTC (permalink / raw)


"G.B." <bauhaus@notmyhomepage.invalid> writes:
> So, what is a proper testing strategy once the programmers have found
> that the transitive closure of some call might sometimes incur externally
> caused behavior? Such as timeout, or ordering effect due to concurrency?

Depending on the situation, this may be an area to try model checking.
I've been wanting to try Alloy (alloytools.org) but so far have only
clicked around its web site a little.  It looks interesting.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Ada and software testing
  2021-07-16 10:21   ` Paul Rubin
@ 2021-07-28 15:28     ` Paul Butcher
  0 siblings, 0 replies; 12+ messages in thread
From: Paul Butcher @ 2021-07-28 15:28 UTC (permalink / raw)


Hi Paul,

If you haven't done already you may also want to have a look at: https://blog.adacore.com/advanced-fuzz-testing-with-aflplusplus-3-00

It's a follow blog to to the original R&D work around fuzz testing Ada programs and goes into more detail. It also contains an example of why fuzz testing Ada applications over C can actually identify more program anomalies (again by leveraging the power of the Ada runtime checks).

We're actually seeing a lot of interest in fuzz testing Ada programs and a commercial need for an industrial grade fuzz testing solution for Ada.

You may also want to have a look at ED-203A "Airworthiness Security Methods and Considerations" which is a set of guidelines around ED-202A "Airworthiness Security Process Specification". This report explicitly mentions fuzz testing as a means of identifying vulnerabilities and challenging security measures within airborne software.

In addition (and following on from a previous comment) one aspect we are very interested in exploring is being able to bolster existing unit test input data with a fuzzing campaign. Here we would take the existing test inputs and feed them into the fuzzer as the starting corpus (in an automated fashion).

Fuzz testing Ada programs may not currently be a thing, but it soon will be... ;-)

Regards,
Paul Butcher
AdaCore

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-07-28 15:28 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-12  0:49 Ada and software testing Paul Rubin
2021-07-12  8:40 ` Dmitry A. Kazakov
2021-07-14 19:56   ` Paul Rubin
2021-07-12 16:14 ` Gautier write-only address
2021-07-12 16:41   ` Dmitry A. Kazakov
2021-07-14 19:32   ` Paul Rubin
2021-07-14 19:51     ` Dmitry A. Kazakov
2021-07-14 20:02       ` Paul Rubin
2021-07-15  7:27         ` Dmitry A. Kazakov
2021-07-16 10:01 ` G.B.
2021-07-16 10:21   ` Paul Rubin
2021-07-28 15:28     ` Paul Butcher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox