Discussion:
Regression testing in OpenBSD
Sergey Bronnikov
2014-03-13 18:08:40 UTC
Permalink
Hi, all

I read OpenBSD mail lists several years and always stuck with reports
like "I have tested your patch and it works fine". Such reports are useless.
We don't know details about _how_ it was tested and we can't retest on the same way.
We need more and more tests to cover OpenBSD code as much as possible.
OpenBSD developers has unit and regression tests in source tree (src/regress)
but you cannot use them without having CVS repo on you computer.

I want to port as much as possible opensource tests used on Linux and FreeBSD to OpenBSD
and give developers and volunteers ability for simple and easy run these tests on OpenBSD.

And I am on that way:
- kyua and dependencies (atf, kyua-testers, lutok)
It is a test framework used in NetBSD and FreeBSD maintained by Julio Merino.
NetBSD and FreeBSD developes plans to switch all regression tests in base to that
framework. Would be nice to have kyua ported to OpenBSD. It allows us
to run tests from NetBSD/FreeBSD trees which are absent in OpenBSD CVS.
Actually port for kyua and their dependencies was sent to ***@.
I hope it will be committed to the official ports tree.
- fsx
Old and good filesystem stress test. Already ported to OpenBSD and sent to ***@.
- piglit (http://cgit.freedesktop.org/piglit)
OpenGL test suite. Port is in WIP status.
- glean
Good OpenGL tests. Already ported and sent to ***@.
And more tests in further plans.

Tests itself are good to run them from time to time. To regular regression testing
we need test automation. And Tapper is good choice from my point of view.

Tapper is a test infrastructure which key features are:
- support TAP format for reports (Test anything protocol is de-facto standart for test reports,
if you are Perl programmer you probably familiar with it)
- written on Perl (it allows to run it on different platforms supported by OpenBSD project)
- flexible query language for test result evaluation
- integration with Power Distribution Units
- easy way to sent report (just exec 'cat regression_report | nc tapper_server 7357), special tools don't needed.
- integration with CodeSpeed to detect regressions in performance
- integration with Linux test framework 'autotest'
- project components distributed under BSD license of course :)
Read more on project site - http://tapper-testing.org

If you have interest in things I have described above then please help me with testing ports,
reviewing my patches and committing my work to the official tree.

Thanks.
Ingo Schwarze
2014-03-13 21:19:57 UTC
Permalink
Hi Sergey,
Post by Sergey Bronnikov
We need more and more tests to cover OpenBSD code as much as possible.
Sure, so improve or write some tests.
Post by Sergey Bronnikov
OpenBSD developers has unit and regression tests in source tree
(src/regress) but you cannot use them without having CVS repo
on you computer.
That's a non-issue. Without a source tree, you can't do anything
in the first place, in particular not change the code, so the code
is already safe from your interference.
Post by Sergey Bronnikov
I want to port as much as possible opensource tests used on Linux
and FreeBSD to OpenBSD and give developers and volunteers ability
for simple and easy run these tests on OpenBSD.
When i looked at test suites elsewhere, they were often overengineered
to the point that i wouldn't want to use them at all. Given a
typical framework, it can be terribly difficult to find out what
actually went wrong when the framework reports an error.

Then again, if somebody finds some tests that are really good *and
simple*, sure, bring them in to src/regress.
Post by Sergey Bronnikov
- kyua and dependencies (atf, kyua-testers, lutok)
It is a test framework used in NetBSD and FreeBSD maintained
by Julio Merino.
I heard about that one during BSDCan 2011:

http://www.bsdcan.org/2011/schedule/events/223.en.html

It was incredibly bloated and overengineered already at that point
in time. I didn't look since then, but usually, projects that are
overengineered to begin with usually get worse as time goes by, not
better.
Post by Sergey Bronnikov
Would be nice to have kyua ported to OpenBSD.
Sure, having a port can do little harm, even if only a few people
use a port, it may be useful. Maybe somebody will use it and find
and fix a few bugs.

However, to get on with OpenBSD unit testing, that is irrelevant.
The test framework we have now is quite simple, and even that one
is used too infrequently, except in very few areas that are actively
maintained. If people aren't even using *that*, a more complicated
framework will get used even less. Anything involving ports has no
chance to get used at all by a relevant number of developers, as far
as i can see.

Regarding test automation: That's completely pointless in my opinion.
It's yet more layers of complexity and abstraction, the reports are
almost invariably ignored and unintelligible, and if you try to enforce
its usage, you teach developers to provide pseudo-tests providing
formal coverage but not actually testing what's relevant.


To summarize, if you are interested in improving OpenBSD regression
tests, i'd suggest working on *actual tests*, not bloated frameworks.
One could look at the existing tests, clean them up such that they
actually run through, do not generate bogus errors, improve the style
in some places such that they are as simple as possible. One could
also write new tests for areas not yet covered, trying to maintain
a simple, if possible uniform style.

For a framework, bsd.regress.mk (122 lines of code right now) is about
the right size. Someone working on actual tests will almost certainly
come up with some improvements to bsd.regress.mk, too, removing some
bloat here and there and maybe, sparingly, add a missing feature
now and then. Just like rc.subr(8): Start small, stay small, and
people will understand what's going on. That's the OpenBSD way.

Yours,
Ingo
Sergey Bronnikov
2014-03-14 15:36:16 UTC
Permalink
Hi, Ingo
Post by Ingo Schwarze
Hi Sergey,
Post by Sergey Bronnikov
We need more and more tests to cover OpenBSD code as much as possible.
Sure, so improve or write some tests.
amount of OpenBSD developers is about 70, and it is really silly to don't reuse
existed tests from another projects. It can allow to free time of developers
for another work, for example new features :)

If you are right then OpenBSD developers should rewrite piglit OpenGL tests
to have regression testing for Intel/ATI DRM. I believe no one from developers
don't want ot do double work and it is better to reuse piglit.
Post by Ingo Schwarze
Post by Sergey Bronnikov
OpenBSD developers has unit and regression tests in source tree
(src/regress) but you cannot use them without having CVS repo
on you computer.
That's a non-issue. Without a source tree, you can't do anything
in the first place, in particular not change the code, so the code
is already safe from your interference.
You look from point of view when user is equal to developer.
But what about case when man can help with running tests on specific hardware
but cannot help with fixing potential problems itself?
According to you he should download sources and then run it. Too complicated
for task with running tests. Decreasing learning curve here can help project to
get more feedback from users.

IMO it is the same as integration bsd.ports.mk with tests from ports:
you run 'make test' and don't dig into guts of tests inside port while tests passed.
But it is criteria for workness of port. It should not be commited when tests failed.
Post by Ingo Schwarze
Post by Sergey Bronnikov
I want to port as much as possible opensource tests used on Linux
and FreeBSD to OpenBSD and give developers and volunteers ability
for simple and easy run these tests on OpenBSD.
When i looked at test suites elsewhere, they were often overengineered
to the point that i wouldn't want to use them at all. Given a
typical framework, it can be terribly difficult to find out what
actually went wrong when the framework reports an error.
Ingo, it usually depends on developers implemented that framework.
Isn't it? I am totally agree that it is better when simple. But sometimes
frameworks are necessary evil. Look at 'smtpscript' framework (https://github.com/poolpOrg/smtpscript)
It doesn't look as complex test. But without framework itself SMTP functional test
can be more complex and less flexible.
- trinity syscall fuzzer (http://codemonkey.org.uk/projects/trinity/)
or bloat by design:
- stress2 (https://people.freebsd.org/~pho/stress)
Need to understand cost of efforts for porting and maintenance
of test and profit from that test.
Post by Ingo Schwarze
Then again, if somebody finds some tests that are really good *and
simple*, sure, bring them in to src/regress.
Post by Sergey Bronnikov
- kyua and dependencies (atf, kyua-testers, lutok)
It is a test framework used in NetBSD and FreeBSD maintained by Julio Merino.
http://www.bsdcan.org/2011/schedule/events/223.en.html
It was incredibly bloated and overengineered already at that point
in time. I didn't look since then, but usually, projects that are
overengineered to begin with usually get worse as time goes by, not
better.
Post by Sergey Bronnikov
Would be nice to have kyua ported to OpenBSD.
Sure, having a port can do little harm, even if only a few people
use a port, it may be useful. Maybe somebody will use it and find
and fix a few bugs.
However, to get on with OpenBSD unit testing, that is irrelevant.
The test framework we have now is quite simple, and even that one
is used too infrequently, except in very few areas that are actively
maintained. If people aren't even using *that*, a more complicated
framework will get used even less. Anything involving ports has no
chance to get used at all by a relevant number of developers, as far
as i can see.
Regarding test automation: That's completely pointless in my opinion.
It's yet more layers of complexity and abstraction, the reports are
almost invariably ignored and unintelligible, and if you try to enforce
its usage, you teach developers to provide pseudo-tests providing
formal coverage but not actually testing what's relevant.
I don't know how developers run tests on different machines, but suppose
it now looks as somehow manual action and it will be routine when you have
more than several machines. Don't see something bad to have automation in that place.
Test automation is necessary evil here.

Elevator is also too complicated than ladder, but in some cases you are
using a ladder, in other - elevator :) The same with automation.
Post by Ingo Schwarze
To summarize, if you are interested in improving OpenBSD regression
tests, i'd suggest working on *actual tests*, not bloated frameworks.
One could look at the existing tests, clean them up such that they
actually run through, do not generate bogus errors, improve the style
in some places such that they are as simple as possible. One could
also write new tests for areas not yet covered, trying to maintain
a simple, if possible uniform style.
For a framework, bsd.regress.mk (122 lines of code right now) is about
the right size. Someone working on actual tests will almost certainly
come up with some improvements to bsd.regress.mk, too, removing some
bloat here and there and maybe, sparingly, add a missing feature
now and then. Just like rc.subr(8): Start small, stay small, and
people will understand what's going on. That's the OpenBSD way.
Totally agree. It are things for which I love OpenBSD.

Ingo, thank you for oppinions. I will correct my actions directed
to improving tests in OpenBSD according to your comments.
Post by Ingo Schwarze
Yours,
Ingo
Theo de Raadt
2014-03-13 21:38:01 UTC
Permalink
Post by Ingo Schwarze
To summarize, if you are interested in improving OpenBSD regression
tests, i'd suggest working on *actual tests*, not bloated frameworks.
One could look at the existing tests, clean them up such that they
actually run through, do not generate bogus errors, improve the style
in some places such that they are as simple as possible. One could
also write new tests for areas not yet covered, trying to maintain
a simple, if possible uniform style.
I hold the same view.

Over the decades, when the regression suites have spotted problems I
have often had to scratch my head. I typically cannot discern why the
test program failed, because quite often it is too grand in scope.

It often does not spot the narrow failure case. Yes, yes yes, better
test programs should be there, but watch what happens next.

The exact failure condition is not clear, and so I have to re-learn
the framework to tweak the test to give narrow and detailed reporting.

But say I have to run the test program in an environment where it is
statically compiled, or where it can be ktrace'd, or new lines of code
can be added to it without disrupting the framework, it becomes
obvious this is a pain.

The heavy frameworks get in the way, or require additional learning...

So what often happens is that we write our own quick test programs
from scratch, fix the underlying problem, and then throw that quick
test program away. We do not have time to learn the heavy frameworks
and integrate the new, assuredly more narrow, test code.

There are two solutions to this problem, either

(1) Give the people working in the development environment more time
or an inclination to write these tests

or

(2) Make the frameworks increasingly more obvious and simple to
operate in.

These bulky frameworks fail in the open source world. And quite
honestly looking at the commercial product we see out there, they
fail there too. Academically, they should be great successes.....
Loading...