- Seems XWiki committers don’t care that much about our pitest build failures. Past records show that the build can stay with pitest failures for more than a month and nobody cares much, and we even release with failing pitest jobs. Note that this is not the case for coverage failures where it’s fixed much faster.
- I’m mostly the only one spending the time to fix the issues it seems and I’m a bit fed up doing it. A lot of the time (but not always) it’s about fixing flickers (ie reporting them on github issues for the descartes project, adding a todo in the pom and lowering the threshold).
- I’m not convinced that the pros we get (small increased quality of tests) is worth the efforts. I think that in theory it’s interesting but I have the feeling that most of our tests are written well enough that the coverage percentage and mutation percentage are similar. There could be some modules where it’s not the case, but we’re not improving them with the “break-on-threshold” strategy anyway.
- The test coverage metric/threshold is easier & faster to compute and doesn’t suffer from flickerings. I wonder if it’s not enough for us.
Thus, I’m questioning our usage of pitest/descartes. I like the concept and I like that we’re one of the few projects pushing it beyond test coverage and I think it would be worth it if we didn’t have the flickers.
In addition, the pitest/descartes projects are not very active and I doubt anyone is going to fix the issue any time soon, which means we will need to do the work ourselves (and I don’t see us having the motivation or time to do it) or continue to lower the thresholds.
I’m not proposing anything at this stage, just wanted to check your POV on the topic.