Mutation Analysis:

What Code Coverage Doesn't Tell Us

#devclub #pitest

Gleb Smirnov, Plumbr

me@gvsmirnov.ru

Sometimes, The Tests Suck

public class SampleSmallClassTest {

    @Test
    public void testStuff() {
        // The boss says we should write tests.
        // Doesn't seem that hard...

        assertTrue(true);
    }

}

So We Use Code Coverage

$ gradle test --tests *testStuff jacocoTestReport
Element Missed Instructions Coverage
doSomeMath(int, int)
0%

 

@PHP_CEO

YOU GET THAT CODE COVERAGE TO 100%...
OR YOU'RE FIRED !! !!!!

It Makes Us Write Better Tests

@Test
public void testStuffWithCoverage() {
    // Need to get that coverage thing high...

    for(int a = -5; a <= 5; a ++) {
        for (int b = -5; b <= 5; b++) {
            SampleSmallClass.doSomeMath(a, b);
        }
    }
}

Which Give Better Metrics

$ gradle test --tests *testStuffWithCoverage jacocoTestReport
Element Missed Instructions Coverage
doSomeMath(int, int)
100%

But Still Suck

We Can Do Better Than That

  1. Run the tests
  2. Mess with the code
  3. Run the tests again
  4. If tests start failing, then the change is covered
  5. If no test start failing, the change is not covered

The Idea Is Not New At All

We Need Automated Tools

A Fancy Table

Tool Java Ant Maven Gradle CI TestNG Mocks* Last Update
Simple Jester 5-6 N N N N N ? 2009
javaLanche 5-6 N N N N N ? 2011
Jumble 4-6 Y N N N Y 3/6 2013
Pitest 5-8 Y Y Y Y Y 6/6 Yesterday


* Powermock, JMock, JMock2, Mockito, JMockit, EasyMock

Yay Pitest!

Let's Check It Out!

$ gradle test --tests *testStuffWithCoverage pitest
Name Line Coverage Mutation Coverage
Total 100% 0%
SampleSmallClass.java

Some Very Advanced Business Logic

public static int doSomeMath(int a, int b) {
    long result = b;

    result += a;
    result -= b;
    result -= a;

    return (int) result;
}

Pit Has Messed With Your Code (1)

public static int doSomeMath(int a, int b) {
    long result = b;

    result -= a;
    result -= b;
    result -= a;

    return (int) result;
}

Replaced long addition with subtraction → SURVIVED

Pit Has Messed With Your Code (2)

public static int doSomeMath(int a, int b) {
    long result = b;

    result += a;
    result -= b;
    result += a;

    return (int) result;
}

Replaced long subtraction with addition → SURVIVED

Pit Has Messed With Your Code (3)

public static int doSomeMath(int a, int b) {
    long result = b;

    result += a;
    result -= b;
    result -= a;

    return ((int) result == 0 ? 1 : 0);
}

Replaced return value with (x == 0 ? 1 : 0) → SURVIVED

So We Need To Assert Something

@Test
public void testProperly() {
    final int expected = 0;

    for(int a = -5; a <= 5; a ++) {
        for(int b = -5; b <= 5; b ++) {
            assertEquals(expected, doSomeMath(a, b));
        }
    }
}

The Test Seems Fine Now

$ gradle test --tests *testProperly pitest
Name Line Coverage Mutation Coverage
Total 100% 100%
SampleSmallClass.java

The Ultimate Guide To Mutation Analysis

  1. Pick classes to mutate
  2. Mutate the chosen classes
  3. Run the tests against the mutants

Picking Mutation Targets

  1. Compile the sources to bytecode
  2. Run selected tests
  3. Remember which instructions are executed by which tests

Building Up Your Team Of Mutants

What we want to have:

  • Stability: a re-run produces identical results
  • Recall: all the possible errors are made
  • Precision: there are no false positives
  • (aka Equivalent Mutations: mutations which result in no observable change in behavior)

NEGATE_CONDITIONALS ON by default

Before

After

if(isAllowed(action, user)) {
    action.execute();
}
if(!isAllowed(action, user)) {
    action.execute();
}

CONDITIONALS_BOUNDARY ON by default

Before

After

if(remainingHp <= 0) {
    kill(player);
}
if(remainingHp < 0) {
    kill(player);
}

REMOVE_CONDITIONALS OFF by default

Before

After

if(cacheEnabled()) {
    cache.store(result);
}
 
cache.store(result);
 
if(isValid(authData)) {
    login(authData);
} else {
    throw new GtfoException();
}
 
 
 
throw new GtfoException();
 

Conditionals

MATH ON by default

Before

After

int sum = a + b;
int sum = a - b;

NB: Also affects field increments and decrements

private int total;
void add(double sample) {
    total ++;
}

0: aload_0
1: dup
2: getfield
5: iconst_1
6: iadd # ← addition
7: putfield

INCREMENTS ON by default

Before

After

int total = 0;
for(...) {
    total ++;
}
int total = 0;
for(...) {
    total --;
}

INVERT_NEGS ON by default

Before

After

if(profit < 0) {
    loss = -profit;
}
if(profit < 0) {
    loss =  profit;
}

REMOVE_INCREMENTS OFF by default

Before

After

int total = 0;
for(...) {
    total ++;
}
int total = 0;
for(...) {
             
}

Math

VOID_METHOD_CALLS ON by default

Before

After

for(Listener l : listeners) {
    l.onStuffHappened(stuff);
}
for(Listener l : listeners) {
                            
}

NON_VOID_METHOD_CALLS OFF by default

Before

After

double len = hypot(a, b);
double len = 0.0;

CONSTRUCTOR_CALLS OFF by default

Before

After

Product p = new Product(...);
//...
double result = calculate(p); 
Product p = null;
//...
double result = calculate(p);

RETURN_VALS ON by default

Before

After

User findOwner(...) {
  //...
  return possibleOwner;
}
//...

if(possibleOwner == null) throw new RuntimeException(); else return null;

Calls

INLINE_CONSTS OFF by default

Before

After

void consume(Object obj) {
    int tlr = (this.tlr =
      (this.tlr * 1664525 +
        1013904223));
    //...
}
void consume(Object obj) {
    int tlr = (this.tlr =
      (this.tlr * 1664525 +
        1013904224));
    //...
}

EXPERIMENTAL_MEMBER_VARIABLE OFF

Before

After

void stop() {
    this.shouldStop = true;
}
void stop() {
                           
}

EXPERIMENTAL_SWITCH OFF by default

Before

After

switch(condition) {
    case 1: doFirst(); break;
    case 2: doSecond(); break;
    default: doDefault(); break;
}
switch(condition) {
    case 1: doDefault(); break;
    case 2: doDefault(); break;
    default: doFirst(); break;
}

REMOVE_SWITCH OFF by default

Before

After

switch(condition) {
    case 1: doFirst(); break;
    case 2: doSecond(); break;
    default: doDefault(); break;
}
doDefault();

Misc

Running The Analysis

  1. Pick one mutant at a time
  2. Inject it via Instrumentation API
  3. Run the relevant tests until one fails
  4. Watch out for:
    • Infinite loops
    • OOM and other failures
    • RuntimeExceptions

The Rules Of TDD

 

Do Not Write Any Production Code

Unless

It Is To Make A Failing Test Pass

 

Do Not Write Any More Unit Tests

Than Is Sufficient To Fail

(Compilation Failures Are Failures, Too)

 

Do Not Write Any More Production Code

Than Is Sufficient To Pass

The One Failing Unit Test

Pitest Grants You Confidence

A success story from TheLadders.com

Okay, but... uh...

Let's Test It On Commons Math

$ svn checkout http://svn.apache.org/repos/asf/comm [...]
$ cd commons-math3
$ patch commons-math3/pom.xml commons-math3-pom-patch.diff

Benchmarking

You're Doing It Wrong

Baseline: Just Run Teh Tests

$ mvn clean test

[...]

Tests run: 4901, Failures: 0, Errors: 0, Skipped: 43

[INFO] ---------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ---------------------------------------------
[INFO] Total time: 05:17 min

Go pitest!

$ mvn clean test pitest:mutationCoverage
[...]
======================================================
- Timings
======================================================
> scan classpath : < 1 second
> coverage and dependency analysis : 27 minutes and 31 seconds
> build mutation tests : 7 seconds
> run mutation analysis : 3 hours, 47 minutes and 7 seconds
======================================================
> Total  : 4 hours, 14 minutes and 46 seconds

Mutation Coverage: 79%

Throw In More Threads, Maybe?

$ mvn clean test pitest:mutationCoverage -Dpit.threads=2
[...]
======================================================
- Timings
======================================================
> scan classpath : < 1 second
> coverage and dependency analysis : 27 minutes and 28 seconds
> build mutation tests : 3 seconds
> run mutation analysis : 2 hours, 2 minutes and 1 seconds
======================================================
> Total  : 2 hours, 29 minutes and 33 seconds

Pitest Performance Over Time

Still, An Hour?

Incremental analysis

Store the following info between runs:

Incremental analysis: Heuristics

Do not check a mutant, if in the previous run:

Incremental Analysis: Overhead

$ mvn ... pitest:mutationCoverage -P pit-history -Dpit.threads=4
[...]
======================================================
- Timings
======================================================
> scan classpath : < 1 second
> coverage and dependency analysis : 7 minutes and 15 seconds
> build mutation tests : 2 seconds
> run mutation analysis : 55 minutes and 41 seconds
======================================================
> Total  : 1 hours, 2 minutes and 59 seconds

o.a.c.math3.linear.MatrixUtils

public static <T extends FieldElement<T>> FieldMatrix<T>
    createFieldIdentityMatrix(final Field<T> field, final int dimension) {
        final T zero = field.getZero();
        final T one  = field.getOne();
        final T[][] d = MathArrays.buildArray(field, dimension, dimension);
        for (int row = 0; row < dimension; row++) {
            final T[] dRow = d[row];
            Arrays.fill(dRow, zero);
            dRow[row] = one;
        }
        return new Array2DRowFieldMatrix<T>(field, d, false);
}
public static <T extends FieldElement<T>> FieldMatrix<T>
    createFieldIdentityMatrix(final Field<T> field, final int dimension) {
        final T zero = field.getZero();
        final T one  = field.getOne();
        final T[][] d = MathArrays.buildArray(field, dimension, dimension);
        for (int row = 0; row < dimension; row++) {
            final T[] dRow = d[row];
            dRow[row] = one;
        }
        return new Array2DRowFieldMatrix<T>(field, d, false);
}

200: removed call to java/util/Arrays::fill : SURVIVED

Incremental Analysis: Profit

$ mvn ... pitest:mutationCoverage -P pit-history -Dpit.threads=4
[...]
======================================================
- Timings
======================================================
> scan classpath : < 1 second
> coverage and dependency analysis : 7 minutes and 3 seconds
> build mutation tests : 1 minutes and 1 seconds
> run mutation analysis : 56 seconds
======================================================
> Total  : 9 minutes and 0 seconds

Get Pitest NOW!

Nothing Is Perfect

Pitest vs. Defensive Programming

public int evaluate(final int distance) {
    if(distance == 0) {
        throw new IllegalArgumentException("Distance should be non-zero");
    }

    int result = 0;
    if(distance > 0) {
        result += calculateDistanceAdjustment(distance);
    }

    return result;
}

Pitest vs. Defensive Programming

public int evaluate(final int distance) {
    if(distance == 0) {
        throw new IllegalArgumentException("Distance should be non-zero");
    }

    int result = 0;
    if(distance > 0) {
        result += calculateDistanceAdjustment(distance);
    }

    return result;
}

11: changed conditional boundary → SURVIVED

Pitest vs. Dangerous Code

public class Minitrue {
    public static void rectify(String path) {
        ensurePathIsSecure(path);
        rmRf(path);
    }

    private static final void rmRf(String path) {
        String command = "$ rm -rf " + path;
        System.out.println("Executing: " + command);
        exec(command);
    }

    // ...
}

Pitest vs. Dangerous Code

@Test
public void testRectifySecurity() {
    boolean rejected = false;

    try {
        Minitrue.rectify("/");
    } catch(SecurityException e) {
        rejected = true;
    }

    Assert.assertTrue(rejected);
    Assert.assertFalse(Minitrue.isUnpath("/"));
}

What Could Possibly Go Wrong?

stderr  : PIT >> Running mutation  [...]
          mutator=VoidMethodCallMutator,
          description=removed call to Minitrue::ensurePathIsSecure

stderr  : PIT >> mutating method rectify
stderr  : PIT >> 2 relevant test for rectify
stderr  : PIT >> Running 1 units

stdout  : Executing: rm -rf /

¯\_(ツ)_/¯

See Also