Anagram Code Kata Part 2 – Mocking and SRP

This post is part of a series on coding Kata, BDD, MSpec, and SOLID principles. Feel free to visit the above link which points to the introductory post, also containing an index of all posts in the series.

In this post, we will rewrite our first specification/test from the previous post, where we didn’t feel very confident about the direction we were heading. We got some great feedback and will now better focus on test driving logic and not just data; we will also be more careful in extracting the requirements from problem statement and in giving our classes a Single Responsibility (the first of the SOLID principles of Object-Oriented Design). As we are test driving our design from a “top-down” perspective, we will encounter the need for dependencies not yet implemented while creating the higher level classes. Instead of halting our rhythm and progress, we will utilize a mocking framework to stub out non-existent implementations according to interfaces. This also encourages us to use Dependency Inversion (last of the SOLID principles; same link as above) and we can prove the validity of our modules by testing them in complete isolation from their dependencies.

Don’t Focus on Data (Yet) and SRP (Single Responsibility Principle)

I had a feeling I was focusing too much on testing data, and I also knew it seemed to odd to be writing my very first test to be dependent on solving a specific instance of the problem (especially a test case with thousands of words in it), instead of working on solving the problem generally. After some terrific feedback from David Tchepak in the comments of my last post, I now understand that wasn’t my only mistake. I was also trying to give my AnagramsFinder class too many responsibilities (especially with just one ParseTextFile method), namely extracting the words out of the text file and then grouping the anagrams together. Not only that, I also was returning a count of anagram sets when the problem statement specifically asked for output of the anagram sets themselves. Even though this is a simple problem scenario, it can be foolish to make assumptions or simplifications like I did that are not part of the requirements.

Let us first focus on the responsibilities required of our AnagramsFinder class. It needs to parse a list of words out of a text file given the file path and it also needs to group words together into sets that are anagrams of each other. However, we just named two responsibilities; implementing both in the same class would make it less cohesive and therefore less maintainable. This is because our class would have more than one responsibility that could change in the future, and because these responsibilities are in the same class, they could be considered coupled and one could be affected by modifications to the other. All of this train of thought falls under the Single Responsibility Principle from the SOLID principles (see links at the beginning of the post).

To solve this predicament, we will make AnagramsFinder a managing class of dependencies that individually solve these separate concerns. Each of these dependencies will adhere to the Single Responsibility principle, as does this higher abstraction manager class. Its responsibility can be summarized as managing the inputs, outputs, and execution order of its dependencies. We shall name the dependencies directly after each of their responsibilities, namely FileParser and AnagramGrouper. However, I don’t want to go implement these dependencies and throw off the flow of fleshing out the design and logic of my manager class. We are trying to design from a more “top-down” approach, instead of focusing on the lower level data concerns of the solution too early in the design process. To accomplish completing the design of our manager class without actually implementing the dependencies, we will code to interfaces (namely IFileParser and IAnagramGrouper).

DIP (Dependency Inversion Principle)

The advantages of using interfaces are that we abstract out the actual implementation of the dependencies, allowing us to be more modular and less coupled between dependencies. One class is not strongly tied to a specific implementation of its dependency, but rather the general contracts made available by the interface’s defined method signatures. What this really means is that we can swap out actual implementations of the dependency without the consuming class being modified in the least bit. This makes for truly maintainable code, especially when requirements change down the road after version one of the application is up and running. This is what the Dependency Inversion Principle (again, from SOLID) is all about.

We can take this a step further by passing the necessary dependencies into our consuming class at construction. This allows for us to “inject” into the object any implementation of the interface we see fit; this technique is aptly named Dependency Injection (DI). Our manager class doesn’t have to fret about any concrete details regarding its dependencies’ actual implementations, nor about any of the construction ceremony associated with “newing” up and initializing said dependencies. We won’t be using any DI containers (as our dependency needs are very light in this exercise), but the use of the Dependency Inversion Principle does set us up nicely to use a mocking framework so that we can finish fleshing out our tests without implementing any of the dependencies yet.

Mocking Dependencies

We would like to finish the design of our top level AnagramsFinder manager class, but not fully commit to how its dependencies will be implemented yet. We have a general idea of how we want our consuming class to interact with its dependencies via interfaces. Let’s go ahead and look at our interfaces we named earlier:

public interface IFileParser
{
    IEnumerable<string> ExtractWordListFromFile(string filePath);
}

public interface IAnagramGrouper
{
    IEnumerable<string[]> FindAnagramSets(IEnumerable wordList);
}

Now, we could create concrete implementations of the interfaces to be used in our tests that return simple dummy results. An example could look like this:

public class TestFileParser : IFileParser
{
    public IEnumerable<string> ExtractWordListFromFile(string filePath)
    {
        return new[] { "wordA", "wordB", "Aword", "wordC" };;
    }
}

This is helpful because it keeps the one class we are interested in from being dependent on any real logic in its dependencies. We want to be able to test our modules in isolation from any dependencies, so that no secondary or outside influences skew the test results. We are basically providing dummy dependencies to our tests that have no logic that could cause side effects.

However, the software development community has provided tools called mocking frameworks that can create these dependency stubs for you. This can save you from creating concrete interface implementations that have no use except for in testing. The mocking framework I will try out is Rhino.Mocks from Ayende. To get set up, you merely need to download the latest build zip archive, extract out the Rhino.Mocks.dll library to our Libraries folder created in the last blog post, and add a reference to the DLL in our Visual Studio solution. I will next show you how to use Rhino.Mocks as we rewrite our first specification test.

Rewrite of Our First Specification/Test

Armed with SRP, DIP, mocking, a more strict adherence to the actual problem statement, and a renewed focus on “favoring test driving logic over just testing data,” we now rewrite our first specification as follows:

[Subject(typeof(AnagramsFinder), "Finding Anagrams")]
public class when_given_text_file_with_word_on_each_line
{
    static AnagramsFinder sut;
    static IEnumerable<string[]> result;
    static string filePath = "dummy_file_path.txt";
    static IEnumerable<string> wordListFromFile = new[] { "wordA", "wordB", "Aword", "wordC" };
    static IEnumerable<string[]> expected = new[] { new[] { "anagram", "gramana" } };

    Establish context = () =>
    {
        var fileParser = MockRepository.GenerateStub<IFileParser>();
        fileParser.Stub(x => x.ExtractWordListFromFile(filePath)).Return(wordListFromFile);

        var anagramGrouper = MockRepository.GenerateStub<IAnagramGrouper>();
        anagramGrouper.Stub(x => x.FindAnagramSets(wordListFromFile)).Return(expected);

        sut = new AnagramsFinder(fileParser, anagramGrouper);
    };

    Because of = () =>
    {
        result = sut.ExtractAnagramSets(filePath);
    };

    It should_result_in_list_of_anagram_sets = () =>
    {
        result.ShouldEqual(expected);
    };
}

I should mention a few notes about the code. First, we need to add a using Rhino.Mocks statement at the top of our specification class file. Second, our AnagramsFinder instance variable named sut stands for the “Subject Under Test,” a mannerism I picked up from David Tchepak, whom I’ve mentioned several times throughout this post series.

Our use of Rhino.Mocks is found when we call the MockRepository.GenerateStub method against an interface. We then proceed to tell the stubbed object how to behave by specifying dummy return values when given methods of the object are called with given parameters. The last line of our “context” setup is to then inject these two newly generated, stubbed dependencies into our manager class for testing. It’s also interesting to note that none of my parameters and expected outputs really make much sense. This is done on purpose to show that the data really doesn’t matter to this class, as we are not testing the logic of any data processing by the dependencies (remember, test in isolation). It is true that this test really isn’t testing any real logic at this point, and it may even never evolve into testing any real logic either. However, I think the important point is that it aided us in fleshing out the design, which we previously didn’t know how we were going to implement. Not all tests will created equal in regard to validating our logic, but they all will play a part in driving the design of our code. Hopefully these statements are true, and I would love to hear feedback on this topic in the comments.

I also want to touch on the Specification pattern of Behavior Driven Design and testing. Specification tests are commonly set up into a workflow of context establishing, behavior performing, and results asserting. The MSpec framework is designed to encourage this test organization.

Red, Green, Refactor and MSpec Test Runner Output

To make our first test run, we need to create the basic outline of our AnagramsFinder class as outlined in our test. This includes a constructor that takes our two dependencies and an ExtractAnagramSets method:

public class AnagramsFinder
{
    private IFileParser fileParser;
    private IAnagramGrouper anagramGrouper;

    public AnagramsFinder(IFileParser fileParser, IAnagramGrouper anagramGrouper)
    {
        this.fileParser = fileParser;
        this.anagramGrouper = anagramGrouper;
    }

    public IEnumerable<string[]> ExtractAnagramSets(string filePath)
    {
        throw new NotImplementedException();
    }
}

I am using the ConsoleRunner according to the method described in Rob Conery’s introductory MSpec and BDD post, including output to an HTML report. After running our specifications, we get the following output:

Specs in AnagramCodeKata:

AnagramsFinder Finding Anagrams, when given text file with word on each line
¯ should result in list of anagram sets (FAIL)
System.NotImplementedException: The method or operation is not implemented.
   ... Stack Trace here ...

Contexts: 1, Specifications: 1
  0 passed, 1 failed

…and the HTML report:

A highly encouraged tenet of Test Driven Development is the practice of “Red, Green, Refactor.” It is meant to denote the evolution of the results and state of your tests. You are encouraged to write the test and then do the minimum work necessary to get the code base to compile and run. You are first running your test in a control state where you know it should fail. Most test runners will show the color Red in regard to failed tests, and thus the name of the first step. Your next stage is to implement the code to make the test pass and turn the output color to Green, commonly indicating passing tests. The Refactor stage is a time to pause and see if any code can be reorganized or simplified. Then “rinse and repeat as necessary”, an applicable instruction from shampoo bottles.

To make our test pass, we implement real logic to coordinate the calls into our dependencies, like so:

public IEnumerable<string[]> ExtractAnagramSets(string filePath)
{
    var wordList = fileParser.ExtractWordListFromFile(filePath);
    return anagramGrouper.FindAnagramSets(wordList);
}

…and the successful outputs of our test runner:

Specs in AnagramCodeKata:

AnagramsFinder Finding Anagrams, when given text file with word on each line
¯ should result in list of anagram sets

Contexts: 1, Specifications: 1

Final Thoughts and Questions

So what do you think of the new test? Are we better able to see the Single Responsibility of the AnagramsFinder manager class? Are we more inline with solving the actual problem statement? Your feedback would be much appreciated.

In the next post, we will begin work on test driving the design and implementation of the dependencies…unless of course I get feedback that I should be focusing on a different area first or that this last rewrite still needs more help.

Comments

Tobias Walter

Yes, this is a good start for a great idea (code kata done slowly).

Right now, I'm struggling with the same kata and I have a problem looking at your AnagramsFinder and IAnagramGrouper. The result of both is the same and the tests (apart from setting up a not needed IFileParser for the AnagramsFinder) will be the same too.

I think this is because you're using the pipes and filters-pattern (opposite to David, who combines his both results in "focus on test driving logic"). Testing just the filters should be sufficient here?

What you maybe can do is a behavioural test. So, instead of testing with stubs, use mock-objects testing weather the methods were called.

What do you think about that?

p.s. it's hard to post a comment here because the word verification-textbox is only accessible when selecting text with the mouse and scrolling down :(

David

Looks good Mike!

I wouldn't call the previous spec a mistake or "foolish" though. I seen working on simple cases like Count advocated lots of places for doing TDD, I've just always struggled with getting from those cases to the main responsibilities of the class.

Looking forward to the rest of the series.