Tag: clean code

Software Craftsmanship and Heroism

A certain restaurant in the Silicon Valley has a sign which says “This is a bad place for a diet. This is a good place for a diet.” There is no contradiction – just 2 different contexts.

So it is with Software Craftsmanship and heroism. The opposite of heroism in software development is a smooth operation high up the CMMI ladder, where everyone knows what to do and no one wastes time or stresses out because the process is moving forward. On the bottom of the CMMI ladder, on the other hand, it’s sometimes chaos, and a developer can be put under pressure to deliver in a hurry because of bad communication or poorly designed and/or poorly tested code encountering demanding clients.

Software craftsmanship is bad for heroism because it reduces the chaos that requires a heroic developer to step in and fix things in a hurry. There’s a software factory pipeline that doesn’t let logic failures reach the client.

Software craftsmanship is good for heroism because sometimes even with a solid agile process and a finely-tuned software factory pipeline and clean, unit-tested code, misconceptions can slip in and a client can do something that the unit tests did not foresee and (rightly) complain about it. And why is craftsmanship good for heroism in this case? Because the code with the error is clean, extensible and unit-testable. If you have an idea for a fix, you can try it out in minutes in a unit test instead of having to code blindly, build, run and test manually and then repeat the process until it’s debugged or you decide to take a different approach. Ultimately, the software craftsperson has to build and run and test manually, but only AFTER the key idea for a solution has been tested and debugged. So with craftsmanship, you may have some shining moments of glory, but the heroic struggle happens gradually, every day, maintaining the quality of the code and the software factory.

Does TDD “damage” your design?

I recently came across a couple articles that challenged some of my beliefs about best practices.

In this article, Simon Brown makes the case for components tightly coupling a service with its data access implementation and for testing each component as a unit rather than testing the service with mocked-out data access. Brown also cites David Heinemeir Hansson, the creator of Rails, who has written a couple of incendiary articles discouraging isolated tests and even TDD in general. Heinemeir Hansson goes so far as to suggest that TDD results in “code that is warped out of shape solely to accomodate testing objectives.Ouch.

These are thought-provoking articles written by smart, accomplished engineers, but I disagree with them.

For those unfamiliar with the (volatile and sometimes confusing and controversial) terminology, isolated tests are tests which mock out dependencies of the unit under test. This is done both for performance reasons (which Heinemeir Hansson calls into question) and for focus on the unit (if a service calls the database and the test fails, is the problem in the service or the SQL or the database tables or the network connection?). There’s also a question of the difficulty of setting up and maintaining tests with database dependencies. There are tools for that, but there’s a learning curve and some set-up required (which hopefully can be Dockerized to make life easier). And there’s one more very important reason which I’ll get to later…

Both Brown and Heinemeir Hansson argue against adding what they consider unnecessary layers of indirection. If your design is test-driven, the need for unit tests will nudge you to de-couple things that Brown and Heinemeir Hansson think should remain coupled. The real dilemma is where should we put the inevitable complexity in any design? As an extreme example, to avoid all sorts of “unnecessary” code you could just put all your business logic into stored procedures in the database.

“Gang of Four” member Ralph Johnson described a paradox:

There is no theoretical reason that anything is hard to change about software. If you pick any one aspect of software then you can make it easy to change, but we don’t know how to make everything easy to change. Making something easy to change makes the overall system a little more complex, and making everything easy to change makes the entire system very complex. Complexity is what makes software hard to change. That, and duplication.

TDD, especially the “mockist” variety, nudges us to add layers of indirection to separate responsibilities cleanly. Johnson seems to be implying that doing this systematically can add unnecessary complexity to the system, making it harder to change, paradoxically undermining one of TDD’s goals.

I do not think that lots of loose coupling makes things harder to change. It does increase the number of interfaces, but it makes it easier to swap out implementations or to limit behavior changes to a single class.

And what about the complexity of the test code? Brown and Heinemeir Hansson seem to act as if reducing the complexity of the test code does not matter. Or rather, that you don’t need to write tests for code that’s hard to test because you should just expand the scope of the tests to do verification at the level of whole components.

Here’s where I get back to that other important reason why “isolated” tests are necessary: math. J.B. Rainsberger simply destroys the arguments of the kind that Brown and Heinemeir Hansson make and their emphasis on component-level tests. He points out that there’s an explosive multiplicative effect on the number of tests needed when you test classes in combination. For an oversimplified example, if your service class has 10 execution paths and its calls to your storage class have 10 execution paths on average, testing them as a component, you may need to write as may as 100 tests to get full coverage of the component. Testing them as separate units, you only need 20 tests to get the same coverage. Imagine your component has 10 interdependent classes like that… Do you have the developer bandwidth to write all those tests? If you do write them all, how easy is it to change something in your component? How many of those tests will break if you make one simple change?

So I reject the idea that TDD “damages” the design. If you think TDD would damage your design, maybe you just don’t know how bad your design is, because most of your code is not really tested.

As for Heinemeir Hansson’s contention that it’s outdated thinking to isolate tests from database access, he may be right about performance issues (not everyone has expensive development machines with fancy SSD drives, but there should be a way to run a modest number of database tests quickly). If a class’s single responsibility is closely linked to the database, I’m in favor of unit-testing it against a real database, but any other test that hits a real database should be considered an integration test. Brown proposes a re-shaped, “architecturally-aligned” testing pyramid with fewer unit tests and more integrated component tests. Because of the aforementioned combinatorial effect of coupling classes in the component, that approach would seem to require either writing (and frequently running) a lot more tests or releasing components which are not exhaustively tested.

The Boy Scout Rule vs. the Pottery Barn Rule

Scenario 1 (The Blind Side)

You’re fairly new here, but it you thought it would be a good idea to rename the method “doProcessing” to “updateSalesStatistics”. It’s true, the code is much more readable now. Did you know that our largest paying customer loads our jar file and calls “doProcessing” via reflection? They’re not happy.

Scenario 2 (Danger UXB):

We had to revert your last commit. We’re 2 weeks from code freeze, and you refactored the doProcessing method. That method is 400 lines long and contains little mountains of indentations, and there’s no way to unit test it. It’s too important to risk breaking it now.

With his Boy Scout Rule, Robert C. Martin encourages us as software craftsmen and craftswomen to be brave and clean up the messes we find, at least a little bit at a time. Unless you have a team experienced in and fanatically devoted to clean code, design, and TDD plus a clear shared vision of the functional specifications, technical debt is added to your project on a daily basis. So the Boy Scout Rule proposes that each of us reduce technical debt on a daily basis, hopefully at least as fast as it accumulates.

This excellent approach sometimes runs afoul of its corporate nemesis, which is best summed up as the Pottery Barn Rule: You break it, you own it.

It can create a real dilemma for a software craftsman or craftswoman who wants to do the right thing but can not promise that every improvement poses no risk. We see in the first scenario above that we can be blindsided by undocumented usages of the code which are not enforced by the compiler. This kind of problem is technical debt that keeps on giving. In the second scenario above we see technical debt which protects itself using fear. It’s like an unexploded bomb, but there’s no bomb squad, just you and the people who will blame you if it explodes after you’ve touched it.  If it explodes when left alone, responsibility for the explosion is diffused.

Overcoming these sorts of blockages requires courage, an effective code review process, and what I might call “good management”. According to an extensive internal study done by Google, the number one key to having an effective team is “psychological safety: Can we take risks on this team without feeling insecure or embarrassed?”. That’s partially down to management, but I think it’s also linked to the number 2 key: “Dependability: Can we count on each other to do high quality work on time?” That’s where the developer can make an impact. If most of the time you get it right when you take a risk, the team will likely cut you some slack on the rare occasions when you don’t.


Is your code a dodecahedron?

One of the key metrics of software quality is cyclomatic complexity. It gives an idea of the number of tests needed to cover all the code in a method (the number of “execution paths”), which often correlates inversely with the readability and maintainability of the code (i.e. lower is better). There are a lot of tools which calculate it for you.

But did you know that McCabe’s formula for cyclomatic complexity is basically the same as Euler’s formula to calculate the number of faces of a polyhedron?

For calculating cyclomatic complexity, each statement in the code is considered a vertex in a graph, and an edge in the graph is added from that vertex to each possible statement which could follow it (including the implicit return if there isn’t an explicit one). Thus an if statement is a vertex with 2 outgoing edges (the true case and the false case), for example.

Where E and V are respectively the number of edges and the number of vertices:

Complexity = E – V + 2

In Euler’s formula, rearranged, the number of faces F = E – V + 2, which is the same as the formula for cyclomatic complexity.

So that really ugly legacy method you’d rather not have to modify might actually be a dodecahedron…

A (regular) dodecahedron

Digression: My browser’s spell-checker suggested for this post the title “Is your code a rhododendron?”. If you have ever written or seen rhododendron-like code, please leave a comment about it. 🙂

Fear of Streams

There’s an article in the new issue of the French magazine Programmez which got my attention. It chronicles the efforts of some programmers for a B2C site to investigate converting their old for loops into more readable and maintainable Java-8 streams. The article is kind of a mess, actually, but to a software craftsman or craftswoman who works on Java legacy code, the message is a bit frightening.  The programmers did performance tests of the methods they changed, but because some of the more complex methods they modified took twice as long to execute as they did before, their employer was not convinced and does not want them to proceed with a refactor.

If you thought that it was difficult to sell your product owner on the idea of adding days to the schedule to clean up legacy code, imagine having to reveal that the clean-up also involves the risk of performance issues!

Not all of their changes slowed down execution – some even speeded it up a bit.  What troubled me about the article, though, is that I could not predict, looking at the examples they gave of their modifications, which ones would run faster and which ones would run slower than before.

I turned to StackOverflow to see if somebody knows how to do that in general. Is there any way, in a code review, for example, to see if a stream will run significantly slower (a constant-order slowdown, but enough to annoy a user or kill a refactoring initiative) based on just looking at the code?  Of course, it’s possible to code stupidly in a way that will obviously cause a slowdown.  But, apart from some useful guidance about when to use parallel streams and when not to, the message I got from some smart, informed developers is that you can’t, or else it’s not worth the trouble for what amounts to at worst a constant-order slowdown. Of course you should review code changes to make sure they don’t affect the correctness or the complexity of the algorithm, and maybe you can tell whether going parallel would help or hurt performance, but that’s all you can usefully detect.

I think the biggest mistake that the developers in the Programmez article made was to evaluate the performance based on isolated unit tests (microtests).  Performance of a method is not interesting to the end-user.  A  constant-order decrease in performance of a method will frighten a product owner, but maybe the end-user won’t even notice it.  For example, it may be that doubling the execution time of a loop results in a 0.001% decrease in performance of a web request because it’s a loop to process results of a heavy database query.

If the code you work on is for a financial trading application where every nanosecond counts, then you’re not going to be using Java streams at all – you likely have customized collections that run super-fast with no syntactic frills.  For the rest of us, we need to do automated end-to-end performance tests for the scenarios where performance might be an issue.

If you convert for loops to streams, don’t convert everything on the same day. Space out these kinds of changes enough that end-to-end tests will catch only one performance issue at a time.  Then you can reassure the product owner that there will be no “noticeable” decrease in performance from the refactor, but that the code will be more readable and understandable, and thus easier to change and less bug-prone.  For certain iterations which can be parallelized efficiently, there may even be noticeable performance gains.