Our Blog

Legacy Code – costs you a hell of a lot of money

Have you ever stopped to think about how Legacy Code impacts your time & money? How it impacts your products’ quality, hence, your customers? How it affects your employees?

 

In this article I will try to cover the bad impacts of Legacy Code, impacts that actually cost the organization money and quite a lot of it. Mainly, it will not be based upon some empirical data, nor academic observations, instead, it will be based on pure common sense and our personal experience as mentors of dozens of teams along the years.

 

Great, so what is Legacy Code?

 

The definition of Legacy Code

In his book, “Working effectively with Legacy Code”, Michael Feathers tries to give a formal definition of Legacy Code:

 

To me, Legacy Code is simply code without tests

 

For those of you who ever worked with Legacy Code, this might sound strange – you know intuitively, that Legacy Code is simply a rotten code. Well, you’re right, but Feathers’ definition is still correct though, when a code is not covered by tests – it’s rotting. Why is it exactly?

Uncle Bob explains this very elegantly in one of his lectures on youtube:

 

How many of you, have looked at a piece of code on your screen and your first thought is: “my God, this is a mess, someone needs to clean it”. And your second thought is: “It’s not gonna be me. I am not going to be the one to clean this code. Because I know that if I’ll clean it, I might break it, and if I’ll break it – it will become mine…” So you walk away from this code, you will not touch it. This is why code rots.

 

The impacts of Legacy Code

Now that we understand what a Legacy Code is and what causes it, let’s talk about its worst impacts.

 

Long tests during development

Think about a code that you’re working on, and it’s not covered by tests. If you’re a programmer with good instincts, then you’ll probably stop every few lines you add or edit, to test what you just did. If you don’t have unit tests in place, then you’ll have to run the whole application to test your last changes, over and over again. Each run consists of:

  • Compiling (in static typed languages of course)
  • Then running the application, which most of them have some initialization phase.
  • Then most of the times you go through a login step
  • Then step over a few screens by a few clicks and sometimes by filling some data
  • Once you’re in the screen you need to test, you need to do some regression tests on this screen just to make sure you haven’t broken anything.

 

Think about it, this process I just described takes half a minute at best, but I think that in most cases it takes not less than 2 minutes. But remember, as a good programmer, you ran this process every few lines you change – this means dozens of times during the day. Do the math and you will see that these manual tests consume a big portion of your time, I would say around 1 hour per a day of work (8.5 hours).

 

Unit tests can reduce this by an order of magnitudes.

 

Long development time

  • Since the code is rotten, it’s probably not clear enough (the understatement of the year), and when a code is not clear enough, developers spend a lot more time on trying to understand it.
  • A rotten code is also fragile and since the code is unclear and fragile, developers spend more time on testing – just to be on the safe side (if they don’t, then the product is in danger)
  • A rotten code is also very rigid. When the code is rigid, developers spend more time on almost every change they do, trying to fit in their change.

 

Long QA time

  • Since there are no automated tests, the QA process is manual and consumes quite a lot of time (human hours instead of machines)
  • Since the code is fragile, more bugs sneaks into QA and the number of QA cycles is getting bigger.

 

Low quality product

  • When the code is rigid and not reusable, developers often have no other choice but to duplicate pieces of code. When there are code duplications, it isn’t a question of whether you’ll remember to update all of these duplications when needed: it’s a question of when you’ll forget to update one of them. Once you forget – you produce a bug.
  • When the code is fragile and unclear, more bugs are being produced at development time. The more bugs are being produced at development time – the more bugs get to QA. The more bugs get to QA – the more bugs sneak into production. The more bugs sneak into production – the lower the quality of your product.

 

Hard merges

  • Often, in a Legacy Code environment, the Continuous Integration is partial or doesn’t even exist. By the way, having a build server, Jenkins or alike, doesn’t necessarily mean Continuous Integration. A build server is just a tool, Continuous Integration is a discipline.

Anyway, when that’s the case, developers commit their changes quite rarely and in large bulks, which leads to hard, risky merges that consume precious time and might produce bugs.

 

Unhappy customers.

  • When the development cycles are longer, the less releases you have in a given period. The less releases you have in a given period – the less happy your customers are.
  • When the code is rigid, fragile, unclear and not covered by tests, you’ll find yourself too often saying ‘no’ to customers’ requests.
  • The lower the quality of your product – the less happy your customers are.

 

The impact on your employees

Good employees like to be part of a winning team. StackOvervlow made a survey that shows an obvious correlation between job satisfaction and checking in code. The more frequently developers are checking in code, the more satisfied they are. Meaning, developers like to work in a fluent, agile environment and Legacy Code isn’t that environment.

 

When your employees aren’t satisfied, the retention rate drops. Moreover, it will become harder and harder to find new good employees.

 

Bottom line

If your code can be defined as Legacy Code, keeping the status quo is nearly not an option.

Read More

Warning! Coderoaches!

Imagine a cool startup place. It has top notch working stations, huge monitors, the latest keyboards. Programmers have open cheques to choose their workstations: Mac Pro, Symphonia, ThinkStation – you name it, you got it. Vintage Pinball and Pacman machines alongside Xbox One and PS4 Pro. You know, The Works.

You also see pizza trays and noodle boxes laying around. Some of them two weeks old. A bit of mold here and there. There are marshmallows and jelly beans on the floor, clearly walked over numerous times, scores of ants are nibbling at them. Well, you get the picture. It’s a cool startup, but also dirty and in great neglect.

Rarely a startup’s physical environment looks like that. But all too often the codebase does. In startups, in medium sized companies, and most notably in mature, well established products, typically in large organizations.

It’s kind of funny that when I meet highly skilled programmers, exhibiting clean coding, they tend to say that they “get their hands dirty”. Well, I guess that is also the case when you clean your home you wash your hands once you are done, don’t you?
As for the not-clean-coders, it’s not that they intend to write bad code. Nor do they like bad code. It’s just that they don’t have the habit of writing code hygienically. Like the pizza trays in the office, they either expect someone else to take care of it, or they just don’t notice that it’s there.

Here’s an example of hygienic coding habit. It’s not the only way, just one possible example. Let’s say that I picked a task to complete: Enhance Foo.Bar(pub) to validate that pub is a good bar for foo to go into.
I would:

  1. Inherit FooSub from Foo, and do all my changes on it, until it’s safe to merge back into Foo. Meanwhile, all existing tests pass and all my teammates can safely work on their code.
    I write my tests on FooSub so long as I am in my development cycle.
  2. I override Bar() in FooSub, so I can try out my changes. When I fail my tests, I undo my bad changes. When my tests pass, I continue to making my next test pass.
  3. I commit, merge and push my code frequently. I am part of the teams’ build cycle, and on the same branch.
  4. Once I am happy that the task is complete, I merge my changes in Bar() into Foo. Again, if I merged correctly, all tests pass. Otherwise, I undo last change and retry.

Michael Feathers has a nice example for this in Working Effectively With Legacy Code, chapter 8, “How Do I Add a Feature?”
This book, by the way, is a must for clean coders that want to disinfect their code from coderoaches.

Seem like a lot of extra motions for just a few code changes, doesn’t it?
Yes, it does. So is washing your hands after you visit the loo, or adjusting your seat and mirrors as you enter your car, or bringing all the right tools before you change a fly-net on your window. You can do without these steps, and, in most cases, you might come out OK. But the once-in-a-while dissenteria, car crash or ruined fly-net frame is so costly that you dare not skip the mundane “overheads”.

Also, the more complex and the more error-prone the domain, the more you should practice safety measures as you progress.
Take medicine, for example. Would you trust a doctor who doesn’t wash her hands? A nurse who’s careless with his sterile tools? Of course not!

In The Checklist Manifesto, Atul Gawande distinguishes between two types of errors we make:
Errors of ignorance – mistakes we do because we don’t know enough, and errors of ineptitude – mistakes we do because we fail to do what we know.

Heck, if you read text messages while you drive – you are a bad driver – you exhibit ineptitude – you ignore available knowledge suggesting that you are a danger to yourself and to fellow drivers. Needless to say you are much worse if you’re also typing text messages while driving.

In the same token – you exhibit ineptitude as a programmer if you do not: refactor frequently, cover your code with high percentage of unit tests, build frequently (including tests, of course), and measure your code with static code analysis.
I can’t stress enough how important is learning (for example from books like Working Effectively With Legacy Code) for programmers to develop their own competence.

If you don’t, you probably have all kinds of old pizza trays and sticky marshmallows in your code. The kind that attract coderoaches. The kind that, eventually, you might need to call in pest control and a cleaning agency, shut down your code scene for a while until you get rid of some coderoaches, and only then get back to the now clean code.
And hopefully keep it clean and hygienic with tests and all.

Image source:
Cockroach and Binary code

Read More

Composite objects – (how) do you assert them?

Let me start with a code sample (in java) that will demonstrate what I am going to talk about:

  ...
  private IMAPService _mapService;
  
  public RealEstateField createField(List<Coordinate> coordinates) {
    RealEstateField field = new RealEstateField();
    field.setCoordinates(coordinates));
    field.setAddress(_mapService.findAddress(coordinates));
    double surface = _mapService.calcSurface(coordinates));
    field.setSurface(surface);
    field.setCenter(_mapService.findCenter(coordinates));
    field.setPrice(surface * _pricePerMeter);

    //...
    //Many other attributes are being set here
    //...
    
    return field;
  }

  ....

 

Without getting into the code itself and what it’s really doing, I would just say that it’s quite “simple” – it has no loops, no complicated if-else statements, nor any sophisticated algorithm in it. It simply creates an object of RealEstateField while the more “sophisticated” code resides in its dependency _mapService, which already has tests of its own.

Questions arise – why should we test the method createField if it does nothing special anyway? Moreover, even if we will decide to test it, how should we assert its outcome, a RealEstateField object, which is quite a big object?

Let’s try to answer these 2 questions.

Should I test such methods?

Many people tend to think that only a code that consists of complicated if-else statements, loop statements and algorithms, is worth testing. To them, it doesn’t make any sense to write a test that lets _mapService.findAddress() return the address ’11 Wall Street’ and then to assert that field.getAddress() really holds the value ’11 Wall Street’.

 

Here is a reminder of how field.address is being filled:

field.setAddress(_mapService.findAddress(coordinates));

 

Well, I tend to disagree, mainly for the following reasons:

Simple code can break too.

When developers are working on a piece of code like the one above, many things can go wrong: Mistakenly, they can switch lines; They can remove lines completely; They can put a value in the wrong attribute; Lines can be removed when you are merging code with different developers’ commits, and so forth.

Let’s remember that the real purpose of automated tests in general and unit tests in particular, is to preserve the current behavior of the code, that is, if someone breaks the code’s behavior by a mistake – some tests should fail.

 

Simple code doesn’t mean “not critical”.

If a code is “simple” and unsophisticated, it doesn’t mean it’s not critical. Consider the example above, what if someone will mess up the price calculation for the real estate fields? This can be a fatal bug!

Moreover, IMO, the worst kind of bugs are the ones that throw no exceptions – they corrupt the data silently, and when this corrupted data eventually trigger an exception elsewhere, it’s hard to find the origin of the problem, which is the code that corrupted the data in the first place. My conclusion: there is no escape from verifying that the object was created correctly, hence, asserting the whole object with all of its attributes.

 

 

Ok, but how should I assert such a big object?

I will demonstrate three ways to do that:

An explicit assertion for every attribute.

@Test
public void createField_happyPath(){
  FieldsFactory fieldsFactory = new FieldsFactory();
  List<Coordinate> coordinates = createCoordinates();
  
  //
  //Set the polygon here...
  //
  
  RealEstateField returned = fieldsFactory.createField(coordinates);
  assertEquals("some address", returned.getAddress());
  assertEquals(1.76, returned.getSurface());
  assertEquals(51.24232, returned.getCenter().getLatitude(), 0.001);
  assertEquals(22.2097, returned.getCenter().getLongitude(), 0.001);
  assertEquals("some address", returned.getAddress());
}

The problems with these kind of tests are quite obvious. First, they are a code smell called Obscure test. Second, to write such a long list of assertions is quite exhausting. Third, normal objects tend to change and more and more attributes are being added or removed, and thus, this list of assertions becomes incomplete the first time we forget to update
 it accordingly. For example, suppose we have been asked to add a new attribute called ‘tax’ to the RealEstateField object and fill it in the createField method. There is a good chance that we will forget to add another assertion for this new attribute and the test will still pass, obviously.

 

Deep comparison libraries.

A better option would be to use libraries like AssertJ  that do deep comparison, meaning, they take two objects and with reflection, they compare the attributes, one by one. This is how you use such a library:

@Test
  public void createField_happyPath(){
    FieldsFactory fieldsFactory = new FieldsFactory();
    List<Coordinate> coordinates = createCoordinates();
    //
    //Set the polygon here...
    //
    
    RealEstateField expected = new RealEstateField();
    expected.setAddress("some address");
    expected.setSurface(1.76);
    expected.getCenter().setLatitude(51.24232);
    expected.getCenter().setLongitude(22.2097);
    
    RealEstateField returned= fieldsFactory.createField(coordinates);
    assertThat(expected).isEqualToComparingFieldByFieldRecursively(returned);
  }

With this method, the danger that the tests will become incomplete as new attributes are being added, also exists. See the following example:

 

  ...
  private IMAPService _mapService;

  public RealEstateField createField(List<Coordinate> coordinates) {
    RealEstateField field = new RealEstateField();
    field.setCoordinates(coordinates));
    field.setAddress(_mapService.findAddress(coordinates));
    double surface = _mapService.calcSurface(coordinates));
    field.setSurface(surface);
    field.setCenter(_mapService.findCenter(coordinates));
    field.setPrice(surface * _pricePerMeter);
    field.setCity(_mapService.getCity(coordinates));
    //...
    //Many other attributes are being set here
    //...
    
    return field;
  }

  ....

If you look at line 12 (highlighted), you will see that the code is now filling a new attribute – city, which its value is taken from _mapService.getCity(). The tests are unaware of this new method of _mapService so it returns null (that’s the default behavior of most mocking frameworks – return the default value for every method if not stated otherwise), hence, the city attribute of the returned RealEstateField is also null. And since the city attribute of the expected RealEstateField is also null (the tests are unaware of it too and never put a value in it), the expected object is equal to the returned object and the tests pass. This is a bad thing of course, the tests should have failed.

Another problem with this option is that it does not save us from the exhausting work – we still need to set a value for each and every attribute in the expected object.

This leads me to the third option and the one I like the most

Serialize and compare.

The idea behind this method is to serialize the returned object into a json/xml string and then compare it with a pre-made json/xml string that represents the expected object. It should look like this:

@Test
public void createField_happyPath() {
  FieldsFactory fieldsFactory = new FieldsFactory();
  List<Coordinate> coordinates = createCoordinates();

  //
  // Set the polygon here...
  //

  RealEstateField returned = fieldsFactory.createField(coordinates);
  Gson gson = new GsonBuilder().serializeNulls().create();
  String result = gson.toJson(returned);

  String expected = "{\"address\":\"11 Wall Steet\",\"surface\":33.3,\"center\":           {\"latitude\":51.24232,\"longitude\":22.2097,\"altitude\":0.0,\"id\":0}}";
  
  assertEquals(expected, result);
}

This method has two big pros:

  1. Unlike with the other two methods, if a new attribute is being added to the object as time goes by a
    nd you forget to change your tests to assert it, the test will fail because the json of the returned object now contains the new attribute (even though it is null) but the json of the expected object doesn’t contain it. This is the desired behavior since the tests now are incomplete and they better fail.
  2. It’s extremely easy to build the expected json string: you simply run the test for the first time, putting an empty string in the expected json string and let JUnit tell you where you got “wrong” and how the real json string should really look like. You do it like this:
@Test
public void createField_happyPath() {
  FieldsFactory fieldsFactory = new FieldsFactory();
  List<Coordinate> coordinates = createCoordinates();

  //
  // Set the polygon here...
  //

  RealEstateField returned = fieldsFactory.createField(coordinates);
  Gson gson = new GsonBuilder().serializeNulls().create();
  String result = gson.toJson(returned);

  String expected = "";
  
  assertEquals(expected, result);
}

But this method has a flaw – you can’t easily build the expected json string before you have a working code (as I described a few lines above) and that’s a violation of a fundamental principle in TDD: You are not allowed to write any production code unless it is to make a failing unit test pass. 

When needed, I choose to use this method even at the cost of violating this TDD rule every once in a while. After all, most of the time we are not required to use this method anyway – most of the time, the outcome of the CUT (class under test) is a lot more simple to verify.

Read More

Unit tests, do they worth it?

Often I hear people doubting the effectiveness of unit tests, they usually say “most bugs we encounter with are ‘integration bugs'”, “they usually happen due to integration flaws with the DB, with some external API, with some configuration, etc.” 

Do they right? Well, it depends.

Consider the following scenario in which I am sure you’ve been many times before: you’ve been asked to make a change in the system, let’s say to add a small feature. Unfortunately, the area in the code in which you are about to make your changes does not perfectly fit the new requirements. You have two options here:

  1. To redesign/refactor the code so that it will fit the new requirements
  2. To implement the requirements without refactoring, even at the cost of making the code uglier.

Option number 1 is the preferred one, but it’s dangerous, thus, teams who don’t have unit tests that cover their code, usually avoid this option and immediately turn on the easier option (but of course the bad one) which is option number 2.

If you want to have a long-lasting software, you need to make more significant changes in your code (cleanups and redesigns) and you should do it more frequently.

 

A case I had a few days ago.

A few days ago I had to do some refactoring operations. Not something huge, I just had to break a class with 2 responsibilities into 2 smaller classes with a single responsibility each. This means that I had to move lines from one place to another, I had to split methods into pieces, I had to change the constructor’s signature, I had to change the order of some lines, etc. You probably understand how dangerous this is – something must go wrong, and something did go wrong. 3 times. This means 3 bugs!

Luckily I covered this class with unit tests just before I started tearing it up, and those tests saved my a***s 3 times. They prevented 3 bugs.

 

Ok, so why not manual or integration tests?

A huge advantage of unit tests over manual or integration tests, especially when you are refactoring code, is that unit tests are very fast (if you followed the rules), and that’s crucial for refactoring – you better progress in small steps and test your code after each step. That way, if you break something, you undo your changes to the last known working state of the code very quickly and without diving into debugging.

Do you remember this refactoring operation I told you about? Since running the relevant tests took me exactly 2 seconds, I literally run the unit tests every few minutes, and when I did break something, I haven’t even had to debug to find out what caused it, I simply undo the code back to a state in which I know it worked a few minutes ago and started again with much more attention this time.

Think about how it would have looked like if I had to test it manually – I would have had to run the whole application every few minutes, login to it, do 3-5 steps before I get to the screen where I can test my changes and then start testing it – very time consuming. If that was the case, I guess I would have given up on running it every few minutes and instead, I would have run it for bulks of changes every half an hour or so, but that would have led me to spend much more time on debugging whenever something went wrong.

Automated integration tests are usually a lot faster than manual, and yet, even they take too long to run (and I am talking about running only the tests that are covering the area I was working on), long enough to prevent me from runnin them as often.

 

A much better protection with Unit Tests

2 out of the 3 bugs I told you about earlier, were very thoughtful – they caused an exception to be thrown. These are very simple bugs to find either with manual or integration tests. The third bug was an evil one, it did not throw any exception, instead, it corrupted the data produced by class, and it did it without any warnings. I call this kind of bugs – “silent killers”.  Would I have found it with manual or integration tests? I doubt it! Usually, this kind of tests focus on the “happy path” looking for major failures or exceptions being thrown.

Luckily, since my unit tests are very specific and are covering every behavior of the unit, they managed to catch this vicious bug.

 

Conclusion

So, do they worth it? In my opinion – hell yeah!

If you don’t have unit tests in place, you will almost always choose the easiest option – driven by fear, you will avoid the needed redesign, and most likely go for the quick and dirty solution.

Unit tests are there to help you be more aggressive with your code when needed. They are there to help you do more frequent and more significant changes in your code, because if you don’t, your code will eventually be a mess. 

Read More

AlgoTrader

Refactoring katas

As part of our series of blog posts about ‘Refactoring’ katas, we now publish a second kata: AlgoTrader.

This kata was written in java and the refactoring operations described here were done using Intelij IDEA.

The code can be found here (look for the “AlgoTrader” folder)

Second kata: AlgoTrader

 

Background

In an algo trade system (a bot that trades in the stock markets), there is a service called StockMarketAPI and it’s responsibility is to abstract the web requests to the stock market API. It looks like this:

Read More

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 4 other subscribers

Comments