avatarAbhishek Gautam, June 19, 2021

What are code smells & how to avoid them?

Bad code works, but as programmers, we always want to write CLEAN quality code

TODO: provide alt

Bad code works, but as programmers, we always want to write CLEAN quality code. If I was to define criteria for quality code the only valid measurement criteria for me will be WTFs/minute. I feel one of the few first things we should pick up early in our programming career is developing our “code nose”.

What is CLEAN code?

Any Code which reads like a story, reveals the intent, reduces the guesswork and is maintainable is safe to be categorized as clean code”.

Your code should look like it is written by a single person and not by 50 programmers over 10 years and should not be opaque.

Robert C Martin in his books “The Clean Code” defines the following characteristics of Clean code:

  1. It should be elegant and pleasing to read, it should make you smile the same way a well-crafted music box will.
  2. Each component of the code should be focused and should expose a single-minded attitude.
  3. It should be designed with keeping simplicity in mind and should contain no duplication
  4. It should be consistent in terms of design and implementation.

Sharpening our sense of code smell

A code smell is not a bug but makes it vulnerable to some, it’s poorly written code having multiple fundamental violations of engineering software which results in poor quality code( more WTFs/minute). It not only adds a technical debt but affects maintainability as well.

One of the sore points is people don’t fix it right away by stating “ we will fix it later”. But let’s face it you’ll never have the time to clean it later. To get rid of these we should refactor our code. We can use built-in IDE features or external plugins like ReSharper to help us through this process.

Let’s talk about some of the most common code smells and how to get rid of them:

Large Functions

The first rule of function is that they should be small. The second rule of function is that they should be smaller.

As humans, we can consciously keep up to 7 things in our minds. As a best practice, our functions should not be more than 10 lines and should have a single responsibility. Long functions make it hard to understand, change, and re-use.

We should apply the cohesion principle to separate out long functions into smaller and more meaningful functions handling a single responsibility

Long parameter list functions

A function should have not more than 3 arguments. 3 being the edge we should try to keep it as low as possible.

To refactor such functions we should try to encapsulate logically related parameters into class. For eg. If a method takes to_date and from_date, we can encapsulate them in a DateRangle class. Another example is if we query data and apply some filter on it, we can encapsulate multiple query parameters in a single class and if required this can be re-used.

We should be careful about our method signature. Don’t use flag arguments they are also another code smell, instead of the split method into multiple independent methods.

Nested Conditionals

Having nested conditionals makes our program hard to read, test, and modify. We should use any combination of the following strategies to refactor nested conditionals.

  1. Use ternary operators but we should be cautious to not abuse it and try not to use it more than once per statement and not end up in statement looking something like this: c = a? b : d? e: f
  2. Try to combine nested blocks by using logical operators but this technique should also be used in moderation.
  3. Using an early exit strategy.

In an OOPs language, we should use polymorphic dispatch instead of functional features like if/else and switch.

Duplicate Code

The problem with duplicate code is that not only it violates DRY( Don’t Repeat Yourself). It also makes it difficult to bring in new changes, as now the same logic may be present at more than 1 place and we need to find it and fix it everywhere.

Variable declarations at the top

We should always declare the variable close to where they are used and not at the top. This was previously done some early C compilers couldn’t understand the variables declared in the middle.

This is important as if we declare our variables at the top, the code reader will have to spend more brainpower to understand and as clean coders, we should avoid it.

Imagine it like this, you meet your friend and ask him how are you and he says “I’m fine and the world population is 7 billion”. The second line is simply unnecessary and not meaningful at the current moment, the same applies to variable declarations as well.

Magic Numbers

It’s hard for code readers to understand the magic numbers and instead of fixing problems people end up adding comment to explain it.

To get rid of them we can either use constants and give it a proper name or use enumeration. Creating enumeration makes it reusable at multiple places, for naming enumeration we should use the singular form.

Comments

Comments which state the obvious are smell and create noise in code or comments to clarify the code are smells as instead of adding clarification we should fix the real issue in code. We should never comment version histories & don’t comment dead code just remove it, you can always get it back from version control systems.

The ultimate comment for our code should be our code itself. We should try to write a self-documented code.

One of the good candidates for comments is TODO comments. As when we are building/fixing something and we don’t want to get distracted we can add a TODO comment and lot’s IDEs allow us to view all TODO comments, so we can easily work on them later.

Naming

It is one of the common code smells. We should not have mysterious or meaningless names. The reader should not go to other places in code to understand the current line of code. We should try to have names with encodings. Eg. Back in the old days, the IDEs were not so powerful to tell the variable type, so people use to prefix it with the data type to easily tell data type by looking at the name, this was called Hungarian notation, but now we should not practice naming by adding encodings.

Our names should be not too short, not too long. It should be meaningful and reveal intentions and the name should be chosen from the problem domain and should follow proper language naming guidelines and specifications.

A lot of code smells can be eliminated if we move closer to the expressiveness of programming language to business features and results in more granular and readable code. We should write code keeping in mind that our code should speak the language of business.

but don’t add another bunch of smells in the process