In this chapter, we will look at problem code and how to refactor it. In the industry, problem code is normally termed code smell. It is code that compiles, runs, and does what it is supposed to do. The reason it is problem code is that it becomes unreadable, complex in nature, and makes the code base hard to maintain and extend further down the line. Such code should be refactored as soon as it's feasible to do so. It is technical debt, and in the long run, if you don't deal with it, it will bring the project to its knees. When this happens, you are looking at an expensive redesign and recoding of the application from scratch.

So what is refactoring? Refactoring is the process of taking existing code that works and rewriting it such that the code becomes clean. And as you have already discovered, clean code is easy to read, easy to maintain, and easy to extend.

In this chapter, we will cover the following topics:

Identifying application-level code smells and how we can address them
Identifying class-level code smells and how we can address them
Identifying method-level code smells and how we can address them

After working your way through this chapter, you will have gained the following skills:

Identifying different kinds of code smell
Understanding why the code is classed as code smell
Refactoring code smells so they become clean code

We'll start our look at refactoring code smells by looking at application-level code smells.

Technical requirements

You will need the following prerequisites for the chapter:

Visual Studio 2019
PostSharp

For the code files of the chapter, you can use this link: https://github.com/PacktPublishing/Clean-Code-in-C-/tree/master/CH13.

Application-level code smells

Application-level code smells are problem code scattered through the application and affect every layer. No matter what layer of the software you find yourself in, you will see the same problematic code appearing over and over again. If you don't address these issues now, then you will find that your software will start to die a slow and agonizing death.

In this section, we will look at the application-level code smells and how we can remove them. Let's start with Boolean blindness.

Boolean blindness

Boolean data blindness refers to the information loss as determined by functions that work on Boolean values. Using a better structure provides better interfaces and classes that keep data, making for a more pleasant experience in working with data.

Let's look at the problem of Boolean blindness via this code sample:

public void BookConcert(string concert, bool standing)
{
    if (standing)
   {
        // Issue standing ticket.
    }
    else
    {
        // Issue sitting ticket.
    }
}

This method takes a string for the concert name, and a Boolean value indicating whether the person is standing or seated. Now, we would call the code as follows:

private void BooleanBlindnessConcertBooking()
{
    var booking = new ProblemCode.ConcertBooking();
    booking.BookConcert("Solitary Experiments", true);
}

If someone new to the code saw the BooleanBlindnessConcertBooking() method, do you think they would know instinctively what true stands for? I think not. They would be blind to what it means. So they would have to either use IntelliSense or locate the method being referred to find the meaning. They are Boolean blind. So how can we cure them of this blindness?

Well, a simple solution would be to replace the Boolean with an enum. Let's start by adding our enum called TicketType:

[Flags]
internal enum TicketType
{
    Seated,
    Standing
}

Our enum identifies two types of ticket types. These are Seated and Standing. Now let's add our ConcertBooking() method:

internal void BookConcert(string concert, TicketType ticketType)
{
    if (ticketType == TicketType.Seated)
    {
        // Issue seated ticket.
    }
    else
    {
        // Issue standing ticket.
    }
}

The following code shows how to call the newly refactored code:

private void ClearSightedConcertBooking()
{
    var booking = new RefactoredCode.ConcertBooking();
    booking.BookConcert("Chrom", TicketType.Seated);
}

Now, if that new person came along and looked at this code, they would see that we are booking a concert to see the band Chrom, and that we want seated tickets.

Combinatorial explosion

Combinatorial explosion is a by-product of the same thing being performed by different pieces of code using different combinations of parameters. Let's look at an example that adds numbers:

public int Add(int x, int y)
{
    return x + y;
}

public double Add(double x, double y)
{
    return x + y;
}

public float Add(float x, float y)
{
    return x + y;
}

Here, we have three methods that all add numbers. The return types and parameters are all different. Is there a better way? Yes, through the use of generics. By using generics, you can have one single method that is capable of working with different types. And so, we will be using generics to solve our addition problem. This will allow us to have a single addition method that will accept either integers, doubles, or floats. Let's have a look at our new method:

public T Add<T>(T x, T y)
{
    dynamic a = x;
    dynamic b = y;
    return a + b;
}

This generic method is called with a specific type assigned to T. It performs the addition and returns the result. Only one version of the method is required for the different .NET types that can be added together. To call the code for int, double, and float values, we would do the following:

var addition = new RefactoredCode.Maths();
addition.Add<int>(1, 2);
addition.Add<double>(1.2, 3.4);
addition.Add<float>(5.6f, 7.8f);

We have just eliminated three methods and replaced them with a single method that performs the same task.

Contrived complexity

When you can develop code with simple architecture, but instead implement an advanced and rather complex architecture, this is known as contrived complexity. Unfortunately, I have suffered having to work on such systems and it is a proper pain and cause of stress. What you find with such systems is that they tend to have a high turnover of staff. They lack documentation, and no one seems to know the system or has the ability to answer questions by onboarders—the poor souls who have to learn the system to maintain and extend it.

My advice to all super-intelligent software architects is that when it comes to software, Keep It Simple, Stupid (KISS). Remember, the days of permanent employment with jobs for life appear to be a thing of the past now. Oftentimes, programmers are more for chasing the money than showing lifelong loyalty to the business. So with the business relying on the software for revenue, you need a system that is easy to understand, to onboard new staff, to maintain, and to extend. Ask yourself this question: If the systems that you are responsible for suddenly experienced yourself and all staff assigned to them walking out and finding new opportunities, would the new staff who take over be able to hit the ground running? Or would they be left stressed out and scratching their heads?

Also bear in mind that if you have only one person on the team who understands that system and they die, move on to a new location, or retire, where does that leave you and the rest of the team? And even more than that, where does it leave the business?

I cannot stress enough that you really are to KISS. The only reason for creating complex systems and not documenting them and sharing the architectural knowledge is to hold the business over a barrel so they keep you on and you can bleed them dry. Don't do it. In my experience, the more complicated a system is, the quicker it dies a death and has to be rewritten.

In Chapter 12, Using Tools to Improve Code Quality, you learned how to use Visual Studio 2019 tools to discover the cyclomatic complexity and Depth of Inheritance. You also learned how to produce dependency diagrams with ReSharper. Use these tools to discover problem areas in the code, then focus on those areas. Reduce cyclomatic complexity down to a value of 10 or less. And reduce the depth of inheritance on all objects down to no greater than 1.

Then, make sure all classes only perform the tasks that they are meant to. Aim to keep methods small. A good rule of thumb is to have no more than around 10 lines of code per method. As for method parameters, replace long parameter lists with parameter objects. And where you have a lot of out parameters, refactor the method to return a tuple or object. Identify any multithreading, and make sure that the code being accessed is thread-safe. You have seen in Chapter 9, Designing and Developing APIs, how to replace mutable objects with immutable ones to improve thread-safety.

Also, look for the Quick Tips icons. These will normally suggest one-click refactorings for the line of code they highlight. I recommend you use them. These were mentioned in Chapter 12, Using Tools to Improve Code Quality.

The next code smell to consider is the data clump.

Data clump

A data clump occurs when you see the same fields appearing together in different classes and parameter lists. Their names usually follow the same pattern. This is normally the sign that a class is missing from the system. The reduction in system complexity will come by identifying the missing class and generalizing it. Don't be put off by the fact that the class may only be small, and never think of a class as being unimportant. If there is a need for a class to simplify the code, then add it.

Deodorant comments

When a comment uses nice words to excuse bad code, this is known as a deodorant comment. If the code is bad, then refactor it to make it good and remove the comment. If you don't know how to refactor it to make it good, then ask for help. If there is no one to ask that can help you, then post your code on Stack Overflow. There are some very good programmers on that site that can be a real help to you. Just make sure to follow the rules when posting!

Duplicate code

Duplicate code is code that occurs more than once. Problems that arise from duplicate code include increased maintenance cost per duplication. When a developer is fixing a piece of code, it costs the business time and money. Fixing 1 bug is technical debt (programmer's pay) x 1. But if there are 10 duplications of that code, that's technical debt x 10. So the more that code is duplicated, the more expensive it is to maintain. Then there is the boredom factor of having to fix the same problem in multiple locations. And the fact that duplication may get overlooked by the programmer doing the bug fix.

It is best to refactor the duplicate code so that only one copy of the code exists. Often, the easiest way to do this is to add it to a new reusable class in your current project and place it in a class library. The benefit of placing reusable code in a class library is that other projects can use the same file.

In the present day, it is best to use the .NET Standard class library for building reusable code. The reason for this is that .NET Standard libraries can be accessed by all C# project types on Windows, Linux, macOS, iOS, and Android.

Another alternative for removing boilerplate code is to use Aspect-Oriented Programming (AOP). We looked at AOP in the previous chapter. You essentially move boilerplate code into an aspect. The aspect then decorates the method it is applied to. When the method is compiled, the boilerplate code is then weaved into place. This enables you only to write code that meets the business requirement inside the method. The aspect applied to the method hides the code that is essential, but not part of what the business has asked for. This coding technique is nice and clean, and it works really well.

You can also write decorators using the decorator pattern, as you also saw in the previous chapter. The decorator wraps concrete class operations in such a way that you can add new code without affecting the expected operation of the code. A simple example would be to wrap the operation in a try/catch block as you saw previously in Chapter 11, Addressing Cross-Cutting Concerns.

Lost intent

If you can't easily understand the intent of the source code, then it has lost its intent.

The first thing to do is look at the namespace and the class name. These should indicate the purpose of the class. Then, check the contents of the class, and look for code that looks out of place. Once you have identified such code, refactor the code and place it in the right location.

The next thing to do is to look at each of the methods. Are they only doing one thing well or doing multiple things not so well? If yes, then refactor them. For large methods, look for code that can be extracted out into a method. Aim to make the code of the class read like a book. Keep refactoring the code until the intent is clear, and only what is in the class needs to be in the class.

Don't forget to put the tools to work that you learned how to use in Chapter 12, Using Tools to Improve Code Quality. The mutation of variables is the code smell we will look at next.

The mutation of variables

The mutation of variables means they are hard to understand and reason about. This makes them difficult to refactor.

A mutable variable is one that gets changed multiple times by different operations. This makes reasoning about why is the value more difficult. Not only that, but because the variable is mutating from different operations, this makes it difficult to extract sections of code into other small and more readable methods. Mutable variables can also require more checking that adds complexity to the code.

Look to refactor small sections of code by extracting them out to methods. If there is a lot of branching and looping, see if there is an easier way to do things to remove the complexity. If you are using multiple out values, consider returning an object or tuple. Aim to remove the mutability of the variable to make it easier to reason about, and know why it is the value that it is, and from where it is getting set. Remember that the smaller the method is that holds a variable, the easier it will be to determine where the variable is getting set, and why.

Look at the following example:

[InstrumentationAspect]
public class Mutant
{
    public int IntegerSquaredSum(List<int> integers)
    {
        var squaredSum = 0;
        foreach (var integer in integers)
        {
            squaredSum += integer * integer;
        }
        return squaredSum;
    }
}

The method takes a list of integers. It then loops through the integers, squares them, and then adds them to the squaredSum variable that is returned when the method exits. Notice the iterations, and the fact that the local variable is getting updated in each iteration. We can improve on this using LINQ. The following code shows the improved, refactored version:

[InstrumentationAspect]
public class Function
{
    public int IntegerSquaredSum(List<int> integers)
    {
            return integers.Sum(integer => integer * integer);
    }
}

In our new version, we use LINQ. As you know from an earlier chapter, LINQ employs functional programming. As you can see here, there is no loop, and no local variable being mutated.

Compile and run the program, and you will see the following:

Both versions of the code produce the same output.

You will have noticed that both versions of the code have [InstrumentationAspect] applied to them. We added this aspect to our reusable library in Chapter 12, Addressing Cross-Cutting Concerns. When you run the code, you will find a Logs folder in the Debug folder. Open the Profile.log file in Notepad, and you will see the following output:

Method: IntegerSquaredSum, Start Time: 01/07/2020 11:41:43
Method: IntegerSquaredSum, Stop Time: 01/07/2020 11:41:43, Duration: 00:00:00.0005489
Method: IntegerSquaredSum, Start Time: 01/07/2020 11:41:43
Method: IntegerSquaredSum, Stop Time: 01/07/2020 11:41:43, Duration: 00:00:00.0000027

The output shows that the ProblemCode.IntegerSquaredSum() method was the slowest version, taking 548.9 nanoseconds to run. And that the RefactoredCode.IntegerSquaredSum() method was much faster, taking only 2.7 nanoseconds to run.

By refactoring the loop to use LINQ, we avoided mutating a local variable. And we also reduced the time it took to process the calculation by 546.2 nanoseconds. Such a small improvement is not noticeable to the human eye. But if you perform such calculations on big data, then you will experience a noticeable difference.

We'll now discuss the oddball solution.

The oddball solution

When you see a problem solved in a different way throughout the source code, this is known as an oddball solution. This can happen because of different programmers having their own style of programming, and no standards being put in place. It can also happen through ignorance of the system, in that the programmer does not realize a solution already exists.

A way to refactor oddball solutions is to write a new class that encompasses the behavior that is being repeated in different ways. Add the behavior to the class in the cleanest way that is the most performant. Then, replace the oddball solutions with the newly refactored behavior.

You can also unite different system interfaces using the Adapter Pattern:

The Target class is the domain-specific interface that is used by Client. An existing interface that needs adapting is called Adaptee. The Adapter class adapts the Adaptee class to the Target class. And finally, the Client class communicates objects that conform to the Target interface. Let's implement the adapter pattern. Add a new class called Adaptee:

public class Adaptee
{
    public void AdapteeOperation()
    {
        Console.WriteLine($"AdapteeOperation() has just executed.");
    }
}

The Adaptee class is very simple. It contains a method called AdapteeOperation() that prints out a message to the console. Now add the Target class:

public class Target
{
    public virtual void Operation()
    {
        Console.WriteLine("Target.Operation() has executed.");
    }
}

The Target class is also very simple and contains a virtual method called Operation() that prints out a message to the console. We'll now add the Adapter class that wires Target and Adaptee together:

public class Adapter : Target
{
    private readonly Adaptee _adaptee = new Adaptee();

    public override void Operation()
    {
        _adaptee.AdapteeOperation();
    }
}

The Adapter class inherits the Target class. We then create a member variable to hold our Adaptee object and initialize it. We then have a single method that is the overridden Operation() method of the Target class. Finally, we will add our Client class:

    public class Client
    {
        public void Operation()
        {
            Target target = new Adapter();
            target.Operation();
        }
    }

The Client class has a single method called Operation(). This method creates a new Adapter object and assigns it to a Target variable. It then calls the Operation() method on the Target variable. If you call a new Client().Operation() method and run the code, you will see the following output:

You can see from the screenshot that the method that gets executed is the Adaptee.AdapteeOperation() method. Now that you have successfully learned how to implement the adapter pattern to solve oddball solutions, we will move on to look at shotgun surgery.

Shotgun surgery

Making a single change that requires changes to multiple classes is known as shotgun surgery. This can sometimes be down to excessive refactoring of code due to divergent changes being encountered. This code smell increases the propensity for introducing bugs such as those caused by a missed chance. You also increase the possibility of merge conflicts, because the code needs to change in so many areas that programmers end up stepping on each other's toes. The code is that convoluted that it induces cognitive overload in programmers. And new programmers have a steep learning curve because of the nature of the software.

The version control history will provide a history of the changes made to the software over time. This can help you identify all the areas that are changed, every time a new piece of functionality is added or when a bug is encountered. Once these areas have been identified, then you can look to move the changes to a more localized area of the code base. This way, when a change is required, you only have to focus on one area of the program and not many areas. This makes the maintenance of the project a lot easier.

Duplicate code is a good candidate for refactoring into a single class that is appropriately named, and that is placed in the correct namespace. Also, consider all the different layers of your application. Are they really necessary? Can things be simplified? In a database-driven application, is it really necessary to have DTOs, DAOs, domain objects, and the like? Could database access be simplified in any way? These are just some ideas for reducing the size of the code base, and so reducing the number of areas that must be modified to effect a change.

Other things to look at are the level of coupling and cohesion. Coupling needs to be kept to an absolute minimum. One way to accomplish this is to inject dependencies via constructors, properties, and methods. The injected dependencies would be of a specific interface type. We will code a simple example. Add an interface called IService:

public interface IService
{
    void Operation();
}

The interface contains a single method called Operation(). Now, add a class called Dependency that implements IService:

public class Dependency : IService
{
    public void Operation()
    {
        Console.WriteLine("Dependency.Operation() has executed.");
    }
}

The Dependency class implements the IService interface. In the Operation() method, a message is printed to the console. Now let's add the LooselyCoupled class:

public class LooselyCoupled
{
    private readonly IService _service;

    public LooselyCoupled(IService service)
    {
        _service = service;
    }

    public void DoWork()
    {
        _service.Operation();
    }
}

As you can see, the constructor takes a type of IService and stores it in a member variable. The call to DoWork() calls the Operation() method within the IService type. The LooselyCoupled class is just that loosely coupled, and it is easy to test.

By reducing coupling, you make classes easier to test. By removing code that does not belong in a class and placing it where it does belong, you improve the readability, maintainability, and extensibility of the application. You lessen the learning curve for anyone coming on board, and there is less chance of introducing bugs when you perform maintenance or new development.

Let's now have a look at solution sprawl.

Solution sprawl

The single responsibility that is implemented within different methods, classes, and even libraries suffer from solution sprawl. This can make code really hard to read and understand. The result is that code becomes harder to maintain and extend.

To fix the problem, move the implementation of the single responsibility into the same class. This way the code is in just one location and does what it needs to. This makes code easy to read and understand. The result is that the code can be easily maintained and extended.

Uncontrolled side effects

Uncontrolled side effects are those issues that raise their ugly heads in production because the quality assurance tests are unable to capture them. When you encounter these problems, the only option you have is to refactor the code so that it is fully testable and variables can be viewed during debugging to make sure they are set appropriately.

An example is passing values by reference. Imagine two threads passing a person object by reference to a method that modifies the person object. A side effect is that unless proper locking mechanisms are in place, each thread can modify the other thread's person object invalidating the data. You saw an example of mutable objects in Chapter 8, Threading and Concurrency.

That concludes our look at application-level code smells. So, now we will move on to look at class-level code smells.

Class-level code smells

Class-level code smells are localized problems with the class in question. The kinds of problems that can plague a class are things like cyclomatic complexity and depth of inheritance, high coupling, and low cohesion. Your aim when writing a class is to keep it small and functional. The methods in the class should actually be there, and they should be small. Only do in the class what needs to be done – no more, no less. Work to remove class dependency and make your classes testable. Remove code that should be placed elsewhere to where it belongs. In this section, we address class-level code smells and how to refactor them, starting with cyclomatic complexity.

Cyclomatic complexity

When a class has a large number of branches and loops, it has an increased cyclomatic complexity. Ideally, the code should have a cyclomatic complexity value of between 1 and 10. Such code is simple and without risks. Code with a cyclomatic complexity of 11-20 is complex but low risk. When the cyclomatic complexity of the code is between 21-50, then the code requires attention as it is too complex and poses a medium risk to your project. And if the code has a cyclomatic complexity of more than 50, then such code is high risk and is not testable. A code that has a value above 50 must be refactored immediately.

The goal of refactoring will be to get the cyclomatic value down to between 1-10. Start by replacing switch statements followed by if expressions.

Replacing switch statements with the factory pattern

In this section, you will see how to replace a switch statement with the factory pattern. First, we will need a report enum:

[Flags]
public enum Report
{
    StaffShiftPattern,
    EndofMonthSalaryRun,
    HrStarters,
    HrLeavers,
    EndofMonthSalesFigures,
    YearToDateSalesFigures
}

The [Flags] attribute enables us to extract the name of the enum. The Report enum provides a list of reports. Now let's add our switch statement:

public void RunReport(Report report)
{
    switch (report)
    {
        case Report.EndofMonthSalaryRun:
            Console.WriteLine("Running End of Month Salary Run Report.");
            break;
        case Report.EndofMonthSalesFigures:
            Console.WriteLine("Running End of Month Sales Figures Report.");
            break;
        case Report.HrLeavers:
            Console.WriteLine("Running HR Leavers Report.");
            break;
        case Report.HrStarters:
            Console.WriteLine("Running HR Starters Report.");
            break;
        case Report.StaffShiftPattern:
            Console.WriteLine("Running Staff Shift Pattern Report.");
            break;
        case Report.YearToDateSalesFigures:
            Console.WriteLine("Running Year to Date Sales Figures Report.");
            break;
        default:
            Console.WriteLine("Report unrecognized.");
            break;
    }
}

Our method accepts a report and then decides on what report to execute. When I started off in 1999 as a junior VB6 programmer, I was responsible for building a report generator from scratch for the likes of Thomas Cook, ANZ, BNZ, Vodafone, and a few other big concerns. There were many reports, and I was responsible for writing a case statement that was massive that dwarfed this one. But my system worked really well. However, by today's standards, there are much better ways of performing this same code and I would do things very differently.

Let's use the factory method to run our reports without using a switch statement. Add a file called IReportFactory as shown:

public interface IReportFactory
{
    void Run();
}

The IReportFactory interface only has one method called Run(). This method will be used by the implementing classes to run their reports. We'll only add one report class, called StaffShiftPatternReport, which implements IReportFactory:

public class StaffShiftPatternReport : IReportFactory
{
    public void Run()
    {
        Console.WriteLine("Running Staff Shift Pattern Report.");
    }
}

The StaffShiftPatternReport class implements the IReportFactory interface. The implemented Run() method prints a message to the screen. Add a report called ReportRunner:

public class ReportRunner
{
    public void RunReport(Report report)
    {
        var reportName = $"CH13_CodeRefactoring.RefactoredCode.{report}Report, CH13_CodeRefactoring";
        var factory = Activator.CreateInstance(
            Type.GetType(reportName) ?? throw new InvalidOperationException()
        ) as IReportFactory;
        factory?.Run();
    }
}

The ReportRunner class has a method called RunReport. It accepts a parameter of type Report. With Report being an enum with the [Flags] attribute, we can obtain the name of the report enum. We use this to build the name of the report. Then, we use the Activator class to create an instance of the report. If the reportName returns null when getting the type, InvalidOperationException is thrown. The factory is cast to the IReportFactory type. We then call the Run method on the factory to generate the report.

This code is definitely much better than a very long switch statement. We need to know how to improve the readability of conditional checks within an if statement. We'll look at that next.

Improving the readability of conditional checks within an if statement

The if statements can break the single responsibility and the open/closed principles. See the following example:

public string GetHrReport(string reportName)
{
    if (reportName.Equals("Staff Joiners Report"))
        return "Staff Joiners Report";
    else if (reportName.Equals("Staff Leavers Report"))
        return "Staff Leavers Report";
    else if (reportName.Equals("Balance Sheet Report"))
        return "Balance Sheet Report";
}

The GetReport() class has three responsibilities: the staff joiners report, the staff leavers report, and the balance sheet report. This breaks the SRP because the method should only be concerned with HR reports and it is returning HR and Finance reports. As far as the open/closed principle is concerned, every time a new report is needed we will have to extend this method. Let's refactor the method so we no longer need the if statement. Add a new class called ReportBase:

public abstract class ReportBase
{
    public abstract void Print();
}

The ReportBase class is an abstract class with an abstract Print() method. We will add the NewStartersReport class, which inherits the ReportBase class:

    internal class NewStartersReport : ReportBase
    {
        public override void Print()
        {
            Console.WriteLine("Printing New Starters Report.");
        }
    }

The NewStartersReport class inherits the ReportBase class and overrides the Print() method. The Print() method prints a message to the screen. Now, we will add the LeaversReport class, which is pretty much the same:

    public class LeaversReport : ReportBase
    {
        public override void Print()
        {
            Console.WriteLine("Printing Leavers Report.");
        }
    }

The LeaversReport inherits the ReportBase class and overrides the Print() method. The Print() method prints a message to the screen. We can now call the reports as follows:

ReportBase newStarters = new NewStartersReport();
newStarters.Print();

ReportBase leavers = new LeaversReport();
leavers.Print();

Both reports inherit the ReportBase class, and so can be instantiated and assigned to a ReportBase variable. The Print() method can then be called on the variable, and the correct Print() method will be executed. The code now adheres to the single responsibility principle and the open/closed principle.

The next thing we will look at is a divergent change code smell.

Divergent change

When you need to make a change in one location and find yourself having to change many unrelated methods, then this is known as a divergent change. Divergent changes take place within a single class and are the result of a poor class structure. Copying and pasting code is another reason this problem arises.

To fix the problem, move the code causing the problem to its own class. If the behavior and state are shared between classes, then consider implementing inheritance using base classes and subclasses as appropriate.

The benefits of fixing divergent change-related problems include easier maintenance, as changes will be located within a single location. This makes supporting the application a whole load easier. It also removes duplicate code from the system, which just so happens to be the next thing we will be discussing.

Downcasting

When a base class is cast to one of its children, this is known as downcasting. This is clearly a code smell as the base class should not know about the classes that inherit it. For example, consider the Animal base class. Any type of animal can inherit the base class. But an animal can only be of one type. For example, felines are felines and canines are canines. It would be absurd to cast a feline to a canine and vice versa.

It is even more absurd to downcast an animal to one of its subtypes. That would be like saying a monkey is the same as a camel and is really good at transporting humans and cargo long distances through the desert. This just does not make sense. And so, you should never be downcasting. The upcasting of various animals such as monkeys and camels to the type Animal is valid because felines, canines, monkeys, and camels are all types of animals.

Excessive literal use

When using literals, it is very easy to introduce coding errors. An example would be a spelling mistake in a string literal. It is best to assign literals to constant variables. String literals should be placed in resource files for localization. Especially if you plan to deploy your software to different locations around the world.

Feature envy

When a method spends more time processing source code in classes other than the one that it is in, this is known as feature envy. We will see an example of this in our Authorization class. But before we do, let's have a look at our Authentication class:

public class Authentication
{
    private bool _isAuthenticated = false;

    public void Login(ICredentials credentials)
    {
        _isAuthenticated = true;
    }

    public void Logout()
    {
        _isAuthenticated = false;
    }

    public bool IsAuthenticated()
    {
        return _isAuthenticated;
    }
}

Our Authentication class is responsible for logging people in and out, as well as identifying whether they are authenticated or not. Add our Authorization class:

public class Authorization
{
    private Authentication _authentication;

    public Authorization(Authentication authentication)
    {
        _authentication = authentication;
    }

    public void Login(ICredentials credentials)
    {
        _authentication.Login(credentials);
    }

    public void Logout()
    {
        _authentication.Logout();
    }

    public bool IsAuthenticated()
    {
        return _authentication.IsAuthenticated();
    }

    public bool IsAuthorized(string role)
    {
        return IsAuthenticated && role.Contains("Administrator");
    }
}

As you can see with our Authorization class, it is doing more than it is supposed to. There is one method that validates whether the user is authorized to carry a role. The role passed in is checked to see whether it is the administrator role. If it is, then the person is authorized. But if the role is not the administrator role, then the person is not authorized.

However, if you look at the other methods, they are doing no more than calling the same methods in the Authentication class. So, in the context of this class, the authentication methods are an example of feature envy. Let's remove the feature envy from the Authorization class:

public class Authorization
{
    private ProblemCode.Authentication _authentication;

    public Authorization(ProblemCode.Authentication authentication)
    {
        _authentication = authentication;
    }

    public bool IsAuthorized(string role)
    {
        return _authentication.IsAuthenticated() && role.Contains("Administrator");
    }
}

You will see that the Authorization class is a lot smaller now, and only does what it needs to. There is no longer any feature envy.

Next up, we will look at an inappropriate intimacy code smell.

Inappropriate intimacy

A class engages in inappropriate intimacy when it relies on the implementation details held in a separate class. Does the class that has this reliance really need to exist? Can it be merged with the class that it relies on? Or is there shared functionality that is better off being extracted into its own class?

Classes should not rely on each other as this causes coupling, and it can also affect cohesion. A class should ideally be self-contained. And classes should really know as little about each other as possible.

Indecent exposure

When a class reveals its internal details, this is known as indecent exposure. This breaks the OOP principle of encapsulation. Only that which should be public should be public. All other implementations that don't need to be public should be hidden by using the appropriate access modifiers.

Data values should not be public. They should be private, and they should only be modifiable via constructors, methods, and properties. And they should only be retrievable via properties.

The large class (aka the God object)

The large class, also known as the God object, is all things to all parts of the system. It is a large, unwieldy class that simply does far too much. When you attempt to read the object, the intent of the code may be clear when you read the class name and see what namespace it is in, but then when you come to look at the code, the intent of the code can become lost.

A well-written class should have the name of its intent and should be placed in the appropriate namespace. The contents of the class should follow the company coding standards. Methods should be kept as small as possible, and method parameters should be kept to the absolute bare minimum. Only the methods that belong in the class should be in the class. Member variables, properties, and methods that don't belong in the class should be removed and placed in the correct files in the correct namespace.

To keep classes small and focused, don't inherit classes if there is no need. If there is a class that has five methods, and you will only ever use one of them, is it possible to move that method out into its own reusable class? Remember the single responsibility principle. A class should only have a single responsibility. For example, a file class should only handle operations and behaviors associated with files. A file class should not be performing database operations. You get the idea.

When writing a class, your aim is to make it as small, clean, and readable as you can.

The lazy class (aka the freeloader and the lazy object)

A freeloading class is one that hardly does anything to be useful. When you encounter such classes, you can merge their contents with other classes that have the same kind of intentions.

You can also attempt to collapse the inheritance hierarchy. Remember that the ideal depth of inheritance is 1. And so, if your classes have a larger value for their depth of inheritance, then they are good candidates for moving back up the inheritance tree. You may also want to consider using inline classes for really small classes.

The middleman class

The middleman class does no more than delegate functionality to other objects. In situations like this, you can get rid of the middleman and deal with the objects that carry out the responsibility directly.

Also, remember that you need to keep the depth of inheritance down. So if you cannot get rid of the class, look to merge it with existing classes. Look at the overall design of that area of code. Could it all be refactored in some way to reduce the amount of code and the number of different classes?

The orphan class of variables and constants

It is not really good practice to have a lone class that holds variables and constants for multiple different parts of the application. When you encounter such a situation, it can be hard for the variables to have any real meaning and their context can be lost. It is better to move constants and variables to areas that use them. If constants and variables will be used by multiple classes, then they should be assigned to a file within the root of the namespace they will be used in.

Primitive obsession

Source code that uses primitive values rather than objects for certain tasks such as range values and formatted strings such as credit cards, postcodes, and phone numbers suffers from primitive obsession. Other signs include constants used for field names, and information stored inappropriately stored in constants.

Refused bequest

When a class inherits from another class but does not use all its methods, then this is known as refused bequest. A common reason for this happening is when the subclass is completely different from the base class. For example, a building base class is used by different building types, but then a car object inherits building because it has properties and methods to do with windows and doors. This is clearly wrong.

When you encounter this, consider whether a base class is necessary. If it is, then create one and then inherit from it. Otherwise, add the functionality to the class that was inherited from the wrong type.

Speculative generality

A class that is programmed with functionality that is not needed now but may beneeded in the future is suffering from speculative generality. Such code is dead code and adds maintenance overhead as well as code bloat. It is best to remove these classes when you see them.

Tell, Don't Ask

The Tell, Don't Ask software principle informs us as programmers that we are to bundle data with the methods that will operate on that data. Our objects must not ask for data and then operate on it! They must tell the logic of an object to perform a specific task on that object's data.

If you find objects that contain logic and that ask other objects for data to carry out their operations, then combine the logic and the data into a single class.

Temporary fields

Temporary fields are member variables that are not needed for an object's entire lifetime.

You can perform refactoring by removing the temporary fields and the methods that operate upon them to their own class. You will end up with clearer code that is well organized.

Method-level smells

Method-level code smells are problems within the method itself. Methods are the work-horses that either make software function well or poorly. They should be well organized and do only what they are expected to do—no more and no less. It is important to know the kinds of problems and issues that can arise because of poorly constructed methods. We will address what to look out for in terms of method-level code smells, and what we can do to address them. We'll start with the black sheep method first.

The black sheep method

Out of all the methods in the class, a black sheep method will be noticeably different. When you encounter a black sheep method, you must consider the method objectively. What is its name? What is the method's intent? When you have answered these questions, then you can decide to remove the method and place it where it truly belongs.

Cyclomatic complexity

When a method has too many loops and branches, this is known as cyclomatic complexity. This code smell is also a class-level code smell, and we have already seen how we can reduce the problems with branching when we looked at replacing switch and if statements. As for loops, they can be replaced with LINQ statements. LINQ statements have the added benefit of being a functional code since LINQ is a functional query language.

Contrived complexity

When a method is unnecessarily complex and can be simplified, this complexity is termed contrived complexity. Simplify the method to make sure that its contents are human-readable and understandable. Then, look to refactor the method and reduce the size to the smallest number of lines that is practical.

Dead code

When a method exists but is not used, this is known as dead code. The same goes for constructors, properties, parameters, and variables. They should be identified and removed.

Excessive data return

When a method returns more data than is needed by each client that calls it, this code smell is known as excessive data return. Only the data that is required should be returned. If you find that there are groups of objects with different requirements, then you should maybe consider writing different methods that appeal to both groups and only return what is necessary to those groups.

Feature envy

A method that has feature envy spends more time accessing data in other objects than it does in its own object. We have already seen this in action when we looked at feature envy under class-level code smells.

A method should be kept small, and most of all, its main functionality should be localized to that method. If it is doing more in other methods than its own, then there is scope for moving some of the code out of the method and into its own method.

Identifier size

Identifiers can be either too short or too long. Identifiers should be descriptive and succinct. The main thing to consider when naming variables is the context and location. In a localized loop, a single letter may be appropriate. But if the identifier is at the class level, then it will need a human-understandable name to give it context. Avoid using names that lack context, and that are ambiguous or cause confusion.

Inappropriate intimacy

Methods that rely too heavily on implementation details in other methods or classes display inappropriate intimacy. These methods need to be refactored and possibly even removed. The main thing to bear in mind is that the methods use the internal fields and methods of another class.

To perform refactoring, you can move the methods and fields to where they actually need to be used. Alternatively, you can extract the fields and methods into a class of their own. Inheritance can replace delegation when the subclass is being intimate with the superclass.

Long lines (aka God lines)

Long lines of code can be very hard to read and decipher. This makes it difficult for programmers to debug and refactor such code. Where it is possible, the line can be formatted so that any periods and any code after a comma appears on a new line. But such code should also be refactored to make it small.

Lazy methods

A lazy method is one that does very little work. It may delegate its work to other methods, and it may simply call a method on another class that does what it is supposed to. If any of these are the case, then it may pay to get rid of the methods and place code within the methods where it is needed. You could, for instance, use an inline function such as a lambda.

Long methods (aka God methods)

A long method is one that has outgrown itself. Such methods may lose their intent and perform more tasks than they are expected to. You can use the IDE to select parts of the method, and then select extract method or extract class to move portions of the method to their own method and even their own class. A method should only be responsible for doing a single task.

Long parameter lists (aka too many parameters)

Three or more parameters are classed as the long parameter list code smell. You can tackle this problem by replacing the parameters with a method call. An alternative is to replace the parameters with a parameter object.

Message chains

A message chain occurs when a method calls an object that calls another object that calls another object and so on. Previously, you saw how to deal with message chains when we looked at the Law of Demeter. Message chains break this law, as a class should only communicate with its nearest neighbor. Refactor the classes to move the required state and behavior closer to where it is needed.

The middleman method

When all a method does is delegate work out to others to complete, it is a middleman method and can be refactored and removed. But if there is functionality that can't be removed, then merge it in the area that it is being used in.

Oddball solutions

When you see multiple methods doing the same thing but doing it differently, then this is an oddball solution. Choose the method that best implements the task, and then replace the method calls to the other methods with calls to the best method. Then, delete the other methods. This will leave only one method and one way of implementing the task that can be reused.

Speculative generality

A method that is not used anywhere in the code is known as a speculative generality code smell. It is essentially dead code, and all dead code should be removed from the system. Such code provides a maintenance overhead and also provides unnecessary code bloat.

Summary

In this chapter, you have been introduced to a variety of code smells and how to remove them through refactoring. We have stated that there are application-level code smells that permeate throughout all the layers of the application, class-level code smells that run throughout the class, and method-level code smells that affect the individual methods.

First of all, we covered the application-level code smells, which consisted of Boolean blindness, combinatorial explosion, contrived complexity, data clump, deodorant comments, duplicate code, lost intent, mutation of variables, oddball solutions, shotgun surgery, solution sprawl, and uncontrolled side effects.

We then went on to look at class-level code smells, including cyclomatic complexity, divergent change, downcasting, excessive literal use, feature envy, inappropriate intimacy, indecent exposure, and the large object, also known as the God object. We also covered the lazy class, also known as the freeloader and the lazy object; middleman; orphan classes of variables and constants; primitive obsession; refused bequest; speculative generality; Tell, Don't Ask; and temporary fields.

Finally, we moved on to method-level code smells. We discussed black sheep; cyclomatic complexity; contrived complexity; dead code; feature envy; identifier size; inappropriate intimacy; long lines, also known as the God lines; the lazy method; the long method, also known as the God method; the long parameter list, also known as too many parameters; message chains; middleman; oddball solutions; and speculative generality.

In the next chapter, we will be continuing our look at code refactoring with the use of ReSharper.

Questions

What are the three main categories of code smell?
Name the different types of application-level code smells.
Name the different types of class-level code smells.
Name the different types of method-level code smells.
What kinds of refactoring are you able to perform in order to clean up various code smells?
What is cyclomatic complexity?
How can we overcome cyclomatic complexity?
What is contrived complexity?
How can we overcome contrived complexity?
What is a combinatorial explosion?
How do we overcome a combinatorial explosion?
What should you do when you find deodorant comments?
If you have bad code but don't know how to fix it, what should you do?
What is a good place to ask questions and get answers when it comes to programming issues?
In what ways can a long parameter list be reduced?
How can a large method be refactored?
What is the maximum length for a clean method?
Within what range of numbers should your program's cyclomatic complexity be?
What is the ideal depth of inheritance value?
What is speculative generality and what should you do about it?
If you encounter an oddball solution, what course of action should you take?
What refactorings would you perform if you encountered a temporary field?

What is a data clump, and what should you do about it?
Explain the refused bequest code smell.
What law do message chains break?
How should message chains be refactored?
What is feature envy?
How do you remove feature envy?
What pattern can you use to replace switch statements that return objects?
How can we replace if statements that return objects?
What is solution sprawl, and what can be done to tackle it?
Explain the Tell, don't ask! principle.
How does the Tell, don't ask! principle get broken?
What are the symptoms of shotgun surgery, and how should they be addressed?
Explain lost intent and what can be done about it.
How can loops be refactored, and what benefits do the refactorings bring?
What is a divergent change, and how would you go about refactoring it?

Table of Contents for Refactoring C# Code &#x2013; Identifying Code Smells

Create new playlist

Sign In

Sign Up

Technical requirements

Application-level code smells

Boolean blindness

Combinatorial explosion

Contrived complexity

Data clump

Deodorant comments

Duplicate code

Lost intent

The mutation of variables

The oddball solution

Shotgun surgery

Solution sprawl

Uncontrolled side effects

Class-level code smells

Cyclomatic complexity

Replacing switch statements with the factory pattern

Improving the readability of conditional checks within an if statement

Divergent change

Downcasting

Excessive literal use

Feature envy

Inappropriate intimacy

Indecent exposure

The large class (aka the God object)

The lazy class (aka the freeloader and the lazy object)

The middleman class

The orphan class of variables and constants

Primitive obsession

Refused bequest

Speculative generality

Tell, Don't Ask

Temporary fields

Method-level smells

The black sheep method

Cyclomatic complexity

Contrived complexity

Dead code

Excessive data return

Feature envy

Identifier size

Inappropriate intimacy

Long lines (aka God lines)

Lazy methods

Long methods (aka God methods)

Long parameter lists (aka too many parameters)

Message chains

The middleman method

Oddball solutions

Speculative generality

Summary

Questions

Further reading

Table of Contents for
Refactoring C# Code – Identifying Code Smells