Andy Crouch - Code, Technology & Obfuscation ...

Readable Code Part 4 - Extending Languages For Readability

Photo: Nadine Shaabana - Unsplash

In this next post in the code readability series, I want to cover the use of language extensions. This is a powerful feature of a lot of modern languages. From C based languages through to dynamic languages such as Python. But, it is almost never mentioned in relation to code readability. Usually, we use extension methods to wrap Type utility methods or create project specific functionality. Here I want to look at taking it a bit further to improve readability at a solution level.

Modern languages use a limited set of operators and keywords like their forerunners. A number of current languages all have between 30 and 40 reserved keywords. Given the amount of research undertaken in language design, it amazes me that as a profession we use such limited and unfriendly syntax to develop code. I might argue that we have actually made it harder than it once was.

Take for instance:

IF ((A.GT.B).AND.(B.LT.C)) THEN

END IF
     
IF ((C.GT.D).OR.(B.EQ.C)) THEN

END IF

That is FORTRAN syntax. A language that was first released in 1957. Compare this to:

if((a > b) && (b < c))
{

}

if((c > d) || (b == c))
{

}

The above could have been taken from any C language tree derived language from the last 40 years. FORTRAN’s design goal was “to create the first ‘user-centered’ programming language.” It focused on “the reduction of mechanistic instructions into simple English commands with algebraic formulas. Its language focused more on the problem the person using the computer wanted to solve than on the machine’s operations.” How often do you feel that the language you use has similar goals? Given that we are pushing more people to the role of a programmer the question must be why are we repeating the obfuscation in each language in each new variant?

Back to readability. How much easier is the FORTRAN code above than the C derived language code? Considerable I would argue. While if statements are easy to read and generally make sense, no one would argue the same for the C language && or || operators. && is the most confusing as in most English based languages, & is shorthand for ‘and’. But, in C derived languages & is an operator to obtain the address in memory of a variable. I assume that this operator was defined before the operator for ‘and’. For most people && does not relate to shorthand for ‘and’ and neither does || for ‘or’. I suspect most people would find reading the FORTRAN versions easier.

So how could we use extensions to improve the readability? Using C# and its Extension methods we could end up with the following:

if(a.IsGreaterThan(b).And(b.IsLessThan(c)))
{

}
	   
if(c.IsGreaterThan(d).Or(b.Equals(c)))
{

}

We can achieve this prose like readability by using the following, simple extensions.

public static class BooleanExtensions
{
     public static bool And(this bool firstCondition, bool secondCondition)
     {
         return firstCondition && secondCondition;
     }
     
     public static bool Or(this bool firstCondition, bool secondCondition)
     {
         return firstCondition || secondCondition;
     }
}

public static class IntegerExtensions
{
    public static bool Equals(this int number, int comparisonValue)
    {
        return number == comparisonValue;
    }
    
    public static bool DoesNotEqual(this int number, int comparisonValue)
    {
        return number != comparisonValue;
    }
    
    public static bool IsGreaterThan(this int leftValue, int rightValue)
    {
        return leftValue > rightValue;
    }
    
    public static bool IsLessThan(this int leftValue, int rightValue)
    {
        return leftValue < rightValue;
    }
} 

This is an improvement over && and || and simplifies the reading of > and <. Most developers will say that this is a minor improvement. That it adds little benefit to code readability. But what if you apply the same logic as you build your project to solution and domain logic and entities? What you would end up with is a solution specific Domain Specific Language (DSL). This should make the code simple to read and more importantly understand. You can essentially change your chosen language to be simpler to read and work as you need. You are no longer constrained by the limited operators and keywords it has decided to implement. This enforces consistency throughout your code base. There is far reduced developer centric variations. It also means it is easier to automate your style guide. A secondary benefit unrelated to readability is that we can apply tests. We can write simple code once and move on to composing logic and objects.

If you implement the idea’s here then you should follow the same rules that I outlined in the previous post. Your extension method/function names should follow a consistent theme. If you create a IsNullOrEmpty string extension then an extension that checks for Null or Empty List’s should be named likewise. You should also consider how you structure your extensions. They should exist at the lowest level possible. If your language supports generics then use that to handle common extensions across types.

If you have any opinions around the idea’s I have outlined in this post or would like to discuss them in more depth then please contact me via twitter or email.