Automated code reviews. For clean code.

Krishnan Nair
3 min readDec 6, 2019

We’re building an ML product that’ll review for clean code.

It will review just like someone from your senior tech team would.

For parameters that matter in the real world — readability, maintainability, object modelling etc. AFAIK this is the first such product in the world.

The story

We’re Geektrust and we believe that developer hiring is hard all over the world because there’s a fundamental problem — resumes are a bad indication of a developer’s coding prowess. In India, this is more pronounced because recruiters want candidates from “premium” institutes or “tier 1” companies only. There are thousands of great developers out there, who don’t fall into this category, and most companies can’t find them.

To try and overcome this problem, several code assessment platforms have come up. They allow you to test for some aspects of coding. However most of them have tests that force you to write (bad) code quickly to get the output right (or worse, ask you to take multiple choice questions). Sure, this might be arguably better than resumes, but nowhere close to how you would want to assess real world coding skills.

So, when we started Geektrust 4 years ago, we decided to solve for proper code assessment and we came up with this — we manually go through a developer’s code (our team is a senior ex-ThoughtWorks and ex-ESPN team), and the ones which we evaluate as good code, are pushed forward to companies to interview. And companies have been able to hire good developers and have had good conversion ratios (interview : offer) with us.

Now the issue is this — for how long will we manually review code? It just gets boring after a while (4 years for us)! After reviewing about a million lines of code manually, we said let’s automate this. But automation for something as abstract as clean code was hard, and we realized that the machine learning route was probably the best. We had data on the manual evaluations we’ve done in the past, we were curious whether this was possible, so we went ahead and did it. After 6 months of effort, we now have the first working version of our code evaluation product — Codu that predicts for clean code.

In the last 1 month, we have been using this internally and Codu predicted correctly 96% of the time!

Here’s the blog on how we built Codu.

The future

Codu can evaluate any code. This opens up a lot of interesting opportunities for companies (from anywhere in the world) looking to hire developers who can write good code. Instead of resumes, you could ask for any coding problem they have solved and Codu will assess it for you.

There will be more features we add to Codu (a generic plagiarism check and a generic input/output check is in the works) but the core value is this — we can evaluate any code for clean code. And we believe this will radically change the way tech hiring is done.

Apart from this being our standalone product, this could be plugged into existing hiring platforms, sold by other assessment platforms as an add-on, integrated with application tracking systems etc.

One question we’ve been asked — can this be used to evaluate my project code (PR and such)? Well, theoretically it could. But we have built and tested this currently for the candidate code evaluation space.

What now and how can you use it

We’re still working on more improvements on the current readability prediction model, and we’ve started working on models for more parameters. So the road map is to keep adding to the clean code prediction, and then build out a code assessment product on top of it.

But Codu is working right now, and we’re using it internally in a production setting. So if what I said above looks interesting, you value the ability to write clean code in your candidates, and you’d like to explore how you could use Codu at your company, do write in to us — codu@geektrust.in.

Check out Codu here -> https://codu.ai/

--

--