I have been doing a lot of code review lately and thought I could write up my process for doing so. In this blog post you will learn how to effectively review code and improve code quality in your project.
Here are the things I do during every merge request (MR), sometimes called a pull request.
- Get familiar with the work
- Read and review the tests
- Take a break (reading tests is hard)
- Read and review the production code
Get familiar with the work
The first task is to get familiar with the work. I do this by reading the MR description. What is this person trying to accomplish? What part of the code are we working in? Often this requires me to dig deeper and find the ticket. What are the acceptance criteria (AC) for this work?
Armed with some idea of what this story is about I next skim through the affected files. What language are we using? How many files are changed? What areas of the code are being changed? Sometimes something jumps out at me, and I look deeper. "What! Most of the changes are in engine, why a change in licensing?"
Often these things got committed by accident. Something left over from previous work. When this happens, I tend ask a question. "Did you mean to change this?" However, sometimes this sort of extraneous change reveals problems with the implementation of the story If your change in one area requires a change in a distant area of the code it could be there is a better way. Or sometimes this is caused by scope creep.
Read and review the tests
The first major bit of review I do is to read and review the unit tests. I like to read the tests first. I find this helps because the cause and effect of the change are often made clear by the new tests introduced. Also, looking at test case completeness is often tedious, and I find I do a better job of it if I look when I am fresh.
Often, I find that there are no tests or that the number of tests is not proportional to the code that is changed. At this point I take a minute to review the change again. Is there a good reason there are no tests? Maybe this area of the code is untestable for some reason. Is it directly coupled to a database? If not, I will comment "Can you write a test for this?" "These tests do not cover the changed code well."
When there are tests, I will read them for clarity.
- Do the tests make good use of Arrange Act Assert (AAA)?
- Can the code in test be more obvious?
- Are the tests distinct from one another such that it is easy to tell them apart?
- Do the test cover the full range of arguments?
Take a break
Its break time! Reading a bunch of tests can make you tired. If I have time, I will take a break and come back to read the production changes when I am more focused. It is hard to do good code review when you are tired.
Read and review the production code
When reading the production code I look for these things
- Coding standards
Most developers spend an order of magnitude more time reading code than writing it. So, it makes sense to help them out by taking a little extra time to write code that is easier to read. That’s why I focus on clarity before functionality.
Does the code do what the function/class names suggest it should? Often, I find people will add a bit of code to an existing function or class which muddles the water about the function/class is supposed to be doing. In these cases, I suggest a refactor into separate functions.
Does the code do to too much in one function? Often, I find long functions that have many blocks inside them, sometimes with comments between the blocks. When you do that, break the code there into separate functions with names. This makes the functions shorter and easier to read and test.
Are the variable names meaningful? Meaningful names can go a long way in helping people know what your code is doing. Don’t cut them short with single letter variable names. Also, abbreviations might be clear to you but not to anyone else, so in general, don’t abbreviate.
Is the branching clear? I often make comments about how to return early or put the error condition first to make branching clearer or reduce the indentation of the code. Branching and indentation makes code harder to read.
Often after I make clarity suggestions, I wait for the author to make those changes before moving on to read the code for functionality. This can speed things up a lot because the author often catches things during the refactor. Also, clearer code is easier to read. Reading for functionality requires you to really focus on what will happen when the code executes and that is hard to do.
It’s time to look at the functionality of the code. During this phase it is important to "be the compiler" and really understand how the code executes. This can be quite hard to do, and this is why I do clarity first. Key things to check for are
- Did the author miss edge cases?
- Does the code actually do what the author meant it to do?
- Are there any race conditions, memory leaks, exceptions, or other unintended consequences?
- Does this code do something we already have a library for?
Finally have a look at coding stands.
- Are they using m_ when they should?
- What about braces?
- Capitalization, etc.?
I expect that my emphasis on test and clarity first might surprise you. I think most people read the production code first and look for functionality right away. When I read other people's reviews, I see a lot of comments about coding standards and not a lot of comments about functionality.