Chris Conlan

Data Scientist

  • About
  • Blog
    • Business Management
    • Programming with Python
    • Programming with R
    • Automated Trading
    • 3D Technology and Virtual Reality
  • Books
    • Fast Python
    • Algorithmic Trading with Python
    • The Blender Python API
    • Automated Trading with R
  • Snippets
  • Opportunities

Documenting Codebases for Teams

February 24, 2017 By Chris Conlan Leave a Comment

depositphotos_5465487_s-2015

A properly prepared team can make or break an expensive development effort. When I prepare a team to work together on an application, I like to keep four distinct documents in the source code that detail how the application works, how it is programmed, how to use it, and what we are working on. In GitHub, this is easily accomplished with Markdown documents in the root directory of the project folder. In other repositories, text files and Word documents work fine as well. We will detail each document below.

The ReadMe

This file is common to almost all codebases. The way I use it is what makes it important to discuss in this post. In the ReadMe file, I will:

  1. Detail which subdirectories hold which parts of the codebase.
  2. Point readers to the remaining three documents.
  3. Explain the purpose, inputs, and outputs to the program.
  4. Delegate general responsibilities to programmers working in the repository.

This document functions as the overview of the codebase. I name it the “readme.md” purely because GitHub will auto-render the document when users are navigating its web GUI.

The Conventions

When single users commit large portions of algorithmic code, it is often hard to understand the purpose and mathematical underpinnings of the code by simply reading it. The Conventions document in my repositories is reserved for identifying and explaining both common and hard to understand coding conventions used throughout the codebase.

I expect many programmers would rebut the necessity of a document like this under the principle that code should be readable and easy to understand. While I tend to agree, highly algorithmic code often benefits from use of standardized single-letter variable names for very commonly accessed objects. This holds especially for large data objects in the parent environment (as is common in Python and R) and commonly used iterables in nested loops. For example, we may hold as a convention that M is to represent a particularly large and important data matrix or that s is to be used for all character iterables of the object sillyString.

Overall, the existence of the document gives codebases a flavor consistency and efficiency. I tend to prefer this to the oft-praised alternative of self-documentation via verbosity.

The Instructions

All programs need instructions, no matter how big or small, whether a command-line tool or GUI. It is better to delegate a separate document for this rather than leave it for the footer of a ReadMe. If a program is complex enough to require a separate website or lengthy guide (think Microsoft Excel), then these are perfectly acceptable substitutes.

The To-Do List

This is the most self explanatory item. Keep a shared document that acts as your to-do list for your team. Delegate tasks for marking names next to items and save items that you have completed at the bottom of the list.

Better Tools Exist…

… but I don’t always use them.

In many ways, using Markdown documents or Word documents to handle codebase documentation can be messy an inefficient. Better tools exist for essentially all of these tasks. GitHub has their Issues feed that is a superior alternative of the to-do list, most programming languages have auto-documentation features that marry the conventions and instructions documents nicely, and why use a readme.md if 99% of installers will never render it?

I don’t always use the superior tools because they are complicated. Complicated tools get underutilized and ignored. Programmers may spend an additional few minutes to manage and update these documents as Markdown or Word documents, but it keeps their importance and maintenance in the front of the minds in a way that is easy to understand. Programmers that grow frustrated with complicated tools and GUI’s will ignore them. I have seen many important pieces of documentation tossed to the wayside because they were being managed with complicated tools. To name a few:

  • PyDoc queues and definitions
  • Eclipse synced tasks lists
  • GitHub Issues feeds
  • Bookdown for collections of Markdown files

I am always ultimately thankful that any programmer can open up my source and find four clearly labelled and well-organized documents that explain everything they need to know.

Filed Under: Business Management Tagged With: Chris Conlan, coding as a team, Documenting codebases, Documenting codebases as a team, team developing organization, web development code documentation

Leave a Reply Cancel reply

Latest Release: Algorithmic Trading with Python

Algorithmic Trading with Python

Available for purchase at Amazon.com.

Algorithmic Trading

Pulling All Sorts of Financial Data in Python [Updated for 2021]

Calculating Triple Barrier Labels from Advances in Financial Machine Learning

Calculating Financial Performance Metrics in Pandas

Topics

  • 3D Technology and Virtual Reality (8)
  • Automated Trading (9)
  • Business Management (9)
  • Chris Conlan Blog (5)
  • Computer Vision (2)
  • Programming with Python (16)
  • Programming with R (6)
  • Snippets (8)
  • Email
  • LinkedIn
  • RSS
  • YouTube

Copyright © 2021 · Enterprise Pro Theme On Log in