Chris Conlan

Financial Data Scientist

  • About
  • Blog
    • Business Management
    • Programming with Python
    • Programming with R
    • Automated Trading
    • 3D Technology and Virtual Reality
  • Books
    • The Financial Data Playbook
    • Fast Python
    • Algorithmic Trading with Python
    • The Blender Python API
    • Automated Trading with R
  • Snippets

Designing Python for Concurrent Development (for Teams)

December 14, 2016 By Chris Conlan Leave a Comment

Overview

It is often impossible to meet a deadline with an army of one. For this reason, many codebases and frameworks designed to allow multiple developers to modify and add components, simultaneously and independently, without creating unexpected problems elsewhere in the codebase. We will classify such codebases as modular.

We will explore existing modular codebases and study the characteristics that permit concurrent development. From there, we will detail how to achieve modularity in arbitrary Python codebases using Python language constructs.

Note on Modularity

In the programming world, term “modular” is a common descriptor for paradigms and languages. We will concentrate on the definition of modularity as it applies to codebases. A modular codebase is one that is carefully designed by a project manager or lead developer to create a smooth concurrent development experience for a team of programmers.

Examples of Modular Codebases

Modular codebases are everywhere in the world of web development. The concept of modular codebases fits nicely into the workflow of a web development team. Simply put, developers can work on different pages of a website at the same time without breaking each other’s pages. In this light, modular codebases are practically a must-have for web development teams. Below are a few MVC (model-view-controller) frameworks and CMS’s (content management systems) that are modular PHP codebases.

  • WordPress: CMS, create plugins and themes
  • Drupal: CMS, create modules and pages
  • Zend: MVC framework, create custom controllers and pages
  • Symfony: MVC framework, create templates and pages

When teams work in these CMS’s and frameworks, they start with the core modular codebase and build outward towards their goals. The initial codebases both support and enforce modular development. Regardless of reader’s understanding of PHP programming frameworks, it is important to understand just how simple concurrent development is with each of these frameworks.

Modular Project Example in PHP/Wordpress

Say three people, Bob, Joe, and Phil, are working on a WordPress application. They are building a custom plugin for teachers to interact with their students. Bob is the project leader. Below is their workweek:

  1. Sunday night:
    1. Bob initializes a new WordPress plugin called wp-school.
  2. Monday morning:
    1. Bob begins building principal.php the landing page for the principal, in wp-school.
    2. Joe begins building teacher.php the landing page for the teachers, in wp-school.
    3. Phil begins building student.php the landing page for the students, in wp-school.
  3. Monday afternoon:
    1. Joe ended up building a useful function for projecting GPA’s of students, he creates a shared toolkit utils.php and places his GPA function inside.
    2. Bob and Phil import utils.php into their pages, preventing them from duplicating code and work. Bob reminds Joe that utils.php must contain function declarations only, because it is a shared toolkit.
  4. etc…

Modular Codebases in Python

We’ve spent a lot of time talking about modular codebases without talking about Python. While Python is a modular language, built-from-scratch Python projects are rarely coded with concurrent development in mind. This is especially the case in data science and numerical programming where complex number-crunching and algorithmic acrobatics are the focus of the project. Our goal is to learn how to build any Python codebase so that it permits concurrent development.

We will discuss a loose framework for modular Python codebases that relies on shared assets, dynamic loading, and dynamic calling.

The Main Script

This module is the typical main script that calls the other scripts. The file main.py will be called to execute the program.

The Shared Toolkit

Essential to a modular codebase is a shared toolkit where project-specific functions are stored for reuse throughout other modules. This toolkit can be a single module tools.py or a library tools. We will refer to this toolkit generally with bold-type as tools throughout this discussion.

The toolkit has a few important characteristics:

  • Can declare functions, will ultimately contain many
    def function(a, b, c):
        ...
  • Any developer can make additive modifications to the toolkit. This includes adding functions and non-destructively modifying existing functions.
    def function(a, b, c):
        some process...
    # can change to
    def function(a, b, c, d = None):
        if not d:
            some process...
        else:
            some similar process using d...
  • Can declare constants
    SOME_THRESHOLD = 2.2e-16
  • Can import unchanging modules
    • import numpy as np
    • import os
    • etc.
  • Cannot import other project-specific modules
    • import mymodule as mm
    • import main
    • etc.
  • Can be imported by any other module in project, without consequence or side effect by design

The Global Data Module

Important in many modular codebases is a module that declares and contains all global variables for the project. This will typically be a single file data.py. This module will be imported by other modules that need to access and share global data. These modules will modify the data rather than data.py modifying itself. A data.py may look like this:

data.py
counter = 0
sharedDict = {}
sharedList = []
log = ""

The Component Modules

These modules contain only functions and function-classes. The modules will be dynamically loaded for use by main. The idea is to specify the same set of functions and classes in each module so that each module can be treated as a variant of each other. In our school application example, each role, principal, teacher, and student, will have a login process, a logout process, a messaging process, and an account admin process. We will make three modules, principal.py, teacher.py, and student.py each containing these four core functions.

The Dynamic Loading Module

This module brings the codebase into the world of modular development. This codebase utilizes the ability of Python modules to be stored as objects to dynamically load modules. Using the previous example of a school app with various user roles, the dynamic loading module may look like this in Python 3:

mod.py
import importlib
roles = ["principal", "teacher", "student"]
f = {}
for role in roles:
    f[role] = importlib.load_module(role)

This module will be imported by main for dynamic use. We will call this module mod.py in our discussion.

Revisiting The Main Module

The main module can now import mod.py to dynamically call modules given knowledge of the role of the user. To execute the login process for any user…

without modular codebase

main.py
import student
import teacher
import principal

# get role of user
user_role = get_current_user_role()

# execute login process for that role
if user_role == 'student':
    student.login()
elif user_role == 'teacher':
    teacher.login()
elif user_role == 'principal':
    principal.login()

 

with modular codebase

main.py
import mod

# get role of user
user_role = get_current_user_role()

# execute login process for that role
mod.f[user_role].login()

 

simple_modular_python_codebase

Adding Roles

We are about to realize the beauty of modular programming. Say, for example, we need to add two roles, janitor and superintendent. We will have one developer work on janitor.py and another work on superintendent.py. When the modules are finished to specification and include all of the core functions used in other role-based modules, we add them to the list of roles in mod.py like so…

mod.py
import importlib
roles = ["principal", "teacher", "student", "janitor", "superintendent"]
f = {}
for role in roles:
    f[role] = importlib.load_module(role)

… and the codebase now supports the new roles. All the while, we have not modified any of the core functionalities or behaviors of the codebase, and our developers were able to work concurrently without causing problems for each other.

Filed Under: Business Management Tagged With: Chris Conlan, concurrent development using Python, data development, Python development, Teams using python concurrently

Leave a Reply Cancel reply

Latest Release: The Financial Data Playbook

The Financial Data Playbook

Available for purchase at Amazon.com.

Algorithmic Trading

Pulling All Sorts of Financial Data in Python [Updated for 2021]

Calculating Triple Barrier Labels from Advances in Financial Machine Learning

Calculating Financial Performance Metrics in Pandas

Topics

  • 3D Technology and Virtual Reality (8)
  • Automated Trading (9)
  • Business Management (9)
  • Chris Conlan Blog (5)
  • Computer Vision (2)
  • Programming with Python (16)
  • Programming with R (6)
  • Snippets (8)
  • Email
  • LinkedIn
  • RSS
  • YouTube

Copyright © 2022 · Enterprise Pro Theme On Log in