Nowadays, programmers typically know more than one language. In this post, we will make a strict delineation between scripting languages and programming languages. Scripting languages are designed from the ground up for specific purposes and applications. Programming languages, on the other hand, are designed for broad applicability.
Experience with multiple scripting languages is better than experience with multiple programming languages. Programmers with skills in multiple scripting languages can prototype massively integrated systems very easily by exploiting the strengths of each language.
Case Study: Automated Data Enrichment System
A few months ago, I needed to manually acquire and manually enrich over 10,000 high-resolution pictures for a machine-vision database. I asked around about existing data enrichment software. I heard many mentions of Mechanical Turk and ClickWorker, as well as commercial platform that wrapped and resold them. None of these platforms had the proper enrichment or management tools I required. I needed one set of personnel to upload specific pictures and another set of personnel to mark them up with advanced OpenCV tools. Both the inputs and outputs were complex, and each step required custom software.
I formulated my project requirements and got to work building my own data enrichment pipeline. My pipeline’s requirements:
- Accept uploaded pictures from photographers at a URL endpoint. Notify myself and the photographer of a successful upload. Create and email me logs to manage payments due the photographers. Filter, decompress, sort, and rename pictures uploaded by photographers. Backup the pictures.
- Distribute software and pictures to data enrichment personnel at a URL endpoint. Manage workloads of multiple data enrichment personnel by evenly distributing pictures and software. Notify myself and personnel when workloads are created and updated.
- Accept uploaded metadata from data enrichment personnel. Notify by email myself and personnel of changes made to workload. Send personnel new workloads. Manage logs about payment due the personnel.
- Deliver completed pictures and metadata to machine learning personnel.
A combination of scripting languages allowed me to prototype this application in a day.
- Use PHP for emails and payment systems.
- Hook Dropbox’s API into Linux system calls for file management.
- Use the R language and R’s session persistence to manage job data and dispatch system calls.
- Use Python Tkinter to create OS-independent manual enrichment software.
- Use AWS CLI called through Bash, Cron, and R to manage networks.
Each language required dozens of dependencies, but the custom scripts run in each language were all relatively small. For example, the PHP email and payment systems totaled under 200 lines, but required over 1,000 dependency files. Such is the purpose of learning scripting languages. We can accomplish incredible things with very little effort using common languages with common dependencies. Incredibly enough, this pipeline used almost every scripting language that I know.
By the time I was finished debugging this pipeline, I was managing over 30 photographers, enrichment personnel, and data scientists simultaneously from my iPhone. An associate approached me and told me: “Now that’s how you f***ing use computers.”
The Best Mix of Languages
In my time programming, I have seen a handful of people with powerful skill sets of diverse languages. Below is a rough list of programming languages and scripting languages with different strengths. The strength of a programmer’s skill set is proportional to number of categories that contain a language or tool he can use. Comment at the bottom if you would like to add to or correct this list.
- Complete Programming Langauges
- Python with Numpy/SciPy/MatPlotLib
- GUI Building
- Python with Tkinter, PyQt, Wx, or Django
- PHP with WordPress, Drupal, or another web app framework
- Server Side
- PHP with PHP CLI
- Advanced Bash or MS-DOS scripting
- Any OpenMP-enabled language
- Any CUDA-enabled language
- Python with Theano or TensorFlow
- Embedded Systems
- Python with RaspberryPi
- Advanced Visualization
- Python with Blender
- Any OpenGL-enabled language
- BabylonJS or ThreeJS