Latest tips and tools for Data Scientists and Data Analysts

Learn how to use autoreload to minimize your preprocessing steps

My first year as a data scientist, I witnessed myself and others retyping the same lines of code and retracing our work time and time again. Perhaps some of this did not warrant concern.

After all, how long does it take to type the standard imports,

1	import pandas as pd

1	import numpy as np

1	import matplotlib.pyplot as plt

1	%matplotlib inline

and the like?

Yet there were also plenty of real concerns, as my colleagues and I performed many of the same tasks repeatedly, filling null values, standardizing column names, and creating dummy variables. Shouldn’t we be able to standardize these rote processes and not have to recode the entire preprocessing pipeline every time?

Even worse, sometimes after a day’s worth of exploratory analysis, fruitful insights would surface, only to realize that the Jupyter notebook you’d been working on was a jumbled mess, having jumped around in the notebook repeatedly, fixing errors and rerunning cells. How on earth are you supposed to now repeat that process?

It’s also funny to me that despite proclaiming the immense value of object orientated programming, none of my instructors pointed out how to practically implement such a philosophy into a daily workflow.

I hope this article helps you sidestep the pitfalls many of us have fallen into in order to develop a more productive and sensible workflow.

Chisel Analytics

The Benefits of Analytics

Expand your insights into the opportunities that analytics can offer. Chisel Analytics provides a platform that aims to break down the barriers to building or growing your data science and analytics programs. Our blog, tools and resources help companies, recruiters and data specialists stay informed, stay organized and stay engaged.

Sign up to get content relevant to you:

About Data Science for Analytics and Operations Leaders
What IT Managers Need to Know about Data Science
Recruiting for the Data Science
Data Science Digest

Data Science Digest

Modular Jupyter Workflows With Autoreload

Chisel Analytics

The Benefits of Analytics

Subscribe Here!

Recent Posts

Posts by Tag

Let’s stay in touch