Build Excel Automation Tools with Python

Design of Excel Automation Tools for Sales Analytics ready to be used by your colleagues without any prior knowledge of Python

Build Excel Automation Tools with Python

Design of Excel Automation Tools for Sales Analytics ready to be used by your colleagues without any prior knowledge of Python

Article originally published on Medium.

How many articles have you read stating that “Excel is Dead, long live Python”, “Python is the new Excel” or “Excel became obsolete”?

But when you look around you, in your team or other departments you can hardly find other colleagues using Python.

And when you mention it, it is seen as a black box that people can’t trust because “it’s too complicated”, “we can’t see formulas” or “I cannot run it on my computer”.

A first step to promoting it would be to give them the possibility of running your scripts on their computers without prior knowledge of python.

💌 New articles straight in your inbox for free: Newsletter

If you prefer watching, have a look at the video version of this article,

1. Scenario

You are working as Data Scientist for a major retail company and your colleagues from the Supply Chain team request your support to analyze sales data.

For this example, we’ll take a dataset from the Kaggle challenge: Store Item Demand Forecasting Challenge. I have been using this dataset for Store Demande Forecasting in another article (Link).

Data Set

  • Transactions from 2013–01–01 to 2017–12–31
  • 913,000 Sales Transactions
  • 50 unique SKU
  • 10 Stores

Objective: Your colleagues would like to build a pivot table of monthly sales by item for each store.

Issue: With nearly 1 million records, your colleagues are reaching the limits of Excel. Therefore, they ask for the support of the Data Scientist to find a magic solution.

2. Solution

We will design a python script to perform data processing, create the pivot table and export the results in an Excel file.

Your script is performing the following tasks:

  1. Import initial raw data from Kaggle Challenge (sales_data.csv)
  2. Date Formatting
  3. Pivot Table of Sales by Month for each pair Store, Item
  4. Sorting records by Store and Item
  5. Saving the final report in an Excel file

3. Share this script using an executable file (.exe)

Objective: Create an executable (.exe) that will run your script and can be used without having python installed on your computer.

New Python Library in your toolbox: Pyinstaller

PyInstaller bundles a Python application and all its dependencies into a single package. The user can run the packaged app without installing a Python interpreter or any modules.

Step 1: Install Pyinstaller

pip install pyinstaller

Step 2: Save your python script in the current directory

Save it in the same directory where you have put your sales records initial file (sales_data.csv). Name your file: “pivot.py”

Step 3: Create your executable file using Pyinstaller

Enter the following command in your command prompt

pyinstaller --onefile hello.py
Pyinstaller processing logs — (Image by Author)

When the process is completed you will see this message “Building EXE from EXE-00.toc completed successfully.” and your (.exe) fill will pop in a folder called “Build”.

Step 4: Use your executable file

Your new (.exe) file processing logs — (Image by Author)

Ask your colleagues to Cut and paste this (.exe) file in a folder with the initial raw data and click to start the script. They can follow the process thanks to the print messages put at each step of the script.

Result: At the end of the process, they can find your sales report (sales_report.xlsx) saved in your folder.

4. Conclusion

You have now a tool to share your work with other colleagues and build tools that will impact your company’s productivity and support people to automate boring tasks.

This can be coupled with cool visualization using seaborn, connections to external APIs to fetch data, a machine learning model to predict future sales or anything that could bring added value to your calculations.

About Me

Let’s connect on Linkedin and Twitter, I am a Supply Chain Engineer that is using data analytics to improve logistics operations and reduce costs.

References

[1] Kaggle Dataset, Store Item Demand Forecasting Challenge, Link