Multi-Tab Excel Nonsense Made Right with Python

Manipulating excel multiple tabs with python

Posted by Kyle R. Conway - January 23, 2019

Today I came across a beautiful spreadsheet that is in use where I work. I needed to pull some data from it but the actual structure of the sheet was problematic. Do you remember phone books? This particular sheet had as many tabs as there are letters in the alphabet (A, B, C... Z) and in each of those tabs there was a generally similar set of columns that looked like the following:

John, Smith | Jane Smith | Spouse

I knew that python would be able to parse this sheet and I went to the one book I had that I knew would have an example of the subject Automate the Boring Stuff with Python. Sure enough, Chapter 12 had enough examples that I could quickly import the excel file:

and loop through each tab in the workbook:

Then, within that loop, on each tab I loop through each row of the sheet to the max # of rows for that tab, and extract values in the A and C columns.

After that I split the values in the A column by comma into First and Last name values.

There's a couple of other things in there (a fill down function) to fill blank cells with the previous column A value if the value is `None` and a print function that checks against the last names and prints if the value matches.

The Benefit

This took some time to write, but I now have a pretty good tool set to quickly import an excel file with multiple tabs and extract common values from all tabs to put the data in something a little more robust. In general I continue to be amazed at the types of data structures people sometimes find helpful. I'll likely be working to improve the data structure used for that sheet (enabling a search function instead of the "click tab, then search the whole list" thing they have now. It's nice to know that improvements can be made. I'll likely even organize the data with a modified version of this script (to pull it all into one tab/sheet within a workbook).

The Full Script

The full script is below. Mostly for future reference for me, but if it's helpful to you by all means use it.