3 reasons to stop using Excel

8 Nov, 2023
Frederik Vanhee

I recently went down the rabbit hole of Excel automation… I don’t consider myself an Excel expert, but I thought I knew enough about Excel. I was wrong.

The challenge I had was to automate a rather complex Excel file. This Excel file contained about 20 sheets, with plenty of calculations and look-ups but also more exotic things like building a function from concatenating data in cells. This file was no longer a spreadsheet, but had become an application.

What I call Excel applications is something I see a lot in engineering. It starts with someone creating a spreadsheet performing some calculation for a project. Some of these are real beauties with input and output nicely separated, selection fields and nice layouts. This file becomes an application once it gets shared with others. Multiple versions start to co-exist (version control), you don’t know what changes have been made and whether this file can be used for your use case (traceability). All knowledge is in there for everyone to admire (IP protection) and every individual calculation results in a new file (automation).

So why do I think we need to stop using excel? While you can create applications in Excel, doing so will cause several issues down the road, which are the reasons not to do so:

Reason number 1: Version control and traceability

The go to version control for Excel file is naming the files in a folder. We all had a folder at one time or another with V2, V3, V3_improved and so on.

Most companies have a document management system, so often the solution is to create a project for “Excel calculation sheets” and store the Excel files there. This solves the version control / traceability issue only in theory. People (guilty) tend to just take the easy route and copy the file from another project. The solution to this is then the band-aid of procedures.

Currently there is a version control feature in Excel which does the job, but this still doesn’t solve the problem op local copies.

Reason number 2: Intellectual Property protection

The Excel file in my example contained the entire calculation for a range of products. The owner knew the importance of this file and was therefore very reluctant to share the file. Logical if you think that the companies entire knowledge of the product is in that file.

The owner wanted his clients to be able to perform simplified calculations, obviously without sharing his IP.

The way Excel works, does not allow for a separation between the input/output and the calculation.

Reason number 3: Automation

This was the reason for me to go down the rabbit hole in the first place. How do you perform Excel calculations automatically? The main two which come to mind (I’m sure you know many more) are vba scripts and executing Excel from code (eg Python, C#). Both solutions are fine, not considering my first 2 reasons, if you intend to use it only locally (on your pc). If at some point in time you want to include an automated Excel calculation in a workflow which is no longer executed on your pc but somewhere on a server, the road ends because Excel is only really intended for Windows PC’s, not (Linux) servers.

So in conclusion, by not creating applications in Excel, you will avoid many issues at a later stage. But that said, Excel is very easy to get started. Then again, so is Python. Investing a few afternoons in learning the basics will get you a long way already. So many people have gone before you that there’s a big chance that your problem has already been solved, or that the building blocks to do so are available. Not to forget the new natural language AI tools, which will get you started even faster.

Coming back to my automation challenge: I saw 2 routes to do so. This first is to write all Excel logic into code (eg. Python, C#) and then deploy that code to a server as an API from where it can be called. The downside is that I would have to re-write all Excel logic, which is a lot of work and does not add any value. Therefore I thought I’d be smart and take route 2: I just run Excel on the server and use code to change the input and get the results from Excel. This is what I learned going down route number 2…

Excel only runs on Windows (not considering the Mac version), while almost 80% of servers run on Linux (Wikipedia). So in order to run Excel, I would need a windows server. This prohibits me from using any cloud SaaS services like Azure app service, Google app engine, Google cloud run, Amazon ECS). Instead I need a Windows Server with the added overload of managing a server, just to run Excel. And that’s not all: Microsoft clearly states that you should not be doing this at the risk of unexpected behavior and also a violation of the End user license agreement (In essence, the users making a call to the server need their own Excel license). So I went further down the rabbit hole.

Searching the internet for how to automate Excel points in the direction of specific (open source) libraries like openpyxl or xlwings in Python and EPPLus or NPOI in dotnetcore. But there’s a subtlety in there: they can all read and write Excel file format, but only EPPLus is able to calculate the Excel sheet after the input changes without the need to have Excel installed. EPPlus supports a lot of Excel functions, it does not support all, so depending on the functions used in your Excel file it might not work for your case. FYI EPPLus has changed from free to licensed since version 5.

I would like to mention pycel, which is a python package also able to calculate the Excel sheet after, but very limited in supported functions.

I my specific case, the functions in my Excel file were supported by EPPLus, but mileage will depend on your specific Excel file.

#whyClickIfYouCanScript

Hello, I’m Frederik. Get In Touch Anytime!