Ever had an Excel file that grows to tremendous size and takes forever to load? Or seemingly triples in size overnight?
Here are a few tips and tricks we will cover to identify what’s causing Excel file size bloat:
- Convert file format from .XLS to .XLSX
- Delete Pivot Table saved data
- Remove unnecessary cell formatting
- Determine if file contains corrupted data
Note: The tips & tricks discussed here are written for Excel 2010. Most of them will also work in other versions of Excel but may require some modification to the instructions.
VERY IMPORTANT! Before modifying your files, make a backup copy in case you break something!
1. Convert File Format
Pre-Excel 2007, there was only one file format for Excel files, .XLS. With the advent of Excel 2007 and beyond, there is a new file format, .XLSX which is able to store more data in a smaller file. You can easily convert an old, XLS file into the new format by:
- Open the file
- Select the “File” tab on the Ribbon
- Select “Save As” from the options
- In the “Save as type:” drop box select “Excel Workbook (*.xlsx)”
- Select the “Save” button
2. Delete Pivot Table Saved Data
By default, pivot tables store their calculated data in the file. You can turn off this functionality and delete and stored data to free up space and compress the file size.
To delete pivot table stored data:
- Right-click on a pivot table and select “PivotTable Options ”
- Select the “Data” tab
- In the “PivotTable Data” section, uncheck the “Save source data with file” check box
- Select the “Refresh data when opening the file” check box
- In the “Retain items deleted from the data source” section, in the “Number of items to retain per field” drop box, select “None”
- Select the “OK” button to save your changes.
3. Remove Unnecessary Formatting
Everything you do to an Excel file increases it’s size. This includes not just the data entered into a cell but also the cell formatting. E.g. making a column bold or highlighted green etc.
To help reduce file size, remove any cell formatting that is not required. Common occurrences of unnecessary cell formatting include:
- Formatting on any tabs that are used for calculations but are not required to be viewed by the end user.
- Applying grid lines or white “fill” to all cells in a worksheet. Only apply formatting to cells which are actually being used or printed.
4. Corrupt Data
Any file on a computer can become corrupted due to many different factors. Excel files can not only become corrupted as a whole, but individual tabs within your worksheet can also become corrupted. The file may open and you may not even notice anything odd on your worksheet but the size of your file may skyrocket from a couple hundred kilobytes to hundreds of megabytes! It can be very difficult to figure out where the data in the file is corrupted but here are a couple of tricks to determine where the corruption is as well as how to potentially fix it.
- On each sheet, go up to the very first cell, A1 and press the Ctrl+End key combo. This will cause your cell selection to jump to the bottom-most cell where Excel believes it has data. If a file is corrupted, many times this will cause your cell selection to jump down hundreds or even thousands or lines below where your data is. In fact, you will probably be selecting a completely blank cell in a sea of blank cells. If this occurs, try deleting all rows and all columns that do not contain data in the sheet. Save your file and see if your file size is reduced. Do this for all sheets in your file.
- Another option to find corrupt data, is save your file as an HTML file. When Excel exports your data into HTML, it will create a new HTML document for each worksheet. You can then view the size of each HTML document to get a rough feel for which sheet is probably corrupt.
To export your file as HTML:
- Select “File” from the ribbon menu.
- Select “Save As” from the menu
- In the “Save as type:” drop box select “Web Page (*.htm;*.html)”
- Select the “Save” button
Once you have saved the file as HTML, open up the folder where you saved it and check the file sizes. The HTML files are named with the back-end worksheet names used in the VBA editor. If you have renamed any of your worksheets, you will need to open up the VBA editor to determine which back-end worksheet name refers to which specific worksheet.
To open the VBA editor:
- Select the “Developer” tab from the ribbon menu
- If you do not see a “Developer” tab, right click on the ribbon and select “Customize the Ribbon”
- On the left-hand side, in the “Main Tabs” box, make sure the check box is checked next to “Developer”
- Click the “OK” button
- Click the “Visual Basic” button in the “Code” menu section.
- On the left-hand side of the Visual Basic editor, there should be a tree view of the projects in your worksheet. If you do not see the tree view, press the Ctrl+R key combo.
- In the tree view, you should see an entry “VBAProject (Name of your excel file.xlsx)”. Under this entry, in the “Microsoft Excel Objects” folder, you should see each one of your tabs listed as “Sheet 1 (Sheet 1)”, etc. The name outside the parentheses is the back-end name of the worksheet. The name in parentheses is the name that you gave that sheet.
Once you have identified the sheet corresponding to the exported HTML file with the suspect large file size, you can try modifying formatting, deleting empty rows and columns, etc. on that sheet to reduce the overall file size.
If nothing else appears to be working, create a new spreadsheet and copy into it, the contents from the suspected sheet. Only copy the cells in which you have data. Do not copy the entire tab into the new workbook.
Once you have your data safely backed up into a new spreadsheet, delete the old tab from your original spreadsheet, save the file and then create a new tab and re-import the data from your new spreadsheet backup. This will hopefully result in deleting any corrupt data on the original spreadsheet tab.