If You Have To Use Excel, Automate It: Formatting with Python
Automate Excel Formatting with Python, Pandas, & XlsxWriter
Sometimes people need to download some data, format it in Excel, and distribute it. And they have to do this every day, and you look at it and think “I bet I could automate that.” And you are right! You could automate it and make someone’s life a lot easier. Don’t be a slacker, help them out.
How it Works
After writing a Pandas DataFrame to an Excel sheet using df.to_excel(writer, ...)
, you can access XlsxWriter’s workbook and worksheet objects. The library has a ton of formatting options I am not even touching here. If you want to review them all, check out the XlsxWriter documentation.
Let’s walk through a Python script that uses Pandas and XlsxWriter to create a well-formatted Excel report from a DataFrame (df1).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
import pandas as pd
# Assume df1 is a pre-existing Pandas DataFrame
# For example:
# data = {'ColA': [1, 2, 3], 'ColB': ['X', 'Y', 'Z'], 'ColC': [10.1, 20.2, 30.3], 'ColD': [True, False, True]}
# df1 = pd.DataFrame(data)
# 1. Initialize ExcelWriter with XlsxWriter engine
writer = pd.ExcelWriter('formatted_report.xlsx', engine='xlsxwriter')
# 2. Write DataFrame to a sheet
# index=False prevents writing the DataFrame index as a column in Excel
df1.to_excel(writer, sheet_name='Sheet1', index=False)
# 3. Access XlsxWriter workbook and worksheet objects
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# 4. Format the Header Row
# Define a format for the header cells
header_format = workbook.add_format({
'bold': True,
'italic': True,
'underline': True,
'font_size': 13,
'bottom': 2, # Medium border
'top': 2, # Medium border
'left': 2, # Medium border
'right': 2, # Medium border
'align': 'center',
'valign': 'vcenter',
'bg_color': '#DDEBF7' # A light blue background
})
# Apply the format to the header row (rewriting what Pandas initially wrote)
for col_num, value in enumerate(df1.columns.values):
worksheet.write(0, col_num, value, header_format)
# 5. Define Formats for Data Cells
# General format for data cells (thin borders on all sides)
format1_all_borders = workbook.add_format({
'bottom': 1, 'top': 1, 'left': 1, 'right': 1
})
# Specific format for the first row of data (medium top border, thin other borders)
format2_first_data_row = workbook.add_format({
'top': 2, # Medium top border
'bottom': 1, # Thin bottom border
'right': 1, # Thin right border
'left': 1 # Thin left border
})
# 6. Apply Conditional Formatting to Data Cells
# This applies 'format2_first_data_row' to the first row of data (row index 1)
# It checks if cells are not blank (criteria: '>=', value: '""' effectively means not blank for text/numbers)
if len(df1) > 0: # Ensure there is at least one data row
worksheet.conditional_format(1, 0, 1, df1.shape[1] - 1, {
'type': 'cell',
'criteria': '>=', # Applies to cells with any content (numbers or text)
'value': '""', # Compares against an empty string
'format': format2_first_data_row
})
# Apply 'format1_all_borders' to all data cells
worksheet.conditional_format(1, 0, df1.shape[0], df1.shape[1] - 1, {
'type': 'cell',
'criteria': '>=',
'value': '""',
'format': format1_all_borders
})
# 7. Add an Excel Table with AutoFilter and Style
if len(df1) > 0:
column_settings = [{'header': column} for column in df1.columns.values]
worksheet.add_table(0, 0, df1.shape[0], df1.shape[1] - 1, {
'columns': column_settings, # Use DataFrame headers for the table
'autofilter': True,
'style': 'Table Style Light 1'
})
else: # Handle empty DataFrame: create table with only headers
column_settings = [{'header': column} for column in df1.columns.values]
worksheet.add_table(0, 0, 0, df1.shape[1] - 1, {
'columns': column_settings,
'autofilter': True,
'style': 'Table Style Light 1'
})
# 8. Set Column Widths
# Adjust these character widths based on your data
worksheet.set_column('A:A', 10)
worksheet.set_column('B:B', 22)
worksheet.set_column('C:D', 11)
worksheet.set_column('E:K', 8.5) # Example range
# 9. Page Setup for Printing
worksheet.set_landscape() # Set page orientation to landscape
worksheet.repeat_rows(0) # Repeat header row (row 0) on each printed page
# 10. Save and Close the Excel File
try:
writer.close() # This also saves the file
print("Excel file 'formatted_report.xlsx' saved successfully!")
except Exception as e:
print(f"Error saving Excel file: {e}")
Explanation of the Code
-
Initialize ExcelWriter with XlsxWriter engine
1
writer = pd.ExcelWriter('formatted_report.xlsx', engine='xlsxwriter')
-
Write DataFrame to a sheet
index=False
prevents writing the DataFrame index as a column in Excel.1
df1.to_excel(writer, sheet_name='Sheet1', index=False)
-
Access XlsxWriter workbook and worksheet objects
1 2
workbook = writer.book worksheet = writer.sheets['Sheet1']
-
Format the Header Row
Define a format for the header cells.
1 2 3 4 5 6 7 8 9 10 11 12 13
header_format = workbook.add_format({ 'bold': True, 'italic': True, 'underline': True, 'font_size': 13, 'bottom': 2, # Medium border 'top': 2, # Medium border 'left': 2, # Medium border 'right': 2, # Medium border 'align': 'center', 'valign': 'vcenter', 'bg_color': '#DDEBF7' # A light blue background })
Apply the format to the header row (rewriting what Pandas initially wrote).
1 2
for col_num, value in enumerate(df1.columns.values): worksheet.write(0, col_num, value, header_format)
-
Define Formats for Data Cells
General format for data cells (thin borders on all sides).
1 2 3
format1_all_borders = workbook.add_format({ 'bottom': 1, 'top': 1, 'left': 1, 'right': 1 })
Specific format for the first row of data (medium top border, thin other borders).
1 2 3 4 5 6
format2_first_data_row = workbook.add_format({ 'top': 2, # Medium top border 'bottom': 1, # Thin bottom border 'right': 1, # Thin right border 'left': 1 # Thin left border })
-
Apply Conditional Formatting to Data Cells
This applies ‘format2_first_data_row’ to the first row of data (row index 1). It checks if cells are not blank (criteria: ‘>=’, value: ‘””’ effectively means not blank for text/numbers).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
if len(df1) > 0: # Ensure there is at least one data row worksheet.conditional_format(1, 0, 1, df1.shape[1] - 1, { 'type': 'cell', 'criteria': '>=', # Applies to cells with any content (numbers or text) 'value': '""', # Compares against an empty string 'format': format2_first_data_row }) # Apply 'format1_all_borders' to all data cells worksheet.conditional_format(1, 0, df1.shape[0], df1.shape[1] - 1, { 'type': 'cell', 'criteria': '>=', 'value': '""', 'format': format1_all_borders })
-
Add an Excel Table with AutoFilter and Style
The table range should include the header row (row 0) and all data rows.
df1.shape[0]
is the number of data rows, so the last data row index isdf1.shape[0]
.1 2 3 4 5 6 7
if len(df1) > 0: column_settings = [{'header': column} for column in df1.columns.values] worksheet.add_table(0, 0, df1.shape[0], df1.shape[1] - 1, { 'columns': column_settings, # Use DataFrame headers for the table 'autofilter': True, 'style': 'Table Style Light 1' })
Handle empty DataFrame: create table with only headers.
1 2 3 4 5 6 7
else: # Handle empty DataFrame: create table with only headers column_settings = [{'header': column} for column in df1.columns.values] worksheet.add_table(0, 0, 0, df1.shape[1] - 1, { 'columns': column_settings, 'autofilter': True, 'style': 'Table Style Light 1' })
-
Set Column Widths
Adjust these character widths based on your data.
1 2 3 4
worksheet.set_column('A:A', 10) worksheet.set_column('B:B', 22) worksheet.set_column('C:D', 11) worksheet.set_column('E:K', 8.5) # Example range
-
Page Setup for Printing
1 2
worksheet.set_landscape() # Set page orientation to landscape worksheet.repeat_rows(0) # Repeat header row (row 0) on each printed page
-
Save and Close the Excel File
1 2 3 4 5
try: writer.close() # This also saves the file print("Excel file 'formatted_report.xlsx' saved successfully!") except Exception as e: print(f"Error saving Excel file: {e}")