Automate Simple Tasks with Python: Excel Table to HTML Table using the python Dominate Module

 

Sometimes a database is really not necessary. Why not try a simple excel sheet

Coded this in an hour; sips on coffee though 9:06:34 PM


I generated the excel workbook using a website called Mockaroo. Today’s task is to convert a basic excel sheet table into an HTML table. I broke the task into sub tasks which were then later translated into functions.

PseudoCode

  1. Convert Excel Table into one dimensional List containing dictionaries
  2. Convert List into HTML document
  3. Save HTML string into a file with “.html” extension
  4. Done

The Excel Sheet looked like this:

MOCK_DATA

The code generated an HTML table (using the style.css – see below) that looked like this:

The power of automation
The excel table data was extracted and converted into a simple table.

The files: excel style

This is the code


#Create a function to process excel data
#create another function to create html page
#https://github.com/Knio/dominate/blob/master/README.md --> dominate module

import logging, dominate, openpyxl, sys, os
from dominate.tags import *

#setup logging Debug
logging.basicConfig(level=logging.DEBUG, format="%(asctime)s - %(levelname)s - %(message)s")

#create excel sheet processor

def excel_sheet_processor(workbookfilepath: str):
    '''
        Converts Excel Sheet data into an array with dictionaries
        arguments:
            workbookfilepath: filepath to workbook
        returns:
            list
    '''
    #open workbook
    wb = openpyxl.load_workbook(workbookfilepath)
    #select active sheet
    ws = wb.active
    #define list
    workbook_list = []
    #define keys as a list
    my_keys = []
    for col in range(0, ws.max_column):
        my_keys.append(   ws.cell(row=1, column=col + 1).value  )
    #define loop to convert rows to dictionaries
    for row in range( 2,  ws.max_row ):
            #create dictionary
            dictionary = {}
            for pos in range(0, len(my_keys)):
                dictionary[my_keys[pos]] = ws.cell(row=row, column=pos+1).value
            workbook_list.append(dictionary)
    return workbook_list


def list_diction_to_html(workbook_list: list):
    '''
        Creates HTML Pages with table containing "list" data
        arguments:
            workbook_list --> a list of dictionaries
        returns:
            html file path --> str
    '''

    doc = dominate.document(title="Excel Spread Sheet") #sets html title tag
    with doc.head:
        link(rel="stylesheet", href="style.css")
    with doc:
        with div(id="excel_table").add(table()):

            with thead():
                #add header
                dictionary = workbook_list[0]
                for key in dictionary.keys():
                    table_header = td()
                    table_header.add(p(key))

            for dictionary in workbook_list:
                #loop through row --> create table row
                table_row = tr(cls="excel_table_row")
                #loop through each key in dictionary
                for key in dictionary:
                    with table_row.add(td()):
                        p(dictionary[key])
    return str(doc) #turns the document into a string

def save_dom_to_html(dom):
    '''
        Saves DOM string into newly generated HTML file
        arguments:
            dom --> str
        returns:
            filepath --> str
    '''
    filepath = os.path.abspath("excel.html")
    htmfile = open(filepath, "w")
    htmfile.write(dom)
    htmfile.close()
    return filepath
if __name__ == "__main__":
    filepath = os.path.abspath("MOCK_DATA.xlsx")
    list_work = excel_sheet_processor(filepath)
    if list_work:
        dom = list_diction_to_html(list_work)
        save_dom_to_html(dom)
        logging.info("HTML file saved")
<span data-mce-type="bookmark" style="display: inline-block; width: 0px; overflow: hidden; line-height: 0;" class="mce_SELRES_start"></span>

You could expand the code to make it more flexible like:

  • Getting the HTML file name from the command line
  • Create different tables based on different sheets in a workbook
  • Create different HTML files based on different workbook (in a given directory)
  • Convert HTML Table into Excel Sheet (well basically webscraping; using the python module: BeautifulSoup)

You don’t like text, then watch me code it live:



If you have any questions, write them in the comment section.

 

See you soon!