08/19/2018

So this is the CSV data I will be working with ( generated using the website:


Goal: Create a bar plot showing  the number of emails based on the alphabets they begin with . For instance, how many emails begin with “a” etc… This can be useful for visualizing emails from various email providers (e.g. yahoo, gmail, etc..).

The code……..


[sourcecode lang=”python”]

import csv, matplotlib.pyplot as plt, numpy as np
filename = "MOCK_DATA.csv"

def count_email(rows):
dictionary = {}
for row in rows:
email = row[3]
if email[0] not in dictionary:
dictionary[email[0]] = 0
dictionary[email[0]] += 1
return dictionary;

def plot_hist(dictionary):
keys = dictionary.keys()
data = []
for key in keys:
x = np.arange(len(keys)), height=data)
plt.xticks(x+0.1, keys)
plt.xlabel("Emails start with…")

data = []

with open(filename) as f:
reader = csv.reader(f)
data = []
for row in reader:
except csv.Error as e:
print("Error while readinf csv file ", e)
if data:
soln = count_email(data)
if soln:


Notes: This assumes your files are all in the same directory ( same level).

Running the code like this: 




