Create a word cloud
Goal¶
This post aims to introduce how to create a word cloud using wordcloud
As the source of words, I use one of my posts in 200Wordsaday a.k.a. 200WaD where is the community for those who want to build a writing habit.
Reference
Library¶
In [1]:
import numpy as np
import pandas as pd
from os import path
from PIL import Image
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
import matplotlib.pyplot as plt
%matplotlib inline
# For fetching data using REST call
import requests
from IPython.display import HTML
# For cleaning html tags & non-ASCII characters
from bs4 import BeautifulSoup
import unidecode
Configuration¶
In [2]:
private_key = '{your private key}' # your private key
Load your words¶
In [3]:
# Get the latest post data
r = requests.get(f'https://200wordsaday.com/api/texts?api_key={private_key}')
r_json = r.json()
print(f'# of posts: {len(r_json)}')
In [4]:
# Each post has the following key / parameters
r_json[0].keys()
Out[4]:
Create a word cloud¶
Load one post¶
In [5]:
# Raw json data is a bit dirty so let's clean it up
words = r_json[2]['content']
words[:300]
Out[5]:
Clean words¶
In [6]:
# Cleaning text by BeautifulSoup
soup = BeautifulSoup(words)
all_text = ''.join(soup.findAll(text=True))
# Convert non-ascii characters into ASCII equivalent
all_text = unidecode.unidecode(all_text)
# # Remove backslash
# all_text = all_text.replace("\'", "'")
all_text
Out[6]:
generate a word cloud with default¶
In [7]:
# Create and generate a word cloud image:
wordcloud = WordCloud().generate(all_text)
# Display the generated image:
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
Generate word cloud with specific parameters¶
In [8]:
# Create and generate a word cloud image:
param_wordcloud = {'max_font_size':30,
'max_words':80,
'background_color':"white"}
wordcloud = WordCloud(**param_wordcloud).generate(all_text)
# Display the generated image:
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
Overlay on 200WaD logo¶
In [9]:
# 200WaD Logo to overlay
image_filename = '200WaD_400x400.jpg'
Image.open(f"../images/{image_filename}")
Out[9]:
In [13]:
# Create a mask from image
mask = np.array(Image.open(f"../images/{image_filename}"))
# Set parameters
param_wordcloud = {'max_font_size':80,
'max_words':200,
'background_color':"white",
'mask': mask}
# Create and generate a word cloud
wordcloud = WordCloud(**param_wordcloud).generate(all_text)
# Display the generated image
plt.figure(figsize=[6,6])
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
Adjust the color based on the logo¶
In [14]:
# Create coloring from image
image_colors = ImageColorGenerator(mask)
plt.figure(figsize=[10,10])
plt.imshow(wordcloud.recolor(color_func=image_colors), interpolation="bilinear");
plt.axis("off");
Comments
Comments powered by Disqus