- Business & Data Research
- Posts
- Amazon Food Reviews (Word Cloud) with an example using Python
Amazon Food Reviews (Word Cloud) with an example using Python
Word Cloud example using Python NLTK libraries

About the dataset :
This dataset consists of reviews of fine foods from Amazon. The data span a period of more than 10 years, including all ~500,000 reviews up to October 2012. Reviews include product and user information, ratings, and a plain text review. It also includes reviews from all other Amazon categories.
Amazon Food Review using NLTK:
Word Cloud is a visualization tool often used within text mining to show the most frequent words in a dataset. It’s great for quick, intuitive insights but doesn’t perform any analysis by itself
Step 1: Importing Required Libraries and packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from PIL import Image
from wordcloud import WordCloud
from nltk.corpus import stopwords
Step 2: Reading the dataset using Pandas:
df = pd.read_csv('/Users/Sample Datasets Kaggle/AmazonFine/Reviews.csv')
Step 3: Describing the dataset using pandas library:
df.describe()

Step 4: Removing the NA values using dropna ()
Step 5: Reviewing the head of the dataset
df_clean.head()

Step 6: Review whether NA values were obtained or not in the dataset
df_clean.isna().sum()

df_clean.columns
Index(['Id', 'ProductId', 'UserId', 'ProfileName', 'HelpfulnessNumerator',
'HelpfulnessDenominator', 'Score', 'Time', 'Summary', 'Text'],
dtype='object')
Step 6: Generate the text column from the dataset df
df_clean_text = df_clean['Text']
df_clean_text.head()
0 I have bought several of the Vitality canned d...
1 Product arrived labeled as Jumbo Salted Peanut...
2 This is a confection that has been around a fe...
3 If you are looking for the secret ingredient i...
4 Great taffy at a great price. There was a wid...
Name: Text, dtype: object
Step 7: Generate the Word Cloud using stop words
# Combine all text into a single string
text = " ".join(df_clean_text.astype(str))
# Generate the word cloud, removing stopwords
stop_words = set(stopwords.words('english'))
wc = WordCloud(stopwords=stop_words, background_color='white', width=800, height=400).generate(text)
# Display the word cloud
plt.figure(figsize=(15, 7))
plt.imshow(wc, interpolation='bilinear')
plt.axis('off')
plt.show()

Conclusion :
Text mining empowers us to extract meaningful insights from unstructured text data, and word clouds serve as an intuitive visual gateway into that process. By highlighting the most frequent terms in a dataset, word clouds make it easy to identify dominant themes, recurring patterns, and potential areas for deeper analysis. While they don't replace statistical rigor, they offer a compelling first look—bridging raw data with human understanding in a way that's both accessible and impactful