Customer Sentiment NLP Pipeline | UK Reviews 2022

Positive Reviews

52.8%

1,056 of 2,000 reviews

Majority class

Negative Reviews

34.7%

694 require attention

Action needed

Priority Flagged

596

Safety · Legal · Fraud signals

Auto-detected

Model Accuracy

87.6%

Held-out test set only

DistilBERT

Sentiment Distribution

Ground-truth label proportions across all 2,000 UK reviews

Sentiment by Product Category

Which categories generate the most negative reviews

Monthly Sentiment Trend — 2022 to 2024

% positive vs % negative reviews per month — tracking shifts in customer satisfaction over time

Topic Distribution

Delivery dominates at 60.9% — the single biggest driver of review volume

Negative Sentiment Rate by Topic

Which topics are most associated with negative customer experience

Topic × Sentiment Breakdown

Stacked positive / neutral / negative for each topic — safety and returns/refunds have highest negative rates

Company % Positive Sentiment — League Table

All 30 companies ranked by positive review rate · Red = below 50% · Green = above 60%

Average Star Rating by Company

Top 15 companies by mean star rating

Priority Complaints by Company

Flagged safety / legal / fraud complaints per company

Priority Complaint Queue — Auto-Flagged Reviews

Sorted by priority tier · Critical = safety / legal / fraud signals · detected by rule-based keyword scoring · independent of sentiment model

Tier	Company	Category	Topic	Review Extract	Signals

Confusion Matrix — Test Set Only

Rows = actual label · Cols = predicted · Neutral class hardest due to fewest samples (250 total)

Per-Class F1 Score

Neutral underperforms due to class imbalance — addressed with class weighting on train set

Training Curve — Loss per Epoch (Simulated from Demo Run)

Train loss vs validation loss · Small gap (~0.03) confirms no overfitting · Early stopping monitors val loss, never test loss

Model Card

Architecture

Base model: distilbert-base-uncased

Parameters: 66.4M

Task: Sequence classification (3 classes)

Max length: 128 tokens

Dropout: 0.1 (DistilBERT default)

Training Config

Optimizer: AdamW

Learning rate: 2e-5

Weight decay (L2): 0.01

Gradient clipping: 1.0

Warmup ratio: 10%

Data Splits

Strategy: Stratified 70 / 15 / 15

Train: 1,400 reviews

Validation: 300 reviews

Test: 300 reviews (held out)

Class weights: Computed on train only

Known Limitations

Neutral F1 is lowest (72.2%) — class is ambiguous and smallest

Trained on English UK text only — may not generalise to other dialects

Very short reviews (<5 words) may produce low-confidence outputs

Live Review Classifier

Type or paste any UK customer review · pipeline classifies sentiment, extracts topics, and scores priority in real time using the rule-based demo classifier (run 02_train_model.py --demo and 04_inference.py --use_model for the full DistilBERT model)

Sentiment

—

Pos

Neg

Neu

Primary Topic

—

Word Count

—

chars: —

Priority Tier

—

Quick Test Examples

Click any example to classify instantly