Sentiment Analysis of YouTube Comments Using Machine Learning

Authors

  • Mohammed Aman, Mehraj Khan, Faisal Ur Rhaman BTech Students Department of Computer Science and Engineering, Lords Institute of Engineering and Technology, Hyderabad, India Author
  • Mrs Mayuri R. Tone Assistant Professor Department of Computer Science and Engineering, Lords Institute of Engineering and Technology, Hyderabad, India Author

Keywords:

Sentiment Analysis · YouTube Comments · Machine Learning · TF-IDF · Logistic Regression · Naive Bayes · SVM · NLP · Flask · scikit-learn · Opinion Mining

Abstract

YouTube generates billions of user comments daily, representing an enormous corpus of public opinion, audience
feedback, and social discourse. Automated sentiment analysis of this data is essential for content creators, brand
managers, and researchers who need to gauge audience reception at scale. This paper presents a comprehensive
machine learning-based platform for three-class sentiment classification (Positive, Negative, Neutral) of YouTube
comments, achieving 95.0% accuracy with both Logistic Regression and Naive Bayes classifiers.The system employs
TF-IDF vectorization with 5,000 features and bigram support on a 15,000-comment training dataset. Six machine
learning algorithms — Logistic Regression (95.0%), Naive Bayes (95.0%), SVM (94.97%), Gradient Boosting
(94.8%), KNN (94.8%), and Random Forest (94.53%) — are systematically trained, evaluated, and compared. The
NLP preprocessing pipeline performs lowercasing, URL removal, mention/hashtag stripping, non-alphabetic
character removal, and NLTK stopword filtering.The system is deployed as a Flask web application (port 5014) with
a Bootstrap 5 dark purple theme, Chart.js interactive visualizations (pie charts, bar charts), dual-mode YouTube
comment fetching (real YouTube Data API v3 and mock generation with 200+ templates), per-user analysis history,
a nine-chart EDA gallery, secure Werkzeug-PBKDF2 authentication, and Docker containerization. All six models
achieve above 94.5% accuracy, validating TF-IDF with bigrams as a highly effective feature representation for social
media sentiment classification.

Downloads

Published

2026-04-22

Issue

Section

Articles

How to Cite

Sentiment Analysis of YouTube Comments Using Machine Learning. (2026). International Journal of Engineering and Science Research, 16(2), 424-439. https://ijesr.org/index.php/ijesr/article/view/1646

Most read articles by the same author(s)

Similar Articles

1-10 of 1000

You may also start an advanced similarity search for this article.