IMDb Movie Review Sentiment Analysis

Model description

imdb-movie-review-sentiment-analysis is a machine learning model that predicts the sentiment (positive or negative) of movie reviews from IMDb.
It uses a TF-IDF vectorizer for feature extraction and an ensemble of two classic machine learning classifiers: Logistic Regression and Multinomial Naive Bayes.

The model was trained on the official IMDb Large Movie Review Dataset (50,000 labeled reviews) with standard NLP preprocessing: lowercasing, special character removal, tokenization, stopword removal, and lemmatization.

Intended use:

Sentiment analysis of English-language movie reviews
Educational and research purposes
As a baseline for more advanced NLP projects

Example usage

from inference import SentimentAnalyzer

# Initialize the analyzer (make sure model files are in 'saved_models/')
analyzer = SentimentAnalyzer(model_dir="saved_models")

# Predict sentiment for a new review
review = "This movie was absolutely fantastic! I loved every minute of it."
result = analyzer.predict(review)
print(result)
# Output example:
# {
#   'logistic_regression': {'prediction': 'positive', 'confidence': 0.98, ...},
#   'naive_bayes': {'prediction': 'positive', 'confidence': 0.95, ...}
# }

Metrics

Logistic Regression Accuracy: 88.47%
Naive Bayes Accuracy: 85.20%
Evaluated on a held-out test set (20% of the IMDb dataset, 10,000 reviews).

Limitations

Only works for English text.
Not robust to sarcasm, irony, or highly ambiguous reviews.
May not generalize well to domains outside of movie reviews.
Does not handle emojis, slang, or non-standard text well.
Classic ML models (not deep learning): may underperform on very complex language.

Training data

IMDb Large Movie Review Dataset: 50,000 movie reviews labeled as positive or negative.
Balanced: 25,000 positive and 25,000 negative reviews.
Reviews are preprocessed (lowercased, cleaned, tokenized, stopwords removed, lemmatized).
Dataset is widely used for benchmarking sentiment analysis models.

License

MIT

MON3EMPASHA
/

imdb-movie-review-sentiment-analysis