Sentiment Analysis of Movie Reviews using Word Embeddings and Machine Learning Techniques
Keywords:
Sentiment analysis, Word embedding, Machine learningAbstract
In this study, sentiment analysis of movie reviews was carried out using word embeddings and machine learning techniques. Sentiment analysis, as an opinion mining technique, involves using feature extraction methods to understand the opinions and emotions expressed in text—particularly in domains such as movie reviews, where public sentiment plays a strong role in shaping consumer decisions. For sentiment analysis to be effective, text must be converted into a form that a computer can process. This involves transforming words or documents into vectors using word embedding techniques. Common techniques include Bag of Words, TF-IDF, and Word2Vec. In this study, TF-IDF and Bidirectional Encoder Representations from Transformers (BERT) were selected to compare their effectiveness in analyzing sentiment in movie reviews. The research used the IMDb dataset, which is widely recognized and commonly used in text mining tasks. Various machine learning models were applied, including Support Vector Machine (SVM), XGBoost, and Long Short-Term Memory (LSTM). Results showed that the combination of TF-IDF and SVM produced the highest accuracy, outperforming more complex models such as BERT with LSTM. The findings suggest that simpler word embedding techniques, when paired with effective classifiers, can give strong performance in sentiment analysis.