Spam vs. Ham

In the Spam vs. Ham project, the goal is to create a classifier capable of distinguishing spam emails from legitimate ones (ham).

Overview

In the Spam vs. Ham project, the goal is to create a classifier capable of distinguishing spam emails from legitimate ones (ham).

Description

  • The project involves feature engineering, logistic regression, and cross-validation techniques to develop an effective spam detection model
  • Created classifier that can distinguish between spam emails from ham (non-spam) emails using feature engineering and logistic regression with text data and sklearn to fit models
  • Achieved 90% accuracy on a 1000 email test set

Skills Developed:

  • Extracting relevant features from text data
  • Cleaning and preprocessing text data for analysis
  • Implementing a classification algorithm for spam detection
  • Evaluating model performance and avoiding overfitting