Featured Project: Medical Insurance Fraud Detection

Built a machine learning model using Medicare datasets to detect fraudulent claims.

Objective

Develop a big data and machine learning model to detect fraudulent activities in Medicare claims, reducing costs and improving compliance in the healthcare system.

Approach

Integrated multiple public datasets (CMS Part D, LEIE, and physician payment records), performed data cleansing and feature engineering, and applied classification algorithms including Random Forest, with emphasis on anomaly detection and geo-demographic analysis.

Results

Built a fraud detection model where Random Forest achieved the best performance with an AUC of 72%, successfully identifying fraud patterns across providers, patients, and regions, highlighting significant fraud concentration in the Bay Area.

Project Files

PDF – Project Summary: Comprehensive report outlining objectives, datasets, methodology, and key findings from the Medicare Fraud Detection project

PPTX – Presentation:
Slide deck summarizing project motivation, approach, visual insights, and results for a professional audience.

IPYNB – Code File:
Interactive Jupyter Notebook containing the full data exploration, feature engineering, and machine learning implementation for fraud detection.

Both PDF and IPYNB File link attached.

How to View and Understand the Project:

  1. Start with the PDF Summary
    Read the Project Summary PDF to understand the objectives, datasets, methodology, and key findings. This document gives you the overall context before diving into details.

  2. Explore the Jupyter Notebook (IPYNB Code File)
    Open the IPYNB Notebook to review the full data exploration, feature engineering, and machine learning implementation. For convenience, both the live notebook link and a PDF are available.

  3. Review the Presentation Slides (PPTX)
    Finally, go through the Presentation PPTX for a concise, visual overview of the project motivation, analytical approach, and main results—ideal for quick understanding.

👉 Recommended order: PDF → IPYNB → PPTX for the most complete learning experience.