Vahdet Karataş
Data & Reporting Consultant
Available for 1–2 projects / month
  • Location:
    Prague, Czech Republic
Tools & Focus Areas
  • Dashboards & reporting
  • Data cleaning & analysis
  • Reporting workflows
  • Data pipelines & ETL
  • ML training → serving patterns
How I can help
  • Client work: dashboards, reports, automation
  • Applied ML & interactive apps — see main site
  • Data cleaning & trustworthy spreadsheets
Recent outcomes
  • Reporting deliverables folded into recurring review cycles
  • Automated prep cut manual steps for repeat exports
  • Notebook experiments packaged as lightweight APIs for stakeholders

This page is a live churn prediction demo (IBM Telco public dataset). Inference API only — training lives in the GitHub repo.

Churn prediction demo

IBM Telco Customer Churn — trained pipeline, F1-tuned threshold at inference. Try the form below or open /docs for the JSON schema.

About this project

What it does. Contract, tenure, charges, and service flags → churn probability, binary label (F1-tuned threshold), and a risk band.

How it was built. Python: feature engineering, LogReg / Random Forest / Gradient Boosting, best by ROC-AUC. Inference-only deploy — model/churn_model.joblib.

Hold-out test metrics

Best model by test ROC-AUC. F1 @ 0.5; live API uses F1-tuned threshold.

Key findings

From EDA and modeling — notebooks in the repo.

  • Class balance. ~27% churn; stratified split.
  • Who churns. Month-to-month and shorter tenure dominate.
  • Models. Three compared; API serves ROC-AUC winner.
  • Threshold. F1-tuned on the test set — not fixed 0.5.
ROC-AUC and F1 by model (hold-out)
Chart missing. Run python scripts/generate_figures.py, copy reports/figures/model_comparison.pngvercel_demo/public/model_comparison.png.
Hold-out ROC-AUC and F1 (notebooks/03_model_experiments).

Notebooks on GitHub — 01 EDA, 02 features, 03 models, 04 threshold.

Limitations

  • Dataset. Public teaching data — not production distribution.
  • Churn definition. Snapshot label; real goals may need different cutoffs.
  • Ops. No retraining, drift, or monitoring in this demo.
Full pipeline & notebooks

Training code, EDA, model comparison, threshold analysis, and FastAPI source.

Customer profile

Prediction

Churn probability
Prediction
Risk band
Threshold used
FastAPI + scikit-learn · IBM Telco · OpenAPI · GitHub