Skip to content

Conversation

@tvnisxq
Copy link
Contributor

@tvnisxq tvnisxq commented Dec 31, 2025

Description

This pull request adds the SmokingPredictionModel project to the projects/prediction directory. It is a comprehensive machine learning system that predicts smoking behavior using health indicators and demographic data, implementing multiple sophisticated ML models with extensive feature engineering and optimization techniques.

Project Overview

SmokingML is an advanced prediction system that classifies individuals as smokers or non-smokers based on biological features such as:

  • Blood sugar levels
  • Cholesterol levels
  • Creatinine levels
  • And other health indicators

The project uses multiple datasets for robust model training and evaluation.

Key Features

Advanced Feature Engineering

  • BMI calculation and health risk indicators
  • Cardiovascular risk assessment
  • Liver function analysis
  • Metabolic indices
  • Polynomial feature interactions
  • Ratio-based features (HDL/LDL, AST/ALT, etc.)

Multiple Model Implementation

  • XGBoost Classifier
  • Random Forest Classifier
  • Ensemble Voting Classifier
  • SMOTE for imbalanced data handling

Comprehensive Model Optimization

  • Hyperparameter tuning using RandomizedSearchCV
  • Custom scoring metrics
  • Cross-validation strategies
  • Feature selection with importance analysis

Robust Evaluation Framework

  • Accuracy, Precision, Recall, F1-score metrics
  • ROC-AUC analysis
  • Confusion matrices
  • Feature importance visualization
  • Detailed error analysis

📊 Performance Metrics

ML Olympiad Dataset

  • Accuracy: 0.777
  • Precision: 0.720
  • Recall: 0.798
  • F1-Score: 0.757
  • ROC-AUC: 0.860

Archive Dataset

  • Accuracy: 0.772
  • Precision: 0.696
  • Recall: 0.677
  • F1-Score: 0.686
  • ROC-AUC: 0.863

🛠️ Technical Stack

  • Language: Python 3.x
  • Data Processing: Pandas 2.1.3, NumPy
  • ML Libraries: Scikit-learn 1.3.2, XGBoost 2.0.2
  • API Framework: FastAPI 0.104.1
  • Utilities: Joblib 1.3.2, Python-dotenv 1.0

🔧 Changes Made

  • Fixed hard-coded absolute paths to use relative paths for cross-platform compatibility
  • Converted submodule to regular directory structure
  • Ensured all files follow PEP 8 guidelines
  • Added comprehensive documentation
  • Validated all dependencies and versions

Checklist

[ ]Code follows PEP 8 style guidelines
[ ] Self-review completed
[ ] Comments added for complex logic
[ ] Documentation updated
[ ] No new warnings generated
[ ] Project structure follows repository conventions
[ ] Cross-platform compatibility ensured

@github-actions
Copy link

Thank you for submitting your pull request! 🙌 We'll review it as soon as possible. In the meantime, please ensure that your changes align with our CONTRIBUTING.md. If there are any specific instructions or feedback regarding your PR, we'll provide them here. Thanks again for your contribution! 😊

@sanjay-kv sanjay-kv merged commit e25bed6 into recodehive:main Jan 2, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants