Clyde - Free Bootstrap 4 Template by Colorlib

Wine Quality Predictor using K-Nearest Neighbors (KNN)

Project Description

This project aims to build a machine learning model that predicts the quality of red wine based on its physicochemical properties. By analyzing various chemical components like acidity, sugar, and alcohol content, the model learns to assign a quality score (from 3 to 8). This kind of predictive model can be incredibly useful for winemakers to monitor and ensure consistent product quality.

🛠 Technologies Used

Python - programming language
Pandas - data manipulation
Scikit-learn - machine learning modeling
Matplotlib & Seaborn - data visualization
Streamlit - web-based deployment

Step 1: Gathering the Data

We start with a dataset containing information about various red wines, including their chemical properties (like fixed acidity, volatile acidity, citric acid, etc.) and their quality ratings. This dataset acts as our "knowledge base" for the model to learn from.

Step 2: Preparing the Data for Training

Raw data isn't always ready for a machine learning model. This step involves cleaning and transforming the data so the model can understand it better.

Exploratory Data Analysis (EDA)

EDA is a crucial initial step in any machine learning project. Before building the prediction model, I thoroughly explored the dataset:

Explored distribution of features like alcohol, acidity, and sugar
Visualized correlations between chemical properties and quality ratings
Identified outliers and patterns for better preprocessing
Understood the target (quality) class imbalance

These insights guided the data cleaning and feature scaling process, improving model performance.

Distribution of wine quality.

histogram of all features.

correlation matrix of features and quality.

scatter plot.

normal distribution.

Fixed base acid qualuty.

Data Splitting & Feature Scaling

After EDA, I split the dataset into:

Features (X): All chemical properties (like alcohol, pH, etc.)
Target (y): The wine quality rating

Then, the dataset was divided into training and testing sets (typically 80/20) to evaluate model performance on unseen data.

To ensure fairness among features, I used feature scaling. This was especially important for models like K-Nearest Neighbors (KNN), which are sensitive to feature magnitudes.

Perfomance metrics.

Interactive Sales and Revenue Dashboard

Tool: Microsoft Power BI
Category: Data Analytics & Visualization

Project Overview

This interactive dashboard was built to analyze and present sales and revenue performance data in a clear, dynamic, and business-friendly format. It allows users to explore sales trends over time, evaluate regional and product-level performance, and derive actionable insights for better decision-making.

The project simulates a real-world business case where stakeholders need an efficient tool to monitor KPIs and understand market behavior in real time.

Key Features

Real-time data filtering using slicers for product category, region, and time period
Performance comparison between products and markets
Visual storytelling through bar charts, line graphs, and KPI indicators
Calculated DAX measures to derive metrics like growth rate, total revenue, and variance
Clean data model for seamless interactivity and responsiveness

Why This Project Matters

This dashboard demonstrates my ability to:

Clean and transform datasets
Build robust data models in Power BI
Use DAX for meaningful insights
Create professional-level visuals that are both functional and user-friendly

It's designed for business leaders, data analysts, and decision-makers who need clear insights from complex data.

📺 Watch the Demo

👉 Click here to watch

Airport Network Analyzer

Project Description

This project visualizes and analyzes global airport connections using network analysis, interactive maps, and routing algorithms. It helps users explore airports by country, view route connections, find shortest paths between airports, and measure the efficiency of the air transport network.

🛠 Technologies Used

Tool	Purpose
Python	Core programming language
Pandas	Data handling and analysis
NetworkX	Graph/network algorithms
GeoPandas	Spatial/geographic data
Folium	Interactive maps
Streamlit	Web app interface
Matplotlib & Seaborn	Data visualization
Shapefiles (.shp)	Country boundaries for plotting world maps