This project discusses the prediction of the weather using the different types of Machine learning models like LR,SVM,KNN,NB,DTR,XGBoost,LightBGM, CatBoost,RF and finding which model gives the better accuracy.
The dataset used for this project includes daily weather observations for Seattle, Washington, spanning the years 2012 to 2015, inclusive. Each data point includes features such as temperature, precipitation, humidity, wind speed, and more. The dataset is provided in a CSV format and can be found in the repository. This repository contains a dataset of daily weather observations from Seattle spanning a period of 1461 days. The dataset includes six variables for each observation:
Date: The date of the observation.
Precipitation: The amount of precipitation in millimeters (mm) on the given date, indicating the quantity of rain or other forms of precipitation.
Maximum Temperature (temp_max): The maximum temperature in degrees Celsius (°C) recorded on the given date, representing the highest temperature reached during the day.
Minimum Temperature (temp_min): The minimum temperature in degrees Celsius (°C) recorded on the given date, indicating the lowest temperature reached during the day.
Average Wind Speed (wind): The average wind speed in meters per second (m/s) on the given date, quantifying the speed of the wind.
Weather: A textual description of the weather conditions on the given date, providing information about whether it was sunny, cloudy, rainy, snowy, or other weather conditions.
Number of Observations: 1461
Date Range: The dataset covers daily observations over a specific date range.
I applied several machine learning algorithms to this dataset and obtained the following accuracy scores on the test data:
- Logistic Regression (LR): 76.23%
- Support Vector Machine (SVM): 79.51%
- K-Nearest Neighbors (KNN): 67.76%
- Naive Bayes (NB): 84.15%
- Decision Tree Regressor (DTR): 72.40%
- XGBoost: 79.51%
- LightGBM: 79.51%
- CatBoost: 80.87%
- Random Forest (RF): 79.51%
These accuracy scores provide insights into how well each algorithm performed in predicting weather conditions based on historical data. Further details and analysis can be found in the .ipynb file.
Still looking to improvise the model's performance.
Contributions are welcome from anyone ! <3 <3