Machine learning Codes.

This repository holds the code for different machine learning techniques learned/revisited with Siddharth Dhar (dharsiddharth7).

Setup

Clone the Repository.
Make a conda or pip virtual environment.
In the terminal with the virtual environment activated run "pip install -r requirements.txt"
Start the jupyter server.
For a particular notebook dataset links are provided (in table below) if the code does not download the dataset automatically. Download those datasets and put it in "./Datasets/{Dataset_name}/" ( Replace the space with _ in dataset name)
Run the code in the notebook.

Please note: I am using a conda environment on an Apple M1 chip based computer. I do not use the GPU. Some parts of code might need some change to run with GPU. I will try to make the notebooks machine independent but not providing any guarantee of the same.

Current status of planned work:

Topic/Technique	Dataset	Jupyter Notebook File
Linear Regression	Linear Regression	Linear_Regrression_Linear_Regrression.ipynb
Multiple Linear Regression	Startup	Startup_Multiple_Linear_Regression.ipynb
Multiple Linear Regression	Car Price Prediction	Car_Price_Prediction_Multiple_Linear_Regression.ipynb
Linear Classification	Fish Market	Fish_Market_Linear_Classification.ipynb
Linear Classification	Credit Card Fraud Detection	Not Started
Convolutional Neural Network (CNN)	MNIST	MNIST_Convolutional_Neural_Networks.ipynb
Convolutional Neural Network (CNN)	Fashion MNIST	Fashion_MNIST_Convolutional_Neural_Networks.ipynb
Convolutional Neural Network (CNN)	CIFAR10	CIFAR10_Convolutional_Neural_Networks.ipynb
Convolutional Neural Network (CNN)	A Large Scale Fish Dataset	Not Started
Recurrent Neural Networks (RNN)	Cos Function	Cos_Function_Recurrent_Neural_Networks.ipynb
Recurrent Neural Networks (RNN)	Sin Function	Sin_Function_Recurrent_Neural_Networks.ipynb
Recurrent Neural Networks (RNN)	Federal Reserve Economic Database Sale Of Bevrages	Federal_Reserve_Economic_Database_Sale_Of_Bevrages_Recurrent_Neural_Networks.ipynb
Recurrent Neural Networks (RNN)	Netflix Stock Price	Netflix_Stock_Price_Recurrent_Neural_Networks.ipynb
Natural Language Processing (NLP)	New York Times Comments	Not Started
Polynomial Regression		Not Started
Support Vector Regression (SVR)		Not Started
Decision Tree Reggresion (DTR)		Not Started
Random Forest Regression		Not Started
Logistic Regression		Not Started
K nearest neighbors		Not Started
Support Vector Machine		Not Started
Kernel SVM		Not Started
Decision Tree Classification		Not Started
Random Forest Classification		Not Started
k Means Clustering		Not Started
hierarchical Clustering		Not Started
Apriori		Not Started
Eclat		Not Started
Reinforced Learning		Not Started
Deep Learning		Not Started
Distributed Application		Not Started

My notes:

df.isnull().sum() is used to find nulls in dataset.
For linear regression number of layers required around sqrt(number of features).
For CNN you can use this formula [(W-K+2P)/S]+1 W is the input volume - in your case 128 K is the Kernel size - in your case 5 P is the padding - in your case O i believe S is the stride - which you have not provided. you can also use (W - (k-1)/2) for one max_pool
For CNN input and output channels are 1 or 3, 1 is greyscale and 3 is rgb color.
Tensor need the image in a particular format of NCHW where N is number of images being passed, C is channel (1 or 3), h is height of image and w is the width of the image.
matplotlib takes image in the format of HWC. point 4 for what each stand for.
For nn.linear always pass a list of number. You need to flatten it.
for lstm you have a hidden state and LSTM means long short term mem. you have a seq of events and one prediction out for it, consider hidden state as short term and it needs to be reset for each individual input lstm takes both h_0 and c_0 (https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html).
For future do not normalize the complete train, test and val together(Split before normalizing. this leads to data leakage as mean and st deviation is generally used for normalizing.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Jupyter_Notebooks		Jupyter_Notebooks
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Machine learning Codes.

Setup

Current status of planned work:

My notes:

About

Uh oh!

Languages

Sripathm2/Machine_Learning_Codes

Folders and files

Latest commit

History

Repository files navigation

Machine learning Codes.

Setup

Current status of planned work:

My notes:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages