3-1 Data Science course
DS1
Exercise a : Make linear regression with graph<.br>
Exsercise b : Calculate mean, variance, standard deviation and draw graph.
Exercise c : Generate 10000 random integers and print occurrence count for each value and draw pie chart.
Exercise d : Given an integer n, generate a random n by n matrix M, and compute the product of M and its inverse, which should be an identity matrix.
DS2
Problem 1 :
Read the CSV dataset file
Cleaning dataset and make linear regression
Draw scatter plot
DS3
Problem 1 : Read the Excel dataset file, and compute the linear regression equation E for the input dataset D. For of each record, compute e = w-w', where w' is obtained for h using E. Normalization e values and decied value a
More program : Divide the record into two group and repeat problem1
Problem 2 : Lab1 : Ex-1-a & 2-a & Minilab2 - problem1
DS4
Problem 1 : Using a linear regression model, predic and evaluate the delivery time based on distance.
Problem 2 : Using a decision tree model, predict, and evaluate, wheter applicants will be interviewed or not.
Problem 3 : Using a KNN model, predict and evaluate, which programming language people use.
DS-Term Project
Objective Setting : Will our child be able to adapt to school grades well?
Data Curation : https://www.kaggle.com/dipam7/student-grade-prediction
This data approach student achievement in secondary education of two Portuguese schools.
Data Inspection : Data Number - 4000 / Feature Number - 33
Missing Data / Wrong Data / Unusable Data / Outlier Data
Data Preprocessing :
1) Cleaning dirty data - wrong data & unusable data & missing data & outlier data
2) Data normalization
3) Encoding data
4) Feature creation
5) Feature removal
Evaluation : Method - Random Forest / Metrix - Confusion Matrix