This project involved writing the code and testing the model on Python for my dissertation using natural language processing. The study compromised 80 excel files being cleaned, pre-processed, processed, and explored which then enabled me to create a model on Python to predict Kickstarter campaign success through sentiment analysis (Machine learning) based on crowdfunding project blurbs’. In addition, regression analysis was implemented to determine which machine learning model provides the best prediction and accuracy percentage for this type of study. As a result of this study the most successful and failed blurbs (descriptions) associated with each category on the platform was produced to provide a guideline for future fund seekers to adhere to, which was unprecedented in this field of research.
The Kickstarter dataset covering one-year worth of reward-based crowdfunding projects (June 2019 to June 2020) was collected from webrobots website and stored in 80 separate csv files. The files were all stored in the same directory so that a function on python can read them all simultaneously instead of analysing each one separately which would be have been impractical. The dataset contains 15 unique project categories comprising art, crafts, comics, dance, design, fashion, film & video, food, games, journalism, music, photography, publishing, technology, and theatre. Besides that, the dataset also contains 147 unique sub-categories in the dataset. In total, the data frame contains 287135 Kickstarter projects launched from 23 different countries.
The sentiment scores are generated from the Kickstarter dataset, which compares two columns consisting of state and blurbs.