Explain Knowledge of Univariate Predictive Analysis to Address a Problem Statement Assignment

  1. Perform research on the different methods used for predictive analytics on univariate data. Provide a detailed narrative on the general purpose of each method and the techniques used to obtain the insights desired by the data science professional.  2. Develop a problem statement for the dataset provided. Create one descriptive and one predictive analytics research question. Assume the goal is to perform binary classification and that you need to determine which machine learning models perform the best for univariate predictive modeling (Support Vector Machine, Random Forest, and Naïve Bayes models) and the specific target variable. Justify the use of your approach based on the problem statement. Clearly show the method used to determine feature importance and the testing of each selected univariate explanatory variable as a predictor of the target variable. Based on the quantitative results, make a recommendation for using a specific feature as the explanatory univariate data for predictive modeling.  3. Using Google Colab/Python, write the necessary code to create each of the three predictive analytics models and run the analysis to train and test each model (in a Notebook (ipynb) file) using the given dataset (cleaned). Create appropriate annotated data visualizations that support the communication of the results to both technical and non-technical audiences.  4. Interpret the statistical output by providing annotated screenshots and a narrative detailing the quantitative conclusions derived from each step of the analysis. Specifically, provide a comparison of performance using appropriate metrics for each. Based on the quantitative results, make a recommendation for using a specific model. Verify that the narrative supports both technical and non-technical audiences.  5. Provide a detailed summative conclusion of both the statistical insights and the applied actionable insights that lead to solving the prediction problem. Provide support for your conclusions while meeting the needs of both technical and non-technical audiences. Justify your assumptions using references.  6. Based on the performance metrics for each machine learning model tested, make a recommendation for which model and related method is best suited for the dataset and research question from the problem statement.  7. Perform research into how predictive analytics models are placed into production. Assume that you need to support a healthcare-related use case and the handling of personally identifiable information (PII) of patients. Determine the most appropriate handling of data sources based on legitimacy, ethical access, and authority. Make some recommendations for handling the data based on best practices, scholarly sources, and industry thought leaders.  Length: 8-11 page paper, not including title and reference pages, plus the Python Notebook (ipynb file)
Still stressed from student homework?
Get quality assistance from academic writers!