Systems Theory and Practice Issues
May 15, 2020
social media and eating disorders
May 15, 2020
Show all

Any topic (writer’s choice)

The Hepatitis. arff Data set contains information about patients affected by Hepatitis. The task is to generate a classification model to predict Hepatitis histology: Yes/No.

Submit a report based on the answers for the following questions:

a)    Select a suitable decision tree model for predicting Histology.
–    Which model evaluation method did you use (CW, H-O)? Provide an overview of this model, why was it preferred? 
–    Interpret the classification outputs: the tree topology, the accuracy rates.   

b)    Provide a detailed description of the classification model:
–    The tree induction algorithm
–    The attributes selection criteria.
–    The pruning method

c)    Vary the model parameters and discuss the impact on the classification results:
–    Set the REP parameter (Reduced Error Pruning) to TRUE. Explain this tree pruning method. What impact has it made on the outputs, why?
–    Set the parameter unpruned to TRUE, Report and explain any change in the accuracy of results and in the tree structure.
–    Change the confidence factor to 15%, report the impact on the classification outputs, explaining the causes of change.

d)    Visualise the tree and Generate a set of rules along the subtree path: Varices – Ascites Spiders Bilirubin Sex Class No. If you were to generate association rules from the tree how could you reduce the number of rules (hint: speculate about Support and Confidence)?

e)    Perform predictions using two other classification models of your choice: e.g. ANN, SVM, Ensemble learner. Report on the accuracy metrics, discuss the superiority/inferiority of these models performance compared to the decision tree.

f)    Create ROC and Lift charts and interpret them.

Leave a Reply

Your email address will not be published. Required fields are marked *