Set the attribute index to 13 (Age) and set the split point at 0. Click on the text of this filter to change the parameters. Select filters > unsupervised > instance > RemoveWithValues. In the Preprocess tab click on Choose in the Filter pane. We want to remove all instances, where the age of an applicant is lower than 0 years, as this suggests that the instance is corrupted. (10 Points) To remove this instance from the dataset we will use a filter. How do you think it would affect Decision trees? A good way to check this is to test the performance of each classifier before and after removing this datapoint. Even a single point like this can significantly affect the performance of a classifier. (5 Points) In the previous point you should have found a data point, which seems to be corrupted, as some of its values are nonsensical. Do you notice anything unusual? You can click on any data point to display all it's values. Try visualising a scatter plot of age and duration. Click on any of the scatter plots to open a new window which shows the scatter plot for two selected attributes. (5 Points) When presented with a dataset, it is usually a good idea to visualise it first. Download the credit_Dataset.arff dataset and load it to Weka.
0 Comments
Leave a Reply. |