Filters
Question type

Study Flashcards

The X axis of a lift chart shows


A) number of actual Class 1 records identified.
B) ratio of decile mean to overall mean.
C) the number of actual Class 1 records.
D) the ratio of the overall mean to the decile mean.

E) A) and B)
F) A) and C)

Correct Answer

verifed

verified

Separate error rates with respect to the false negative and false positive cases are computed to take into account the


A) asymmetric costs in misclassification.
B) symmetric weights of these two cases.
C) distortions due to outliers.
D) effect of sampling error.

E) C) and D)
F) B) and D)

Correct Answer

verifed

verified

____________ is a category of data-mining techniques in which an algorithm learns how to predict or classify an outcome variable of interest.


A) Supervised Learning
B) Unsupervised Learning
C) Dimension Reduction
D) Data Sampling

E) A) and C)
F) All of the above

Correct Answer

verifed

verified

Data-mining methods for predicting an outcome based on a set of input variables is referred to as


A) supervised learning.
B) unsupervised learning.
C) dimension reduction.
D) data sampling.

E) A) and D)
F) A) and C)

Correct Answer

verifed

verified

____________is a method of extracting data relevant to the business problem under consideration. It is the first step in the Data Mining process.


A) Data sampling
B) Data partitioning
C) Model construction
D) Model assessment

E) All of the above
F) A) and C)

Correct Answer

verifed

verified

Applying descriptive statistics and data visualization to the training set to understand the data and assist in the selection of an appropriate technique is a part of


A) data exploration.
B) data partitioning.
C) data preparation.
D) model assessment.

E) B) and D)
F) All of the above

Correct Answer

verifed

verified

A

_____ refers to the scenario in which the analyst builds a model that does a great job of explaining the sample of data on which it is based but fails to accurately predict outside the sample data.


A) Underfitting
B) Overfitting
C) Oversampling
D) Undersampling

E) A) and C)
F) A) and D)

Correct Answer

verifed

verified

______________ is NOT a step of Data Mining Process.


A) Data sampling
B) Data partitioning
C) Model construction
D) Supervised learning

E) A) and B)
F) A) and C)

Correct Answer

verifed

verified

______________ involves descriptive statistics, data visualization, and clustering.


A) Data exploration
B) Data partitioning
C) Data preparation
D) Model assessment

E) B) and C)
F) All of the above

Correct Answer

verifed

verified

A

Given the following classification confusion matrix, what is the overall error rate?  Classification Confusion Matrix  Predicted Class  Actual Class 101224850283,258\begin{array} { | l | c | c | } \hline { \text { Classification Confusion Matrix } } \\\hline & { \text { Predicted Class } } \\\hline \text { Actual Class } & 1 & 0 \\\hline 1 & 224 & 85 \\\hline 0 & 28 & 3,258 \\\hline\end{array} ​ ​ ​

Correct Answer

verifed

verified

Misclassifying an actual ______ observation as a(n) ______ observation is known as a false positive.


A) Class 0, Class 1
B) Class 1, Class 0
C) error, accuracy
D) false, true

E) All of the above
F) A) and D)

Correct Answer

verifed

verified

Test set is the data set used to


A) build the data mining model.
B) estimate accuracy of candidate models on unseen data.
C) estimate accuracy of final model on unseen data.
D) show counts of actual versus predicted class values.

E) A) and D)
F) B) and C)

Correct Answer

verifed

verified

A _____ classifies a categorical outcome variable by splitting observations into groups via a sequence of hierarchical rules.


A) regression tree
B) scatter chart
C) classification tree
D) classification confusion matrix

E) A) and D)
F) A) and C)

Correct Answer

verifed

verified

A(n) _______________ is often displayed as a row of values in a spreadsheet or database in which the columns correspond to the variables.


A) record
B) data point
C) classification
D) location

E) B) and D)
F) B) and C)

Correct Answer

verifed

verified

_______compares the number of actual Class 1 observations identified if considered in decreasing order of their estimated probability if randomly selected.


A) Cumulative lift
B) ​Classification confusion
C) Decile-wise lift chart
D) ROC curve

E) B) and D)
F) C) and D)

Correct Answer

verifed

verified

Estimation methods are also referred to as


A) prediction methods.
B) clustering methods.
C) association methods.
D) supervised methods.

E) All of the above
F) A) and D)

Correct Answer

verifed

verified

An observation classified as part of a group with a characteristic when it actually does not have the characteristic is termed as a(n)


A) false negative.
B) false positive.
C) residual.
D) outlier.

E) All of the above
F) A) and B)

Correct Answer

verifed

verified

___________ is a generalization of linear regression for predicting a categorical outcome variable.


A) Multiple linear regression
B) Logistic regression
C) Discriminant analysis
D) Cluster analysis

E) B) and D)
F) B) and C)

Correct Answer

verifed

verified

The percent of misclassified records out of the total records in the validation data is known as the


A) overall error rate.
B) error.
C) accuracy.
D) class.

E) None of the above
F) C) and D)

Correct Answer

verifed

verified

A

Given the following classification confusion matrix, what is the accuracy? ​  Classification Confusion Matrix  Predicted Class  Actual Class 101224850283,258\begin{array} { | l | c | c | } \hline { \text { Classification Confusion Matrix } } \\\hline & { \text { Predicted Class } } \\\hline \text { Actual Class } & 1 & 0 \\\hline 1 & 224 & 85 \\\hline 0 & 28 & 3,258 \\\hline\end{array} ​ ​

Correct Answer

verifed

verified

Showing 1 - 20 of 40

Related Exams

Show Answer