Categories: Third Year Second Semester (3-2)

JNTU Kakinada B-Tech 3-2 RT32052 I DATA WARE HOUSING AND MINING R13 April 2018 Question Paper

Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B) 2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) Why data mining is required? [3M]
b) With an example, justify the need of data Integration? [4M]
c) Compare and contrast ROLAP versus MOLAP. [3M]
d) Justify the need of attribute splitting rules? Where one is used? [4M]
e) What is pruning? Why support-based pruning is required? [4M]
f) Why clustering called unsupervised classification? [4M]
PART ?B
2 a) What is the difference between discrimination and classification? Between
characterization and clustering? Between classification and prediction? For each of
these pairs of tasks, how are they similar?
[8M]
b) Briefly describe data mining functionalities. [8M]
3 a) What is Preprocessing? Why we need to preprocess the data? Briefly describe the
forms of data preprocessing.
[8M]
b) What is data reduction? Describe the strategies for data reduction. [8M]
4 a) Briefly describe the available processes for data cube materialization. [8M]
b) With an example, describe the Efficient Data Cube Computation. [8M]
5 a) What is attribute selection measure? Briefly describe the attribute selection
measures for decision tree induction.
[8M]
b) With an example, describe the classification by decision tree induction. [8M]
6 a) Consider the following set of frequent 3-itemsets:
{1, 2, 3}, {1, 2, 4}, {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, {2, 3, 4}, {2, 3, 5}, {3, 4, 5}.
Assume that there are only five items in the data set.
i) List all candidate 4-itemsets obtained by the candidate generation procedure in
Apriori.
ii) List all candidate 4-itemsets obtained by a candidate generation procedure using
the F
k – 1
? F
1
merging strategy.
[8M]
b) Briefly describe Apriori algorithm for frequent itemset generation. [8M]
7 a) How to handle empty clusters and outliers in k-means clustering? [8M]
b) Compare and contrast K-means clustering Vs Hierarchical clustering. [8M]
*****
R13
SET – 1
Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B) 2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) Why data mining is required? [3M]
b) With an example, justify the need of data Integration? [4M]
c) Compare and contrast ROLAP versus MOLAP. [3M]
d) Justify the need of attribute splitting rules? Where one is used? [4M]
e) What is pruning? Why support-based pruning is required? [4M]
f) Why clustering called unsupervised classification? [4M]
PART ?B
2 a) What is the difference between discrimination and classification? Between
characterization and clustering? Between classification and prediction? For each of
these pairs of tasks, how are they similar?
[8M]
b) Briefly describe data mining functionalities. [8M]
3 a) What is Preprocessing? Why we need to preprocess the data? Briefly describe the
forms of data preprocessing.
[8M]
b) What is data reduction? Describe the strategies for data reduction. [8M]
4 a) Briefly describe the available processes for data cube materialization. [8M]
b) With an example, describe the Efficient Data Cube Computation. [8M]
5 a) What is attribute selection measure? Briefly describe the attribute selection
measures for decision tree induction.
[8M]
b) With an example, describe the classification by decision tree induction. [8M]
6 a) Consider the following set of frequent 3-itemsets:
{1, 2, 3}, {1, 2, 4}, {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, {2, 3, 4}, {2, 3, 5}, {3, 4, 5}.
Assume that there are only five items in the data set.
i) List all candidate 4-itemsets obtained by the candidate generation procedure in
Apriori.
ii) List all candidate 4-itemsets obtained by a candidate generation procedure using
the F
k – 1
? F
1
merging strategy.
[8M]
b) Briefly describe Apriori algorithm for frequent itemset generation. [8M]
7 a) How to handle empty clusters and outliers in k-means clustering? [8M]
b) Compare and contrast K-means clustering Vs Hierarchical clustering. [8M]
*****
R13
SET – 1
Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B)2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) What are challenges of data mining? [3M]
b) Justify the need of data reduction? [4M]
c) Briefly describe key features of data warehouse. [3M]
d) How entropy is used in classification? [4M]
e) Why confidence-based pruning is required? [4M]
f) Would the cosine measure be the appropriate similarity measure to use with
K-means clustering for time series data? Why or why not?
[4M]
PART -B
2 a) What are the major challenges of mining a huge amount of data (such as
billions of tuples) in comparison with mining a small amount of data (such as a
few hundred tuple data set)?
[8M]
b) Describe the differences between Operational Database Systems and Data
Warehouses.
[8M]
3 a) What is descriptive data summarization? Why descriptive data summarization
is used? What is dispersion? Describe measures for Measuring the Dispersion
of Data.
[8M]
b) What is attribute subset selection? Describe heuristic methods of attribute
subset selection.
[8M]
4 a) Describe various schemes used for the design of multidimensional data model. [8M]
b) With an example, describe indexing OLAP data using bitmap indices.
[8M]
5 a) Briefly describe the measures for selecting the bet split. [6M]
b) What is cross validation? With an example, describe how cross validation can
be used for evaluating the performance of a classification model.

1 of 2

[10M]
R13
SET – 2
Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B) 2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) Why data mining is required? [3M]
b) With an example, justify the need of data Integration? [4M]
c) Compare and contrast ROLAP versus MOLAP. [3M]
d) Justify the need of attribute splitting rules? Where one is used? [4M]
e) What is pruning? Why support-based pruning is required? [4M]
f) Why clustering called unsupervised classification? [4M]
PART ?B
2 a) What is the difference between discrimination and classification? Between
characterization and clustering? Between classification and prediction? For each of
these pairs of tasks, how are they similar?
[8M]
b) Briefly describe data mining functionalities. [8M]
3 a) What is Preprocessing? Why we need to preprocess the data? Briefly describe the
forms of data preprocessing.
[8M]
b) What is data reduction? Describe the strategies for data reduction. [8M]
4 a) Briefly describe the available processes for data cube materialization. [8M]
b) With an example, describe the Efficient Data Cube Computation. [8M]
5 a) What is attribute selection measure? Briefly describe the attribute selection
measures for decision tree induction.
[8M]
b) With an example, describe the classification by decision tree induction. [8M]
6 a) Consider the following set of frequent 3-itemsets:
{1, 2, 3}, {1, 2, 4}, {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, {2, 3, 4}, {2, 3, 5}, {3, 4, 5}.
Assume that there are only five items in the data set.
i) List all candidate 4-itemsets obtained by the candidate generation procedure in
Apriori.
ii) List all candidate 4-itemsets obtained by a candidate generation procedure using
the F
k – 1
? F
1
merging strategy.
[8M]
b) Briefly describe Apriori algorithm for frequent itemset generation. [8M]
7 a) How to handle empty clusters and outliers in k-means clustering? [8M]
b) Compare and contrast K-means clustering Vs Hierarchical clustering. [8M]
*****
R13
SET – 1
Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B)2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) What are challenges of data mining? [3M]
b) Justify the need of data reduction? [4M]
c) Briefly describe key features of data warehouse. [3M]
d) How entropy is used in classification? [4M]
e) Why confidence-based pruning is required? [4M]
f) Would the cosine measure be the appropriate similarity measure to use with
K-means clustering for time series data? Why or why not?
[4M]
PART -B
2 a) What are the major challenges of mining a huge amount of data (such as
billions of tuples) in comparison with mining a small amount of data (such as a
few hundred tuple data set)?
[8M]
b) Describe the differences between Operational Database Systems and Data
Warehouses.
[8M]
3 a) What is descriptive data summarization? Why descriptive data summarization
is used? What is dispersion? Describe measures for Measuring the Dispersion
of Data.
[8M]
b) What is attribute subset selection? Describe heuristic methods of attribute
subset selection.
[8M]
4 a) Describe various schemes used for the design of multidimensional data model. [8M]
b) With an example, describe indexing OLAP data using bitmap indices.
[8M]
5 a) Briefly describe the measures for selecting the bet split. [6M]
b) What is cross validation? With an example, describe how cross validation can
be used for evaluating the performance of a classification model.

1 of 2

[10M]
R13
SET – 2
Code No: RT32052

6 a)Consider the market basket transactions shown in the above table:
i) What is the maximum number of association rules that can be extracted from
this data (including rules that have zero support)?
ii) What is the maximum size of frequent itemsets that can be extracted (assuming minsup > 0)?
[8M]
b) Briefly describe the factors that can affect the computational complexity of
Apriori algorithm.
[8M]
7 a) For your own data, describe step-by-step process of bisecting k-means
clustering. In what way bisecting k-means clustering is different from basic k-
means clustering.
[8M]
b) Compare and contrast DBSCAN clustering Vs Hierarchical clustering. [8M]

*****

2 of 2
R13
SET – 2
Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B) 2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) Why data mining is required? [3M]
b) With an example, justify the need of data Integration? [4M]
c) Compare and contrast ROLAP versus MOLAP. [3M]
d) Justify the need of attribute splitting rules? Where one is used? [4M]
e) What is pruning? Why support-based pruning is required? [4M]
f) Why clustering called unsupervised classification? [4M]
PART ?B
2 a) What is the difference between discrimination and classification? Between
characterization and clustering? Between classification and prediction? For each of
these pairs of tasks, how are they similar?
[8M]
b) Briefly describe data mining functionalities. [8M]
3 a) What is Preprocessing? Why we need to preprocess the data? Briefly describe the
forms of data preprocessing.
[8M]
b) What is data reduction? Describe the strategies for data reduction. [8M]
4 a) Briefly describe the available processes for data cube materialization. [8M]
b) With an example, describe the Efficient Data Cube Computation. [8M]
5 a) What is attribute selection measure? Briefly describe the attribute selection
measures for decision tree induction.
[8M]
b) With an example, describe the classification by decision tree induction. [8M]
6 a) Consider the following set of frequent 3-itemsets:
{1, 2, 3}, {1, 2, 4}, {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, {2, 3, 4}, {2, 3, 5}, {3, 4, 5}.
Assume that there are only five items in the data set.
i) List all candidate 4-itemsets obtained by the candidate generation procedure in
Apriori.
ii) List all candidate 4-itemsets obtained by a candidate generation procedure using
the F
k – 1
? F
1
merging strategy.
[8M]
b) Briefly describe Apriori algorithm for frequent itemset generation. [8M]
7 a) How to handle empty clusters and outliers in k-means clustering? [8M]
b) Compare and contrast K-means clustering Vs Hierarchical clustering. [8M]
*****
R13
SET – 1
Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B)2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) What are challenges of data mining? [3M]
b) Justify the need of data reduction? [4M]
c) Briefly describe key features of data warehouse. [3M]
d) How entropy is used in classification? [4M]
e) Why confidence-based pruning is required? [4M]
f) Would the cosine measure be the appropriate similarity measure to use with
K-means clustering for time series data? Why or why not?
[4M]
PART -B
2 a) What are the major challenges of mining a huge amount of data (such as
billions of tuples) in comparison with mining a small amount of data (such as a
few hundred tuple data set)?
[8M]
b) Describe the differences between Operational Database Systems and Data
Warehouses.
[8M]
3 a) What is descriptive data summarization? Why descriptive data summarization
is used? What is dispersion? Describe measures for Measuring the Dispersion
of Data.
[8M]
b) What is attribute subset selection? Describe heuristic methods of attribute
subset selection.
[8M]
4 a) Describe various schemes used for the design of multidimensional data model. [8M]
b) With an example, describe indexing OLAP data using bitmap indices.
[8M]
5 a) Briefly describe the measures for selecting the bet split. [6M]
b) What is cross validation? With an example, describe how cross validation can
be used for evaluating the performance of a classification model.

1 of 2

[10M]
R13
SET – 2
Code No: RT32052

*****

1 of 2

[8M]
R13
SET – 3
Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B) 2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) Why data mining is required? [3M]
b) With an example, justify the need of data Integration? [4M]
c) Compare and contrast ROLAP versus MOLAP. [3M]
d) Justify the need of attribute splitting rules? Where one is used? [4M]
e) What is pruning? Why support-based pruning is required? [4M]
f) Why clustering called unsupervised classification? [4M]
PART ?B
2 a) What is the difference between discrimination and classification? Between
characterization and clustering? Between classification and prediction? For each of
these pairs of tasks, how are they similar?
[8M]
b) Briefly describe data mining functionalities. [8M]
3 a) What is Preprocessing? Why we need to preprocess the data? Briefly describe the
forms of data preprocessing.
[8M]
b) What is data reduction? Describe the strategies for data reduction. [8M]
4 a) Briefly describe the available processes for data cube materialization. [8M]
b) With an example, describe the Efficient Data Cube Computation. [8M]
5 a) What is attribute selection measure? Briefly describe the attribute selection
measures for decision tree induction.
[8M]
b) With an example, describe the classification by decision tree induction. [8M]
6 a) Consider the following set of frequent 3-itemsets:
{1, 2, 3}, {1, 2, 4}, {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, {2, 3, 4}, {2, 3, 5}, {3, 4, 5}.
Assume that there are only five items in the data set.
i) List all candidate 4-itemsets obtained by the candidate generation procedure in
Apriori.
ii) List all candidate 4-itemsets obtained by a candidate generation procedure using
the F
k – 1
? F
1
merging strategy.
[8M]
b) Briefly describe Apriori algorithm for frequent itemset generation. [8M]
7 a) How to handle empty clusters and outliers in k-means clustering? [8M]
b) Compare and contrast K-means clustering Vs Hierarchical clustering. [8M]
*****
R13
SET – 1
Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B)2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) What are challenges of data mining? [3M]
b) Justify the need of data reduction? [4M]
c) Briefly describe key features of data warehouse. [3M]
d) How entropy is used in classification? [4M]
e) Why confidence-based pruning is required? [4M]
f) Would the cosine measure be the appropriate similarity measure to use with
K-means clustering for time series data? Why or why not?
[4M]
PART -B
2 a) What are the major challenges of mining a huge amount of data (such as
billions of tuples) in comparison with mining a small amount of data (such as a
few hundred tuple data set)?
[8M]
b) Describe the differences between Operational Database Systems and Data
Warehouses.
[8M]
3 a) What is descriptive data summarization? Why descriptive data summarization
is used? What is dispersion? Describe measures for Measuring the Dispersion
of Data.
[8M]
b) What is attribute subset selection? Describe heuristic methods of attribute
subset selection.
[8M]
4 a) Describe various schemes used for the design of multidimensional data model. [8M]
b) With an example, describe indexing OLAP data using bitmap indices.
[8M]
5 a) Briefly describe the measures for selecting the bet split. [6M]
b) What is cross validation? With an example, describe how cross validation can
be used for evaluating the performance of a classification model.

1 of 2

[10M]
R13
SET – 2
Code No: RT32052

*****

1 of 2

[8M]
R13
SET – 3
Code No: RT32052

6 a)Consider the market basket transactions shown in the above table: (i) Write an expression for the maximum number of size-3 itemsets that
can be derived from this data set. (ii) Find an item set (of size 2 or larger) that has the largest support.
[8M]
b) Briefly describe the ways to reduce the computational complexity of frequent item
set generation.
[4M]
c) What is candidate generation? List the requirements for an effective candidate
generation.
[4M]
7 a) For a suitable data, describe the step-by-step process of k-means clustering. [8M]
b) What is DBSCAN? For which situation you suggest the usage of DBSCAN
clustering?
[8M]

*****

2 of 2
R13
SET – 3
Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B) 2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) Why data mining is required? [3M]
b) With an example, justify the need of data Integration? [4M]
c) Compare and contrast ROLAP versus MOLAP. [3M]
d) Justify the need of attribute splitting rules? Where one is used? [4M]
e) What is pruning? Why support-based pruning is required? [4M]
f) Why clustering called unsupervised classification? [4M]
PART ?B
2 a) What is the difference between discrimination and classification? Between
characterization and clustering? Between classification and prediction? For each of
these pairs of tasks, how are they similar?
[8M]
b) Briefly describe data mining functionalities. [8M]
3 a) What is Preprocessing? Why we need to preprocess the data? Briefly describe the
forms of data preprocessing.
[8M]
b) What is data reduction? Describe the strategies for data reduction. [8M]
4 a) Briefly describe the available processes for data cube materialization. [8M]
b) With an example, describe the Efficient Data Cube Computation. [8M]
5 a) What is attribute selection measure? Briefly describe the attribute selection
measures for decision tree induction.
[8M]
b) With an example, describe the classification by decision tree induction. [8M]
6 a) Consider the following set of frequent 3-itemsets:
{1, 2, 3}, {1, 2, 4}, {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, {2, 3, 4}, {2, 3, 5}, {3, 4, 5}.
Assume that there are only five items in the data set.
i) List all candidate 4-itemsets obtained by the candidate generation procedure in
Apriori.
ii) List all candidate 4-itemsets obtained by a candidate generation procedure using
the F
k – 1
? F
1
merging strategy.
[8M]
b) Briefly describe Apriori algorithm for frequent itemset generation. [8M]
7 a) How to handle empty clusters and outliers in k-means clustering? [8M]
b) Compare and contrast K-means clustering Vs Hierarchical clustering. [8M]
*****
R13
SET – 1
Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B)2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) What are challenges of data mining? [3M]
b) Justify the need of data reduction? [4M]
c) Briefly describe key features of data warehouse. [3M]
d) How entropy is used in classification? [4M]
e) Why confidence-based pruning is required? [4M]
f) Would the cosine measure be the appropriate similarity measure to use with
K-means clustering for time series data? Why or why not?
[4M]
PART -B
2 a) What are the major challenges of mining a huge amount of data (such as
billions of tuples) in comparison with mining a small amount of data (such as a
few hundred tuple data set)?
[8M]
b) Describe the differences between Operational Database Systems and Data
Warehouses.
[8M]
3 a) What is descriptive data summarization? Why descriptive data summarization
is used? What is dispersion? Describe measures for Measuring the Dispersion
of Data.
[8M]
b) What is attribute subset selection? Describe heuristic methods of attribute
subset selection.
[8M]
4 a) Describe various schemes used for the design of multidimensional data model. [8M]
b) With an example, describe indexing OLAP data using bitmap indices.
[8M]
5 a) Briefly describe the measures for selecting the bet split. [6M]
b) What is cross validation? With an example, describe how cross validation can
be used for evaluating the performance of a classification model.

1 of 2

[10M]
R13
SET – 2
Code No: RT32052

*****

1 of 2

[8M]
R13
SET – 3
Code No: RT32052

*****

1 of 2

[8M]
R13
SET – 4
Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B) 2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) Why data mining is required? [3M]
b) With an example, justify the need of data Integration? [4M]
c) Compare and contrast ROLAP versus MOLAP. [3M]
d) Justify the need of attribute splitting rules? Where one is used? [4M]
e) What is pruning? Why support-based pruning is required? [4M]
f) Why clustering called unsupervised classification? [4M]
PART ?B
2 a) What is the difference between discrimination and classification? Between
characterization and clustering? Between classification and prediction? For each of
these pairs of tasks, how are they similar?
[8M]
b) Briefly describe data mining functionalities. [8M]
3 a) What is Preprocessing? Why we need to preprocess the data? Briefly describe the
forms of data preprocessing.
[8M]
b) What is data reduction? Describe the strategies for data reduction. [8M]
4 a) Briefly describe the available processes for data cube materialization. [8M]
b) With an example, describe the Efficient Data Cube Computation. [8M]
5 a) What is attribute selection measure? Briefly describe the attribute selection
measures for decision tree induction.
[8M]
b) With an example, describe the classification by decision tree induction. [8M]
6 a) Consider the following set of frequent 3-itemsets:
{1, 2, 3}, {1, 2, 4}, {1, 2, 5}, {1, 3, 4}, {1, 3, 5}, {2, 3, 4}, {2, 3, 5}, {3, 4, 5}.
Assume that there are only five items in the data set.
i) List all candidate 4-itemsets obtained by the candidate generation procedure in
Apriori.
ii) List all candidate 4-itemsets obtained by a candidate generation procedure using
the F
k – 1
? F
1
merging strategy.
[8M]
b) Briefly describe Apriori algorithm for frequent itemset generation. [8M]
7 a) How to handle empty clusters and outliers in k-means clustering? [8M]
b) Compare and contrast K-means clustering Vs Hierarchical clustering. [8M]
*****
R13
SET – 1
Code No: RT32052
III B. Tech II Semester Regular/Supplementary Examinations, April -2018
DATA WARE HOUSING AND MINING (Common to Computer Science Engineering and Information Technology)Time: 3 hours Max. Marks: 70
Note: 1. Question Paper consists of two parts (Part-A and Part-B)2. Answering the question in Part-A is compulsory
3. Answer any THREE Questions from Part-B
*****
PART ?A
1 a) What are challenges of data mining? [3M]
b) Justify the need of data reduction? [4M]
c) Briefly describe key features of data warehouse. [3M]
d) How entropy is used in classification? [4M]
e) Why confidence-based pruning is required? [4M]
f) Would the cosine measure be the appropriate similarity measure to use with
K-means clustering for time series data? Why or why not?
[4M]
PART -B
2 a) What are the major challenges of mining a huge amount of data (such as
billions of tuples) in comparison with mining a small amount of data (such as a
few hundred tuple data set)?
[8M]
b) Describe the differences between Operational Database Systems and Data
Warehouses.
[8M]
3 a) What is descriptive data summarization? Why descriptive data summarization
is used? What is dispersion? Describe measures for Measuring the Dispersion
of Data.
[8M]
b) What is attribute subset selection? Describe heuristic methods of attribute
subset selection.
[8M]
4 a) Describe various schemes used for the design of multidimensional data model. [8M]
b) With an example, describe indexing OLAP data using bitmap indices.
[8M]
5 a) Briefly describe the measures for selecting the bet split. [6M]
b) What is cross validation? With an example, describe how cross validation can
be used for evaluating the performance of a classification model.

1 of 2

[10M]
R13
SET – 2
Code No: RT32052

*****

1 of 2

[8M]
R13
SET – 3
Code No: RT32052

*****

1 of 2

[8M]
R13
SET – 4
Code No: RT32052

6 a)Consider the market basket transactions shown in the above table: (i) What is the maximum size of frequent itemsets that can be extracted (assuming minsup > 0)? (ii) Find a pair of items, a and b, such that the rules {a} ? {b} and {b} ? {a}
have the same confidence.
[8M]
b) Briefly describe the relation among frequent, maximal frequent and closed
frequent item sets.
[8M]
7 a)Highlight strengths and weaknesses of k-means clustering algorithm.
[8M]
b)With an example, briefly describe the construction of dendograms.
[8M]

*****

2 of 2

R13
SET – 4

JNTUK-R13-April-2018-3-2-Data-Warehousing-Mining-Common-to-CSE-and-IT.pdf

Team FirstRanker.in

Next JNTU Kakinada B-Tech 3-2 RT32054 I DESIGN AND ANALYSIS OF ALGORITHMS R13 April 2018 Question Paper »

Previous « JNTU Kakinada B-Tech 3-2 PT32052 I DATA COMMUNICATION R13 April 2018 Question Paper

MGR University BPT Fourth Year 746268 PAPER V – REHABILITATION MEDICINE INCLUDING GERIATRIC MEDICINE August 2018 Question Paper

746268 PAPER V - REHABILITATION MEDICINE INCLUDING GERIATRIC MEDICINETHE TAMIL NADU DR. M.G.R. MEDICAL UNIVERSITY…

4 years ago

Fourth Year

MGR University BPT Fourth Year 746268 PAPER V – REHABILITATION MEDICINE INCLUDING GERIATRIC MEDICINE August 2018 Question Paper

746268 PAPER V - REHABILITATION MEDICINE INCLUDING GERIATRIC MEDICINETHE TAMIL NADU DR. M.G.R. MEDICAL UNIVERSITY…

4 years ago

Fourth Year

MGR University BPT Fourth Year 746267 PAPER IV – P.T. IN ORTHOPAEDICS August 2018 Question Paper

746267 PAPER IV - P.T. IN ORTHOPAEDICSTHE TAMIL NADU DR. M.G.R. MEDICAL UNIVERSITY [LN 6267]…

4 years ago

Fourth Year

MGR University BPT Fourth Year 746267 PAPER IV – P.T. IN ORTHOPAEDICS August 2018 Question Paper

746267 PAPER IV - P.T. IN ORTHOPAEDICSTHE TAMIL NADU DR. M.G.R. MEDICAL UNIVERSITY [LN 6267]…

4 years ago

Fourth Year

MGR University BPT Fourth Year 746266 PAPER III – CLINICAL ORTHOPAEDICS August 2018 Question Paper

746266 PAPER III – CLINICAL ORTHOPAEDICSTHE TAMIL NADU DR. M.G.R. MEDICAL UNIVERSITY [LN 6266] AUGUST…

4 years ago

Fourth Year

MGR University BPT Fourth Year 746265 PAPER II – P.T. IN NEUROLOGY August 2018 Question Paper

746265 PAPER II – P.T. IN NEUROLOGYTHE TAMIL NADU DR. M.G.R. MEDICAL UNIVERSITY [LN 6265]…

4 years ago

JNTU Kakinada B-Tech 3-2 RT32052 I DATA WARE HOUSING AND MINING R13 April 2018 Question Paper

Recent Posts

MGR University BPT Fourth Year 746268 PAPER V – REHABILITATION MEDICINE INCLUDING GERIATRIC MEDICINE August 2018 Question Paper

MGR University BPT Fourth Year 746268 PAPER V – REHABILITATION MEDICINE INCLUDING GERIATRIC MEDICINE August 2018 Question Paper

MGR University BPT Fourth Year 746267 PAPER IV – P.T. IN ORTHOPAEDICS August 2018 Question Paper

MGR University BPT Fourth Year 746267 PAPER IV – P.T. IN ORTHOPAEDICS August 2018 Question Paper

MGR University BPT Fourth Year 746266 PAPER III – CLINICAL ORTHOPAEDICS August 2018 Question Paper

MGR University BPT Fourth Year 746265 PAPER II – P.T. IN NEUROLOGY August 2018 Question Paper