Exam text content

DATA.ML.320 Knowledge Mining and Big Data - 08.05.2025

Exam text content

The text is generated with Optical Image Recognition from the original exam file and it can therefore contain erroneus or incomplete information. For example, mathematical symbols cannot be rendered correctly. The text is mainly used for generating search results.

Original exam
Tampere University
Faculty of Information Technology and Communication Sciences

Data.ml.320 Knowledge Mining and Big Data (5 cr)
Exam 8.5.2025 / Ari Visa

Please, do not forget the Kaiku-feedback!

 

1. Define terms: Data warehouse, Data Lake, 6p
and Knowledge Mining

An example of a definition:

Data: Facts and things certainly known. Data are any
facts, numbers, or text that can be processed by a
computer.

 

2. Why do you use HADOOP? Are there 6p
alternatives to HADOOP? What is HADOOP?
How do you use HADOOP when solving a
clustering problem? What is the relation
between cloud computing and HADOOP?

 

3. What do you know about associative analysis? | 6p

 

4. What is the difference between classification | 6p
and prediction? You have only 2 labeled
samples of type X = (x: , X2, x3) ", consisting of
real numbers. You should make a predictive
model. What kind of model do you use?
Motivate your answer! What is the robustness of
your solution?

 

 

5. How do you define cluster analysis? How 6p
can you estimate the number of clusters?
Motivate your answer.

You have 1T (=10!) samples of high
dimensional data (dimension > 100) available.
What kind of clustering method do you use?
Motivate your answer! How does computer
architecture influence your proposal?

 

 

 

 


We use cookies

This website uses cookies, including third-party cookies, only for necessary purposes such as saving settings on the user's device, keeping track of user sessions and for providing the services included on the website. This website also collects other data, such as the IP address of the user and the type of web browser used. This information is collected to ensure the operation and security of the website. The collected information can also be used by third parties to enable the ordinary operation of the website.

FI / EN