Python定制 | Advanced Databases and Applications (CP5520)

这个python作业是隐私保护的数据挖掘
Some Possible Topics for Minor Research
Advanced Databases and Applications (CP5520)
1. Privacy Preserving Data Mining
Two parties owning confidential databases wish to run a data mining algorithm on the union
of their databases, without revealing any unnecessary information. This problem has many
practical and important applications, such as in medical research with confidential patient
records.
2. Private Information Retrieval
A private Information Retrieval (PIR) protocol enables a user to retrieve a data item from
a database while hiding the identity of the item being retrieved. The main cost-measure of
such protocols is the communication complexity of retrieving a single bit of data.
3. Overview of spatial databases and investigation into spatial databases used in
commercial Geographic Information Systems
It is believed that as much as 90% of business commerdirk.c.aumuellercial data is geographic
data. The importance of handling geographic data is ever increasing. Geographic Information
Systems (GIS) are information systems that deal with geo-spatial and temporal databases to
solve geo-spatial problems. It is well noted that the application of GIS is only limited by
the imagination of users. In this small project, you are going to investigate special features
of spatial databases and how these are structured within commercial GIS such as ArcView,
ArcInfo, and Smallworld etc.
4. Investigation into Data warehouses vs. transactional databases
We witnessed the explosion of data and stored data doubles in every three years. We are
drowned by data and we need intelligent ways of structuring data. Traditional databases are
not good enough to analyze these dynamically growing and large-scale data to make accurate
and prompt decisions. Data warehousing is one of new techniques especially designed for
online analysis. Data warehousing provides infrastructure and functions for business managers
to systematically analyze their customer data to make strategic decisions. It provides OLAP
(On-Line Analytical Processing) tools to analyze subject-oriented, integrated, time-variant
and nonvolatile data.
5. When data mining meets databases
Data mining is one of steps in knowledge discovery in databases. It attempts to discover
novel, previously unknown, ultimately understandable patterns from massive databases. This
task is going to investigate how data mining can be improved from database techniques or
how the database community can benefit from the development of data mining. You may
investigate data mining query language, data mining query optimization, data mining system
architectures. etc.
1
6. Clustering in data mining
Clustering partitions a number of data items into homogeneous groups so that it minimizes
within group dissimilarities and maximizes between group dissimilarities. It is the most
popular data mining techniques in the data mining community. This project is going to
investigate various types of clustering in data mining. Note that, clustering techniques has
been widely investigated within the statistics and machine learning communities. You need
to investigate what makes clustering in data mining community differ from clustering in other
communities.
7. Association Rules Mining
Association rules mining is the second widely used techniques in data mining. It searches for
interesting relationships among items in a given data set especially in transactional databases.
This will investigate what Association rules mining is, application areas, variants, etc.
8. Temporal Databases
A wide range of database applications manage time-varying data. In contrast, existing
database technology provides little support for managing such data. The research area of
temporal databases aims to change this state of affairs by characterizing the semantics of
temporal data and providing expressive and efficient ways to model, store, and query temporal data. It requires the study of fundamental temporal database concepts, surveys state-ofthe-art solutions to challenging aspects of temporal data management, and also offers a look
into the future of temporal database research.
or, choose a (database-relevant) topic of your interest.
2