Abstract
Data mining is the search for relationships and global patterns that exist in large databases, but are `hidden' among the vast amounts of data, such as a relationship between patient data and their medical diagnosis. These relationships represent valuable knowledge about the database and objects in the database and, if the database is a faithful mirror, of the real world registered by the database. One of the main problems for data mining is that the number of possible relationships is very large, thus prohibiting the search for the correct ones by simple validating each of them. Hence, we need intelligent search strategies, as taken from the area of machine learning. Another important problem is that information in data objects is often corrupted or missing. Hence, statistical techniques should be applied to estimate the reliability of the discovered relationships. This report provides a survey of current data mining research, it presents the main underlying ideas, such as inductive l...
Keywords
Related Publications
Knowledge Discovery in Databases
From the Publisher: Knowledge Discovery in Databases brings together current research on the exciting problem of discovering useful and interesting knowledge in It spans many ...
Discovering informative patterns and data cleaning
We present a method for discovering informative patterns from data. With this method, large databases can be reduced to only a few representative data entries. Our framework als...
Translating embeddings for modeling multi-relational data
We consider the problem of embedding entities and relationships of multi-relational data in low-dimensional vector spaces. Our objective is to propose a canonical model which is...
An Efficient Algorithm for Mining Association Rules in Large Databases
Mining for association rules between items in a large database of sales transactions has been described as an important database mining problem. In this paper we present an effi...
Mediators in the architecture of future information systems
For single databases, primary hindrances for end-user access are the volume of data that is becoming available, the lack of abstraction, and the need to understand the represent...
Publication Info
- Year
- 1994
- Type
- article
- Pages
- 1-78
- Citations
- 184
- Access
- Closed