Efficient algorithms for data mining
Date of Award
Doctor of Philosophy (PhD)
Electrical Engineering and Computer Science
Efficient algorithms, Data mining
This thesis focuses on the computational aspects and the development of scalable and efficient algorithms for some important data-mining techniques. The emerging field of data mining requires the processing of very large datasets. This makes the development of efficient and scalable algorithms an essential step towards the success of data mining.
Data mining combines many techniques and methods from other areas such as statistics, machine learning, and database. Although some data-mining techniques have been identified in other fields for more than a decade, most of the proposed algorithms in the literature were designed under the assumption of small datasets, either in terms of the number of examples/points or in terms of the number of dimensions/attributes of the examples. These limitations and other inherent properties of these techniques generally make them inappropriate for data-mining applications. Thus, there is a great need for the development of efficient and scalable algorithms that will allow for the application of these techniques to large datasets.
In this thesis we investigate the development of scalable and efficient algorithms for classification, clustering , and similarity join . We mainly follow two approaches to achieve this goal: (1) developing exact algorithms while reducing the computational and I/O requirements, and (2) developing approximation algorithms while reducing the computational and I/O requirements and maintaining a good quality. The proposed algorithms are shown to be efficient, scalable, and robust across different datasets.
Surface provides description only. Full text is available to ProQuest subscribers. Ask your Librarian for assistance.
Alsabti, Khaled Abduallah, "Efficient algorithms for data mining" (1998). Electrical Engineering and Computer Science - Dissertations. Paper 174.