Title
Efficient algorithms for data mining
Date of Award
1998
Degree Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Electrical Engineering and Computer Science
Advisor(s)
Sanjay Ranka
Keywords
Efficient algorithms, Data mining
Subject Categories
Computer Sciences
Abstract
This thesis focuses on the computational aspects and the development of scalable and efficient algorithms for some important data-mining techniques. The emerging field of data mining requires the processing of very large datasets. This makes the development of efficient and scalable algorithms an essential step towards the success of data mining.
Data mining combines many techniques and methods from other areas such as statistics, machine learning, and database. Although some data-mining techniques have been identified in other fields for more than a decade, most of the proposed algorithms in the literature were designed under the assumption of small datasets, either in terms of the number of examples/points or in terms of the number of dimensions/attributes of the examples. These limitations and other inherent properties of these techniques generally make them inappropriate for data-mining applications. Thus, there is a great need for the development of efficient and scalable algorithms that will allow for the application of these techniques to large datasets.
In this thesis we investigate the development of scalable and efficient algorithms for classification, clustering , and similarity join . We mainly follow two approaches to achieve this goal: (1) developing exact algorithms while reducing the computational and I/O requirements, and (2) developing approximation algorithms while reducing the computational and I/O requirements and maintaining a good quality. The proposed algorithms are shown to be efficient, scalable, and robust across different datasets.
Access
Surface provides description only. Full text is available to ProQuest subscribers. Ask your Librarian for assistance.
Recommended Citation
Alsabti, Khaled Abduallah, "Efficient algorithms for data mining" (1998). Electrical Engineering and Computer Science - Dissertations. 174.
https://surface.syr.edu/eecs_etd/174
http://libezproxy.syr.edu/login?url=http://proquest.umi.com/pqdweb?did=733069371&sid=1&Fmt=2&clientId=3739&RQT=309&VName=PQD