Introduction
From world wide
web (www) we can get many type of information such as Page Link, web page,
accessible document, images videos and also many type of content .So database continually
increase. The WWW has added abundant of data and information transform into
complex information. For the complex and large volume of information, it is not
easy to find relevant information in a short time. In this regard problem has
been resolved by data mining which is a process of extracting previously
unknown data. However, data mining is a process by which previously unknown
information and patterns are extracted from large quantity of data. I try to describe
basic idea of search engine and data mining.
Search
engine:
For a large
volume of data on internet it is difficult to find and extract information for
you. It has said that if you spend only one minute per page, 10 hour a day, it
would take four and half year to explore only 1 million web pages. So for real
need data mining is necessary. There are many search utilities such as google,
bing, ask, AOL, webCrawler etc. Every search engine has large database.
A search engine
database typically contains information such as
1. Title of the page
2. The url
3. A short abstract of the content
4. Keyword to help the search
engine
Web sites are
indexed, scored and ranks for different search engine. Ranking algorithms are
work by web site usability and search frequency of keywords.
For Example: If
10 different user are search by “Data mining” text within 15 user. Other 5 user
search by “Data mining and search engine” text. First 10 user also have
interest of search engine related result. Here frequency of “Data mining”
related web pages is increase. So next time when any one write “Data Mining”
text for pick result then most browsed web site will show first.
Data
Mining:
Data mining
extract related data for you from large database by use of KDD(Knowledge
discovery in database) .
KDD can be :
1. Database
2. Relational database
3. Structure database
4. Unstructured database
5. Flat file
6. Transactional database
7. Object Oriented database
8. Data Warehouse
9. Multimedia database
10. Time series database
You can use Association
and clustering analysis in search engine algorithm to extract required result.
1.
Association Analysis:
Association
analysis discovers the pattern that describes strongly associated features in
data. For example: they who search by text “data mining” would most possibility
to enjoy “data mining and search engine” related result.
2.
Cluster Analysis:
Cluster
analysis seeks to find groups of closely related observations so that
observations belong to same clusters are more similar to each other.
For
example: Search result of data mining and data science may closely related.
Bibliography
[1] Mohammad
Alhamami,Using Data mining to enhance web search
engine. Ref: http://www.ehulool.com/using-data-mining-to-enhance-web-search-engines/
[2] Hillal Hadi
Saleh, Mohammad Ala’a AL-Hamami, “A Proposed System to Improve Relevant
Information Retrieval on the Web”, the 1st International Conference on Digital
Communications and Computer Applications (DCCA2007), the Jordan University of
Science and Technology, Irbid, Jordan.2007.
[3] Alaa H. AL-Hamami, Mohammad A. AL-Hamami, Soukaena H. Hashem, “Using Data Mining Confidence and Support for Privacy Preserving Secure Database”, Journal of Statistical Sciences, Volume 1, No. 1, Issued by Arab Institute for Training and Research in Statistics, July –December 2009.
[4] Smith J. R., and Chang S. F., “Visually Searching the Web for Content”, IEEE Multimedia Magazine, vol. 4, pp. 12-20, 1997.
[3] Alaa H. AL-Hamami, Mohammad A. AL-Hamami, Soukaena H. Hashem, “Using Data Mining Confidence and Support for Privacy Preserving Secure Database”, Journal of Statistical Sciences, Volume 1, No. 1, Issued by Arab Institute for Training and Research in Statistics, July –December 2009.
[4] Smith J. R., and Chang S. F., “Visually Searching the Web for Content”, IEEE Multimedia Magazine, vol. 4, pp. 12-20, 1997.
[5] Pang- Ning
Tan,Michael Steinbach,Vipin Kumar,”Introduction to data mining” 2006