Semantics processing for search engines
Document Type
Conference Proceeding
Publication Date
8-15-2018
Abstract
This paper presents a study of extracting related content based on links on web sites through semantics keyword inputs and content analysis. The Java-based implementation semantically converts keywords in English to potential website names as those done by search engines. But our work doesn‟t work here, we use the URL‟s returned from a search engine to fetch content from website names and to match them with the keywords, through semantics analysis such as Latent Semantics Indexing (LSI). Our differentiator is to extract relevant content from all the sub-links besides the sites discovered by search engines. Our research includes three parts: 1) Let the user input a list of keywords and convert them into a list of URL‟s through search engines. 2) Use a method to match the keyword information and extract the sub-links from the content of the URL‟s. Then save the content in a new list. The content will be filtered through LSI analysis. 3) Create an interface to output the content list to users. Some relevant research is shown in the paper, e.g. PageRank algorithm, Hyperlink-Induced Topic Search algorithm. Link extraction has been done while the LSI part is ongoing.
Publication Title
ACM International Conference Proceeding Series
First Page Number
124
Last Page Number
126
DOI
10.1145/3243250.3243273
Recommended Citation
Wang, Qian and Jenny Li, J., "Semantics processing for search engines" (2018). Kean Publications. 1472.
https://digitalcommons.kean.edu/keanpublications/1472