Show simple item record

dc.contributor.advisorSalim, Fahim Ahmed
dc.contributor.authorChowdhury, S M Habibul Mursaleen
dc.description.abstractThe outcomes of this research have significant implications for enterprises seeking effective semantic search solutions considering information beyond what is explicitly shown in the raw documents from different inventories and formats and responding to users’ inquiries with relevant answers helping them to reduce their workload, time, and efforts. This master’s thesis investigates semantic search on heterogeneous documents in an enterprise-level context by exploring two distinct approaches: RDF ontology with RML and entity extraction with vector embeddings. The thesis aims to evaluate the effectiveness of these approaches individually and identify opportunities for their combined ap- plication as future research scope. The first experiment employs entity extraction techniques and vector embeddings with the support of Pinecone DB. By transforming documents into high-dimensional vectors, it captures semantic simi- larities, enabling similarity-based search. The experiment specifically targets CSV, Excel, and datasets, offering a focused investigation into the semantic search within specific formats. The second exper- iment focuses on RDF ontology with RML, utilizing graph-based modeling and SPARQL querying. It demonstrates the ability to capture complex semantic relationships, hierarchies, and ontological con- cepts, providing a powerful framework for semantic search. The experiment handles structured (CSV, Excel) and unstructured (JSON, XML, DOCX, PDF) documents, enabling effective retrieval of informa- tion from diverse file formats. Through a thorough analysis and comparison of the results, the thesis concludes that the RDF ontol- ogy experiment outperforms the entity extraction experiment in terms of semantic search on het- erogeneous documents in an enterprise-level setting. The RDF ontology approach exhibits superior semantic representation, advanced querying capabilities, and greater semantic expressiveness, en- abling more accurate and meaningful search results. Building upon this conclusion, the thesis pro- poses future research on merging RDF ontology with RML and vector embeddings to leverage their respective strengths. This combined approach can hold promise for providing a more comprehensive and powerful semantic search solution. Finally, this master’s thesis contributes to the advancement of semantic search on heterogeneous documents in an enterprise-level context, offering valuable insights and paving the way for further research and development in this field.
dc.publisherUniversity of South-Eastern Norway
dc.titleSemantic Search on Equipment
dc.typeMaster thesis

Files in this item


This item appears in the following Collection(s)

Show simple item record