Siamese Based Network for Detecting Quora Questions Similarity

Yan LI

논문 상세보기

Siamese Based Network for Detecting Quora Questions Similarity KCI 등재

Yan LI

언어ENG
URLhttps://db.koreascholar.com/Article/Detail/420401

구독 기관 인증 시 무료 이용이 가능합니다. 4,300원

한국컴퓨터게임학회 논문지 (Journal of The Korean Society for Computer Game)

제35권 제4호 (2022.12)
pp.57-68

한국컴퓨터게임학회 (Korean Society for Computer Game)

초록

Quora search engine redirects to different discussion pages based on the search terms searched by the user. So, when questions that are semantically similar are searched on Quora, it sometimes redirects a user to different discussion pages even if there exists a page to the dedicated search. In such a case, Semantic Similarity among the questions carries highest weightage. So, for text, using traditional methods for calculating similarity, usually the text is considered as sequence of words and they just count the number of words that occurred in a sentence, on which some distance measures are applied to find the similarity, while missing the semantic level knowledge of the text during calculation. Considering such traditional methods, it will also require a huge training set as well as time to produce an accurate model. But in this Research Paper, Siamese Based Network is used that can train itself on a single example of each text to provide an accurate similarity output.I have used different types of pre-trained word embedding models like word-2-vec and glove to understand the semantics of the question pairs present in Quora Question Pair dataset. This paper introduces a new approach to calculate sentence similarity and gives astronishing results outperforming the current state of art Siamese Based LSTM models. Along with this new approach of using Manhattan LSTM with attention mechanism for similarity calculation, a comparative analysis is performed on the embedded question pairs, among different Siamese based LSTM models like LSTM and Manhattan LSTM, to predict whether the questions are similar or not and get the best model combination for Quora Question Pair.

키워드

Siamese Neural Network LSTM Word Embedding Attention Mechanism Text Similarity Quora.

ABSTRACT
1. Deep Learning and Text Similarity
    1.1 Similariy between Questions Searched inQuora
    1.2 Relevance and Importance of the Research
2. Related Research
3. Research Methodology
    3.1. Siamese Neural Network Flow
    3.2 Model Architecture
    3.3 Dataset Description
    3.4. Algorithm Design and Components
4. Analysis and Evaluation
5. Conclusion
References

저자

Yan LI(School of Textile Garment and Design, Changshu Institute of Techonology)

같은 권호 다른 논문