A Transformer-Enhanced System to Reverse Dictionary Technology
Ahmed Bahaaulddin A. Alwahhab1,*, Vian Sabeeh1, Ali Sami Al-Itbi1, Ali Abdulmunim Ibrahim Al-kharaz1
1Department of informatics, Technical College of Management, Middle Technical University, Baghdad, Iraq
Emails: ahmedbahaaulddin@mtu.edu.iq; viantalal@mtu.edu.iq; ali.sami@mtu.edu.iq; ali.al-kharaz@mtu.edu.iq
Abstract
The ability to retrieve a word from the cusp of memory often encounters the well-documented Tip-of-the-Tongue (TOT) barrier. This cognitive phenomenon can impede communication and learning. Addressing this, our study introduces a novel reverse dictionary framework empowered by cutting-edge neural network architectures to facilitate the retrieval of words from definitions or descriptions. This research draws the path of the development and the efficiency of various natural language deep learning models formulated to grasp the semantics inside the text. This work started with gripping a new dataset with rich content from a linguistic perspective. An accurate pre-processing step, including text normalizations and contextual features extraction, was conducted to transform the unstructured text into structured features fitting the model training. Dense vectors representative of text have been extracted using the BERT embedding model. Three models (LSTM, FNN, and GRU) were tested and compared using scrapped and benchmarked data. The proposed model that was consisted from Bert embedding and LSTM learner was evaluated and showed notable performance under cosine similarity and mean square error metrics. The LSTM model proved useful in real-world applications by exhibiting excellent semantic coherence in its embedding and accuracy in its predictions. This research evolved a discussion about the efficient behavior of the pre-trained BERT model in enhancing vocabulary. In addition, this work sheds light on the crucial role of reverse dictionaries in many NLP applications in the future. Subsequent research endeavors will focus on augmenting the multilingual functionalities of our methodology and investigating its suitability for other cognitive linguistic phenomena.
Keywords: Bidirectional Encoder Representations from Transformers; Long short-term memory networks; Natural language processing; Reverse Dictionary