  <?xml version="1.0"?>
<journal>
 <journal_metadata>
  <full_title>Fusion: Practice and Applications</full_title>
  <abbrev_title>FPA</abbrev_title>
  <issn media_type="print">2692-4048</issn>
  <issn media_type="electronic">2770-0070</issn>
  <doi_data>
   <doi>10.54216/FPA</doi>
   <resource>https://www.americaspg.com/journals/show/3671</resource>
  </doi_data>
 </journal_metadata>
 <journal_issue>
  <publication_date media_type="print">
   <year>2018</year>
  </publication_date>
  <publication_date media_type="online">
   <year>2018</year>
  </publication_date>
 </journal_issue>
 <journal_article publication_type="full_text">
  <titles>
   <title>Over-Under Sampling Approach with Adaptive Synthetic and Tomek Links Methods to Handle Data Imbalance in Sentence Classification on Halal Assurance Certificate Documents</title>
  </titles>
  <contributors>
   <organization sequence="first" contributor_role="author">Doctoral Program of Information System School of Postgraduate Studies, Diponegoro University, Semarang, Indonesia; Faculty of Computer and Engineering, Department of Information System, Alma Ata University, Yogyakarta, Indonesia; Alma Ata Center for Medical Informatics, Alma Ata University, Yogyakarta, Indonesia</organization>
   <person_name sequence="first" contributor_role="author">
    <given_name>Dadang</given_name>
    <surname>Dadang</surname>
   </person_name>
   <organization sequence="first" contributor_role="author">Doctoral Program of Information System School of Postgraduate Studies, Diponegoro University, Semarang, Indonesia</organization>
   <person_name sequence="additional" contributor_role="author">
    <given_name>Rahmat</given_name>
    <surname>Gernowo</surname>
   </person_name>
   <organization sequence="first" contributor_role="author">Doctoral Program of Information System School of Postgraduate Studies, Diponegoro University, Semarang, Indonesia</organization>
   <person_name sequence="additional" contributor_role="author">
    <given_name>R. Rizal</given_name>
    <surname>Isnanto</surname>
   </person_name>
  </contributors>
  <jats:abstract xml:lang="en">
   <jats:p>Data imbalance is a common problem in machine learning, specifically in classification, in which examples in a dominant class outnumber examples in a minority class many times over. Besides, such a problem keeps a model unable to discover meaningful patterns for a minority class —hence, such a problem reduces model performance specifically in terms of Recall and F1-Score.  In current work, activity is performed in overcoming data imbalance problem in sentence classification model of documents of assurance certificate for halal with a combination of over-sampling and under-sampling techniques, namely Adaptive Synthetic (ADASYN) and Tomek Links. Text Classification technique is adopted in classifying sentences regarding assurance of halal in documents of assurance certificate for halal Text Classification; since incorrect classification of such sentences is not preferable, therefore, it is important to make sure no information about halal product is missed out. Over-sampling techniques considered include the SMOTE, Borderline SMOTE, ADASYN, and SMOTENC, and under-sampling techniques include the Random Under-Sampler, Near Miss, and Tomek Links. As comparative result, best performance gain in terms of Accuracy (0.759), F1-Score (0.748), Recall (0.759), and Precision (0.768) is generated with ADASYN. In our use case, ADASYN + Tomek Links is effective; recall is important in case of classification of documents for assurance certificate for halal and therefore, we cannot miss any relevant sentences. The proposed approach remarkably enhances the accuracy level for halal-related sentence identification and can be adopted in the halal product checking systems in industries with a halal feature.</jats:p>
  </jats:abstract>
  <publication_date media_type="print">
   <year>2025</year>
  </publication_date>
  <publication_date media_type="online">
   <year>2025</year>
  </publication_date>
  <pages>
   <first_page>194</first_page>
   <last_page>210</last_page>
  </pages>
  <doi_data>
   <doi>10.54216/FPA.190215</doi>
   <resource>https://www.americaspg.com/articleinfo/3/show/3671</resource>
  </doi_data>
 </journal_article>
</journal>
