BALAGHAScore.com Arabic Word Tokenisation Scheme

Mandar Marathe

doi:10.64393/balagha-score.tokenisation-v0.1.0

Abstract

This document presents the BALAGHAScore.com Arabic Word Tokenisation Scheme, a customised set of rules for segmenting Arabic text into word units for rhetorical density calculations such as in the BALAGHA Score.

Affiliated Institutions

Related Publications

Graph Convolutional Networks for Text Classification

Liang Yao , Chengsheng Mao , Yuan Luo

Text classification is an important and classical problem in natural language processing. There have been a number of studies that applied convolutional neural networks (convolu...

2019 Proceedings of the AAAI Conference on... 1867 citations

A learner-independent evaluation of the usefulness of statistical phrases for automated text categorization

Maria Fernanda Caropreso , Stan Matwin , Fabrizio Sebastiani

In this work we investigate the usefulness of n-grams for document indexing in text categorization (TCi We call-gram a set g k of n word stems, and we say that g k occurs in a d...

2001 180 citations

ROUGE: A Package for Automatic Evaluation of Summaries

Chin-Yew Lin

ROUGE stands for Recall-Oriented Understudy for Gisting Evaluation. It includes measures to automatically determine the quality of a summary by comparing it to other (ideal) sum...

2004 8287 citations

A theory of reading: From eye fixations to comprehension.

Marcel Adam Just , Patricia A. Carpenter

This article presents a model of reading comprehension that accounts for the allocation of eye fixations of college students reading scientific passages. The model deals with pr...

1980 Psychological Review 3691 citations

<i>Ab initio</i>up to the melting point: Anharmonicity and vacancies in aluminum

Blazej Grabowski , L. Ismer , Tilmann Hickel +1 more

We propose a fully ab initio based integrated approach to determine the volume and temperature dependent free-energy surface of nonmagnetic crystalline solids up to the melting ...

2009 Physical Review B 271 citations

Publication Info

Year: 2025
Type: article
Citations: 0
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

BALAGHAScore.com Arabic Word Tokenisation Scheme

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

OpenAlex

Cite This

APA Style

                            
                                    Mandar Marathe
                                
                            (2025). 
                            BALAGHAScore.com Arabic Word Tokenisation Scheme. 
                            
                            .
                            https://doi.org/10.64393/balagha-score.tokenisation-v0.1.0

Identifiers

DOI: 10.64393/balagha-score.tokenisation-v0.1.0