Speaker: Pavel Kaganovich, Ophir Münz-Manor and Elishai Ezra-Tsur
Affiliation: 1, Reichman University,
8 Ha’universita St, Herzliya, 4610101, Israel; 2, The Open University of Israel,
1 University Road, Raanana, 43107, Israel
Title: Style Transfer of Modern Hebrew Literature Using Text Simplification and Generative Language Modeling
Abstract: The task of Style Transfer (ST) in Natural Language Processing (NLP), involves altering the style of a given sentence to match another target style while preserving its semantics. Currently, the availability of Hebrew models for NLP, specifically generative models, is scarce. The development of such models is a non-trivial task due to the complex nature of Hebrew. The Hebrew language presents notable challenges to NLP as a result of its rich morphology, intricate inflectional structure, and orthography, which have undergone significant transformations throughout its history Hebrew orthography has evolved over time, and there are differences between modern Hebrew, biblical Hebrew, and other historical forms of the language. This can make it difficult to create models that are robust across different time periods and genres. . In this work, we propose a generative ST model of modern Hebrew language that rewrites sentences to a target style in the absence of parallel style corpora. Our focus is on the domain of Modern Hebrew literature, which presents unique challenges for the ST task. To overcome the lack of parallel data, we initially create a pseudo-parallel corpus using back translation (BT) techniques for the purpose of achieving text simplification. Subsequently, we fine-tune a pre-trained Hebrew language model (LM) and leverage a zero-shot Learning (ZSL) approach for ST. Our study demonstrates significant achievements in terms of transfer accuracy, semantic similarity, and fluency in the ST of source sentence to a target style using our model. Notably, to the best of our knowledge, no prior research has focused on the development of ST models specifically for Modern Hebrew literature. As such, our proposed model constitutes a novel and valuable contribution to the field of Hebrew NLP, Modern Hebrew Literature and more generally computational literary studies.