OUT NOW: Liina Repo, Brett Hashimoto & Veronika Laippala look at extending register annotations from a corpus of historical documents using BERT-based deep learning models.
Considering text-internal variation, they find that beginnings tend to support better predictions.
doi.org/10.1075/ijcl...
In case you missed it..
doi.org
IJCL
IJCL
In case you missed it:
OUT NOW: Jia Li and Xianyao Hu compare human- and machine-translated texts from Chinese to English to evaluate features of conservatism across registers
Their investigation sheds light on the potetnial of human-machine collaborative translation models
doi.org/10.1075/ijcl...
OUT NOW: Yiğit Savuran & Stefanie Wulff introduce the Turkish Learner Corpus (TURLEC), comprising written and spoken texts of learners of Turkish L2 across CEFR proficiency levels.
The corpus and metadata are available at: osf.io/bnv3p/files/...
#OpenScience #LearnerCorpus
doi.org/10.1075/ijcl...
Two new book reviews!:
@mariannagracheva.bsky.social reviews Le Foll's (2024)
Textbook English: A multi-dimensional approach
doi.org/10.1075/ijcl...
and Holly Baker reviews Kaunisto & Schilk (2024)
Challenges in corpus linguistics: Rethinking corpus compilation and analysis
doi.org/10.1075/ijcl...
OUT NOW:
@timfeld.bsky.social, Fabian Barteld & Alexander Ziem present and evaluate an approach for predicting typical fillers for the slots of grammatical constructions using BERT.
They ask: How can language models be used to support the development of linguistic resources?
doi.org/10.1075/ijcl...
Abstract
The present study investigates whether conservatism exists in human- and machine-translated texts from Chinese
into English, and whether this tendency is consistently observable across differ...
Abstract This paper provides a detailed account of the Turkish Learner Corpus (TURLEC). Building on the first author’s doctoral dissertation project, which aimed to identify proficiency descriptors fo...
doi.org
Welcome to e-content platform of John Benjamins Publishing Company. Here you can find all of our electronic books and journals, for purchase and download or subscriber access.
Abstract The starting point of this paper is a central problem in constructicography: on the one hand, constructicon projects aim at describing a broad spectrum of constructions based on usage data; o...
la semaine prochaine
but
la raison suivante
Looi, Riget, Boulton & Hassan discuss synonym alternation between French prochain and suivant, using corpus evidence and statistical methods to re-examine variables derived through introspection
#OnlineFirst #FrenchLinguistics
doi.org/10.1075/ijcl...
doi.org
Abstract
This paper presents a corpus-based study that evaluates variables identified introspectively by Berthonneau (2002) in relation to the alternation between two French synonymous:
prochain (‘nex...