A Multi-Language Comparison of Influences on Author Verification using Character N-Grams :
Institution: | Delft University of Technology |
---|---|
Department: | |
Year: | 2014 |
Keywords: | author verification; n-gram; character n-grams; wikipedia; corpus; multi-language; talkpage; common n-grams; topic; time |
Record ID: | 1262553 |
Full text PDF: | http://resolver.tudelft.nl/uuid:47d8d028-6ec2-4b75-b380-0c4c0ae58c5d |
We create a new multi-language corpus for author verification based on Wikipedia talkpages, and evaluate the influence that differences in topic and time have on character n-gram author profiles. Topic alignment between two texts is found to increase author verification precision, and an authors writing style is found to change over time, but not more significantly after 3 years than after 1 year.