Stylometric Analysis of Open-Source Literature
EECS 349 Machine Learning, Northwestern University
Our task is to train a machine learning algorithm that matches a given input text with a well-known author whose writing style it most resembles. We chose 13 prominent authors in the world of English literature and learned their writing patterns using stylometry, the study of linguistic style. We think this topic is interesting because its potential applications range from identifying an author of an anonymous text to detecting plagiarism in papers.
The authors we analyzed and put in our database include:
​
Agatha Christie Beatrix Potter Charles Dickens Charlotte Bronte D H Lawrence
​
​
​
​
​
​
​
​
​
​
​
​
​ Edgar Allan Poe Herman Melville Jack London Jane Austen Louisa May Alcott
​​
Mark Twain Sir Arthur Conan Doyle Virginia Woolf