Feminism, Authorship Attribution, and the Value of the Hidden Work of Data ‘Cleaning’

When I joined the Thomas Nashe Project three years ago to work on Nashe’s dubia,[1] I was new to attribution studies. The responses I got from colleagues and academic friends over the next few months involved a lot of raised eyebrows, comments such as ‘you’re brave getting involved in that’ or ‘isn’t it a bit…macho?’, and someone muttered something about a tape measure. Not being one for melodrama, I carried on with the labour-intensive marking up of texts for stylometric analysis regardless. As a sixteenth-century scholar working on poetics and the occult tradition, my thesis was dominated by male writers, both in terms of primary texts and parts of the critical field. In that sense, the macho environment of attribution studies didn’t bother me as much as it probably should have done: I have held my own as female academic and saw no reason why I shouldn’t continue to do so, especially working for a project with such supportive colleagues, men and women. Now that the bulk of my work on Nashe’s dubia is finished, I have been able to reflect a little on the aggressive nature of attribution studies, a trait which is damaging on multiple fronts.

Andy Kesson hit the nail on the head in his Before Shakespeare blog series on attribution (2017) when he observed ‘I cannot imagine how difficult it must be for junior or female scholars or for anyone uncomfortable with the current state of attribution to engage critically with its practices’. His posts elicited a number of responses about attribution studies generally: some curious and hopeful of a constructive dialogue, others more confrontational.

To say that the field of attribution studies is an uncomfortable space is an understatement, and certain parts of it are particularly bad (*cough* Shakespeare *cough*).[2] The criticism of ‘rivals’ in attribution studies can be vituperative and targeted; some of the debate/shaming is carried on in national newspapers. If daring to publish something on attribution effectively means sticking a target on your back and declaring allegiance to one boys’ club or another, why would anyone – least of all a woman and an early career academic – bother?

There is of course nothing wrong with being protective of one’s work or constructively critical of someone else’s work. Perhaps in attribution studies there is more of a sense of a need to provide the answer, to crack the code, to prove something definitively, and it is this that encourages contest. It is easy to be seduced by the final result: the graph that proves that so-and-so may or may not have written a text, the charts that squash unfathomable amounts of data into clear, precise points that can be summed up in a neat visualisation. Perhaps it is this claim to have answers – real answers, not subjective arguments – that produces more aggressive commentary and defensive criticism than is considered acceptable in other academic fields that are more open to complex evidence and shades of meaning.

Take, for example, the methodology used by Brett Greatley-Hirsch for the Nashe Project: Principal Components Analysis. PCA is a data reduction method and its use in authorship attribution is due to the fact that ‘when analysing word-frequency counts across a mixed corpus of texts known to be of different authorship, the strongest factor that emerges in the relationship between the texts is generally authorial in nature’.[3] Like any other method, it has its critics.[4] The Nashe Project used PCA to help the editorial team confirm  the texts we wanted to include from those uncertainly attributed to Nashe in the forthcoming edition of his works. But this is only one of the several methods the team has used. PCA is a footnote to the swathes of scholarship that have gone before, and to the archival scholarship conducted by members of the team themselves. Tempting as it is to be seduced by a few graphs, no single method can claim to hold the definitive truth.

But, if we remain fixated on answering who did or did not write a text, and bemused by the ferocity of the debate between some attribution scholars, what are we forgetting about?

Most obviously, from the point of view of the researchers at least, we miss the labour it takes to get a result. Data preparation or data cleaning is at least 80% of the work and it is often done by post-docs and students. The workload can be vast and tedious. It is also, unfortunately, often ignored or forgotten about. In their book Data Feminism (2020), Catherine D’Ignazio and Lauren F. Klein note that ‘seen’ labour has value, meaning that the hours of tedious research and data preparation that go into a visualisation often remains unseen and uncredited. Yet, that labour is crucial because it provides more than data to be processed. For a start, it creates experts, the people who can explain why the processed data behaves in particular ways, or why there might be a surprising result. It is easy to become protective of our work because of the blood, sweat, and tears – literally, I tagged all 72,000 words of Gabriel Harvey’s Pierces Supererogation (1593), the sheer length of which even Nashe complains about – that we put into processes like tagging or data cleaning. Without a valid result, the whole enterprise seems pointless. Failure or a non-result is simply not on our radar, whereas our colleagues in the sciences accept these things as a matter of course. In addition, there is often no opportunity for a double-blind study as current funding is not generous enough to cover the cost of running the same tests again to double-check; a lot rides, then, on getting it right first time, and having something to show for it.

I spent hundreds of hours marking up function words in a corpus of texts that I had researched, and I breathed a huge sigh of relief when the data was processed and produced graphs with neat clusters indicating different authors. Even I, the intrepid (and tired) tagger, only saw value in the visualisations of the processed data on the screen, not in the work done to get to that point.

Data-cleaning, data-preparation, and tagging sound like tedious processes. Let’s be honest, they sound dull. Boring. Drying paint suddenly seems interesting by comparison. I won’t lie: tagging has its dull moments (did I mention Pierces Supererogation?). But it is also incredibly exciting and in those hours of seemingly futile labour are moments of crystallisation, original discovery, and surprise, and I’d like to focus on these now.

Engaging so closely with those texts we ‘clean’ and tag to glean particular data can be intellectually challenging and reveal unknown information; it is also a surprisingly emotional experience. Emotional labour is often seen as feminine and somehow unscholarly.

These emotive responses are worthy of comment though. I had read Christs Teares over Jerusalem (1593) as a postgraduate and I knew it contained graphic descriptions of dead bodies, pain, and suffering  However, I was unprepared for the visceral reaction I felt when I tagged the section with Mariam’s cannibalism of her own child. It was nauseating. It was not the actual moment of decapitation and cannibalism which got to me, but the speeches either side that describe Miriam’s desperation through to the smell of the dead ‘rost’ child. I had a similar response when tagging the rape of Heraclide in The Unfortunate Traveller (1594).  Again, it was not the brief description of the rape itself that caused a reaction, but the build-up to it and the exploration of the trauma after the event. This is more than gratuitous violence: Nashe makes something of the desperation of the characters seep through the layers of graphic description and revulsion which, when read closely, are truly affecting. The process of tagging focussed my attention differently as a reader, and allowed this desperation to seep into my consciousness, which surprised me. I wonder if reading that is active but not necessarily critically engaged, like tagging, might mean that there is little mental preparation for these graphic episodes; we become conscious in the midst of the violence and graphic description without processing it critically, leading to a more profound experience than sitting down and reading a text. This is something I’m still thinking about.

The experience of tagging texts is very different from other ‘reading’ experiences and constitutes a form of emotional labour – often coded as feminine and neither scholarly nor noteworthy – which is never reflected in a project’s output. When tagging a text, there is no escape from it. Every bit of it, no matter how gruesome, has to be treated in the same way. There is thus no opportunity to skim or block out portions of text even on a subconscious level. The engagement with the text is different, it transforms the reading experience and thus produces new insights and discoveries. The emotional labour that comes with it shouldn’t be ignored. In tagging and reacting to Miriam’s cannibalisation of her son and the rape of Heraclide, I was able to experience, I believe, something of Nashe’s intention as the writer and how he expected his text to affect the reader. I was more alert to the live ‘voices’ in the text that are intended to trouble us. As a consequence, I approach Nashe differently than I did before, which I can now turn into a contribution to studies on Nashe and give scholarly value to that emotional labour. My self-awareness as a tagger, I would like to suggest, has a place in the Nashe Project – it is confirming and shaping the impact we believe Nashe wanted to have.

In shifting the focus from the end-result to the labour that gets the result, attribution studies could open itself up to other conversations and engage with other disciplines and theories. By focussing more on the work that goes into the results, the heat could be taken out of the debate as it would allow those doing the work to contribute their insights. These contributions wouldn’t rely on the infallibility of methods but ask what else can be discovered through the many different processes of exploring authorship. Of course, I am not alone in seeing the value of the labour itself and using it as a research tool, but this approach is not usually part of the conversation in attribution studies. If attribution studies could embrace the messiness of the data behind its neat visualisations through showing its hidden labour – often done by women and ECAs and downplayed – then the field could change for the better. Perhaps some of the discomfort surrounding attribution studies would dissipate and new experiences and discoveries could be shared and celebrated as part of it. Let’s not get distracted by shiny visualisations but look beyond to the work – and the workers – that produced it. That is where the real discoveries lie.


With thanks to Andy Kesson, Alan Hogarth, Brett Greatley-Hirsch and the Nashe project team who were kind enough to comment on earlier versions of this piece.


[1] These are the texts that Nashe may have written pseudonymously or have had a hand in writing.

[2] Coughs are not Coronavirus related.

[3] Hugh Craig & Brett Greatley-Hirsch, Style, Computers and Early Modern Drama: Beyond Authorship (Cambridge: Cambridge University Press, 2017), 32.

[4] See, for example, Nan Z. Da, ‘The Computational Case Against Computational Literary Studies’, Critical Inquiry, 45:3 (2019), 601-639.

Nashe on Screen (after a fashion)

You’ve seen the plays: now watch the playwright do some gardening. In the new film All is True (trailer), director Kenneth Branagh and screenwriter Ben Elton adopt the contemporary title of William Shakespeare’s last play, Henry VIII, in order to tell the story of the Swan of Avon (not portayed, sadly, by an actual swan, but by Branagh himself) returning to Stratford at the end of his career, after the burning of the Globe Theatre. Far from his successes in London, Shakespeare has to reckon with the hostility of his long-abandoned wife (Judi Dench) and daughters Susanna (Lydia Wilson) and Judith (Kathryn Wilder), as well as the snark and accusations of neighbours, and his long-held grief for the death of his son Hamnet. I was lucky enough to see the film courtesy of the lovely Viral History team (check out their work, it’s brilliant), and joined them and Dr Joanne Paul from the Sussex History department for a chat on their new podcast. You can listen to our episode via iTunes, Spotify, Audioboom, or wherever you like to source your listening material.


Continue reading Nashe on Screen (after a fashion)

A very Thomas Nashe Christmas

I wanted to call this post 'No cheeses for the meeces, and other Nashean Christmas problems' but it felt rather too niche

Christmas is coming, the goose is getting fat, people on Twitter are doing festive display names and no-one can find the end of the sellotape. On the University of Sussex campus, the students that thronged the corridors for essay meetings last week have scattered. Lights glint amongst the branches of the Christmas tree in Library Square, the cafe is serving festive peppermint hot chocolates with enough sugar in them to stun a horse, and a researcher just announced to her empty shared office that ‘IT IS EMMYLOU HARRIS CHRISTMAS ALBUM TIME’ because, well, it is Emmylou Harris Christmas album time. It is difficult to be very Scrooge-like about Christmas when you work in an institution of higher education, at least until the marking comes in.

Thomas Nashe might, I think, have approved, at least of the sugary hot chocolate – or, perhaps, of the snacks brought in by tired lecturers for their tired students. Although he lived (of course) in a pre-chocolate, pre-Christmas-tree, pre-Emmylou-Harris world, he had (as Nashe tended to do) strong opinions on how one should celebrate Christmas, and particularly on generosity during the festive season. Continue reading A very Thomas Nashe Christmas

Nashe’s Shopping List 4: not without mustard!

I might be suffering from the effects of rhenish wine, but I think this pickled herring needs something to make it a little more palatable. Pass the mustard!

The hot taste should help to cover up the strong or even rancid pickled herring, and may go some way to helping with the effects of all that alcohol (anyone else feel like something died in their mouth?). It has been suggested that in the past mustard seeds were chewed during meals to cover the taste of questionable food. The seeds themselves are not flavourful until crushed when myronate and myrasin are released, which creates the hot taste. Does anyone in this drunken company fancy trying it? My bet is on Marlowe.

Continue reading Nashe’s Shopping List 4: not without mustard!

Nashe’s shopping list 3: a surfeit o’ pickled herring


I know, I know. We started imagining what would go into Nashe’s shopping basket, but I think we can safely assume that he wouldn’t be having his 450th birthday party at his place. Thomas Middleton certainly doesn’t think Nashe would have lived in salubrious surroundings. In The Blacke Booke (1604) Nashe’s persona Pierce Penilesse is renting a room in a brothel. The visitor

“stumbled up two payre of stayres in the darke, but at last caught in mine eyes the sullen blaze of a melancholy lampe, that burnt very tragically uppon the narrow Deske o a halfe Bedstead, which descryed all the pittifull Ruines throughout the whole chamber, the bare privities of the stone-walls were hid with two pieces of painted Cloth; but so ragged and tottred, that one might haue seene all neuerthelesse…The Testerne or the shadow over the bed was made of foure Elles of Cobwebs, and a number of small Spinners Ropes hung downe for Curtaines… in this unfortunate Tyring-house lay poore Pierce uppon a Pillow stuffed with horse meat, the sheets smudged so dirtily, as if they had been stolen by night out of Saint Pulcher’s churchyard when the sexton had left a grave open.” (sigs. D1r-v)

Continue reading Nashe’s shopping list 3: a surfeit o’ pickled herring

Choose a conference cocktail!

We’re planning drinks for our upcoming conference (Thomas Nashe and his contemporaries. Newcastle University, 12-14th July 2018), and we want your help! Do pop over to Twitter to vote for the cocktail you’d like to see there… And check out the Call for Papers at the project website!

Continue reading Choose a conference cocktail!

Nashe’s shopping list 2: ale, beer, and cider

What would a 450th birthday party be without a well-stocked drinks table? Thankfully, Nashe refers to a good range of alcoholic beverages in his works…


Yesterday we heard about drinks for the high-rollers among us: the imported wines that (according to Thomas Dekker, at least) Nashe should have been plied with by his patrons. However, if your budget doesn’t quite run to sack and Rhenish, don’t worry: there’s plenty of cheaper booze to be had in Nashe’s works.

Continue reading Nashe’s shopping list 2: ale, beer, and cider

Nashe’s shopping list 1: wine and sugar

What would a 450th birthday party be without a well-stocked drinks table? Thankfully, Nashe refers to a good range of alcoholic beverages in his works…


I have a friend whose extremely generous wine-buff father caters for parties on the basis of one bottle of white wine and one bottle of red per guest – plus beer and spirits. Nashe would have appreciated this kind of largesse a great deal: Thomas Dekker imagines Nashe arriving in the underworld and complaining about ‘dry-fisted Patrons’ because ‘if they had given his Muse that cherishment which shee most worthily deserved, hee had fed to his dying day on fat Capons, burnt sack and Suger, and not so desperately have ventur’de his life, and shortend his dayes by keeping company with pickle herrings’ (Dekker, L1r).

Continue reading Nashe’s shopping list 1: wine and sugar