Go to content Go to navigation Go to search

I (sometimes) write like a girl!
or, Shameless self-promotion.

It’s a fun little tool, the Gender Genie. Rich (of Brain Squeezings) took the algorithm developed by by Moshe Koppel, Bar-Ilan University in Israel, and Shlomo Argamon, Illinois Institute of Technology, to predict an author’s gender, and turned it into a webapp (available here, but also here). (The algorithm itself is almost embarrassingly simple.)

Results? When I write about webcomics and cartoonists, I’m a girl. When I reminisce about college (and not so much Robyn Hitchcock), I’m a girl. When I rant, though, about war or the aftereffects of war (as for instances), I’m a guy. Unless I’m ranting about Ann Coulter, in which case I’m (just barely) a girl. But my fiction—my art; the word games nearest and dearest my heart—well, I’m a guy. Pretty much astoundingly so.

The algorithm tries (simply) to calculate the “involvedness” and “informationalness” of a text. Women, you see, write involvedly—texts that show interaction between the speaker/writer and the listener/reader; men, on the other hand, tend to indicate or specify the things they write about. (I’m not entirely certain why that’s an other hand, but I’m summarizing a paper I’ve only just skimmed, and being cheeky to boot.) The basic flags are based on statistical analyses of texts drawn from the British National Corpus—texts from the BNC have already been labeled for genre, and each word is tagged as belonging to one of their recognized 76 parts of speech. 123 male documents—excuse me, texts generated by men—and 123 texts generated by women were used; these included 179 nonfiction pieces, drawn from the realms of natural science, applied science, social science, world affairs, commerce, the arts, belief/thought, and leisure. Average length was just above 42,000 words, for a total of 25 million words; no single author wrote more than six of the 246 texts.

[Ed. note— My summary of the number of documents chosen is staggeringly wrong, as anyone who paused and took up a calculator could easily see. Please open the comments thread for further discussion by those more numerate than myself. The theory that follows, then, does not obtain as a criticism of the assumptions underlying the algorithm, which nonetheless continues not to live up to projections. Ah, well.]

In other words, I don’t doubt the analysis Koppel and Argamon performed is an accurate enough description of 25 million words of British English as it was used in the 20th century—reflecting the broad usage patterns of male and female speakers and writers. —But you’d think maybe something a little less narrowly focussed might be studied before proclaiming it a universal prescriptor. Eh?

There’s also the fact that correlations seem to be ignored utterly. It’s gender that’s the determinant, not the intended audience, not the school of writing, not the function to which it will be put. As a for instance: presume that texts written in the fields of natural science, applied science and commerce all require a higher degree than average of specificity, indicativeness, informativity. (A safe enough presumption.) Further presume, as one notes that the texts are drawn from those written or spoken in British English in the 20th century, that the sometimes extreme gender prejudice of that benighted age has resulted in the majority of those more specific, indicative, informational texts having been written by men—because women were disproportionately denied opportunities to advance in the fields of natural science, applied science, or commerce; their informativity isn’t represented in the sample not because they were women, but because they were Shakespeare’s sisters. —It’s far from settled, but that no attempt is made to correct for this sort of bias makes the prescriptive power of the algorithm and its underlying assumptions highly suspect. When coupled with the relatively tiny, focussed sample, it’s pretty much useless.

After all, my pieces on webcomics are about groups and relationships and schools of cartoonists, and so, involved; the bit on college is a memoir, and so personal, and so vague and unspecific and relational; political rants need to be specific and, one would hope, full of informativity (unless, it seems, they’re about Ann Coulter); and my fiction—at least, the two pieces cited, which, though one is first person and one is third person, both try for a specific, declarative, one doesn’t want to say clear or lucid or limpid or muscular (gack) style, but—well. Fiction is fiction.

Or is what’s between our legs more important to the shapes our words might take than the purposes to which we intend to put them?

(Overall, the Gender Genie’s running 60/40 in favor of Bzzt! I’m sorry. Try again—though I hasten to point out it’s an unscientific, self-reporting survey. Additional data points: the Spouse writes like a girl—even when she’s writing about strippers. Hmm.)

  1. sara    Aug 20, 01:45 pm    #
    man, you'd think the gender genie would do something more useful, like, say, conferring one's gender of choice upon one after waving hir wand, or other less phallic object of power.

  2. marydell    Aug 20, 07:44 pm    #
    Interesting post. The algorithm does need work since Koppel and Argamon's claim of 80% seems to be very off. I wonder what kind of texts they used for their study since my testing (before the application went live) on general fiction came up with a dismal 50%. The genie continues to manage only 40%

    By the way, my friend Rich of Brain Squeezings, male, wrote http://www33.brinkster.com/echoloc8/Default.asp while I, female, wrote http://www.bookblog.net/gender/genie.html. Mine is prettier, don’t you think?

  3. --k.    Aug 20, 08:15 pm    #
    Apologies for any confusion; I was trying to source the original program--Rich did write that, right? Or did I totally flub the footer?--and neglected to say more about Bookblog itself because I filed it under the "Ooh! Looks interesting!" header and then got distracted by the rest of my own post. Apologies. (But ooh! Looks interesting! --Followups to the question of gender and writing styles here and here. And indeed, yours makes more effective use of ornamental design elements to catch and hold the user's attention than his.)

    Sara: Genies use winks, nose twitches, booming commands, lamps (of course), sly smiles, and the occasional magisterial fillip. Not wands. Not so far as I know. So I don't think there's a phallic conflict to worry about, there.

  4. Charles    Aug 20, 08:54 pm    #
    Actually, the training data set for the study was controlled for sex of author. Equal numbers of works by each sex were chosen from a wide variety of genre categories:

    (Fiction: 123 male documents, 123 female documents; Nonfiction: 179 each, including Nat
    Science: 2 documents
    each; Appl. Science: 13; Soc. Science: 60; World Affairs: 34 Commerce :
    4; Arts: 31; Belief/Thought: 18; Leisure: 17)

    So the biasing effect of over representation of (for example) men in hard science is corrected for.

    However, the paper is disturbingly unclear on what data set was used to test the algorithm. The fact that the list of materials used http://www.ir.iit.edu/~argamon/textlist.txt does not distinguish the training set from the testing set suggests that they used the trainig set as the testing set, which would be abysmal practice (particularly in a case like this where there is no shortage of available data (the entire body of late 20th century writing)). This might also explain why they are batting worse than 50/50 in the online interface.

    Charles

  5. --k.    Aug 21, 04:31 am    #
    Actually, Charles, I don't see that the bias was corrected for. They picked 123 texts generated by each gender, but don't say how many of those 179 nonfiction texts were generated by men, and how many by women. There could easily have been a disproportionate number of fictional texts by women--or more texts in the fields of the Arts, Belief/Thought, or Leisure, which could all more easily be classes as "involved" than "informational." (Or so says my highly unscientific gutcheck analysis.)

    Also, Koppel and Argamon discuss the ability of the algorithm to tell the difference between fiction and nonfiction--with a 98% success rate. First, this raises my eyebrows: I imagine they have to tune the alogrithm slightly differently, or else the results are highly suspect; men (whose results match nonfiction) do write fiction, after all, and women (whose results match fiction) write nonfiction. But it also proves that context and intent do matter to the eventual shape of a piece--in fact, matter more (98%) than gender (80%). (That my own fiction reads as male/nonfiction--well, I did pick two pieces that were a little funky, stylistically, but still.) --To see this, not correct for it, and loudly proclaim you've found a test for gender is, well, bunk.

    The texts, again, are from the British National Corpus, which looks like a nice, clean, controlled set of texts with a bunch of the dirty work already done for you. Convenient. --What's weird (beyond double-dipping) is the BNC is a 100 million word set, comprised of 4,124 texts, each of which is a sample of up to 45,000 words in length. Koppel and Argamon used 246 texts --less than 6 percent of the total--at an average length of 42,000 words, check--for a total of 25 million words, or one quarter of the total BNC. That doesn't seem to add up, or at the least suggests some sort of selection bias for longer works. That they also had multiple texts by the same authors within that set (no more than 6 texts by any one author) raises my eyebrows pretty much off my forehead.

    But this is statistical analysis, which is far from my strong suit, and which has the nasty habit of being counter-intuitive and uncommonly sensible. Thanks, though, for peeking again at the list of texts; when I tried looking at it while scribbling this entry down, I kept getting 404s. Someone could go through and see if there really is a bias towards texts generated by women from the more "involved" genera of writing; me, I'm going to drink more coffee.

  6. julia    Aug 21, 05:00 am    #
    Apparently you write like a boy when you write about writing like a girl.

    Just saying.

  7. marydell    Aug 21, 07:13 am    #
    Oh, yes, Rich wrote the original application and deserves full credit for it. I simply moved into his house (the idea), cleaned it (threw out the nasty ASP and put in happy PHP), renovated it (added stuff like the analysis breakdown), and redecorated (made it pretty). Now I need to write an application that determines the gender of a programmer, since there's obviously a difference. :)

    I'm impressed to see that some have actually made it through K&A's paper. It made me fall asleep.

  8. Kevin Moore    Aug 21, 10:16 am    #
    But, Kip, you are a girl, so what's the problem?

  9. Charles    Aug 21, 10:58 am    #
    Kip, I think you are misreading. Quoting directly from the article:

    For each genre we used precisely the same number of male- and female -authored documents (Fiction: 123 male documents, 123 female documents; Nonfiction: 179 each, including Nat Science: 2 documents each; Appl. Science: 13; Soc. Science: 60; World Affairs: 34 Commerce : 4; Arts: 31; Belief/Thought: 18; Leisure: 17). Documents were chosen in each genre by using all available documents in the smaller (male or female) set and randomly discarding the surplus in the larger set.

    Genre here fairly clearly means the narrow categories: e.g Nat Science. And each means same number for each sex. The last sentence confirms this interpretation. Natural science is under-represented in the study due to the under-representation of women in the natural sciences in the Corpus.

    The reason for not using the entire corpus is to allow testing of the hypothesis on unused data. D squared had an excellent essay on the danger of using these sorts of data mining techniques on entire data sets a while back (too lazy to hunt it down). Basically, coincidental correlations will occur quite frequently, and the way you recognize coincidences is to test the correlations that you stumble upon with new data. The bias towards long texts that you noticed does seem to be real, but I am not sure what significance it might have. It seems quite possible that it might have a beneficial influence in terms of smoothing out non-gender influences in the text.

    I am honestly not sure how they are handling the fiction/non-fiction distinction, particularly since they say it aligns very strongly with the male/female distinction.

    Charles

  10. --k.    Aug 21, 11:32 am    #
    Eh heh. Whoops. I'm not at all certain how I missed the "each" in there, or the following sentence. (I was as close to nodding off as you were, marydell. Dangerous, when one then hares off to pontificate.) Well, gosh, to quote the Mayor.

    So my own hypothesis to account for the discrepancy is obliterated. I still maintain that noting a discrepancy in even a broad difference of intent (fiction v. nonfiction) makes the assertion that the discrepancy is due to gender at best shakey. --Any test for gender bias in this regard is going to be inextricably tied to a matrix of expectation and prejudice peculiar to the age and place when and where the texts were generated (and when and where they are studied, but); again, I don't doubt they've accurately measured certain characteristics inherent to British English texts from roughly 1974-1994. They are apparently working on expanding their data, to see if this bias extends into the past (a quip seen somewhere about whether George Eliot can continue to pretend to have been a man, which made little sense), but I'll bet the algorithm starts dissolving into static. --I rest assured that the Gender Genie tests so far don't live up to their claimed results, so ha! Battered, but unbowed, I sit me down.

    And I do remember that D^2 post; the mechanism whereby one could run regressions over and over on the same dataset until one got the results one wanted mystified the heck out of me. See above, re: counterintuitiveness of statistical analysis, and my own cheerful haplessnes in the face thereof. Yet another good reason to sit me down. And so.

  11. Population: One    Aug 21, 12:27 pm    #
    Ya got breasts
    To my vast amusement, the Koppel-Argamon Gender Predictor believes Reese is female. Seriously! There’s some sort of gender-determining algorithm which...

Commenting is closed for this article.