Skip to content

Gender Guesser – Determine the author gender from writing in emails or articles

March 24, 2012

Gender Guesser

The words you use can disclose identifying features. This tool attempts to determine an author’s gender based on the words used. Submitted text is evaluated based on two types of writing: formal and informal. Formal writing includes fiction and non-fiction stories, articles, and news reports. Informal writing includes blog and chat-room text. (Email can be formal, informal, or some combination.) You should view the results based on the appropriate type of writing.

Here’s the screenshot for male writing result :

A few quick notes:

  • The system generates a simple estimate (profiling). While Gender Guesser may be 60% – 70% accurate, it is not 100% accurate. This is better than random guessing (50%), but should not be interpreted as “fact”. In particular, men should not be offended if it says you write like a girl.
  • People write differently in different forums. For example, a single writing sample may appear MALE for informal writing but test as FEMALE for formal writing. Be sure to interpret the results based on the appropriate writing style. (These notes, for example, are more informal/blog than formal/non-fiction.)
  • Many factors can impact the interpretation from any single person’s writing. The content, knowledge of the material, age of the author, nationality, experience, occupation, and education level can all impact writing styles. For example, a woman who has spent 20 years working in a male-dominated field may write like her co-workers. Similarly, professional female writers (and experienced hobbyists) frequently use male writing styles. Gender Guesser does not take any of these factors into account.
  • Email can blur the lines between formal and informal writing styles. An informal email from a manager may have traces of formality, and a formal email from a 12-year-old is likely to be informal compared to a letter from a 40-year-old. Do not be surprised if email messages sent to public forums test incorrectly — when writing for an audience, people commonly use informal words, phrases, and slang within a formal writing style.
  • Quotations, block quotes, and included text usually carries the gender from the initial author. Be sure to remove quoted text from any pasted content. Also, significant changes from a copy-editor can result in a different gender analysis. (A male editor may make a female author’s news article appear MALE or as a Weak MALE.)
  • Lyrics, lists, poems, and prose are special writing styles. This tool is unlikely to classify these texts correctly.
  • The system needs a paragraph or two of text in order to observe word repetition. A good sample should have 300 words or more. Fewer words can lead to more variation in accuracy, and a single sentence is unlikely to generate an accurate result. Pasting the same text multiple times will not change the results!
  • People tend to write with consistent styles. If the system misclassifies a particular author, then other writings by the same author will likely be misclassify the same way.

If you’d like to try, you can go to : Guess the gender author from his/her writing

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: