What's Newsworthy?
Jul. 11th, 2006 07:21 pmSome statistical analysis. Articles from many dozen online news sources. Over 18,000 articles sampled over a span of two weeks or so. Strip out advertising, "related stories", and other stuff, leaving only the article content.
Ignore the following words: and, but, or, the, an, a
Here are some word frequencies:
- he: ~50,000 instances
- his: ~34,000 instances
- she: ~11,000 instances
- her: ~11,000 instances
"He" was also the second-most common word, after "said". "Their" was more common than "she".
(no subject)
Date: 2006-07-12 12:25 am (UTC)thanks.
(no subject)
Date: 2006-07-12 12:41 am (UTC)(no subject)
Date: 2006-07-12 12:44 am (UTC)(no subject)
Date: 2006-07-12 12:47 am (UTC)(no subject)
Date: 2006-07-12 01:10 am (UTC)It's worth noting that the parallel structures are used as follows:
1) Police say (he/she) has no previous criminal record.
2) Officers arrested (him/her) two days later.
3) Both of (his/her) cars were stolen.
4) All of the dogs were (his/hers).
Both his and her actually pull double duty. It's not really possible with single-word searches to break those down and see if the discrepancy is still universal or not.
(no subject)
Date: 2006-07-12 01:37 am (UTC)(no subject)
Date: 2006-07-12 02:01 am (UTC)(no subject)
Date: 2006-07-12 02:04 am (UTC)(no subject)
Date: 2006-07-12 02:13 am (UTC)And also that the statement "men commit 100% of the crimes" is either wrong or the women who get charged with crimes are innocent, or they are actually men in disguise.
(no subject)
Date: 2006-07-12 02:29 am (UTC)And my community is one in which women have approached parity in the political, corporate, and sporting worlds to the best of my perception. But our news seems to be about what scares and angers us, and men seem to dominate that aspect of our perception of reality.
(no subject)
Date: 2006-07-12 02:36 am (UTC)(no subject)
Date: 2006-07-12 02:39 am (UTC)(no subject)
Date: 2006-07-12 03:05 am (UTC)All of this is making me curious about your data. I have full faith in your methodology, but I wonder what percentage of news stories was about international diplomacy vs. local crime news vs. parlimentary wrangling vs. whatever else, and how the pronouns settle out in the midst of those subcategories.
(no subject)
Date: 2006-07-12 02:37 am (UTC)(no subject)
Date: 2006-07-12 05:04 am (UTC)