Earlier I posted a map of Wikipedia contributors comparing the most active users with general Wikipedia users. In his essay Who Writes Wikipedia, Wikipedian Aaron Swartz proposes that the contributions of the “core” group of Wikipedia contributors are significantly overstated. In addition, he suggests focusing on this small, but important portion of the Wikipedia community to the exclusion of the broader group of contributors is a big mistake.

If this is true, then my Wikipedia Contributor Map oversimplifies what is happening (as was pointed out in a reader comment).

Aaron Swartz wrote a script to analyze the entire history of a group of about 200 articles and conlcuded that while the majority of edits are done by the “core group”, the actual bulk of the words in the articles were done by a much broader group of contributors who have “…generally had made less than 50 edits…“.

Some quotes:

When you put it all together, the story become clear: an outsider makes one edit to add a chunk of information, then insiders make several edits tweaking and reformatting it. In addition, insiders rack up thousands of edits doing things like changing the name of a category across the entire site — the kind of thing only insiders deeply care about. As a result, insiders account for the vast majority of the edits. But it’s the outsiders who provide nearly all of the content.

And when you think about it, this makes perfect sense. Writing an encyclopedia is hard. To do anywhere near a decent job, you have to know a great deal of information about an incredibly wide variety of subjects. Writing so much text is difficult, but doing all the background research seems impossible.

Michael Mace, in his post Why is Apple porting its browser to Windows?, predicts a coming battle between companies vying for control of the rich browser interface. He speculates that Apple is porting its Safari web browser to windows as a kind of trojan horse in order to get the full Apple OS layer onto Windows systems.

…The war to come. This could set up a brutal competition in software layers, between Adobe Apollo, Microsoft Silverlight, Sun’s revised Java, Firefox’s platform, and Apple. Google fits in there somewhere as well, but it’s not clear if they’ll try to create their own platform or work with several other players…

Adobe is certainly in the running. Adobe already benefits from a huge adoption rate of flash players and a lot of new web applications are built to run in flash. The other day, I got a demo from a friend who is building an application for a new web startup. He built it using openlaszlo, an open source development framework for Adobe Flash and native web browser Dynamic HTML. It is clean an snappy and has great interactive animations.

On the other hand, I know others who are building exclusively with javascript / DHTML applications, using libraries like and prototype and intend to stay away from flash or other “lock-in” frameworks.

I don’t like the idea of one company dominating – I believe it will be far better for everyone if none of the commercial vendors win this war.

The other night my nephew told me he doesn’t contribute to Wikipedia much. He feels that all of the good articles are taken and he doesn’t want to waste his time editing an article only to have an editor or administrator revert his work because they jealously guard that territory already. I wonder if this is a widespread feeling?

In the the life sciences, the “carrying capacity” of a species is the population that an environment can support without significant negative impacts to the given species and its environment. A common example is White Tailed Deer population in the United States. In wild areas, the normal predator-prey interaction keeps deer populations in balance. In areas where people have removed predators, the deer populations can exceed the capacity of the local environment to the point where deer starve. Conversely, when the population drops below a certain point, the population is unable to sustain itself and disappears. Of course, there’s more to it than that. Modeling populations of organisms is a popular and notoriously complex subject of systems theory.

Perhaps certain collaboration “environments” also have a carrying capacity for contributors. This seems especially applicable to collaborations like a Wikipedia article where many people are contributing to a finite set of tasks (as opposed to a social network where there are as many tasks as there are people). If this is so, then there is a threshold beyond which every contributor you add to an Article actually has a negative impact on the Article’s community. Likewise when the population of contributors drops below a certain threshold, the health of the Article’s community suffers. Just as with animal populations, modeling this effect would be a complex task, since the “population” of contributors is very dynamic as is the “environment” in which the contributors operate.

[Update 8-6-07: I've updated the wikipedia contributor map based on my recent discoveries. Please see this post for a better contributor map]

Here are some interesting factoids culled from Wikipedia contributor statistics.

Compare the population of world countries to the Wikipedia contributors. In the hierarchy of users the vast majority of visitors to Wikipedia, 48 million of them, are readers; for the most part they don’t edit articles. Next are the regular contributors who contribute between 5 and 100 times per month. There are about 77,000 of those. Finally, there are the 10,000 anchor contributers (I’ve borrowed this phrase from retail marketing) who contribute more than 100 times per month.wikipedia-contributor-math.png

So if Wikipedia readers are like China, then the regular contributors are like Macedonia and the anchor contributors are like the Barbados. To extend this analogy to absurd extremes, Barbados and Macedonia do all of the work, have the highest GDP and provide humanitarian aid to China!

[Update 7-16-07: Here's the spreasheet I used to calculate this data. The inspiration to make this map came from the Strangemaps blog. ]

The Google Earth Blog recently mentioned an article by Michael Jones, Chief Technologist of Google Earth, in the IEEE “Computer Graphics and Applications” magazine. The article can be downloaded here.

Michael quotes from Rudyard Kipling:

I Keep six honest serving-men:
(They taught me all I knew);
Their names are What and Where and When
And How and Why and Who.

The rest of the article is devoted Google’s vision of “Where”. But I think there is also a hidden meaning in that particular analogy. The poem is from Just So Stories, The Elephant’s Child which is an allegorical children’s tale about the dangers and rewards of ’satiable curiosity (Kipling’s words). Here’s the full text:

I Keep six honest serving-men:
(They taught me all I knew)
Their names are What and Where and When
And How and Why and Who.
I send them over land and sea,
I send them east and west;
But after they have worked for me,
I give them all a rest.

I let them rest from nine till five.
For I am busy then,
As well as breakfast, lunch, and tea,
For they are hungry men:
But different folk have different views:
I know a person small —
She keeps ten million serving-men,
Who get no rest at all!
She sends ‘em abroad on her own affairs,
From the second she opens her eyes —
One million Hows, two million Wheres,
And seven million Whys!

The person small in this case was Rudyard Kiplings daughter, but we could easily substitute “company large and ambitious” in its place. Perhaps ’satiable curiosity is at the heart of Google’s success.

The Wikipedia article on Neutral Point of View is an official policy statement, but it is not the kind of “policy” that is typically spewed by bureaucratic IT departments, corporate HR groups or local politicians. I find it inspiring.

NPOV policy is summarized as: “All Wikipedia articles and other encyclopedic content must be written from a neutral point of view, representing views fairly, proportionately and without bias.”

The reasoning behind the policy is beautifully written and thoroughly reasoned. Here or some of my favorite passages:

…A solution is that we accept, for the purposes of working on Wikipedia, that “human knowledge” includes all different significant theories on all different topics. We are committed to the goal of representing human knowledge in that sense, surely a well-established meaning of the word “knowledge”. What is “known” changes constantly with the passage of time, and so when we use the word “know,” we often enclose it in so-called scare quotes. Europeans in the Middle Ages “knew” that demons caused diseases; we now “know” otherwise….

…There is another reason to commit ourselves to this policy, that when it is clear to readers that we do not expect them to adopt any particular opinion, this leaves them free to make up their minds for themselves, thus encouraging intellectual independence. Totalitarian governments and dogmatic institutions everywhere might find reason to oppose Wikipedia, if we succeed in adhering to our non-bias policy: the presentation of many competing theories on a wide variety of subjects suggests that we, the editors of Wikipedia, trust readers to form their own opinions. Texts that present multiple viewpoints fairly, without demanding that the reader accept any particular one of them, are liberating. Neutrality subverts dogmatism. Nearly everyone working on Wikipedia can agree this is a good thing…

