Note: Transcriptions are usually more slow in coming than the podcast itself. Please be patient. Previous episode transcripts are archived just below, and further down is the current episode's transcript. Help create the transcripts by using an application such as Express Scribe.
The following is complete; please copyediting or refine it, or help with other current transcriptions.
Tawker: This is Wikipedia Weekly, episode 7, for the week of November 27th, 2006.
Fuzheado: Welcome to another episode of Wikipedia Weekly. I'm your host, Andrew Lih, also known as User:Fuzheado on the English Wikipedia. Well, last week we got hooked onto the big panel using Skype, so this week we're going for even more with eight folks from around the world. So from Vancouver, Canada, we have user:Tawker,
Tawker: Hi there.
Fuzheado: From the Wikimedia Foundation homeland in Florida, we've got Danny, aka Danny Wool.
Danny Wool: Hi.
Fuzheado: And from New Jersey we've got user:Messedrocker.
MessedRocker: Good morning, afternoon, evening and night.
Fuzheado: And from the state of Illinois we've got Kelly Martin.
Kelly Martin: Hello out there.
Fuzheado: And from Wisconsin we've got user 1ne, or Sushigeek.
1ne: Hello.
Fuzheado: And this week we're happy to welcome back our Australian, Daveydweeb.
So this week we had three interesting milestones across the Wikipedias. On Wikipedia English we reached 1.5 million articles, at roughly the same time we had German, which reached 500,000 articles, and French which had 400,000 articles. MessedRocker, you know what the 1.5 millionth article is, don't you?
MessedRocker: Yes, the 1.5 millionth article is the Kanab Ambersnail. The article is about a type of endangered snail which lives in the Grand Canyon, and it's a pretty good article so far. It's a few paragraphs, there's a nice reference section going in, some really vivid photographs. All in all it's a pretty good article so far. I' have a feeling that because of its popularity as being the one-point-five millionth article, it's going to get a lot more attention than normal. This could also be relevant to WikiSpecies, which is an attempt on making a directory of all the types of species of life in the world as we know it.
1ne: I have a question. Did Jordanhill railway station, the one millionth article, did it get attention for being millionth article?
MessedRocker: Well yeah, it's received a lot of attention. In fact, if you go to the article, it's pretty comprehensive for an article about a railway station, and I believe it's because it's gotten so much attention for being article one million.
Kelly Martin: I believe they even put a plaque up at the railway station, acknowledging its status as the one millionth article, although that may only have been proposed.
1ne: Maybe the snail article could revive interest in WikiSpecies, because... as of a few months ago, Wikispecies didn't really have activity. So, maybe it could revive it, I don't know.
Daveydweeb: Well it has been receiving a lot of attention, this article. Int he last three days its had 112 edits made to it, unfortunately half of those were vandalism. And the other half were vandalism reverts.
Tawker: And then the whole possibility of protection, and needless to say it's-
Fuzheado: Oh, it's semi-protected now?
Tawker: Yeah, I protected it and then I got about five complaints about how we shouldn't protect anything linked from the main page, even if it's a vandalmagnet.
Fuzheado: Well, we'll talk later about protection and how much protection there should be. But it was interesting, in another thing related to Wikipedia articles making it to the real world.
Daveydweeb: when the article about Belgrade in Serbia made it to featured article, and supposedly the mayor of Belgrade mentioned this in his press conference, that Wikipedia selected Belgrade as a featured article. It's interesting how Wikipedia articles being selected for something or making a milestones has suddenly become this big thing to boast about in the real world.
Tawker: Apparently the one thing, I think it was Danny brought up, was that Wikipedia's article on podcasting was actually using in the United States Patent and Trademark office rejection of the word "Podcast" as a trademark.
Fuzheado: Really?
Tawker: Yeah, it was actually attached -- Wikpedia's entire article was attached to the document, they got a printout of Wikipedia's article.
Danny Wool: One of the things that's interesting about this article is that molluscs and snails are something we're pretty weak in at Wikipedia. Looking at the gastropod article, it said there's sixty thousand to seventy thousand known living species of gastropods, but we only categorised one hundred and forty-six. So hopefully this article can get us moving to write about all these other species, which really deserve mention in an encyclopedia.
Fuzheado: So what do you folks think? I know that Jimbo was emphasising quality over quantity.
Kelly Martin: We definitely have a problem with people adding a lot of content that we really, probably don't need. And in this case it's just pointless trivia, but we also talked a little while ago about the search engine optimisers who are adding content that not only do we not need, but is being added for purposes that aren't really in the encyclopedia's interests in any way.
Fuzheado: Yeah. Kelly, there's an old Chinese saying that predates even Wikipedia and the internet-
Daveydweeb: Wow. That's old.
Fuzheado: -that says, "Don't add legs when painting a snake." And I think that is something that is so apt for Wikipedia, that we've got a lot of... I wouldn't call them finished drawings, but we've got a lot of snakes, and we don't need to add all these crazy little trivia things to ruin the article, right? So I think that's a bit challenge of Wikipedia, that you really do have to change your policies as we start to reach different levels of maturity.
Daveydweeb: That's going to be especially difficult, because as you start to work on pruning articles to make sure that we only have what we need, you risk crossing the line to becoming really hardcore deletionist, which is potentially a problem -- especially when new users and anonymous users add content to Wikipedia which is immediately removed. So it introduces the whole extra problem of, at what point do you become excessively deletionist?
Kelly Martin: One of the things is that we have another problem with recognising when an article is mature, and it's when an article covers most of the facts that need to be represented about the topic. It really comes up in biographies -- it comes up in other topics too -- but especially biographies of persons who are no longer living. I mean living persons have all sorts of biographical issues, not the least of which is the fact that they're still alive and therefore things about them changes, but things change about dead people as new facts are discovered. It's a slower process.
But when articles mature and most of what needs to be said has been said, what often ends up happening over time is, people who go to the article and feel "I should change something" and they get this urge to do something and they add some trivia because everything else has been talked about, so they stick in the trivia. And we've seen articles, and I think this happens to JFK, where most of the article wasn't changing but there was this slow accretion over time of undifferentiated facts, no organisation to them and it read like people were just throwing.. like they had darts with data on them, and they were just throwing them at the article and seeing where they stick.
1ne: What were they doing?
Kelly Martin: This has happened too so many articles that we've seen, where the article really is mature, and...
Fuzheado: Something related to that this week was that there was a massive unblocking spree, I wouldn't call it a spree, it was an admin who said "we shouldn't have this many protected or semiprotected articles," you know - User:Steel359, not to single one person out, a lot of people feel this way - just blanket-unprotected dozens and dozens of articles without even looking at why they were protected in the first place. And you know, some of these articles... anyone who's done RC patrol knows you should not be unprotecting articles like asshole or anus or poop, these are like magnets for vandalism, and there's nnot much to be added too these articles, right?
It just kind of drove a lot of people nuts that they had to go reprotected them hours later, because someone just went on this Wiki-fundamentalist streak of, "we shouldn't have protected articles, so we need to unprotected whatever we can." There's a lot of use for semiprotection, in other words not allowing anonymous or new users to edit them, and semiprotection might be used for more than just streaks of vandalism, it might be useful for other things.
1ne: Speaking of putting too much trivia in articles, there's an article on sea urchins and back when I had a school project on it, I started to do some vandalism patrolling and this wasn't vandalism, but I decided to do some patrolling on the article. And somebody added some trivia too it that said a way too take away the effects of a sea urchin's pricks on your skin is to pee on it, and I thought, "what the heck?" So I removed it, and it turns out that it's a legitimate way to get rid of the pricks, but it's too close to offering medical information so you had to remove it anyway because right there, you know you're kind of getting into a grey area if the trivia goes a bit too far.
Fuzheado: Right, right. Danny, you must get a lot of these type of queries, too, right?
Danny Wool: All the time, all the time. But I think one of the things about it is that, it's really a relevant discussion for this week, because we've just hit a million and a half articles and we're still tending to focus on the idea of increasing the number of articles; everyone wants to create a new article. And the real challenge should be to shift the focus, instead of stating on the front page that we have a million and a half articles, change the focus to saying featured articles, or ten thousand featured articles, and set goals on that level. Instead of putting it on to how many new articles we have, or the number count of articles - how many of those articles are great?
We'll see the articles really accumulate then, because peoople will start to work on improving them, rather than just adding stuff.
Fuzheado: Danny, do you know anything more about the German milestone of 500,000 articles, or the French milestones?
Danny Wool: No, I don't, actually.
Fuzheado: I'm trying to think of what other mileestones...
Danny Wool: There's also the Commons milestone.
Fuzheado: Oh, what is the Commons milestone?
Danny Wool: We're about to reach a million images.
Fuzheado: A million images? Nice.
1ne: Wow.
Danny Wool: But we haven't hit a million yet on Commons, but it's 999,406 and counting. So, it's really increasing. It's becoming a fabulous repository.
Kelly Martin: The only question I have now is, how many times are we going to hit a million? Because how many of those are copyvio? I've been talking to Bastique, who is one of the admins and bureaucrats on Commons and is a very active member and very fine Wikimedian, he says he has a lot of concerns about copyright infringement on Commons and we obviously have to be much more aggressive in enforcing our copyright use policies there.
Daveydweeb: They've also chosen a very interesting way to celebrate those million images. At the English Wikipedia, all we did for 1.5 million articles was to put a thing on the main page to say "Congratulations, 1.5 million articles." At Commons, a large number of users are working to put togeether a collage of thousands and thousands of images to reproduce the Commons logo... or is it the Wikimedia Foundation logo?
Kelly Martin: It's the Wikimedia Foundation logo, and I have seen it, and there's a couple of technical glitches to work at. We think we might have actually discovered a bug in MediaWiki, because that page requires the generation of many, many, many thumbnails, and MediaWiki doesn't seem to handle that very well. So there's the possibility that won't happen, because MediaWiki try to generate..
Fuzheado: They made this collage within Wikimarkup?
Kelly Martin: Yes.
Fuzheado: Oh, that's...
Kelly Martin: Rather than generating a collage by combining images, there is a document which has a hundred or thousands of image tags for appropriate size codes, and yes it does give MediaWiki quite the herniaa.
Fuzheado: So this is one of the photomosaic things?
Kelly Martin: Yes, and it's really quite impressive. It started out with just a handful of images, and they've been slowly subbiing out images for other ones so that each image is unique. And I saw a version in progress the other day, and I was quite impressed.
Fuzheado: But you blew up MediaWiki in the process?
Kelly Martin: I have yet to see a proper render of it..
Daveydweeb: Is this the MediaWiki equivalent of getting really drunk for your birthday and having a huge hangover?
Kelly Martin: Yes! I'm sure Brion is going to kill all of us for even talking about it.
Fuzheado: Well, moving to other projects that we've been talking about.
Tawker: Citizendium. Citizendium has been chugging along, has reached 300 users, and I believe several folks on this podcast are actually in the pilot program.
Daveydweeb: Yeah, Citizendium - the Wikipedia alternative founded by Larry Sanger, with whom we spoke in our podcast special a few weeks ago - is reportedly, in his words, "chugging along". When we spoke with Larry, the website had just reached 80 live articles, ones that had been substantially changed from their Wikipedia counterparts, and 230 usernames, not all of which were active. Since then, Citizendium hhhas lowered the bar for new users and no longer demands that authors provide a link to their CV or resume in order to join. As a result, the site now boasts over 300 active usernames, including a number of our own panelists here at Wikipedia Weekly, and 300 live articles. In just a few weeks, they've grown substantially, and the rate of editing is increasing quite quickly.
I guess one of the reasons - apart from lowering the bar for new user applications - for this increasing growth is that Sanger recently created what he calls "Discipline Workgroups", which Wikipedians may know also as WikiProjects. These groups are more formal than their Wikipedia equivalent, they have a really rigid structure common to all Workgroups, and just like the rest of Citizendium members of them are split between authors and editors. And as a result, Citizendium is starting to move in a more structured way towards producing and changing its own unique content.
They've started slowly, though: I'm a member of the Games Workgroup, which at the moment has just three members, and 77 articles for us to work on so far.
Fuzheado: Davey, talk about the Workgroup that you're involved with. There's three folks...
Daveydweeb: Well, Workgroups are mainly focused on the same kinds of things that WikiProjects at Wikipedia do. They collect a large number of articles that fall under their own particular discipline and work to improve them. There are some differences, though. One difference is that unlike the WikiProjects at Wikipedia, none of the Discipline Workgroups rate their articles in the same way. For instance, Wikiproject computer and video games, which I am also a member of at Wikipedia, rates its articles according to the Wikipedia 1.0 standards, so Featured Article, A-class, Good Article, B-class and so on and so forth.
Citizendium doesn't do thaTawker: the articles are just articles, and we have too work towards them. There are core articles for each Workgroup, but they're not rated by quality. the other difference would be that unlike Wikipedia, Citizendium isn't live yet. It's not open to the public, so users have the freedom to make major changes too Citizendium articles, which aren't necessarily improving them each time, but working towards a totally different article without having too worry about people seeing it in the meantime. There's no need for sandboxes at Citizendium.
1ne: There isn't vandalism yet?
Daveydweeb: There's no vandalism that I've seen, because people have to sign up with their first name and last name, and provide real email addresses.
1ne: Yeah, unless you want to be really daring.
Daveydweeb: Well, if you want to be a very short lived vandal.
Fuzheado: Yes, well. If you look at the RC, the Recent Changes list, it's pretty fascinating seeing only proper names, you don't see nicknames or any funky...
Kelly Martin: Can't Sleep, Clown Will Eat Me won't be showing up there any time soon.
Fuzheado: So, it's quite interesting seeing that, it's kind of like some cognitive dissonance. You're used to seeing the Wikipedia recent changes, then you see the Citizendium recent changes, it's very different.
Daveydweeb: There's a downside to that too, though, because although looking at recent changes reveals hundreds and hundreds of people using their real names - which is a nice thing - on the other hand, what you also see is very little being done to actually improve articles. Citizendium has a huge amount of work before it, and unfortunately most of the work being done at the moment is just setting up the rest of the work that they're eventually going to have to do as well. So, categorising articles, adding tags to them, that kind of thing - not major changes to content.
I think that's part of the reason why so few articles at Citizendium are actually live at the moment, because there's all this maintenance being done.
Fuzheado: I know that when I last looked at it, the pictures were all broken and the categories were a little broken. What is the status at this point?
Daveydweeb: At the moment, those images are still broken. There's also, at Wikipedia also every articles has links to different language versions of the same article. At Citizendium those links still exist, but because there's only one language at citizendium, they're also broken. There's a lot of red text at Citizendium at the moment, and I'm not even entirely sure what the policy on that is at the moment; if we're meant to remove those images, or if we're meant to remove those interwiki links. Nothing's been explained in any way that I really understand with that, and I think that's the kind of thing that we're going to have to work on at some point in the near future.
Kelly Martin: I have a couple of questions on that. Does Citizendium plan to offer other language editions besides English?
Daveydweeb: At the moment, as far as I can tell, that kind of thing would be a bad idea for Citizendium. its real challenge at the moment would be to differentiate itself as much from Wikipedia as possible. If it wants to survive, I guess it's very important for Citizendium to look different from Wikipedia and offer something different. So it seems wise at the moment just to focus on an English-language version. I haven't heard anything from Larry on, you know, creating a French or German Citizendium, but I think that might happen if English is successful and works out well.
Kelly Martin: They have a lot of work on their hands, it's... I know I'd be the kind of person who'd go, "Oh this is too much work, I'm going to do something else", and I hope Larry and the other people who are trying to make this happen are a bit more stick-to-it with this than I am.
Daveydweeb: I would liken it to trying to raise ten thousand articles to Featured Article status at Wikipedia, for every member of Citizendium. It's an enormous amount of work, and you really have to improve the quality of articles. It's a very difficult thing to do, especially with such a small user-base because then there's no effective means of review, and so on and so forth. So yes, it's an enormous amount of work.
1ne: Two things that I was going to say about Citizendium: First off, yeah, you can't expect the community for Citizendium to grow as big as it did for Wikipedia because we got a lot bigger after the Jimbo vs Siegenthaler interview thing. And then, you know, my other question was: So if Citizendium's trying not to be like the Wikimedia Foundation, we're not going to see any sub-projects like Citizendium Species, or, Citizendium Incubator, we're not going to see anything like that?
Daveydweeb: Well, for the first one, I'd say you're absolutely on the mark there. Citizendium will grow slowly of course, it's not going to spike Wikipedia did, until it reaches a public release. The problem being, it first has to be good enough to reach a public release, so all it has right now is a small dedicated core of users. On the other hand though, it's doing well: in the first month of Wikipedia's existence, from 10 January to 10 February 2001, there was only something like 250 edits made to the entire of the English Wikipedia. Some users make that many edits every day at Wikipedia, I made that many in the last fortnight. And at Citizendium the same applies: some people make a lot of edits there, so at the moment its rate of improvement is probably faster, much faster than Wikipedia when it first launched, but at the same time it has a much freater workload.
For the second question, I'd say Larry Sanger seems very keen on producing a lot of the same kind of projects that wikipedia has. it's already got a rough equivalent to Wikiprojects, discipline workgroups look very structured, organised, and when they grow to the correct size, effective. But obviously we won't see things like the Incubator being copied at any point in the near future, because there's no utility for it at the moment.
Wow, you guys are interviewing me, that's awesome...
Fuzheado: Yeah, we are. Well, we should keep our eye on Citizendium, I mean, there's a lot of interesting things we could learn from their processes.
The results have been announced for the most recent Danny's Contest, which has been sponsored by Danny Wool. This is the third such contest, and the field of military history proved to be one of the most fertile places where this winning articles came from.
1ne: Yeah, I noticed there's an article, Crawford Expedition, that went from one paragraph to a huge article. I'm pretty amazed at how that grew. I'm, I'm... I'm,, I just can't explain it, but it turned from a paragraph into a sourced page, with lots of info and pictures and details and I'm just really impressed with how this contest turned out.
Fuzheado: Hey Danny, maybe you could explain quickly what your contest is about, and how you got the idea for it.
Danny Wool: Okay, this is the third contest. I did the first two a couple of years ago, and I decided to raise the whole idea of having the contest again, hopefully to get people to focus more on improving articles. I was a little sad that we didn't have have quite as many people competing for this as we had last time, hopefully that will improve in the fourth contest.
But the results of the contest were greaTawker: the articles were fabulous, SushiGeek [1ne] mentioned Crawford Expedition, which was wonderful, but there were also so many other wonderful articles, including some of the ones that didn't win. I think, Andrew you can vouch for the judges that really had a tough choice to make there, and in the end I decided to give out three prizes instead of just one. I'm hoping to do this again, I want to do it again starting, actually, next week. So, wait for the fourth contest, which will be focused on integrating Wikipedia with other Wiki-projects, so taking an article, improving it, enhancing it, expanding it, sourcing it, but also using other Wiki-projects such as Wikibooks Wikispecies, Wikisource or Commons, in innovative ways to improve the content.
Kelly Martin: It's definitely the kind of thing that helps to spur a lot of production. I've actually helped participate in Danny's Contest, there was an article on polar expedition, which was brought to my attention at the beginning of the contest, which had almost no content whatsoever. So I did what I've done on other occasions and I grabbed my 1966 New World Encyclopedia and opened it up, and then went to the appropriate articles in there and looked for content that I could pull in. Without of course violating Copyright, because there's this little thing that you can't copy content straight across, but you can certainly draw content and I've done that to create articles in the past.
Fuzheado: I've have to take some of the blame for the contest taking so long - because the articles were so good and so extensive, I didn't have the time to read over all of them - because I was part of the judging of this thing. But I'm glad that you guys chose three in the end, because they're all great, great examples.
Hey, Danny, let me ask you a question while we have the time: the polar expedition article that Kelly talked about, we talked about that probably a few episodes ago. Whatever became of that? Because I know there was a reporter who was going to look at that for an example of an article.
Danny Wool: He actually never did look at it, but the article really improved. Basically, just to correct a little bit of what Kelly said, I actually started with the polar expedition article before the contest. The idea was that, with the million and four articles that we had, the was no article on polar expedition. So we started with a sentence, and a group of people on IRC got together, created a channel called Wikipedia Spotlight, and started working together too see how they could expand that article.
The article still needs a lot of work. I think it's a great article now which is really going in the right direction, but there's still so much to be added to it. But it gave a really exciting overview of all of arctic expedition from ancient times until today. So it's really heading in the right direction, and what was really nice about it was the idea that you had so many different people working on it together, so it was really being written in a collaborative effort, and it led on to other articles being written in that same way: fashion photography - another topic we had nothing about, which became a very good article - and a few others as well, and I'm hoping that idea will continue.
Fuzheado: Right, so that channel is dedicated just to talking about the spotlight article of the week, or of that moment, right?
Danny Wool: Yeah, the channel is kind of quiet right now, but it could use some re-envigorating.
Fuzheado: Great, so on IRC that's #wikipedia-spotlight, is that right?
Danny Wool: Yes.
Daveydweeb: As people might be aware, there are various other ways in which Wikipedia editors can actually make a little bit of money from improving Wikipedia's articles. Danny's contest offers Amazon gift vouchers for the major contributors to the winning articles, whereas other projects like the Wikipedia bounty board also offer money to people who raise articles to featured article status within a set period of time, set by the individual offering the bounty.
It's a very simple thing. All that happens is that a user offers a bounty of any amount of money in a currency that they choose, to improve the quality of an article that they choose, to featured article status. If that's achieved, they donate the money to the Wikimedia foundation, as a means of improving the quality of the articles and at the same time donating to the Wikimedia Foundation. It's a really good way for users who aren't able to donate to the foundation itself, or simply don't want to, to still have a positive impact on the foundation as well as improving articles at the same time. i've offered a bounty for the article on David Irving, the disgraced historian, as a little shameless plug if anyone wants to improve that.
Fuzheado: And what page is it that they track the bounties on?
Daveydweeb: It's at Wikipedia:Bounty board. There are lots of bounties being offered at the moment, I think 32. The article on Jimmy Wales has two bounties offered for its improvement, the article on inflation - the article on inflation, as in economic inflation, has a bounty of $1 whichwill expire in 2055.
Kelly Martin: One dollar! Oooooh!
Daveydweeb: And there are lots of other important articles there that could be improved, and if anybody wants to, there are a lot of low-hanging fruit there which could be really good for improving in order to donate a little money and expertise to the Foundation.
Fuzheado: Well Cro-Magnon man here has a CDN$100 bounty, that's pretty high compared to the other ones. Most of them are ten, twenty...
Kelly Martin: That's almost ten cents, right?
Fuzheado: We've got a pretty decent amount of bounties here, it's pretty impressive.
Kelly Martin: Some of them are impressive. I've looked at it in the past, and there's been some criticism of it because there are some people who think that paying for writing articles cheapens wikipedia in some way, but I really don't agree with that. I think anything that gets quality articles written - because some of them are not all that demanding, some of them aren't... some of them are quite demanding, some of them will be significantly-
Fuzheado: And some of them are absolutely impossible, like the bounty on the George W Bush article for US$100, for bringing up to "certified Featured status". I think that's...
Kelly Martin: That'll probably happen in about 2040.
Fuzheado: Yes!
Daveydweeb: Some of these bounties don't even require that the articles be raised to Featured Article status. the majority do, but there are others like copyediting bounties, of which there is one currently running which offers three US cents per spelling, grammatical or formatting correction made to any article in a certain set, to a maximum of US$10. So, if you want to donate, say, six cents to the Wikimedia Foundation but can't bring yourself to look that cheap, that might be a good way to go for it.
Tawker: And just a fluke of a little IM person just came online, we have Joshbuddy from Toronto, Canada. Hi Josh!
Joshbuddy: Hey, how's it going?
Tawker: Not to bad.
Fuzheado: Well, you're just in time for us to talk about a wiki-book. The Wall Street Journal reported this week that Pierson Publishing in the UK is partnering with U Pen's Warden School and MIT's Sloan school to create a business book that will be authored and edited using the Wiki processes by an online community committed to the project. Wikipedia is mentioned as the inspiration or inspiring the effort, and the name of the book is going to be called, "We Are Smarted Than Me."
And it's going to be produced by, hopefully, a community of business experts and managers and it says it will explore how business can use online communities, consumer-generated media such as blogs and other web content in their marketing, pricing research and service. So what do you folks think, is this going to work?
Tawker: Hey, throw it out, it's out there, it's a possibility. Give it a shot, let's see how it goes.
Kelly Martin: It all depends on how they, what kind of administrative structure they get. We've seen this in the past when the Los Angeles Times attempted that, what was it, Wiki tutorial? It was a total disaster because they didn't have the community required to do the kind of patrolling. It is really shown that you need to have a solid community that will ensure that.. it depends on what you do. You do it like Citizendium, where everyone has to apply and be known and there's a bar to entry, or you can have let everyone in like Wikipedia does, but then you have to have the structure behind the scenes that manages the free-for-all - that, you know, deals with the fact that there are going to people throwing peanuts and provide peanut shields. It's an interesting concept, but-
Fuzheado: That's two issues. One is, if you put a bar there, how are you going to make sure you can get the community to be part of it, how to get the right folks in to it? It looks like they're going to be using a Creative Commons license.
And the other thing is, we had an experiment somewhat like this in wikipedia, where there was a journalistic article being edited within there. If you go look at the page Wikipedia:Improve this article about wikipedia, there was an Esquire Magazine reporter who put up his version, got Wikipedians to go in and edit his version of the history of wikipedia and the story. and although the resulting version that Wikipedians edited was more accurate, it was boring as, you know, nails. It was just not punchy, was not interesting.. factually perfect but really dry, and that's why I worry about with a book that's edited by a lot of folks. It could be really dry, but accurate, and I don't know what's going to happen.
Kelly Martin: That's one of the problems wen you have editing by committee, and people talk about this like a joke, but committees really don't have much in the way of style. It's one of the problems Wikipedia faces, that every articles on wikipedia is essentially edited by committee, and you don't get a feeling of style or of any kind of personality in the writing because it's all edited out through the compromise process.
Greg Maxwell and Kim Bruning did a review of articles, of editing of articles, discovered that most articles are edited by fewer than ten editors. The ones that are edited by less than five editors are actually the ones that have the most coherency to them because there's only a few people who are making sure that it reads like an article. When you get into edited articles like say Bill Clinton - this is a great example, because Bill Clinton is on the articles like what I talked about earlier, where there's just a zillion pieces of trivia smashed in with the articles - where you have everyone putting their, adding their little vegetable to the soup, before long you have such a mishmash of-
Fuzheado: Schizophrenia goulash, that's what you have.
Kelly Martin: Yeah! And you, exactly so, very good. You have this article that has no consistent style and the reader goes from reading something thats written at thirteenth-grade diction to something thats written at fifth-grade diction, and then there's seventeenth-grade diction and it doesn't... it's very confusing for the reader, and it really is a problem that Wikipedia has too come up with a solution for in the long run.
Danny Wool: Look, here's the thing: Wikipedia committee-based articles never works, Wikipedia is always by committee, and Wikipedia won't ever be pretty. It's just it can't be, it would be factually dense and have lots of colourful references, but it's never going to look beautiful, nothing you can do about it - just deal with it.
Fuzheado: Right. You said there's a starting point for the next product, which will make it nicer. There used to be an effort - maybe a year ago - of what they called the inverted pyramid writing, right, so make sure you tell 80% of the story in the first two sentences and leave all tthe werid details about "it used to be called this" further down in the article.
Kelly Martin: That's taken from journalism, which is a different editing, but it might be appropriate.
Danny Wool: In the past we used to actually have a group of people that really focused on copyediting. There was Vicky Rosenscheiz (?), for example, who goes back three or four years. And maybe we need to really encourage a new core of users who just go through the articles and copyedit theMessedRocker: make them really legible, readable, simple editing skill - based on basic editing skills - to improve what we have.
Fuzheado: Right.
Danny Wool: But I mean, new people approaching the Wikipedia project.. Wikipedia is, by its nature, byzantine and wholly complex. So for someone to join it and figure out what they should be doing is a hugely complex task, I mean, I remember when I first joined Wikipedia and being overawed by all the structure and committees and secret groups - it's scary and bizarre. So what you need is some kind of intro page: this is Wikipedia, and this is what you can do to help.
Kelly Martin: Yeah, well we need volunteer coordinators who will go out and, and, we sent these welcoming messages to people-
Fuzheado: This is actually a perfect segue into our next segment, which is these messages that people get on their user pages and everything, or user warnings. They are kind of all over the map, we've got templates for all kinds of things to, "you're doing a great job," "you're doing a horrible job," "you're vandalising this," "you're doing that".. I know that, SushiGeek, you looked into this, right?
1; Yeah, I looked into this and.. so there's a thing going on where they're trying to lower the number of warning templates used when people vandalise or when people do something else.. I think it's a good idea, because you know it's so complex right now, and you've got so many different templates that you have to use and memorise.
Daveydweeb: Some of us seem to be able to remember, you know, {{db-a7}}, so... I'm not sure you should be the ones complaining about how hard they are to remember.
But personally... personally, I think this is actually striking at the wrong problem. As nice as it is to see people trying to clean up user talk warnings and that kind of thing, so that it's easier for say, new page patrollers and recent change patrollers to remember them, I think the major problem here is the way that they're used in the first place. Mindspillage, aka Kat Walsh on English Wikipedia and various other projects, had a really good essay which she wrote a long time ago about how new users and vandals should be approached - and unfortunately the problem with writing essays on Wikipedia is that nobody ever reads them if they're in the userspace.
So this essay explained in good detail why you should always warn vandals appropriately, you should always start at the lowest end of a series of test warnings, say - you should always start with {{test1}}, then {{test2}} and {{test3}} and so on and so forth - you should always be nice to them, you should always welcome them if you have the chance - but nobody reads this because of course it's in the userspace. So what the problem we have is, is that new page patrollers and recent change patrollers revert, they delete, they put up for AfD and that kind of thing without telling new users. So as nice as it is too see projects looking to make it a little easier for them to do it, there isn't nearly enough focus in my humble opinion on the importance of warning people in the first place, and ensuring that they're broken in gently, kind of thing.
Fuzheado: Yeah. It's just so easy too get jaded when you're doing RC patrol and fighting vandals, it's just hard to get the motivation up to tell that next vandal, "oh, by the way, you might not have meant to do that," and we're trying to be nice to them. It's just really hard, because sometimes you're just one hundred percent sure this person will never be receptive to being a productive member, but you've still go to do it, right?
Danny Wool: And it's so much work, too. I mean, you have to figure out, you know, is this an old vandal, is this a new vandal, what level am I supposed to leave him? It's a lot of effort. If you've ever done vandalism patrolling, you know that everyone there wants to be as fast as they possibly can be, and there's like a contest to see who can whack-a-mole fastest.
Daveydweeb: Yes! Yes.
Danny Wool: So, there's a whole concept of, you know, who can I beat down fastest? And so some of the attitudes have to change with vandalism patrolling before we get much relief there.
Kelly Martin: Well, especially-
Fuzheado: Yeah, for people who don't know, our own famous Tawker got interviewed and profiled very extensively by a local paper in vancouver, is that correct?
Tawker: Yep!
Fuzheado: And you can find it online and you can find it in the show notes, it's a great description of this Canuck fanatic.
Tawker: Should we talk about... according to AntiVandalBot, since we renamed things, what are the most vandalised pages?
Fuzheado: Sure, let's get to that.
Daveydweeb: Sure.
Tawker: So, where should we start, from number ten? Okay, number ten would be the good old US of A, the United States of America, with 79 AntiVandalBot reverts. Followed up by yours truly, the big Bill Gates. Then Scientology, then let's see... George W Bush was apparently safe, mostly because it was semiprotected, by Tony Blair pops up as number six on our list. Five is football, like, the rest of the world's football, not North American football.
Kelly Martin: Isn't that surprising, though? I'm surprised that soccer gets more than football does.
Tawker: Guess who's up on number four?
Danny Wool: Guinness beer?
Tawker: No! God.
Fuzheado: God.
1ne: God!
Kelly Martin: God?
Tawker: God is... God has been...
Danny Wool: If God is number four, Jesus must be number three.
Tawker: Jesus was current events, wow! I don't know why... Jesus was like, ages ago, I didn't think Jesus would be a current event anymore.
Fuzheado: I think Jesus is protected by God is not, I think it's something weird like that.
Daveydweeb: I guess you could draw all kinds of metaphysical conclusions from that.
Tawker: Well number two-
1ne: Is that just something somebody did on Jesus' page, or when you protect pages do you put that lock up on the upper right?
Fuzheado: That's a new one, I just saw that too.
Kelly Martin: The lock is added manually, you're supposed to.. when an administrator protects a page, they are supposed to add a protection template, there are three or four of them.
Four was God, who's number three?
Tawker: Three was current events. Two was RuneScape, whatever the heck-
Kelly Martin: Runescape-
Danny Wool: Runescape is an RPG game.
1ne: Runescape? I love Runescape.
Danny Wool: Yeah, I mean, even as a recent changes patroller it always got so much traffic, Runescape...
Fuzheado: That's strange.
Kelly Martin: Runescape is one of the largest MMORPGs.
Fuzheado: And everyone should know what number one is.
Danny Wool: Let me guess, can I guess?
Fuzheado: Yes.
Danny Wool: Who's going to guess? Is it-
Kelly Martin: Wikipedia!
Danny Wool: -Wikipedia?
Fuzheado: Yes!
Tawker: Ding ding ding ding ding ding ding ding ding ding ding ding ding ding!
Kelly Martin: Ow! How totally self-referential!
Fuzheado: Okay, well we're going to whip through our two very brief feedback stories here. We had someone who emailed us who liked our ttalk about the universal wikimarkup, and says:
So it's a kind of universal wikilanguage, and he invites people to visit www.wikicreole.org.
Kelly Martin: Yeah, this came in from Chuck Smith, who is one of the panel.. one of the members of the Wikicreole project, which is also being done by Christoph Sauer, Jana Jalkanen - I apologise if I've bastardised your name -
Fuzheado: -and Ward Cunningham-
Kelly Martin: -a name which we should all be familiar with, who came up with the whole wiki concept. The whole pattern repository that Ward Cunningham in the original Wiki, is full of neat stuff that people really need to read because, I mean, there's a lot of good stuff in there and in Meatball, which is all dedicated... Meatball is totally dedicated to what we'd called meta-stuff, stuff about the whole community editing process, and there's a lot of really good stuff. And I find myself a lot of times at wits end with the Wikipedia community, and then I go read something at Meatball that's says.. "oh! That explains why so-and-so did that! Now I understand, and I know how to approach them."
Fuzheado: Yeah, for all those folks out there, Meatball is like... I think it was the second ever wiki that was created. Maybe one of these shows we should get Sunir Shah, who is the creator of Meatball, on, and he'd be a fascinating guy to talk to on the podcast.
1ne: What I was going to say was, I kind of regard Meatball as, like, the Old Testament of the wiki. I know that's just me, but that's how I feel.
Fuzheado: Hey, that's quote-worthy! That's quite good, actually. "The Old Testament of the wiki..."
1ne: Exactly!
Tawker: Super-super-super quick plug based on feedback from Brion. A few podcasts ago you were talking about Wikipedia and MediaWiki tutorials: he's actually done a thirteen-part video series on editing MediaWiki. Unlike some one of the videos earlier, he does not just the basics, but - I should say, actually, unlike the videos that cover just the mechanics of using MediaWiki not the social conventions of Wikipedia. The series runs about 80 minutes long because of some of the non-obvious stuff, like rules or - I should say - page-names and namespaces...
1ne: Does he explain WikiDrama and all that?
Tawker: No, he's more dealing with the technical side. But it's at learnprogramming.tv/index.php, somewhere titled "Learn Programming: How to Edit A Wiki". It's really detailed, there's actually a link on the comments on page.. yeah, just on the episode 6 comments page, it's linked there. And it's a really good and nice tutorial for newbies, and stuff like that.
Fuzheado: Great.
Our last piece of feedback comes from Jason Calacanis, our former guests on the Wikipedia Weekly podcast, and on his blog he actually posted even more suggestions on how Wikipedia could, you know, derive income with a very controversial proposal that he had. But he's added more to it, and he actually related to something we talked about, put on a little proposal saying that people should produced a GreaseMonkey script that allows users to add Google AdSense to Wikipedia.
Now it doesn't sound.. it's not as scary as it sounds: basically a GreaseMonkey script is just a little piece of code you add to your Firefox web browser and it can do something based on the webpage that loads. And the idea-
1ne: Isn't it like an extension?
Fuzheado: Yeah, it's kind of like an extension, but it's probably a little bit more lightweight, it's basically just a chunk of.. I think it's JavaScript is the only thing it can be, and when you load a page, if the page matches a pattern you could execute some kind of JavaScript code and it's completely on the client side. So you don't need support from the server or from the website that you're visiting, and it's usually used for the little gadgets and little things that you can enhance your page with.
But someone actually posted on Calacanis' comments section, saying they actually came up with an example of a GreaseMonkey script that does load ads when you hit Wikipedia pages, but the interesting thing is that they discovered that no useful AdSense or Google AdSense ads come up, because Wikipedia has a robots.txt file that prevents advertising-related bots to index Wikipedia's pages. So, if you look at it, if you look at the comments on Calacanis' blog, and we'll put a link in the show notes, it says that the user agents: media partner/google - they're all disallowed. And because of that, they don't get good ads coming up when you hit the pages.
Kelly Martin: Right.
Fuzheado: So that's an interesting phenomenon.
Kelly Martin: The Google ad engine, in order to target advertisements to a page, has to load the page to examine it - to look for whatever the engine looks for - and since Wikimedia blocks all of Google's ad engines from fetching pages and Google is a nice, is a good Internet citizen and it respects robots.txt files, they can't actually index the page and so they can't target the ads. There's no good reason why we would allow - from the Wikimedia Foundation's point of view - why would allow the ad agent to come in there, because Wikimedia's not putting the ads on the page. And by not allowing them to look at the the pages, it reduces the loads on the server, so it's just.. there's no reason why the Foundation would gain anything from doing this.
So it's.. Calacanis is going to have to make an argument to - basically, Brion - why Brion is should allow this, and I don't see Brion as having any vested interest in doing this, it's not like Wikimedia need more traffic coming to their site. And...
Fuzheado: But Kelly, is it.. would it be accurate to say, then, that the kind of Google AdSense side of the business for Google is, seems to be quite separated from the Google search side? Because Google does get a feed of Wikipedia's articles and changes, right? But maybe the AdSense side is not getting that same information.
Kelly Martin: Yes, I think that's very accurate. We have... the Wikimedia Foundation has negotiated with Google for some kind of favoured status, because Google's not stupid, they recognise that Wikimedia, that Wikimedia's websites have a very high PageRank and that people are very interested in those pages. And so Google, because they're trying to provide a high-quality product to their customers, has negotiated this arrangement where we feed them with, you know.. we send them a notice saying, "this page has been updated, you might want to fetch it," and they do that, and they do it very promptly, so changes on Wikipedia show up very quickly at Google. That's because Google's trying to deliver quality to their customers, which is the people who do searches.
On the other hand, their ad program.. Wikipedia has no connection with their advertising program, we're not interesting in their advertising program, so we've never had any real reason to negotiate with them on that and they've never asked for that to be made available. There's really no point to it, because why would their ad engine be looking at our pages? We're not advertising on their pages.
Fuzheado: Right, and to be clear that, that feed service Wikimedia does offer to, I think, any company that approaches them for a fee.
Kelly Martin: I think, I don't... I can't speak for Brion on this, but I think that there's... like, we... there's some ground rules that Brion has, like, you'll agree not to do certain things and such-not. It's more of a technical issue, it's like, "don't do things that abuse our servers, don't update the page too often, because we don't want you banging in there every ten seconds." So the whole point of this is that Google doesn't keep indexing the page every minute, we send them a notice saying that it's been updated so they can say "oh, okay, we'll come look at it." And that makes it a "push", rather than a "pull" technology, and "push" technologies are always more efficient. But the..
Fuzheado: And they do pay a subscription fee per month for these type of services. It's not huge, but it's something to at least pay for the, you know, there is load and CPU and programmer effort to do this. What I just find so interesting about this whole exchange is that something we talked about on the podcast gets blogged, and then a developer sees it and tries it out, and we find out more information.. I think it's fascinating, seeing this kind of back and forth, and people are trying it and they're putting their fingers on it. And we see immediately that, you know, it's an interesting idea but there's still some problems with that concept.
Tawker: Well the other one is with advertisers sort of fine-tuning Wikipedia content to bring up specific ads, and that was an issue somebody raised up in Jason's comments about people tweaking Wikipedia for AdSense, which sort of goes against the whole purpose of Wikipedia.
Fuzheado: Well, we definitely look forward to people trying out some other options. Obviously this GreaseMonkey script doesn't provide much useful stuff, but we're always looking for other ways of doing this.
Okay, so for our final note of feedback. As we said last week, we were happy to have some folks pitch in to do some transcription, but this week we had some folks who answered our call for trying to feed our wonderful podcast into speech recognition. I think Daveydweeb has something to say on this..
Daveydweeb: Yes, indeed. We had an interesting time with voice recognition, trying to get it transcribed. Initially we started out looking for an open source alternative, and we came across some software called Sphinx... or, as I like to call it, "that annoying software." It seems to still be a development version, it doesn't appear to have released a public version for people to install and use. I spent about two hours trying to get it to install and run and it turned out to completely impossible. we decided sphinx probably wasn't the best solution, considering we couldn't get it to work.
Later, User:Jacoplane, who's actually an administrator at the English Wikipedia, suggested that we use Dragon NaturallySpeaking 8, and provided a sample of the episode 6 transcribed by Dragon. The results were... interesting, you might call them. We spent a fair amount of time laughing at those results, in fact. I'm sure all of us have our own interesting little stories about them, don't we?
Fuzheado: Well, it doesn't.. it's doesn't help that the speech recognition engine doesn't recognise what the word "wiki" or "Wikipedia" is, so obviously it started from a very bad position just off the bat.
Daveydweeb: Yes. It, you know, transcribes wiki as tahiti, and all these horrible little mistakes. and what we ended up with was ten thousand words of crap. We had to manually goo through and put it into English, and we got about thirteen minutes in. User:Chacor and I, and probably a few others, spent a lot of time going through it - and we only got thirteen minutes in before we just gave up. And so that transcript is still incomplete at the moment.
The interesting thing is, it actually got better over time.
Fuzheado: Is that right?
Daveydweeb: Initially it was crap, it was completely garbled, but it actually gets slightly better over time, even n the first fifteen minutes. In the future, we might even be able to seriously use Dragon as a real solution for solving that problem.
Fuzheado: Yeah, I have some sympathy for Dragon, because I don't think it was necessarily set up for, like, seven speakers chiming in and with different volumes and different quality. Half the time it transcribed "Wiki" as "Mickey", as in Mickey Mouse, half the time it was "wicka", as in, you know, witchcraft. It was really tough, I can't imagine a much tougher thing.
Daveydweeb: Another interesting point is that we have a huge range of voices now. You know, the first two or three episodes we had, you know a couple of people who all sounded very similar. Episode 6, a huge number of different voices, so when we had User:Chacoor and Tdxiang all talking, it had huge amount of trouble with them and their accents.
Kelly Martin: It'll eventually figure all of this out.
Fuzheado: One can only hope, but you recognise every third word. So, if you look at the transcription for episode 6, it's about one-third done.. read it for some laughs, if you can help us transcribe the rest.
Tawker: So, I think we're going to have to say that we're going to save this, we're going to edit it, and we're going to put it up.
Fuzheado: List our feedback information, by the way, Tawker.
Tawker: Oh, yeah. Let's see now.. Wikipediaweekly.com, WP:WEEKLY on the English Wikipedia, let's see... [email protected]... okay, yay.
Daveydweeb: That's about right.
Tawker: So that about sums it up, we'll see you next week.
Fuzheado: Bye bye.
Daveydweeb: And it only took us three hours of recording and an hour of technical problems to get here.