We have a copy and paste detection bot here that keeps an eye on medical topics. It is working fairly well with a 50/50 rate of true positives to false positives. Doc James (talk · contribs · email) (if I write on your page reply on mine) 20:03, 4 October 2014 (UTC)[reply]
In almost all of the articles where some or all of the text is taken from the 1911 Brittanica or similar old sources sources, or public domain US government material, there is only a general attribution, usually given by the template {{EB1911}} or the like. It is almost never indicated in the article just what portions of the text were taken. In some cases it can be figured out from the edit history, in all cases it could be determined from the original, but there seems to be no specific policy requiring this to be done, whether for the many such contributions in earlier years or the continuing use nowadays mostly from PD-USgov (this is currently most prevalent in military biographies) I have on occasion posted to people adding such material asking they do so, and I have never gotten a response that indicates that they acknowledge the need for it. I suggest we add a specific statement, and enforce it. As for dealing with the present material, it's time for a cleanup. Presumably someone could do a bot, but it could even be done manually, though I'm not prepared to do that single-handed--and not until the inflow of new such material has been stopped. DGG ( talk ) 21:19, 4 October 2014 (UTC)[reply]
I agree that it is time to cease use of these general attribution templates. There likely are hundreds of articles which are verbatim transcriptions of DANFS ship histories. These histories are generally competent but often are written in a breezy style and not from a neutral viewpoint. EB1911 articles can be very well-written but opinionated. (Cf.John Byng with its 1911 EB source; some of the latter is still present in our article.) It is difficult to tell who wrote what. Wikipedia should have progressed beyond copying PD sources, which preserves the flaws of the source and often is then bowdlerized by later edits. This is not a criticism of those who created the articles, as that practice was commonly accepted. But it is a practice we no longer need, and from what I have seen, at least some of the editors who used that practice are very capable of writing well-researched and well-written articles, equal or superior to a PD source. Kablammo (talk) 17:23, 6 October 2014 (UTC)[reply]
What is the policy on using closely paraphrased excerpts from peer reviewed journal article abstracts? I see very experienced editors doing that all the time, and given that abstracts are published for indexing, I'm not sure if there is anything technically or morally wrong with it, especially when the citation has a link back to where someone can download -- or buy -- the source article. I know I'd be pretty pissed off if I saw anti-copyvio reverts of that sort of thing. (I'm logging out so my contribs won't be given any extra scrutiny by potential copyright paranoiacs, even though I almost always use inline phrase or single-sentence quotes for that sort of thing.) 173.197.107.11 (talk) 21:33, 4 October 2014 (UTC)[reply]
In my mind, it's plagiarism, even if you're linking back to the original source. Wikipedia's slightly more lenient, but the guidelines clearly note that summarizing material in your own words is preferred. Does it really take that much longer to rewrite the passage? Ed[talk][majestic titan]22:43, 4 October 2014 (UTC)[reply]
Wikipedians can usually technically improve the original anyway; that should be the start of the non-close paraphrasing effort. Tony(talk) 10:39, 8 October 2014 (UTC)[reply]
If the source is copyrighted (and abstracts likely are; see, for example - "NLM data licensees and others contemplating any type of transmission or reproduction of copyrighted material such as abstracts...."), you should put verbatim taking into quotation marks but otherwise put the material in your own words and structure in accordance with WP:C. --Moonriddengirl(talk)17:00, 5 October 2014 (UTC)[reply]
Is that analysis informed by or oblivious to fair use considerations? I would hope that reasonable editors go after lengthy copyvio passages before they start making judgement calls about how close is too close a paraphrase for abstract text. 173.197.107.11 (talk) 17:58, 5 October 2014 (UTC)[reply]
I assume informed by, as what's the fair use rationale for text? Images like logos are clear unreplaceable, but undistinctive text can nearly always be rewarded ("undistinctive" because there are places for quotes from primary sources, etc.) Ed[talk][majestic titan]19:03, 5 October 2014 (UTC)[reply]
Pretty much. :) Your question was about policy - policy is at WP:C and WP:NFCC. Wikipedia is deliberately narrow in application of fair use for a number of reasons, including that content on Wikipedia is meant to be reusable by anyone, anywhere, in any context. What passes for fair use in one country in one situation may not be acceptable in another. By clearly marking quotations, we offer reusers the opportunity to assess whether what we believe is "fair use" works for them. --Moonriddengirl(talk)20:58, 5 October 2014 (UTC)[reply]
I don't do much with images, Hawkeye7, so I've never put any thought into this at all, but I'm not sure how that applies? I mean, plagiarism doesn't require keeping intact all of the text you use - it just requires properly attributing what you do use. It's quite possible I still don't take your meaning, here. :) Can you give an example of when separating a caption from an image has been questioned plagiarism? --Moonriddengirl(talk)22:23, 7 October 2014 (UTC)[reply]
Speaking of Intellectual sloppiness, the first footnote points to the first page of the article being cited. (A common way of citing a journal or book article in some academic circles.) The actual material being summarised is on pp. 226–227 (I think), although the term "outright theft" appears on p. 211 in another essay. In neither case is plagiarism equated with theft in the legal sense. This neatly illustrates the danger of paraphrase: unless you really understand what it being said, you run the risk of getting it wrong. That is of course the whole reason for asking the undergraduates to paraphrase; but on Wikipedia it means that secondary sources cannot be paraphrased without double-checking. Hawkeye7 (talk) 22:05, 4 October 2014 (UTC)[reply]
How does this apply to copying/merging/splitting Wikipedia articles? Whilst okay with regard to copyright due to CC-BY-SA licencing, there is no quotation marks around the copied material, and no in-text acknowledgement of the source copied from. It seems like most cases of merging, splitting, or other copying is actually plagiarism. - Evad37[talk]03:06, 5 October 2014 (UTC)[reply]
Interesting question. I think in the case of Wikipedia, we treat it more as a collective work, Evad, than as a work of individual authorship. However, even if we did treat it as individual authorship, I think that the plagiarism concerns are somewhat addressed by the "history" tab, which allows people to trace word for word who wrote what. While in merged content, you do have to look at the history of another page, it's very much of the record. --Moonriddengirl(talk)16:50, 5 October 2014 (UTC)[reply]
@Moonriddengirl:Re collective work/individual authorship: It seems strange to treat Wikipedia as a collective work for plagiarism purposes, but as individual works for copyright purposes, where each article (or other page) has distinct authors who retain their copyright. It also doesn't address transwiki-ing, e.g. what if we were to decide that voy:Low-cost airlines (or any article on another WMF wiki) would also make a good encyclopedia article, or a new section of an existing article, and copy it over here? Assuming we do it properly, we would have satisfied the terms of the licencing for reuse, but not the plagiarism rules (copying verbatim without quotation marks and in-text attribution).
Re history tab: If this method of attribution isn't acceptable for public domain or freely/compatibly licensed external works, why is it acceptable for copying within Wikipedia or between WMF projects? Or to put it the other way, if this method is acceptable for copying within Wikipedia or between WMF projects, why isn't it acceptable for public domain or freely/compatibly licensed external works? - Evad37[talk]02:57, 6 October 2014 (UTC)[reply]
I don't think our handling is different there. Authors do retain copyright, and the history of the page is sufficient for that attribution. We explicitly waived our right to be attributed by name when we agreed that a hyperlink was sufficient, every time we hit "save page". Why would we expect to be treated differently for plagiarism purposes? --Moonriddengirl(talk)10:10, 6 October 2014 (UTC)[reply]
No, I'm saying I believe those of us who contribute to these projects do so under a different understanding. We do so with the full knowledge that our names are not attached in a highly visible way to our work. It's not to do with the license, but with our general expectations on contribution. --Moonriddengirl(talk)00:19, 7 October 2014 (UTC)[reply]
I think that there is a fundamental contradiction in Wikipedia's policies, between: (1) no original research and reliable sourcing; and (2) no plagiarism and no copyright violations. You keep trying to straddle the line so that no one can say that you are clearly violating one policy rather than the other. JRSpriggs (talk) 17:58, 5 October 2014 (UTC)[reply]
I'd agree that it is a challenge particularly for those who have trouble paraphrasing, but it is possible, including by the judicious use of quotations. I don't think it's a contradiction as it is attainable. :) --Moonriddengirl(talk)20:58, 5 October 2014 (UTC)[reply]
I agree with JRSpriggs. For contentious articles (e.g. political BLPs), editors are so sensitive to possible "spin" that it can be difficult to gain consensus for anything but a close paraphrase of the source. --Surturz (talk) 03:14, 13 October 2014 (UTC)[reply]
← Back to Dispatches