Part of a series on |
Citation metrics |
---|
Co-citation is the frequency with which two documents are cited together by other documents.[1] If at least one other document cites two documents in common, these documents are said to be co-cited. The more co-citations two documents receive, the higher their co-citation strength, and the more likely they are semantically related.[1] Like bibliographic coupling, co-citation is a semantic similarity measure for documents that makes use of citation analyses.
The figure to the right illustrates the concept of co-citation and a more recent variation of co-citation which accounts for the placement of citations in the full text of documents. The figure's left image shows the Documents A and B, which are both cited by Documents C, D and E; thus Documents A and B have a co-citation strength, or co-citation index[2] of three. This score is usually established using citation indexes. Documents featuring high numbers of co-citations are regarded as more similar.[1]
The figure's right image shows a citing document which cites the Documents 1, 2 and 3. Both the Documents 1 and 2 and the Documents 2 and 3 have a co-citation strength of one, given that they are cited together by exactly one other document. However, Documents 2 and 3 are cited in much closer proximity to each other in the citing document compared to Document 1. To make co-citation a more meaningful measure in this case, a Co-Citation Proximity Index (CPI) can be introduced to account for the placement of citations relative to each other. Documents co-cited at greater relative distances in the full text receive lower CPI values.[3] Gipp and Beel were the first to propose using modified co-citation weights based on proximity.[4]
Henry Small[1] and Irina Marshakova[5] are credited for introducing co-citation analysis in 1973.[2] Both researchers came up with the measure independently, although Marshakova gained less credit, likely because her work was published in Russian.[6]
Co-citation analysis provides a forward-looking assessment on document similarity in contrast to Bibliographic Coupling, which is retrospective.[7] The citations a paper receives in the future depend on the evolution of an academic field, thus co-citation frequencies can still change. In the adjacent diagram, for example, Doc A and Doc B may still be co-cited by future documents, say Doc F and Doc G. This characteristic of co-citation allows for a dynamic document classification system when compared to Bibliographic Coupling.
Over the decades, researchers proposed variants or enhancements to the original co-citation concept. Howard White introduced author co-citation analysis in 1981.[8] Gipp and Beel proposed Co-citation Proximity Analysis (CPA) and introduced the CPI as an enhancement to the original co-citation concept in 2009.[3] Co-citation Proximity Analysis considers the proximity of citations within the full-texts for similarity computation and therefore allows for a more fine-grained assessment of semantic document similarity than pure co-citation.[9]