Spaced seed

In bioinformatics, a spaced seed is a pattern of relevant and irrelevant positions in a biosequence and a method of approximate string matching that allows for substitutions. They are a straightforward modification to the earliest heuristic-based alignment efforts that allow for minor differences between the sequences of interest. Spaced seeds have been used in homology search.,^[1] alignment,^[2] assembly,^[3] and metagenomics.^[4] They are usually represented as a sequence of zeroes and ones, where a one indicates relevance and a zero indicates irrelevance at the given position. Some visual representations use pound signs for relevant and dashes or asterisks for irrelevant positions.

^ Ma, Bin; Tromp, John; Li, Ming (March 2002). "PatternHunter: faster and more sensitive homology search". Bioinformatics. 18 (3): 440–445. doi:10.1093/bioinformatics/18.3.440. PMID 11934743.
^ David, Matei; Dzamba, Misko; Lister, Dan; Ilie, Lucian; Brudno, Michael (April 2011). "SHRiMP2: Sensitive yet Practical Short Read Mapping". Bioinformatics. 27 (7): 1011–1012. doi:10.1093/bioinformatics/btr046. PMID 21278192.
^ Birol, I; Chu, J; Mohamadi, H; Jackman, S. D.; Raghavan, K; Vandervalk, B. P.; Raymond, A; Warren, René L. (2015). "Spaced Seed Data Structures for De Novo Assembly". International Journal of Genomics. 2015: 196591. doi:10.1155/2015/196591. PMC 4619942. PMID 26539459.
^ Břinda, Karel; Sykulski, Maciej; Kucherov, Gregory (November 2015). "Spaced seeds improve k-mer-based metagenomic classification". Bioinformatics. 31 (22): 3584–3592. arXiv:1502.06256. Bibcode:2015arXiv150206256B. doi:10.1093/bioinformatics/btv419. PMID 26209798.

[PatternHunter-1] Ma, Bin; Tromp, John; Li, Ming (March 2002). "PatternHunter: faster and more sensitive homology search". Bioinformatics. 18 (3): 440–445. doi:10.1093/bioinformatics/18.3.440. PMID 11934743.

[SHRiMP2-2] David, Matei; Dzamba, Misko; Lister, Dan; Ilie, Lucian; Brudno, Michael (April 2011). "SHRiMP2: Sensitive yet Practical Short Read Mapping". Bioinformatics. 27 (7): 1011–1012. doi:10.1093/bioinformatics/btr046. PMID 21278192.

[SpacedSeedsDeNovo-3] Birol, I; Chu, J; Mohamadi, H; Jackman, S. D.; Raghavan, K; Vandervalk, B. P.; Raymond, A; Warren, René L. (2015). "Spaced Seed Data Structures for De Novo Assembly". International Journal of Genomics. 2015: 196591. doi:10.1155/2015/196591. PMC 4619942. PMID 26539459.

[Metagenomics-4] Břinda, Karel; Sykulski, Maciej; Kucherov, Gregory (November 2015). "Spaced seeds improve k-mer-based metagenomic classification". Bioinformatics. 31 (22): 3584–3592. arXiv:1502.06256. Bibcode:2015arXiv150206256B. doi:10.1093/bioinformatics/btv419. PMID 26209798.

[1]

[2]

[3]

[4]