Show simple item record

dc.contributor.author
Wittwer, Lucas D.
dc.contributor.author
Piližota, Ivana
dc.contributor.author
Altenhoff, Adrian Michael
dc.contributor.author
Dessimoz, Christophe
dc.date.accessioned
2019-10-18T09:08:13Z
dc.date.available
2017-06-11T16:23:44Z
dc.date.available
2019-10-18T09:08:13Z
dc.date.issued
2014-10-07
dc.identifier.issn
2167-8359
dc.identifier.other
10.7717/peerj.607
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/98636
dc.identifier.doi
10.3929/ethz-b-000098636
dc.description.abstract
Orthology inference and other sequence analyses across multiple genomes typically start by performing exhaustive pairwise sequence comparisons, a process referred to as “all-against-all”. As this process scales quadratically in terms of the number of sequences analysed, this step can become a bottleneck, thus limiting the number of genomes that can be simultaneously analysed. Here, we explored ways of speeding-up the all-against-all step while maintaining its sensitivity. By exploiting the transitivity of homology and, crucially, ensuring that homology is defined in terms of consistent protein subsequences, our proof-of-concept resulted in a 4× speedup while recovering >99.6% of all homologs identified by the full all-against-all procedure on empirical sequences sets. In comparison, state-of-the-art k-mer approaches are orders of magnitude faster but only recover 3–14% of all homologous pairs. We also outline ideas to further improve the speed and recall of the new approach. An open source implementation is provided as part of the OMA standalone software at http://omabrowser.org/standalone.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
PeerJ
en_US
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
dc.subject
All-against-all
en_US
dc.subject
Sequence alignment
en_US
dc.subject
Homology
en_US
dc.subject
Orthology
en_US
dc.subject
Smith-Waterman
en_US
dc.title
Speeding up all-against-all protein comparisons while maintaining sensitivity by considering subsequence-level homology
en_US
dc.type
Journal Article
dc.rights.license
Creative Commons Attribution 4.0 International
ethz.journal.title
PeerJ
ethz.journal.volume
2
en_US
ethz.journal.abbreviated
PeerJ
ethz.pages.start
e607
en_US
ethz.size
16 p.
en_US
ethz.version.deposit
publishedVersion
en_US
ethz.identifier.wos
ethz.identifier.nebis
009756818
ethz.publication.place
London
en_US
ethz.publication.status
published
en_US
ethz.date.deposited
2017-06-11T16:24:33Z
ethz.source
ECIT
ethz.identifier.importid
imp593652f9395f858059
ethz.ecitpid
pub:154259
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2017-07-13T18:05:47Z
ethz.rosetta.lastUpdated
2019-10-18T09:08:31Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Speeding%20up%20all-against-all%20protein%20comparisons%20while%20maintaining%20sensitivity%20by%20considering%20subsequence-level%20homology&rft.jtitle=PeerJ&rft.date=2014-10-07&rft.volume=2&rft.spage=e607&rft.issn=2167-8359&rft.au=Wittwer,%20Lucas%20D.&Pili%C5%BEota,%20Ivana&Altenhoff,%20Adrian%20Michael&Dessimoz,%20Christophe&rft.genre=article&rft_id=info:doi/10.7717/peerj.607&
 Search print copy at ETH Library

Files in this item

Thumbnail

Publication type

Show simple item record