Identifying gene clusters by discovering common intervals in indeterminate strings
dc.contributor.author
Doerr, Daniel
dc.contributor.author
Stoye, Jens
dc.contributor.author
Böcker, Sebastian
dc.contributor.author
Jahn, Katharina
dc.date.accessioned
2018-10-01T15:00:04Z
dc.date.available
2017-06-11T14:51:57Z
dc.date.available
2018-10-01T13:27:38Z
dc.date.available
2018-10-01T15:00:04Z
dc.date.issued
2014-10
dc.identifier.issn
1471-2164
dc.identifier.other
10.1186/1471-2164-15-S6-S2
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/94923
dc.identifier.doi
10.3929/ethz-b-000094923
dc.description.abstract
Background
Comparative analyses of chromosomal gene orders are successfully used to predict gene clusters in bacterial and fungal genomes. Present models for detecting sets of co-localized genes in chromosomal sequences require prior knowledge of gene family assignments of genes in the dataset of interest. These families are often computationally predicted on the basis of sequence similarity or higher order features of gene products. Errors introduced in this process amplify in subsequent gene order analyses and thus may deteriorate gene cluster prediction.
Results
In this work, we present a new dynamic model and efficient computational approaches for gene cluster prediction suitable in scenarios ranging from traditional gene family-based gene cluster prediction, via multiple conflicting gene family annotations, to gene family-free analysis, in which gene clusters are predicted solely on the basis of a pairwise similarity measure of the genes of different genomes. We evaluate our gene family-free model against a gene family-based model on a dataset of 93 bacterial genomes.
Conclusions
Our model is able to detect gene clusters that would be also detected with well-established gene family-based approaches. Moreover, we show that it is able to detect conserved regions which are missed by gene family-based methods due to wrong or deficient gene family assignments.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
BioMed Central
en_US
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
dc.subject
common intervals
en_US
dc.subject
indeterminate strings
en_US
dc.subject
gene cluster detection
en_US
dc.title
Identifying gene clusters by discovering common intervals in indeterminate strings
en_US
dc.type
Conference Paper
dc.rights.license
Creative Commons Attribution 4.0 International
ethz.journal.title
BMC Genomics
ethz.journal.volume
15
en_US
ethz.journal.issue
Supplement 6
en_US
ethz.journal.abbreviated
BMC Genomics
ethz.pages.start
S2
en_US
ethz.size
12 p.
en_US
ethz.version.deposit
publishedVersion
en_US
ethz.event
12th Annual Research in Computational Molecular Biology (RECOMB) Satellite Workshop on Comparative Genomics
en_US
ethz.event.location
Cold Spring Harbor, NY, USA
en_US
ethz.event.date
October 19-22, 2014
en_US
ethz.identifier.wos
ethz.publication.place
London
en_US
ethz.publication.status
published
en_US
ethz.date.deposited
2017-06-11T14:52:07Z
ethz.source
ECIT
ethz.identifier.importid
imp593652b4e9f0e21440
ethz.ecitpid
pub:149051
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2017-07-14T18:33:01Z
ethz.rosetta.lastUpdated
2024-02-02T06:13:49Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Identifying%20gene%20clusters%20by%20discovering%20common%20intervals%20in%20indeterminate%20strings&rft.jtitle=BMC%20Genomics&rft.date=2014-10&rft.volume=15&rft.issue=Supplement%206&rft.spage=S2&rft.issn=1471-2164&rft.au=Doerr,%20Daniel&Stoye,%20Jens&B%C3%B6cker,%20Sebastian&Jahn,%20Katharina&rft.genre=proceeding&rft_id=info:doi/10.1186/1471-2164-15-S6-S2&
Files in this item
Publication type
-
Conference Paper [36424]