Tell: An Elastic Database System for Mixed Workloads

Pilman, Markus

doi:10.3929/ethz-b-000187431

Show simple item record

dc.contributor.author

Pilman, Markus

dc.contributor.supervisor

Kossman, Donald

dc.contributor.supervisor

Bernstein, Philip

dc.contributor.supervisor

Boncz, Peter

dc.contributor.supervisor

Roscoe, Timothy

dc.date.accessioned

2017-09-27T06:46:14Z

dc.date.available

2017-09-26T23:40:47Z

dc.date.available

2017-09-27T06:46:14Z

dc.date.issued

2017

dc.identifier.uri

http://hdl.handle.net/20.500.11850/187431

dc.identifier.doi

10.3929/ethz-b-000187431

dc.description.abstract

It is an exciting time to do database research. Two movements dominated the eld for the last few years: Big Data and NoSQL. Both movements arose out of necessity, as cloud computing imposes new requirements on database systems. Cloud computing makes scalability and elasticity more important than ever. A user does not want to pay for computing and storage resources she does not use, but she expects to be able to get these resources as soon as they are needed. Traditional database management systems, however, are not able to meet these requirements. Early NoSQL systems provided elasticity and scalability by massively simplifying the provided consistency guarantees and the underlying data model. Most notably key value stores can scale to thousands of machines and allow resizing their cluster at runtime. However, their simplicity is also their greatest weakness: The lack of transactions makes it dif cult to reason about concurrency, and the simple data model makes them dif cult to use. Key value stores push most of their complexity into the application. As a result, more recent solutions try not only to add transactions, but they also implement complex operations in a layer above the underlying NoSQL storage. This layering is often referred to as SQL over NoSQL. Big Data, on the other hand, is about the analytical processing of massive amounts of data in the cloud. The Hadoop ecosystem and, more recently, Spark are the most prominent systems that play in this eld. These systems allow for massive parallelization of complex analytical queries and are elastic and scalable. They achieve this by implementing a shared data architecture which decouples computing resources from storage resources. However, these Big Data platforms still have a problem: bringing the data from the online NoSQL (or SQL) database into Hadoop is a complex issue. Traditionally, this is solved like traditional data warehousing which is a heavy weight solution.A system like Spark also can not simply use a key value store for its underlying storage, because current key value stores perform poorly when they have to deliver high volumes of data. This thesis introduces Tell, a distributed shared-data database management system that lls the gap between NoSQL and Big Data. Tell implements the SQL over NoSQL design principle: it performs transaction processing on top of a high- performance key-value store. At the same time, its key value store is heavily optimized for scan queries, allowing data processing engines to fetch their data directly from the online database.

en_US

dc.format

application/pdf

en_US

dc.language.iso

en

en_US

dc.publisher

ETH Zurich

en_US

dc.rights.uri

http://rightsstatements.org/page/InC-NC/1.0/

dc.subject

DATABASES + DATABASE MANAGEMENT SYSTEMS (SOFTWARE PRODUCTS)

en_US

dc.subject

Cloud computing

en_US

dc.subject

Key-Value Store

en_US

dc.subject

Transactions

en_US

dc.title

Tell: An Elastic Database System for Mixed Workloads

en_US

dc.type

Doctoral Thesis

dc.rights.license

In Copyright - Non-Commercial Use Permitted

ethz.size

208 p.

en_US

ethz.code.ddc

DDC - DDC::0 - Computer science, information & general works::004 - Data processing, computer science

ethz.identifier.diss

24147

en_US

ethz.publication.place

Zurich

en_US

ethz.publication.status

published

en_US

ethz.leitzahl

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02663 - Institut für Computing Platforms / Institute for Computing Platforms::03689 - Kossmann, Donald (ehemalig)

en_US

ethz.date.deposited

2017-09-26T23:40:48Z

ethz.source

FORM

ethz.eth

yes

en_US

ethz.availability

Open access

en_US

ethz.rosetta.installDate

2017-09-27T06:46:18Z

ethz.rosetta.lastUpdated

2022-03-28T17:36:18Z

ethz.rosetta.versionExported

true

ethz.COinS

ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Tell:%20An%20Elastic%20Database%20System%20for%20Mixed%20Workloads&rft.date=2017&rft.au=Pilman,%20Markus&rft.genre=unknown&rft.btitle=Tell:%20An%20Elastic%20Database%20System%20for%20Mixed%20Workloads

Search print copy at ETH Library

Files in this item

Name:: thesis_final_mpilman.pdf
Size:: 2.110Mb
Format:: Adobe PDF
Label:: Full text

Download

Publication type

Doctoral Thesis [29805]

Show simple item record

Research Collection

Search

Tell: An Elastic Database System for Mixed Workloads Mendeley CSV RIS BibTeX

Files in this item

Publication type

Tell: An Elastic Database System for Mixed Workloads

Mendeley

CSV

RIS

BibTeX