AWS Athena is a service that allows on demand queries of data stored on Amazon s3. It allows users to separate compute and storage by storing data on s3 at a low cost and querying it on-demand using Presto at a fixed cost. The separation of compute and storage reduces costs and helps to optimize each layer independently. In this talk I will walk through how to create a completely open source alternative to AWS Athena. I'll do this using PrestoDB, Apache Spark and OpenStack Swift. We'll walk through the operational set up, and advantages of this architecture versus more traditional big data architectures. [147]

Comments

Comments are closed.