druid-on-ocp4

Setup for running Apache Druid on OpenShift 4 working with data residing on Amazon S3. Additionally, there is are Docker Compose instructions for alternate development environments.

Assumptions

This project utilizes Red Hat software and assumes you have access to registry.redhat.io for both the OpenShift 4 and Docker Compose setup.

Deploying Apache Druid on OpenShift 4 assumes you have a working OpenShift 4 cluster.

Both the OpenShift 4 deployment and Docker Compose setup assumes you have access to the S3 object store (AWS).

Clone the repos

git clone https://github.com/chambridge/druid-on-ocp4.git
git clone https://github.com/apache/druid.git

Docker Compose Deployment

Copy the example.env file to .env and update your AWS settings

cp example.env .env

Setup python environment

pipenv install

Start Docker Compose

make docker-up

View the Docker logs

make docker-logs

Ingest data from S3 bucket for data stored as s3://{bucket}/data/csv/{account}/{source_uuid}/{year}/{month}/{report_name}

python scripts/ingest.py  -a < account_id > -s < source_uuid > -y < year > -m < month > -r < report >

Query data from Druid datasource

python scripts/query.py  -a < account_id > -s < source_uuid >

OpenShift 4 Deployment

Setup pull secret in your project

oc project <project>
make setup-pull-secret

Deploy Druid on OpenShift

make deploy-druid

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

druid-on-ocp4

Assumptions

Docker Compose Deployment

OpenShift 4 Deployment

Files

README.md

Latest commit

History

README.md

File metadata and controls

druid-on-ocp4

Assumptions

Docker Compose Deployment

OpenShift 4 Deployment