-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
3668303
commit 6eb5633
Showing
1 changed file
with
11 additions
and
26 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,31 +1,16 @@ | ||
# Project "Pretty Panda" | ||
|
||
## Goal | ||
This repository is a collection of independent loosely coupled components to enable developers to efficiently prototype | ||
geospatial processing locally. Once the processing steps to create a new dataset work | ||
in the local environment, it should be straight-forward to bring them in a | ||
production environment for periodic batch execution. | ||
This repository is a playground to develop a collection of loosely coupled components | ||
which... | ||
1. enable developers to efficiently perform geospatial processing locally through a containerized environment. | ||
2. demonstrate how to interact with large amounts of static geospatial data in low cost blob storage. | ||
3. provide a framework to enable periodic batch processing of geospatial workflows. | ||
|
||
## Design requirements | ||
- Allow flexible and rapid local prototyping involving spatial processing. | ||
- Store third party datasets and processing results in an organized manner. | ||
- Allow for periodic batch re-processing of the data. | ||
- Leverage Google Cloud infrastructure if possible. | ||
- Low cost as a priority requirement. | ||
## Case study | ||
Open data from Switzerland will be used for demonstration purposes. | ||
|
||
## Components | ||
### Data store | ||
We leverage Cloud Storage as cost-efficient storage for geospatial data. We leverage cloud-native data formats where possible. | ||
|
||
### Processing environment | ||
Various processing capabilities are bundled using container technology. This enables consistency across development and production workflows. | ||
Priotity is given to the Python Geo-Ecosystem, followed by pyQGIS and R capabilities. | ||
|
||
### Local development | ||
We leverage Jupyterlab as a UI for development due to its integrated visualization capabilities and extensive plugin system. | ||
|
||
### Batch processing | ||
We leverage Google Batch to provide the infrastructure for batch runs. | ||
|
||
### Orchestration | ||
Use Cloud Scheduler to trigger the batch jobs. | ||
## Design guidelines | ||
- Optimize for cost efficiency for usecases where small to medium sized static, read-only datasets are produced through geospatial workflows through a low frequency batch process. | ||
- Leverage Google Cloud infrastructure where possible. | ||
- Allow multiple developers to efficiently perform exploratory analysis on stored datasets. |