Skip to content

Commit

Permalink
docs: Reorganize registry docs (#4407)
Browse files Browse the repository at this point in the history
* reorganize registry docs

Signed-off-by: tokoko <[email protected]>

* remove commented out text

Signed-off-by: tokoko <[email protected]>

* changes in registry.md

Signed-off-by: tokoko <[email protected]>

---------

Signed-off-by: tokoko <[email protected]>
Co-authored-by: tokoko <[email protected]>
  • Loading branch information
tokoko and tokoko committed Aug 23, 2024
1 parent 0a48f7b commit 42d659f
Show file tree
Hide file tree
Showing 10 changed files with 136 additions and 136 deletions.
8 changes: 6 additions & 2 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
* [Feature view](getting-started/concepts/feature-view.md)
* [Feature retrieval](getting-started/concepts/feature-retrieval.md)
* [Point-in-time joins](getting-started/concepts/point-in-time-joins.md)
* [Registry](getting-started/concepts/registry.md)
* [Permission](getting-started/concepts/permission.md)
* [\[Alpha\] Saved dataset](getting-started/concepts/dataset.md)
* [Components](getting-started/components/README.md)
Expand All @@ -45,7 +44,6 @@
* [Real-time credit scoring on AWS](tutorials/tutorials-overview/real-time-credit-scoring-on-aws.md)
* [Driver stats on Snowflake](tutorials/tutorials-overview/driver-stats-on-snowflake.md)
* [Validating historical features with Great Expectations](tutorials/validating-historical-features.md)
* [Using Scalable Registry](tutorials/using-scalable-registry.md)
* [Building streaming features](tutorials/building-streaming-features.md)

## How-to Guides
Expand Down Expand Up @@ -114,6 +112,12 @@
* [Hazelcast (contrib)](reference/online-stores/hazelcast.md)
* [ScyllaDB (contrib)](reference/online-stores/scylladb.md)
* [SingleStore (contrib)](reference/online-stores/singlestore.md)
* [Registries](reference/registries/README.md)
* [Local](reference/registries/local.md)
* [S3](reference/registries/s3.md)
* [GCS](reference/registries/gcs.md)
* [SQL](reference/registries/sql.md)
* [Snowflake](reference/registries/snowflake.md)
* [Providers](reference/providers/README.md)
* [Local](reference/providers/local.md)
* [Google Cloud Platform](reference/providers/google-cloud-platform.md)
Expand Down
52 changes: 36 additions & 16 deletions docs/getting-started/components/registry.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,51 @@
# Registry

The Feast feature registry is a central catalog of all the feature definitions and their related metadata. It allows data scientists to search, discover, and collaborate on new features.
The Feast feature registry is a central catalog of all feature definitions and their related metadata. Feast uses the registry to store all applied Feast objects (e.g. Feature views, entities, etc). It allows data scientists to search, discover, and collaborate on new features. The registry exposes methods to apply, list, retrieve and delete these objects, and is an abstraction with multiple implementations.

Each Feast deployment has a single feature registry. Feast only supports file-based registries today, but supports four different backends.
Feast comes with built-in file-based and sql-based registry implementations. By default, Feast uses a file-based registry, which stores the protobuf representation of the registry as a serialized file in the local file system. For more details on which registries are supported, please see [Registries](../../reference/registries/).

* `Local`: Used as a local backend for storing the registry during development
* `S3`: Used as a centralized backend for storing the registry on AWS
* `GCS`: Used as a centralized backend for storing the registry on GCP
* `[Alpha] Azure`: Used as centralized backend for storing the registry on Azure Blob storage.
## Updating the registry

The feature registry is updated during different operations when using Feast. More specifically, objects within the registry \(entities, feature views, feature services\) are updated when running `apply` from the Feast CLI, but metadata about objects can also be updated during operations like materialization.
We recommend users store their Feast feature definitions in a version controlled repository, which then via CI/CD
automatically stays synced with the registry. Users will often also want multiple registries to correspond to
different environments (e.g. dev vs staging vs prod), with staging and production registries with locked down write
access since they can impact real user traffic. See [Running Feast in Production](../../how-to-guides/running-feast-in-production.md#1.-automatically-deploying-changes-to-your-feature-definitions) for details on how to set this up.

Users interact with a feature registry through the Feast SDK. Listing all feature views:
## Accessing the registry from clients

Users can specify the registry through a `feature_store.yaml` config file, or programmatically. We often see teams
preferring the programmatic approach because it makes notebook driven development very easy:

### Option 1: programmatically specifying the registry

```python
fs = FeatureStore("my_feature_repo/")
print(fs.list_feature_views())
repo_config = RepoConfig(
registry=RegistryConfig(path="gs://feast-test-gcs-bucket/registry.pb"),
project="feast_demo_gcp",
provider="gcp",
offline_store="file", # Could also be the OfflineStoreConfig e.g. FileOfflineStoreConfig
online_store="null", # Could also be the OnlineStoreConfig e.g. RedisOnlineStoreConfig
)
store = FeatureStore(config=repo_config)
```

### Option 2: specifying the registry in the project's `feature_store.yaml` file

```yaml
project: feast_demo_aws
provider: aws
registry: s3://feast-test-s3-bucket/registry.pb
online_store: null
offline_store:
type: file
```
Or retrieving a specific feature view:
Instantiating a `FeatureStore` object can then point to this:

```python
fs = FeatureStore("my_feature_repo/")
fv = fs.get_feature_view(“my_fv1”)
store = FeatureStore(repo_path=".")
```

{% hint style="info" %}
The feature registry is a [Protobuf representation](https://github.com/feast-dev/feast/blob/master/protos/feast/core/Registry.proto) of Feast metadata. This Protobuf file can be read programmatically from other programming languages, but no compatibility guarantees are made on the internal structure of the registry.
{% endhint %}

The file-based feature registry is a [Protobuf representation](https://github.com/feast-dev/feast/blob/master/protos/feast/core/Registry.proto) of Feast metadata. This Protobuf file can be read programmatically from other programming languages, but no compatibility guarantees are made on the internal structure of the registry.
{% endhint %}
4 changes: 0 additions & 4 deletions docs/getting-started/concepts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,6 @@
[point-in-time-joins.md](point-in-time-joins.md)
{% endcontent-ref %}

{% content-ref url="registry.md" %}
[registry.md](registry.md)
{% endcontent-ref %}

{% content-ref url="dataset.md" %}
[dataset.md](dataset.md)
{% endcontent-ref %}
Expand Down
107 changes: 0 additions & 107 deletions docs/getting-started/concepts/registry.md

This file was deleted.

23 changes: 23 additions & 0 deletions docs/reference/registries/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Registies

Please see [Registry](../../getting-started/architecture-and-components/registry.md) for a conceptual explanation of registries.

{% content-ref url="local.md" %}
[local.md](local.md)
{% endcontent-ref %}

{% content-ref url="s3.md" %}
[s3.md](s3.md)
{% endcontent-ref %}

{% content-ref url="gcs.md" %}
[gcs.md](gcs.md)
{% endcontent-ref %}

{% content-ref url="sql.md" %}
[sql.md](sql.md)
{% endcontent-ref %}

{% content-ref url="snowflake.md" %}
[snowflake.md](snowflake.md)
{% endcontent-ref %}
23 changes: 23 additions & 0 deletions docs/reference/registries/gcs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# GCS Registry

## Description

GCS registry provides support for storing the protobuf representation of your feature store objects (data sources, feature views, feature services, etc.) uing Google Cloud Storage.

While it can be used in production, there are still inherent limitations with a file-based registries, since changing a single field in the registry requires re-writing the whole registry file. With multiple concurrent writers, this presents a risk of data loss, or bottlenecks writes to the registry since all changes have to be serialized (e.g. when running materialization for multiple feature views or time ranges concurrently).

An example of how to configure this would be:

## Example

{% code title="feature_store.yaml" %}
```yaml
project: feast_gcp
registry:
path: gs://[YOUR BUCKET YOU CREATED]/registry.pb
cache_ttl_seconds: 60
online_store: null
offline_store:
type: dask
```
{% endcode %}
23 changes: 23 additions & 0 deletions docs/reference/registries/local.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Local Registry

## Description

Local registry provides support for storing the protobuf representation of your feature store objects (data sources, feature views, feature services, etc.) in local file system. It is only intended to be used for experimentation with Feast and should not be used in production.

There are inherent limitations with a file-based registries, since changing a single field in the registry requires re-writing the whole registry file. With multiple concurrent writers, this presents a risk of data loss, or bottlenecks writes to the registry since all changes have to be serialized (e.g. when running materialization for multiple feature views or time ranges concurrently).

An example of how to configure this would be:

## Example

{% code title="feature_store.yaml" %}
```yaml
project: feast_local
registry:
path: registry.pb
cache_ttl_seconds: 60
online_store: null
offline_store:
type: dask
```
{% endcode %}
23 changes: 23 additions & 0 deletions docs/reference/registries/s3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# S3 Registry

## Description

S3 registry provides support for storing the protobuf representation of your feature store objects (data sources, feature views, feature services, etc.) in S3 file system.

While it can be used in production, there are still inherent limitations with a file-based registries, since changing a single field in the registry requires re-writing the whole registry file. With multiple concurrent writers, this presents a risk of data loss, or bottlenecks writes to the registry since all changes have to be serialized (e.g. when running materialization for multiple feature views or time ranges concurrently).

An example of how to configure this would be:

## Example

{% code title="feature_store.yaml" %}
```yaml
project: feast_aws_s3
registry:
path: s3://[YOUR BUCKET YOU CREATED]/registry.pb
cache_ttl_seconds: 60
online_store: null
offline_store:
type: dask
```
{% endcode %}
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Snowflake registry
# Snowflake Registry

## Description

Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,4 @@
---
description: >-
Tutorial on how to use the SQL registry for scalable registry updates
---

# Using Scalable Registry
# SQL Registry

## Overview

Expand Down

0 comments on commit 42d659f

Please sign in to comment.