Skip to content

Commit

Permalink
Update 0xxx-tekton-results-db-index.md
Browse files Browse the repository at this point in the history
  • Loading branch information
khrm authored Sep 24, 2024
1 parent cc23514 commit ad86a0e
Showing 1 changed file with 11 additions and 11 deletions.
22 changes: 11 additions & 11 deletions teps/0xxx-tekton-results-db-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,16 +38,16 @@ collaborators: []
<!-- /toc -->

## Summary
TektonResults stores PipelineRuns, TaskRuns, Events as Records. Moreover, these are linked to parent object - PipelineRun or TaskRun via Results Table row.
At present, there are no indexes created for querying these Records/Results. This propose a standard way in which Results can create certain indexes.
TektonResults stores PipelineRuns, TaskRuns, Events as Records. Moreover, these are linked to the parent object - PipelineRun or TaskRun via the `results` Table row.
At present, there are no indexes created for querying these Records/Results. This proposes a standard way in which Tekton Results can create certain indexes.


## Motivation
Different platform have different annotations/labels which they use to filter out records.
Results can't create predefined Indexes. They should be configurable via certain config.
As a single row in Result Table has relation to many rows in Record Table, we should utilize Results Table to generate faster queries.
Different platforms have different annotations/labels which they use to filter out records.
Results can't create predefined Indexes. They should be configurable via certain configs.
As a single row in the `results` Table has a relation to many rows in the `records` Table, we should utilize the `results` Table to generate faster queries.

Results Table row contains annotations and record summary annotations to integrate with different platforms. These platforms can communicate on what labels/annotations to store by these json values to annotation `results.tekton.dev/resultAnnotations` or `results.tekton.dev/recordSummaryAnnotations`. We store these json values as jsonb.
The `results` Table row contains annotations and record summary annotations to integrate with different platforms. These platforms can communicate on what labels/annotations to store by these JSON values to annotation `results.tekton.dev/resultAnnotations` or `results.tekton.dev/recordSummaryAnnotations`. We store these JSON values as JSONB in the annotation or recordSummaryAnnotation column.

Now integrators let's say `workflow` has `components`, `application`

Expand All @@ -70,7 +70,7 @@ Listing non-goals helps to focus discussion and make progress.

### Requirements

-JSONB values in Result table to be indexed based on Tekton Results Admin specified configurationn.
-JSONB values in the `results` table to be indexed based on Tekton Results Admin specified configurations.

## Proposal

Expand All @@ -86,7 +86,7 @@ Let's say a platform `workflow` creates the following labels on Runs:
`workflow.foo-service/component`
`workflow.bar-service/scenario`

Now these labels and values should also be passed as results annotations so that platform can communicate what value to store in annotations/summary annotations row. Ref: https://github.com/tektoncd/results/pull/426/files
Now these labels and values should also be passed as results annotations so that the platform can communicate what value to store in the annotations/summary annotations row. Ref: https://github.com/tektoncd/results/pull/426/files
```
apiVersion: tekton.dev/v1
kind: PipelineRun
Expand All @@ -100,9 +100,9 @@ metadata:
Now, Tekton Results can index all these four values.

One more advantage of having these fields is we don't need to filter out based on `PipelineRun` if platform generates `PipelineRun` and needs to display values from `PipelineRun`.
We can store more relevant but limited amoutn of fields from Run Status or Spec in annotations column of Results Table.
We can store some more relevant but limited number of fields from Run Status or Spec in the annotations column of `results` Table.

Also, making query on Results Table outperforms the query on Records Table because of one to many relations.
Also, even without indexes, making a query on `results` Table outperforms the query on the `records` Table because of one-to-many relations and the `records` table having much more number of rows and data per column.

We have observed in certain productions environment, records rows reaching more than half million for just fifty thousand record. One PipelineRuns having 9 TaskRuns.
A sample of this:
Expand All @@ -112,7 +112,7 @@ SELECT count(*) FROM "records" WHERE parent = 'scanner-build'
SELECT count(*) FROM "results" WHERE parent = 'scanner-build'
59101
```
Unless UI want to show TaskRuns, it should only query Results Table.
Unless UI want to show TaskRuns, it should only query the `results` Table.


## Design Evaluation
Expand Down

0 comments on commit ad86a0e

Please sign in to comment.