Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Elasticsearch Query Parameter Configuration Error: window_size Less Than size #16293

Open
playarikama opened this issue Sep 30, 2024 · 1 comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@playarikama
Copy link

Bug Description

While attempting to execute an Elasticsearch query, a BadRequestError was encountered. The error message indicates that the window_size parameter value is less than the size parameter value, which violates Elasticsearch's validation rules.

"Regarding #12695, it addressed a similar issue, but the same problem occurs when using index.as_retriever."

Version

llama-index 0.11.14

Steps to Reproduce

def _get_vector_store(self):
    return ElasticsearchStore(
        index_name=self.index_name,
        es_cloud_id=os.environ["ELASTICSEARCH_CLOUD_ID"],
        es_api_key=os.environ["ELASTICSEARCH_API_KEY"],
        retrieval_strategy=AsyncDenseVectorStrategy(hybrid=True),
    )

index = VectorStoreIndex.from_vector_store(
        vector_store=self.vector_store,
        storage_context=self.storage_context,
    )
    
    
retriever = index.as_retriever(
        similarity_top_k=100,
        embed_model=self.embedder,
    )

    nodes = retriever.retrieve(query)

Relevant Logs/Tracbacks

elasticsearch.BadRequestError: BadRequestError(400, 'action_request_validation_exception', 'Validation Failed: 1: [rank] requires [window_size: 10] be greater than or equal to [size: 100];')
@playarikama playarikama added bug Something isn't working triage Issue needs to be triaged/prioritized labels Sep 30, 2024
Copy link

dosubot bot commented Sep 30, 2024

Hey @playarikama! I'm here to help you with any bugs, questions, or contributions you have. Let's tackle this Elasticsearch query parameter issue together!

To resolve the BadRequestError in Elasticsearch where the window_size parameter is less than the size parameter when using index.as_retriever in llama-index 0.11.14, you need to ensure that the window_size parameter is set to a value greater than or equal to the size parameter.

In the SentenceWindowNodeParser class, the window_size parameter is defined with a default value of 3. You can adjust this value when creating an instance of SentenceWindowNodeParser to ensure it meets the required condition.

Here is an example of how to set the window_size parameter correctly:

from llama_index.core.node_parser.text.sentence_window import SentenceWindowNodeParser

# Ensure window_size is greater than or equal to size
window_size = 100  # Example value, should be >= size parameter used in index.as_retriever

parser = SentenceWindowNodeParser(window_size=window_size)

# Use the parser as needed

Make sure to replace 100 with a value that is appropriate for your use case and is greater than or equal to the size parameter you are using with index.as_retriever [1].

Additionally, ensure that the size parameter in your ElasticsearchStore class is set correctly. The size parameter is set or modified through the batch_size attribute, which defaults to 200 but can be adjusted during initialization:

class ElasticsearchStore(BasePydanticVectorStore):
    ...
    batch_size: int = 200
    ...

    def __init__(
        self,
        ...
        batch_size: int = 200,
        ...
    ) -> None:
        ...
        self.batch_size = batch_size
        ...

You can modify the batch_size during the instantiation of the ElasticsearchStore class to ensure it aligns with your window_size parameter [2].

To continue talking to Dosu, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

1 participant