Skip to content

Commit

Permalink
9 add aliases to splitembed (#13)
Browse files Browse the repository at this point in the history
* Added ability to alias embeddings
  • Loading branch information
gotsysdba authored Sep 20, 2024
1 parent 403f682 commit 6f57920
Show file tree
Hide file tree
Showing 26 changed files with 455 additions and 360 deletions.
21 changes: 18 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,31 @@

The **Oracle AI Microservices Sandbox** provides a streamlined environment where developers and data scientists can explore the potential of Generative Artificial Intelligence (GenAI) combined with Retrieval-Augmented Generation (RAG) capabilities. By integrating **Oracle Database 23ai** AI Vector Search, the Sandbox enables users to enhance existing Large Language Models (LLMs) through RAG.

## Sandbox Features

- [Configuring Embedding and Chat Models](configuration/model_config)
- [Splitting and Embedding Documentation](tools/split_embed)
- [Storing Embedded Documents into the Oracle Database](tools/split_embed)
- [Modifying System Prompts (Prompt Engineering)](tools/prompt_eng)
- [Experimenting with **LLM** Parameters](chatbot)
- [Testing Framework on auto-generated or existing Q&A datasets](test_framework)

## Getting Started

The **Oracle AI Microservices Sandbox** is available to install in your own environment, which may be a developer's desktop, on-premises data center environment, or a cloud provider. It can be run either on a bare-metal, within a container, or in a Kubernetes Cluster.

For more information, including additional information on **Setup and Configuration** please visit the [documentation](https://oracle-samples.github.io/oaim-sandbox)
For more information, including more details on **Setup and Configuration** please visit the [documentation](https://oracle-samples.github.io/oaim-sandbox).

### Prerequisites

- Oracle Database 23ai incl. Oracle Database 23ai Free
- Python 3.11 (for running Bare-Metal)
- Container Runtime e.g. docker/podman (for running in a Container)
- Access to an Embedding and Chat Model:
- API Keys for Third-Party Chat Model
- On-Premises Chat Model
- API Keys for Third-Party Models
- On-Premises Models<sub>\*</sub>

<sub>\*Oracle recommends running On-Premises Models on hardware with GPUs. For more information, please review the [Infrastructure](infrastructure/) documentation.</sub>

#### Bare-Metal Installation

Expand Down Expand Up @@ -48,6 +59,8 @@ To run the application on bare-metal; download the [source](https://github.com/o

1. Navigate to `http://localhost:8501`.

1. [Configure](configuration) the Sandbox.

#### Container Installation

To run the application in a container; download the [source](https://github.com/oracle-samples/oaim-sandbox) and from the top-level directory:
Expand All @@ -68,6 +81,8 @@ To run the application in a container; download the [source](https://github.com/

1. Navigate to `http://localhost:8501`.

1. [Configure](configuration) the Sandbox.

## Contributing

This project welcomes contributions from the community. Before submitting a pull request, please [review our contribution guide](./CONTRIBUTING.md).
Expand Down
5 changes: 3 additions & 2 deletions app/.dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,9 @@
**/__init__.py
**/tests
**/etc
**/.streamlit/config.toml
**/tns_admin
# **/.streamlit/config.toml
# **/tns_admin
**/.pytest_cache
# No shell scripts included except the entrypoint
**/*.sh
!entrypoint.sh
28 changes: 11 additions & 17 deletions app/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,26 +11,20 @@
### pip3 install -r app/requirements.txt

## Top-Level brings in the required dependencies, if adding modules, try to find the minimum required
beautifulsoup4==4.12.3
griffe==0.48.0
giskard[llm]==2.15.0
langchain==0.2.16
langchain_community==0.2.16
langchain_huggingface==0.0.3
langchain_ollama==0.1.3
langchain_openai==0.1.23
llama_index==0.11.5
giskard[llm]==2.15.1
IPython==8.27.0
langchain_community==0.3.0
langchain_huggingface==0.1.0
langchain_ollama==0.2.0
langchain_openai==0.2.0
llama_index==0.11.10
lxml==5.3.0
oci>=2.131.0
oracledb>=2.3.0
plotly==5.24.0
pytest>=8.3.2
setuptools>=74.1.0
oci>=2.0.0
oracledb>=2.0.0
plotly==5.24.1
streamlit==1.38.0
IPython==8.27.0
pypdf==4.3.1

## For Licensing Purposes... this ensures no GPU modules
-f https://download.pytorch.org/whl/cpu/torch
torch==2.4.1+cpu ; sys_platform == "linux"
torch==2.2.2 ; sys_platform == "darwin"
torch==2.2.2 ; sys_platform == "darwin"
2 changes: 1 addition & 1 deletion app/src/content/chatbot.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ def main():
chat_history.clear()
lm_model = st_common.lm_sidebar()
else:
st.error("No chat models are configured and/or enabled.", icon="🚨",)
st.error("No chat models are configured and/or enabled.", icon="🚨")
st.stop()

# RAG
Expand Down
8 changes: 4 additions & 4 deletions app/src/content/import_settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,9 @@ def compare_dicts_recursive(current, uploaded):
nested_diff = compare_dicts_recursive(current[key], uploaded[key])
if nested_diff:
diff[key] = nested_diff
elif current.get(key) != uploaded.get(key):
if uploaded.get(key) != "":
# Report differences for non-dict values
diff[key] = {"current": current.get(key), "uploaded": uploaded.get(key)}
elif current.get(key) != uploaded.get(key) and uploaded.get(key) != "":
# Report differences for non-dict values
diff[key] = {"current": current.get(key), "uploaded": uploaded.get(key)}

return diff

Expand All @@ -59,6 +58,7 @@ def compare_with_uploaded_json(current_state, uploaded_json):


def update_session_state_recursive(session_state, updates):
"""Apply settings to the Session State"""
for key, value in updates.items():
if value == "" or value is None:
# Skip empty string values
Expand Down
1 change: 0 additions & 1 deletion app/src/content/model_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ def initialise_streamlit():
"COSINE": [DistanceStrategy.COSINE],
"EUCLIDEAN_DISTANCE": [DistanceStrategy.DOT_PRODUCT],
"DOT_PRODUCT": [DistanceStrategy.EUCLIDEAN_DISTANCE],
"JACCARD": [DistanceStrategy.JACCARD],
"MAX_INNER_PRODUCT": [DistanceStrategy.MAX_INNER_PRODUCT],
}
logger.info("Initialised Distance Metric Config")
Expand Down
4 changes: 2 additions & 2 deletions app/src/content/oci_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def initialise_streamlit():
if "oci_configured" in state:
return

logger.info("Initialising OCI Configuration")
logger.info("Initializing OCI Configuration")
if "oci_config" not in state:
state.oci_config = oci_utils.initialise()
try:
Expand Down Expand Up @@ -108,7 +108,7 @@ def main():
st.success("OCI API Authentication Tested Successfully", icon="✅")
state.oci_config = test_config
st.success("OCI Configuration Saved", icon="✅")
state.oci_configred = True
state.oci_configured = True
except oci_utils.OciException as ex:
logger.exception(ex, exc_info=False)
st.error(ex, icon="🚨")
Expand Down
Loading

0 comments on commit 6f57920

Please sign in to comment.