validatedpatterns · dminnear-rh · Jun 25, 2026
diff --git a/content/patterns/lemonade-stand-quickstart/_index.adoc b/content/patterns/lemonade-stand-quickstart/_index.adoc
@@ -0,0 +1,37 @@
+---
+title: Lemonade Stand AI Quickstart
+date: 2026-06-25
+tier: sandbox
+summary: This pattern deploys an AI guardrails demonstration with a multi-layered safety pipeline, interactive chatbot, and real-time monitoring on OpenShift.
+rh_products:
+  - Red Hat OpenShift Container Platform
+  - Red Hat OpenShift AI
+industries:
+  - General
+focus_areas:
+  - AI
+  - Safety
+  - AI Quickstart
+aliases: /lemonade-stand-quickstart/
+links:
+  github: https://github.com/validatedpatterns-sandbox/ai-quickstart-lemonade-stand
+  install: getting-started
+  bugs: https://github.com/validatedpatterns-sandbox/ai-quickstart-lemonade-stand/issues
+  feedback: https://docs.google.com/forms/d/e/1FAIpQLScI76b6tD1WyPu2-d_9CCVDr3Fu5jYERthqLKJDUGwqBg7Vcg/viewform
+---
+:toc:
+:imagesdir: /images
+:_content-type: ASSEMBLY
+include::modules/comm-attributes.adoc[]
+
+include::modules/lemonade-stand-quickstart-about.adoc[leveloffset=+1]
+
+include::modules/lemonade-stand-quickstart-architecture.adoc[leveloffset=+1]
+
+[id="next-steps-lemonade-stand-quickstart"]
+== Next steps
+
+* link:getting-started[Install this pattern]
+* link:cluster-sizing[Cluster sizing]
+* link:customizing-this-pattern[Customizing this pattern]
+* link:troubleshooting[Troubleshooting]
diff --git a/content/patterns/lemonade-stand-quickstart/cluster-sizing.adoc b/content/patterns/lemonade-stand-quickstart/cluster-sizing.adoc
@@ -0,0 +1,29 @@
+---
+title: Cluster sizing
+weight: 30
+aliases: /lemonade-stand-quickstart/cluster-sizing/
+---
+
+:toc:
+:imagesdir: /images
+:_content-type: ASSEMBLY
+include::modules/comm-attributes.adoc[]
+include::modules/ai-quickstart-lemonade-stand/metadata-ai-quickstart-lemonade-stand.adoc[]
+
+include::modules/cluster-sizing-template.adoc[]
+
+[id="lemonade-stand-quickstart-gpu-node-requirements"]
+== GPU node requirements
+
+In addition to the worker nodes listed above, this pattern requires at least 1 GPU-equipped node for LLM inference. On AWS, the pattern automatically provisions a `g5.2xlarge` instance with an NVIDIA A10G GPU. On other providers and bare metal, a GPU node must already be part of the cluster before deploying the pattern.
+
+.GPU node minimum requirements
+[cols="<,^,<,<"]
+|===
+| Cloud provider | Node type | Number of nodes | Instance type
+
+| Amazon Web Services
+| GPU Worker
+| 1
+| g5.2xlarge
+|===
diff --git a/content/patterns/lemonade-stand-quickstart/customizing-this-pattern.adoc b/content/patterns/lemonade-stand-quickstart/customizing-this-pattern.adoc
@@ -0,0 +1,138 @@
+---
+title: Customizing this pattern
+weight: 20
+aliases: /lemonade-stand-quickstart/customizing/
+---
+
+:toc:
+:imagesdir: /images
+:_content-type: ASSEMBLY
+include::modules/comm-attributes.adoc[]
+
+[id="customizing-lemonade-stand-quickstart"]
+== Customizing the Lemonade Stand AI Quickstart pattern
+
+This pattern deploys an AI chatbot with a multi-layered guardrails pipeline, including model-based detectors, a rule-based language detector, and regex-based competitor filtering. You can customize the LLM model, detector configuration, and monitoring settings.
+
+[id="changing-model-lemonade-stand"]
+=== Changing the LLM model
+
+The pattern serves Llama 3.2 3B Instruct (FP8-quantized) by default through vLLM on KServe. The model is defined in the lemonade-stand-assistant Helm chart's `values.yaml`.
+
+To change the locally served model, update the model configuration in the Helm chart values. The model must be compatible with vLLM and fit within the available GPU VRAM on the provisioned node (NVIDIA A10G with 24 GB VRAM on `g5.2xlarge`).
+
+[id="using-external-model-lemonade-stand"]
+=== Using an external model endpoint (BYOM)
+
+Instead of serving a model locally on GPU, you can configure the pattern to use an external Model-as-a-Service endpoint. This eliminates the GPU node requirement for inference.
+
+. Make a local copy of the secrets template outside of your repository:
++
+[WARNING]
+====
+Do not add, commit, or push this file to your repository. Doing so might expose personal credentials to GitHub.
+====
++
+[source,terminal]
+----
+$ cp values-secret.yaml.template ~/values-secret-ai-quickstart-lemonade-stand.yaml
+----
+
+. Edit the secrets file and set the API key for your external model endpoint:
++
+[source,terminal]
+----
+$ vim ~/values-secret-ai-quickstart-lemonade-stand.yaml
+----
++
+[source,yaml]
+----
+  - name: lemonade-stand
+    vaultPrefixes:
+    - global
+    fields:
+    - name: vllm-api-key
+      value: <your-external-api-key>
+----
+
+. Set the `model` section in the Helm chart values to point to your external endpoint:
++
+[source,yaml]
+----
+model:
+  name: my-model
+  endpoint: my-maas-instance
+  port: 443
+----
+
+When using an external model endpoint, the vLLM InferenceService is not deployed and the GPU node is not required for LLM inference. The guardrails pipeline continues to function normally with the external model.
+
+[id="enabling-gpu-detectors-lemonade-stand"]
+=== Enabling GPU for detector models
+
+By default, the HAP and prompt injection detector models run on CPU. You can enable GPU acceleration for these models to reduce inference latency, but this requires additional GPU resources.
+
+To enable GPU for the detector models, set the `useGpu` flag in the Helm chart values:
+
+[source,yaml]
+----
+detectors:
+  hap:
+    useGpu: true
+  promptInjection:
+    useGpu: true
+----
+
+[NOTE]
+====
+Enabling GPU for both detectors requires 2 additional GPUs beyond the 1 GPU used for the LLM, for a total of 3 GPUs. You must provision additional GPU nodes before enabling this option.
+====
+
+[id="configuring-detector-thresholds-lemonade-stand"]
+=== Configuring detector thresholds
+
+The guardrails pipeline uses three detector models, each with a configurable detection threshold. Lower thresholds increase sensitivity (block more content) while higher thresholds reduce false positives.
+
+The default thresholds are:
+
+[cols="1,1,2",options="header"]
+|===
+| Detector | Default threshold | Description
+
+| IBM Granite Guardian HAP
+| 0.5
+| Hate speech, abuse, and profanity detection
+
+| DeBERTa v3 Prompt Injection
+| 0.5
+| Prompt injection and jailbreak detection
+
+| Lingua Language
+| 0.88
+| English language confidence threshold
+|===
+
+To adjust detector thresholds, modify the Guardrails Orchestrator configuration in the `fms-orchestr8-config-nlp` ConfigMap within the lemonade-stand-assistant Helm chart.
+
+[id="configuring-regex-detector-lemonade-stand"]
+=== Configuring the regex detector
+
+The FastAPI application includes a regex-based detector that blocks mentions of competitor fruit names (oranges, apples, bananas, and others) across 13+ languages. This detector runs locally in the application before the request reaches the Guardrails Orchestrator.
+
+To modify the blocked terms or supported languages, edit the regex patterns in the `app_fastapi.py` file in the lemonade-stand-assistant repository.
+
+[id="configuring-shiny-dashboard-lemonade-stand"]
+=== Adjusting the monitoring dashboard
+
+The R Shiny dashboard polls the FastAPI application's `/metrics` endpoint to display guardrail activation statistics in real time. The default polling interval is 1 second.
+
+To adjust the refresh interval, modify the `shinyDashboard.metrics.refreshInterval` value in the Helm chart values:
+
+[source,yaml]
+----
+shinyDashboard:
+  metrics:
+    refreshInterval: 5
+----
+
+Push your changes to your forked repository so the GitOps framework applies the updated configuration.
diff --git a/content/patterns/lemonade-stand-quickstart/getting-started.adoc b/content/patterns/lemonade-stand-quickstart/getting-started.adoc
@@ -0,0 +1,158 @@
+---
+title: Getting started
+weight: 10
+aliases: /lemonade-stand-quickstart/getting-started/
+---
+
+:toc:
+:imagesdir: /images
+:_content-type: ASSEMBLY
+include::modules/comm-attributes.adoc[]
+
+[id="deploying-lemonade-stand-quickstart-pattern"]
+== Deploying the Lemonade Stand AI Quickstart pattern
+
+.Prerequisites
+
+* An OpenShift cluster (version 4.18 or later). This pattern requires at least 1 NVIDIA GPU node for LLM inference.
+ ** *AWS*: The pattern automatically provisions 1 `g5.2xlarge` GPU worker node (NVIDIA A10G) during installation. No GPU nodes need to be present before you deploy.
+ ** *Other providers and bare metal*: A GPU node must already be part of the OpenShift cluster before you deploy this pattern. The pattern installs all required operators automatically.
+ ** To create an OpenShift cluster, go to the https://console.redhat.com/[Red Hat Hybrid Cloud console].
+ ** Select *OpenShift \-> Red Hat OpenShift Container Platform \-> Create cluster*.
+* The Helm binary. For instructions, see link:https://helm.sh/docs/intro/install/[Installing Helm].
+* The `oc` CLI tool. For instructions, see link:https://docs.openshift.com/container-platform/latest/cli_reference/openshift_cli/getting-started-cli.html[Getting started with the OpenShift CLI].
+* Additional installation tool dependencies. For details, see link:https://validatedpatterns.io/learn/quickstart/[Patterns quick start].
+
+[id="preparing-for-deployment-lemonade-stand"]
+== Preparing for deployment
+.Procedure
+
+. Fork the link:https://github.com/validatedpatterns-sandbox/ai-quickstart-lemonade-stand[ai-quickstart-lemonade-stand] repository on GitHub. You must fork the repository to customize this pattern.
+
+. Clone the forked copy of this repository.
++
+[source,terminal]
+----
+$ git clone git@github.com:your-username/ai-quickstart-lemonade-stand.git
+----
+
+. Go to the root directory of your Git repository:
++
+[source,terminal]
+----
+$ cd ai-quickstart-lemonade-stand
+----
+
+. Run the following command to set the upstream repository:
++
+[source,terminal]
+----
+$ git remote add -f upstream git@github.com:validatedpatterns-sandbox/ai-quickstart-lemonade-stand.git
+----
+
+. Verify the setup of your remote repositories by running the following command:
++
+[source,terminal]
+----
+$ git remote -v
+----
++
+.Example output
++
+[source,terminal]
+----
+origin	git@github.com:your-username/ai-quickstart-lemonade-stand.git (fetch)
+origin	git@github.com:your-username/ai-quickstart-lemonade-stand.git (push)
+upstream	git@github.com:validatedpatterns-sandbox/ai-quickstart-lemonade-stand.git (fetch)
+upstream	git@github.com:validatedpatterns-sandbox/ai-quickstart-lemonade-stand.git (push)
+----
+
+. Optional: To customize the deployment, create and switch to a new branch by running the following command:
++
+[source,terminal]
+----
+$ git checkout -b my-branch
+----
++
+Make your changes, then stage and commit them:
++
+[source,terminal]
+----
+$ git add <changed-files>
+$ git commit -m "Customize deployment"
+----
++
+Push the changes to your forked repository:
++
+[source,terminal]
+----
+$ git push origin my-branch
+----
+
+[id="deploying-cluster-using-patternsh-file-lemonade-stand"]
+== Deploying the pattern by using the pattern.sh file
+
+To deploy the pattern by using the `pattern.sh` file, complete the following steps:
+
+. Log in to your cluster by following this procedure:
+
+.. Obtain an API token by visiting link:https://oauth-openshift.apps.<your_cluster>.<domain>/oauth/token/request[https://oauth-openshift.apps.<your_cluster>.<domain>/oauth/token/request].
+
+.. Log in to the cluster by running the following command:
++
+[source,terminal]
+----
+$ oc login --token=<retrieved-token> --server=https://api.<your_cluster>.<domain>:6443
+----
++
+Or log in by running the following command:
++
+[source,terminal]
+----
+$ export KUBECONFIG=~/<path_to_kubeconfig>
+----
+
+. Deploy the pattern to your cluster. Run the following command:
++
+[source,terminal]
+----
+$ ./pattern.sh make install
+----
+
+.Verification
+
+To verify a successful installation, check the health of the ArgoCD applications:
+
+. Run the following command:
++
+[source,terminal]
+----
+$ ./pattern.sh make argo-healthcheck
+----
++
+It might take several minutes for all applications to synchronize and reach a healthy state. This includes downloading detector models, initializing the GPU operator, and starting the vLLM inference service.
+
+. Verify that the Operators are installed by navigating to *Operators -> Installed Operators* in the {ocp} web console. Confirm the following Operators are present:
++
+* NVIDIA GPU Operator
+* {rhoai}
+* Node Feature Discovery Operator
+* External Secrets Operator
+
+. After all applications are healthy, verify the inference service is serving by running:
++
+[source,terminal]
+----
+$ oc get inferenceservice -A
+----
+
+. Access the Lemonade Stand chatbot UI. Navigate to *Networking -> Routes* in the `lemonade-stand` namespace and open the route URL for the `lemonade-stand` service.
+
+. Access the R Shiny monitoring dashboard. Navigate to *Networking -> Routes* in the `lemonade-stand` namespace and open the route URL for the `shiny-dashboard` service.
+
+[id="next-steps-getting-started-lemonade-stand"]
+== Next steps
+
+* link:customizing-this-pattern[Customizing this pattern]
+* link:cluster-sizing[Cluster sizing]
+* link:troubleshooting[Troubleshooting]