Local Development | NVIDIA Cloud Functions

Run the full NVCF self-hosted control plane on your laptop using k3d for development, testing, or demos.

This setup is for local development only. It uses fake GPUs, a single Cassandra replica, and ephemeral storage. Do not use this for production workloads.

Assumptions

This guide assumes:

Helm charts are pulled from the NGC registry (nvcr.io/0833294136851237/nvcf-ncp-staging)
Container images are pulled from the same NGC registry
Image pull secrets are configured in the environment YAML using imagePullSecrets to authenticate with NGC

If you are using a different registry (e.g., Amazon ECR, a private Harbor instance, or a local mirror), update the helm.sources and image sections in the environment file and adjust the pull secret configuration accordingly. See self-hosted-image-mirroring for details on mirroring artifacts to other registries.

A ready-to-use k3d configuration and setup script is available in the nv-cloud-function-helpers repository. Clone it and run ./setup.sh to create the cluster with all prerequisites. The script is the source of truth for local cluster bootstrap. The manual commands below are for debugging and recovery. After the script completes, skip to [Deploy the NVCF Stack].

Prerequisites

Install the following tools:

Docker (running)
k3d v5.x or later
kubectl
helm >= 3.12
helmfile >= 1.1.0, < 1.2.0
helm-diff plugin (helm plugin install https://github.com/databus23/helm-diff)
NGC API Key from ngc.nvidia.com with access to the NVCF chart/image registry

Step 1: Create the k3d Cluster

Save the following configuration as k3d-config.yaml:

1 apiVersion: k3d.io/v1alpha5
2 kind: Simple
3 metadata:
4   name: ncp-local
5 
6 image: rancher/k3s:v1.30.2-k3s2
7 servers: 1
8 agents: 5
9 
10 ports:
11   - port: 8080:80
12     nodeFilters:
13       - loadbalancer
14   - port: 8443:443
15     nodeFilters:
16       - loadbalancer
17 
18 options:
19   k3d:
20     wait: true
21   k3s:
22     extraArgs:
23       - arg: "--disable=traefik"
24         nodeFilters:
25           - server:*
26     nodeLabels:
27       - label: run.ai/simulated-gpu-node-pool=default
28         nodeFilters:
29           - agent:3
30           - agent:4
31       - label: nvidia.com/gpu.family=hopper
32         nodeFilters:
33           - agent:3
34           - agent:4
35       - label: nvidia.com/gpu.machine=NVIDIA-DGX-H100
36         nodeFilters:
37           - agent:3
38           - agent:4
39       - label: nvidia.com/cuda.driver.major=535
40         nodeFilters:
41           - agent:3
42           - agent:4

This creates a 6-node cluster: 1 server (control plane) and 5 agents. Agents 3 and 4 are pre-labeled for the fake GPU operator. Traefik is disabled because NVCF uses Envoy Gateway.

Create the cluster:

$ k3d cluster create --config k3d-config.yaml

Verify:

$ kubectl get nodes
$ # Expected: 6 nodes (1 server + 5 agents), all Ready

Step 2: Install the Fake GPU Operator

The fake GPU operator simulates GPU resources on the pre-labeled nodes so the NVCA agent can discover them. See fake-gpu-operator for full details.

$ # Install KWOK (required by the fake GPU operator)
$ kubectl apply -f https://github.com/kubernetes-sigs/kwok/releases/download/v0.7.0/kwok.yaml
$ kubectl wait --for=condition=Available deployment/kwok-controller -n kube-system --timeout=60s
$ 
$ # Install the fake GPU operator
$ helm repo add fake-gpu-operator \
>   https://runai.jfrog.io/artifactory/api/helm/fake-gpu-operator-charts-prod --force-update
$ 
$ helm upgrade -i gpu-operator fake-gpu-operator/fake-gpu-operator \
>   -n gpu-operator --create-namespace \
>   --set 'topology.nodePools.default.gpuCount=8' \
>   --set 'topology.nodePools.default.gpuProduct=NVIDIA-H100-80GB-HBM3' \
>   --set 'topology.nodePools.default.gpuMemory=81559'

If Helm fails with RuntimeClass "nvidia" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata, rerun the helper repository ./setup.sh. The script removes known stale fake GPU operator resources without deleting the k3d cluster. For the canonical recovery workflow, see examples/self-hosted-local-development/README.md.

Verify fake GPUs appear on the labeled nodes:

$ kubectl get nodes -o custom-columns="NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu"
$ # Agents 3 and 4 should show GPU: 8

Step 3: Install CSI SMB Driver

The CSI SMB driver is required for NVCA shared model cache storage:

$ helm repo add csi-driver-smb \
>   https://raw.githubusercontent.com/kubernetes-csi/csi-driver-smb/master/charts
$ 
$ helm install csi-driver-smb csi-driver-smb/csi-driver-smb \
>   -n kube-system --version v1.17.0

Deploy the NVCF Stack

With the cluster ready, use the Quickstart for the one-click CLI flow. The local k3d quickstart installs the stack, registers the local cluster, and installs NVCA.

The steps below document the manual Helmfile flow and call out the local-specific differences for each step.

Step 1 (Ingress)

Follow as documented, but skip the cloud-provider annotations on the Gateway resource. k3d handles LoadBalancer services automatically via its built-in klipper-lb.

Step 2 (Environment file)

Create a local development environment file from the template below (local-dev-env.yaml). Save it as environments/<name>.yaml (e.g., environments/my-local.yaml) in your nvcf-self-managed-stack directory.

environments/my-local.yaml

1 # NVCF Self-Hosted Local Development Environment
2 # For use with k3d clusters. See the Local Development guide for setup instructions.
3 #
4 # Save this file as environments/<name>.yaml in your nvcf-self-managed-stack directory.
5 # Create a matching secrets/<name>-secrets.yaml file with your registry credentials.
6 # Deploy with: HELMFILE_ENV=<name> helmfile sync
7 
8 global:
9   # Domain for local access (routes use .localhost TLD)
10   domain: "localhost"
11 
12   # Helm chart registry (where helmfile pulls OCI charts from)
13   helm:
14     sources:
15       registry: nvcr.io
16       repository: 0833294136851237/nvcf-ncp-staging
17 
18   # Container image registry (where Kubernetes pulls images from)
19   image:
20     registry: nvcr.io
21     repository: 0833294136851237/nvcf-ncp-staging
22 
23   # Pull secret created by create-nvcr-pull-secrets.sh (run once before deploying)
24   imagePullSecrets:
25     - name: nvcr-pull-secret
26 
27   # Disable node selectors for local development (pods schedule on any node)
28   nodeSelectors:
29     enabled: false
30 
31   # k3d uses the local-path StorageClass by default
32   storageClass: local-path
33   storageSize: 2Gi
34 
35   observability:
36     tracing:
37       enabled: false
38       collectorEndpoint: ""
39       collectorPort: 4317
40       collectorProtocol: http
41 
42 # Single Cassandra replica for local development
43 cassandra:
44   enabled: true
45   replicaCount: 1
46   jvm:
47     # Fast startup options -- only safe with a single replica.
48     # Do NOT use these settings with multiple replicas.
49     extraOpts: "-Dcassandra.superuser_setup_delay_ms=100 -Dcassandra.gossip_settle_min_wait_ms=1000"
50 
51 nats:
52   enabled: true
53 
54 openbao:
55   enabled: true
56   migrations:
57     issuerDiscovery:
58       enabled: true
59 
60 # Gateway configuration matching the local k3d setup script
61 ingress:
62   gatewayApi:
63     enabled: true
64     controllerNamespace: envoy-gateway-system
65     routes:
66       nvcfApi:
67         routeAnnotations: {}
68       apiKeys:
69         routeAnnotations: {}
70       invocation:
71         routeAnnotations: {}
72       grpc:
73         routeAnnotations: {}
74     gateways:
75       shared:
76         name: shared-gw
77         namespace: envoy-gateway-system
78         listenerName: http
79       grpc:
80         name: grpc-gw
81         namespace: envoy-gateway-system
82         listenerName: tcp

This template is pre-configured for local development:

Storage: local-path (2Gi volumes, the default k3d StorageClass)
Cassandra: Single replica with fast startup JVM options
Node selectors: Disabled (pods schedule on any available node)
Registry: nvcr.io/0833294136851237/nvcf-ncp-staging
Gateway: shared-gw and grpc-gw in envoy-gateway-system namespace
Domain: localhost
imagePullSecrets: Pre-configured to reference nvcr-pull-secret (created in Step 4)

Step 3 (Secrets)

Create secrets/<name>-secrets.yaml (e.g., secrets/my-local-secrets.yaml) from the template in the control plane guide. The file name must match your environment name. Fill in your NGC base64-encoded credentials for the NGC org you’ll be deploying function images from:

$ echo -n '$oauthtoken:YOUR_NGC_API_KEY' | base64

Step 4 (Pull secrets)

Run the helper script to create the nvcr-pull-secret Kubernetes secret in all NVCF namespaces:

$ export NGC_API_KEY="<your-ngc-api-key>"
$ bash samples/scripts/create-nvcr-pull-secrets.sh

The environment file template from Step 2 already references this secret via imagePullSecrets.

Step 5 (Deploy)

Authenticate helm and deploy using your environment name:

$ helm registry login nvcr.io -u '$oauthtoken' -p "$NGC_API_KEY"
$ HELMFILE_ENV=<name> helmfile sync

Replace <name> with the name you chose for your environment file (e.g., my-local).

Step 6 (Verify)

Check that all pods are running:

$ kubectl get pods -A -o wide
$ # All pods should be Running or Completed
$ 
$ helm list -A
$ # All releases should show STATUS: deployed

Verify the NVCA agent discovered the fake GPUs:

$ kubectl get nvcfbackends -n nvca-operator
$ # Expected: nvcf-default   healthy
$ 
$ kubectl get nvcfbackends -n nvca-operator -o jsonpath='{.items[0].status.gpuUsage}' | python3 -m json.tool
$ # Expected: {"H100": {"available": 16, "capacity": 16}}

Verify API connectivity using the .localhost routing (not the Gateway address, which is cluster-internal on k3d):

$ # Generate an admin token
$ export NVCF_TOKEN=$(curl -s -X POST "http://api-keys.localhost:8080/v1/admin/keys" \
>   | python3 -c "import sys,json; print(json.load(sys.stdin)['value'])")
$ 
$ echo "Token: ${NVCF_TOKEN:0:20}..."
$ 
$ # List functions (should return empty)
$ curl -s "http://api.localhost:8080/v2/nvcf/functions" \
>   -H "Authorization: Bearer ${NVCF_TOKEN}" | python3 -m json.tool
$ # Expected: {"functions": []}

The standard control plane verification commands use the Gateway address from kubectl get gateway. On k3d this returns a cluster-internal IP that is not reachable from the host. Use localhost:8080 with .localhost hostnames instead, as shown above.

Accessing Routes Locally

NVCF routes use the .localhost top-level domain, which resolves to 127.0.0.1 automatically on most systems. Access services via the k3d load balancer on port 8080:

http://api.localhost:8080 — NVCF API
http://api-keys.localhost:8080 — API Keys service
http://sis.localhost:8080 — SIS service used during cluster registration
http://invocation.localhost:8080 — Function invocation

If .localhost does not resolve automatically, add entries to /etc/hosts:

127.0.0.1 api.localhost
127.0.0.1 api-keys.localhost
127.0.0.1 sis.localhost
127.0.0.1 invocation.localhost

Wildcard subdomains (e.g., <function-id>.invocation.localhost) cannot be added to /etc/hosts. For local testing with dynamic function IDs, add specific entries or use a local DNS resolver such as dnsmasq.

Teardown

$ # Remove the NVCF stack (use your environment name)
$ HELMFILE_ENV=<name> helmfile destroy
$ 
$ # Delete the k3d cluster
$ k3d cluster delete ncp-local

Limitations

Fake GPUs - Function containers will be scheduled and deployed but cannot execute actual GPU workloads.
Single Cassandra replica - No high availability. Data may be lost on pod restart.
Ephemeral storage - local-path volumes are deleted when the cluster is destroyed.
Not suitable for performance testing - Resource constraints of a laptop do not represent production environments.