Chapter 4: Troubleshooting
Common commands for monitoring and troubleshooting the BinderHub deployment.
Kubernetes Dashboard
The Kubernetes dashboard provides a web UI to inspect pods, logs, and cluster resources.
1. Get the dashboard address:
microk8s kubectl get svc -n kube-system kubernetes-dashboard
The dashboard runs as a ClusterIP service. To access it from your browser, use kubectl port-forward:
microk8s kubectl port-forward -n kube-system svc/kubernetes-dashboard 10443:443
Then open https://localhost:10443 in your browser. If the dashboard is exposed via a reverse proxy or Cloudflare Tunnel, use that URL instead.
2. Generate a login token (valid for 2000 hours):
kubectl create token default --duration=2000h
Paste the token into the dashboard login page to authenticate.
Command lines
Watch pod status:
# All namespaces
watch "microk8s.kubectl get pods -A"
# Binder namespace only
watch "microk8s.kubectl get pods -n binder"
Pod logs and details:
kubectl logs -n binder <pod-name> -f
kubectl describe pod -n binder <pod-name>
Common Operations
Manage resource quota — by default there are no resource limits on the binder namespace. resource-quota.yaml can be applied to cap the total number of concurrent pods, CPU, or memory across all user sessions. If the current load exceeds the configured limit, new user pods will not be scheduled until resources are freed.
Edit resource-quota.yaml to set the limits you need, then apply:
kubectl apply -f ./resource-quota.yaml -n binder
To check current usage against the quota:
kubectl get resourcequota -n binder
To remove the quota entirely:
kubectl delete resourcequota binderhub -n binder
Delete all user pods (prefix jupyter-):
kubectl get pods -n binder --no-headers=true | awk '/jupyter-/{print $1}' | xargs kubectl delete pod -n binder
Manage container images (MicroK8s uses its own containerd, separate from Docker):
# List all images
microk8s ctr images ls
# List images matching a pattern
microk8s.ctr images ls | grep intel4coro
# Delete images matching a pattern
microk8s.ctr images rm $(microk8s.ctr images list | grep <pattern> | awk '{print $1}')
Update BinderHub configuration — after editing binder.yaml:
microk8s.helm upgrade binder --cleanup-on-fail \
jupyterhub/binderhub --version=1.0.0-0.dev.git.3506.hba24eb2a \
--namespace=binder \
-f ./secret.yaml \
-f ./binder.yaml
Upgrade BinderHub version:
- Check the latest version at https://hub.jupyter.org/helm-chart/#development-releases-binderhub
- Run
helm repo update - Replace the
--versionvalue in the deploy command and re-run it
Troubleshooting
Firewall Issues After Reboot
Symptoms: pods cannot communicate, services unreachable.
sudo ufw disable
sudo ufw allow in on cni0
sudo ufw allow out on cni0
sudo ufw default allow routed
sudo ufw enable
Pods Stuck in Pending State
kubectl describe pod -n binder <pod-name>
kubectl get resourcequota -n binder
Common causes:
- Insufficient CPU/memory/GPU resources
- Resource quota exceeded
- Build node selector mismatch
GPU Not Available in Pods
# Verify GPU operator pods are running
kubectl get pods -n gpu-operator-resources
# Check node GPU resources
kubectl describe node <node-name> | grep nvidia
# Restart GPU device plugin if needed
kubectl -n gpu-operator-resources rollout restart daemonset nvidia-device-plugin-daemonset
Binder Pods Fail to Restart After Reboot
Ensure MicroK8s is ready first:
microk8s status --wait-ready
If pods are stuck, a Helm upgrade can recover them:
microk8s.helm upgrade binder --cleanup-on-fail \
jupyterhub/binderhub --version=1.0.0-0.dev.git.3506.hba24eb2a \
--namespace=binder \
-f ./secret.yaml \
-f ./binder.yaml