Version:
All
Issue:
When attempting to diagnose DMM issues, having a clear picture of the status of the spark master and workers can be useful. The Spark UI can can be helpful. Setting up the Spark UI can be configured and viewed using the instructions below.
How to connect to Spark UI:
Spark is part of the DMM Compute ecosystem. Spark runs on master-worker architecture. Although Spark master always runs as a pod in domino platform namespace, a spark worker is active only when a job(training/prediction/ground-truth) runs.
How to connect with Spark master:
-
Use the following command
kubectl port-forward -n <platform-namespace> spark3-master-0 8080:8080).
Alternatively, runk9s
command from the cluster and port-forward thespark-master
to a local port.
-
open port forwarded local port to the spark master UI) i.e:
localhost:8080
Note that Workers, Running Applications and Completed Applications are visible in this UI. If no worker or applications are present, stop the port-forward and port-forward the other spark master pod, spark-master-1
.
How to connect with Spark worker
-
Use the kubectl command
kubectl port-forward -n <compute-namespace> spark3-worker-0 8081:8081
Alternatively, Port-forward Spark worker to a specified local port and browse the same URL. -
Open
localhost:8081
to view the worker UI. -
How to connect with Spark driver
DMM Compute uses spark connector to connect with Spark. So, It becomes extremely important to see spark driver logs in case of troubleshooting. To see spark driver logs,
-
Use the command
kubectl get pod -n <platform-namespace> -l app.kubernetes.io/name=compute
(Alternatively, Select the pod withcompute-xxx-xxx
(compute-66c4d46887-xj2rh in this case in K9s). -
Port forward port number 4040 of this pod to a local port. Use the kubectl command :
kubectl port-forward -n <platform-namespace> <pod-name-from-previous-command> 4040:4040
( in K9s, you need to pressenter
on this pod and then pressshift+f
to port forward it). -
Open the local port to open Spark Driver page which contains information of jobs timeline and Job details. Job description points to the actual code pointer of DMM compute.
How to see Spark Logs
Use the following kubectl commands to fetch the logs from different spark pods.
kubectl logs -n <platform-namespace> spark3-master-0
kubectl logs -n <compute-namespace> spark3-worker-0
kubectl logs -n <platform-namespace> -l app.kubernetes.io/name=compute
Alternatively, using
k9s
, the user can pressl
on a specific pod to see the corresponding logs of that specific pod. While Spark configuration-related information can be found inspark-master
andspark-worker
logs, DMM spark usage troubleshooting information resides incompute-xx-xx
(xx depends on each compute pod) pod. So, please refer tocompute
logs for the same.
Comments
0 comments
Please sign in to leave a comment.