Version/Environment (if relevant):
This applies to all versions of Domino.
Issue:
In a Domino Spark cluster, you're using Kerberos to authenticate to the SQL server using PySpark. When adding the Kerberos to Domino spark cluster, you get this Kerberos authentication error:
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
WARN TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1) (12.34.567.89 executor 0): com.microsoft.sqlserver.jdbc.SQLServerException: Integrated authentication failed. ClientConnectionId:1234d56a-78c9-1234-beda-eedc1234b567
You've set the Spark configuration on the code level as well as on the Spark Web UI's Project Integration page. You can successfully run kinit
in the Spark driver (workspace) and see the token via klist
. So you're able to validate klist
in the workspace logs.
I have no name!@spark-123456cf7c89123b456fa789-spark-worker-2:/tmp$ klist
Ticket cache: FILE:/tmp/krb5cc_12345
Default principal: sys-data@AD.COMPANY.COM
Valid starting Expires Service principal
03/29/23 15:28:44 03/29/23 23:28:44 krbtgt/AD.COMPANY.COM@AD.COMPANY.COM
renew until 04/03/23 15:28:44
However, you're unable to view the klist
on the executor Spark workers pods. When you run klist
in the executor and/or the workspace, it'll always look for the default location e.g. /tmp/krb5cc_12345
and you run into the klist: No credentials cache found
error:
I have no name!@spark-1234567f1c23456b789faed1-spark-worker-0:/opt/bitnami/spark$ klist
klist: No credentials cache found (filename: /tmp/krb5cc_12345)
Root Cause:
- When setting environment variables,
spark.executorEnv.ENVNAME
will not work. You need to useENV
in the cluster environment to set them. - When setting multiple
extraJavaOptions
parameters, having one line perextraJavaOptions
will only set the last parameter.
Resolution:
Dockerfile Instructions
Set environment variables within the executor cluster environment's Dockerfile using:
ENV KRB5CCNAME=FILE:/mnt/data/workspace_name/krb5cc_12345
ENV KRB5_CONFIG=/mnt/data/workspace_name/krb5.conf
- This successfully changes the default
ccache
directory to a shared directory (that’s accessible by both driver and executor) so that executors can read theccache
previously generated from the driver during pre-run. - Trying to change it within the
spark.executorEnv.KRB5CCNAME
option or anywhere else will not work. When you try to validate by logging into the executor and performingklist
in the executor and/or the workspace, it'll always look for the default location/tmp/krb5cc_12345
). - Use
ENV KRB5_CONFIG=/mnt/data/workspace_name/krb5.conf
instead ofARG KRB5_CONFIG=/mnt/data/workspace_name/krb5.conf
Pre Run Script
- You need to run
kinit
In the Spark driver's environment’s Pre Run script to get the token and create the cached token on the dataset so that executors can rely on it for authenticating.
extraJavaOptions
parameters
Under Project Settings > Integrations > Apache Spark mode > Spark Configuration Options, when setting multiple extraJavaOptions
, having one line per parameter will not work:
spark_conf.set("spark.executor.extraJavaOptions","-Djavax.security.auth.useSubjectCredsOnly=false")
spark_conf.set("spark.executor.extraJavaOptions","-Djava.security.auth.login.config=/mnt/data/workspace_name/SQLJDBCDriver2.conf")
spark_conf.set("spark.executor.extraJavaOptions","-Djava.security.krb5.conf=/mnt/data/workspace_name/krb5.conf")
spark_conf.set("spark.executor.extraJavaOptions","-Dsun.security.krb5.debug=true")
This is because during runtime, only the one last option (-Dsun.security.krb5.debug=true
in the above example) will be seen in the Spark WebUI environment variables under Spark Web UI > Spark Properties, which highlights that it was overwriting the previous three options.
You need to change runtime code to concatenate multiple spark.driver.extraJavaOptions
parameters into one line with each additional option separated by spaces:
spark_conf.set("spark.executor.extraJavaOptions","-Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/mnt/data/workspace_name/SQLJDBCDriver2.conf -Djava.security.krb5.conf=/mnt/data/workspace_name/krb5.conf -Dsun.security.krb5.debug=true")
Notes/Information:
Additional resources:
- Apache Spark 3.0: By setting
spark.kerberos.renewal.credentials
toccache
in Spark’s configuration, the local Kerberos ticket cache will be used for authentication. Spark will keep the ticket renewed during its renewable life, but after it expires a new ticket needs to be acquired (e.g. by runningkinit
). It’s up to the user to maintain an updated ticket cache that Spark can use. The location of the ticket cache can be customized by setting theKRB5CCNAME
environment variable. - Spark on Domino > Configure Prerequisites: You must configure the PySpark compute environments for workspaces and/or jobs that will connect to your cluster.
- Domino supports Kerberos Authentication including Keytab file based authentication, allowing users to authenticate as themselves when connecting to Kerberos-secured systems. Users can enable Kerberos authentication at the project-level or user-level by uploading a Kerberos keytab and principal into Domino. After set up, Runs started by Kerberos-enabled users or in Kerberos-enabled projects in Domino will automatically run kinit and retrieve the ticket to be able to authenticate.
Comments
0 comments
Please sign in to leave a comment.