To use Spark2 in Sparklyr, set up the environment variable “SPARK_HOME” like this:
> Sys.setenv(SPARK_HOME='/usr/hdp/current/spark2-client')
> library(sparklyr)
> sc <- spark_connect(master='yarn-client')
> spark_version(sc)
[1] ‘2.1.1.2.6.1.0’
Note! If this doesn’t work, and you experience a lag from the spark-connect command of more than 10 seconds, then it is plausible that your Kerberos ticket has to be renewed (use commandkinit
):
- Disconnect the process in RStudio by clicking the red STOP-icon.
- Open a SSH shell or choose Tools > Shell in RStudio and write:
kinit $USER
- Now you should be able to connect to Spark in RStudio.