How to get Graphframes working in Jupyter

terry.stebbens
terry.stebbens Moderator, Domino Posts: 17 mod

To get Graphframes working in Jupyter notebook when using Spark / PySpark you need to download the Graphframes JAR and copy it to your $SPARK_HOME/jars folder then make a 2nd copy and change the file extension to .zip. Finally you must make sure that your PYTHONPATH environment variable includes the full path to the .zip file.

Here is an example of setting this up in the Dockerfile instructions for an environment build using Graphframes 0.5.0 and Spark 2.1:

RUN wget --quiet https://dl.bintray.com/spark-packages/maven/graphframes/graphframes/0.5.0-spark2.1-s_2.11/graphframes-0.5.0-spark2.1-s_2.11.jar && \
    cp graphframes-0.5.0-spark2.1-s_2.11.jar /opt/spark-2.1.0-bin-hadoop2.6/jars && \
    cp graphframes-0.5.0-spark2.1-s_2.11.jar /opt/spark-2.1.0-bin-hadoop2.6/jars/graphframes-0.5.0-spark2.1-s_2.11.zip && \
    echo 'export PYTHONPATH=${PYTHONPATH:-}:${SPARK_HOME:-}/jars/graphframes-0.5.0-spark2.1-s_2.11.zip' >> /home/ubuntu/.domino-defaults


Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!