Go ahead and read this guide if you run into the error “Exception: Java gateway process exited before sending the driver its port number” in PySpark. Despite its scary look, this mishap can be easily solved.
Solutions For “Exception: Java gateway process exited before sending the driver its port number” Error
This error occurs because your Spark application couldn’t find a valid Java installation on your system.
As we have mentioned here, while PySpark is a Python module, it is only a high-level application programming interface (API) for interacting with Spark under the hood. You still need a full Spark installation, which needs Java to work.
Installing the Java Development Kit (JDK) on your system can solve the problem most of the time.
JDK has several implementations, including popular open-source packages like OpenJDK. But you can get an official installer from Oracle – the developer of Java. It will be needed to run or develop applications written in this programming language.
Go to Oracle’s Java download page and grab an installer for your system. Both Java 17 and 18 are currently supported.
Version 17 is Java’s long-term support (LTS) release at the moment, meaning it receives longer updates from Oracle than normal versions (until September 2024). Java 18 is the newer version, but its support will be superseded by its successor – Java 19 – in September 2022.
Choose Java 17 if you want to have a more stable system, while Java 18 will keep you up-to-date with the latest features of this programming language.
Pick one from three formats of installers: ZIP, EXE, and MSI. It is recommended to choose the EXE or MSI file if you want to have an easy installation process.
Wait for the download to finish. You can verify the integrity of this file by comparing its size and SHA256 hash to information on the download website.
Make sure you have a 64-bit Windows system, and your account has administrative privileges.
Double click the installer in your download folder. Follow instructions on screen – there is no need to change any default setting.
Your JDK installation should be located at “C:\Program Files\Common Files\Oracle\Java” by default. Open File Explorer to verify this.
Windows users also have the option of installing Oracle JDK silently by using the command:
Replace <installer_file> with the actual name of the installer you have just downloaded.
As with Windows, you will need administrative privileges to install Oracle JDK on your Mac machine.
Note that it is compatible with both Intel and Arm versions of macOS at the moment. However, there is no option for single-user JDK installation on this operating system. You can only have a system-wide installation.
Choose the correct installer for your system (the recommended file format is DMG). If you have an Intel-based system, pick x64. For newer Arm versions, choose Arm.
Wait for the download to complete and double-click the DMG file to begin the installation. Follow instructions on the screen. Enter your password when required.
The installation should complete shortly, after which you can delete the DMG installer file to save space.
Most Linux distributions have OpenJDK as the official implementation of Java in their software repositories. This is the recommended method as it makes sure you can have the latest JDK version when updating your Linux system.
Use these commands to install OpenJDK, depending on your distribution.
Ubuntu, Debian, etc.
sudo apt-get install openjdk-8-jre
List available OpenJDK versions:
dnf search openjdk
Install OpenJDK (use the package name of the version you want to install):
sudo dnf install <openjdk-package-name>
sudo pacman -S jdk-openjdk
The error “Pyspark: Exception: Java gateway process exited before sending the driver its port number” occurs when your system lacks a JDK installation. A valid installation of a recent version will get rid of this message.