Setting up WSL2 for data science projects in Windows


Ubuntu is one of the most popular operating systems among programmers, data engineers, and data scientists. For Windows users, Ubuntu and other Linux distributions are available through WSL2, which stands for Windows Subsystem for Linux. We can easily install Ubuntu in WSL2 and set it up quickly to start working on data science projects. This post shows the steps required to set up the WSL2 for data science projects in Windows from setting up conda environments and libraries to setting up projects.

generic graph

Steps to set up WSL2 for data science projects

Update Windows

Before installing WSL, update Windows 10 or 11 to the latest version by going to Settings > Update & Security.

Install WSL2 on Windows

We install WSL2 and Ubuntu by following the official guides provided by Microsoft. We have two options to install WSL:

TLDR: If you don’t want to follow those guides, just run this command in the Powershell or Windows Command Prompt (with Administrator privileges if required)

wsl --install

After installing WSL and a Linux distribution, set up a username and password for the Linux distribution you installed.

Install or update GPU drivers

If your computer system doesn’t have a GPU, this step is not required.

Next up, install the latest GPU drivers on Windows. We don’t need to install the GPU drivers on Ubuntu because it will use the GPU drivers installed on Windows.

  • For Nvidia GPUs, use GeForce Experience to update the GPU drivers or download them from this link
  • For AMD GPUs, use AMD Radeon Software to update the drivers or download them from here

Install and set up Data Science tools and libraries

Open the Linux terminal by searching Ubuntu or whatever Linux distribution you installed on the Windows search bar.

Install Conda

Conda provides an easy way to set up Python packages and modules in segregated environments. We install the minimal version of Conda called Miniconda.

Download the Miniconda installer by running the following command.

sudo apt install wget
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

Install Miniconda

bash Miniconda3-latest-Linux-x86_64.sh

Create conda environment

The following command creates a new conda environment with the given name. You can name your environment anything you like.

conda create --name data_science

Activate the conda environment

We need to activate the conda environment before using it.

conda activate data_science

Installing data science packages

Run the following command to install any Python packages you want in the conda environment.

conda install numpy pandas

Using Jupyter Lab (Optional)

If you want to work on an interactive Python environment, Jupyter Lab and Jupyter Notebook provide the playground to write and test the code on-the-fly. Jupyterlab is exceptionally popular among data scientists.

Install Jupyterlab

Run the following command to install Jupyterlab via conda-forge.

conda install -c conda-forge jupyterlab

Create a project folder

mkdir projects/data_science
cd projects/data_science

Running Jupyterlab

Start and run Jupyterlab with this command.

jupyter lab

It will open a browser window with the Jupyter environment, or you can go to http://localhost:8888/ in your browser.

From there, we can create a new Jupyter Notebook under “Notebook > Python 3” and start working on data science projects.

WSL2 for data science

Additional conda commands

Deactivating conda environment

conda deactivate

Removing conda environment

conda remove --name data_science --all

This will remove every package installed in the data_science environment and the environment itself.

Cleaning up conda

We can clean up conda and free up space by removing unused packages and caches

conda clean --all -y

Conclusion

In this article, we installed WSL2 and Ubuntu and then set up a new environment for data science projects. For this, we installed Miniconda, created an environment, and installed Jupyterlab. We also saw various commands to manage conda environments.

If you encounter any issues or if you have any questions or suggestions, feel free to post them in the comment section below. You can also read more articles on data science and machine learning.


admin

Tech and programming enthusiast

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *