See related repository
First, you need to install and launch Rancher Desktop, an open-source container manager, on your local machine.
You have two options when setting up jupyter via the data-science project. Choose from one of the following:
git clone git@gitlab.com:gitlab-data/data-science.git
cd data-science
make setup-jupyter-local
make setup-jupyter-local-no-mamba
make jupyter-local
Included in the environment setup are all of the libraries needed to lint Jupyter notebooks in the repository. When you launch JupyterLab and open a notebook you should see a new "Format Notebook" icon in the task bar of your notebook. Clicking that button will lint your entire notebook.
Alternatively, after completing the above setup instructions run:
make lint
From the root of the data science repo, this will find and correct and issues according to the Black format.
/Users/{your_user_name}/.dbt/profiles.yml
file which does not include your password. profiles.yml
must be placed in this directory in your "home" directory otherwise, you will not be able to connect to Snowflake from your local machine. You can use the example provide here as referenceBy default, the local install will use the data-science folder as the root directory for jupyter. This is not terribly useful when your code, data, and notebooks are in other repositories on your computer. To change, this you will need to create and modify a jupyter notebook config file:
cd repos/data-science
. The config file must be created with the pipenv we setup in the above steps: pipenv run jupyter-lab --generate-config
. This creates the file /Users/{your_user_name}/.jupyter/jupyter_lab_config.py
.#c.ServerApp.root_dir = ''
and replace with c.ServerApp.root_dir = '/the/path/to/other/folder/'
. If unsure, set the value to your repo directory (i.e. c.ServerApp.root_dir = '/Users/{your_user_name}/repos'
). Make sure you remove the #
at the beginning of the line.\{your_user_name}\Any Folder\More Folders\
make jupyter-local
from the data-science directory and your root directory should now be changed to what you specified above.The data science team has created modeling templates that allow you to easily start building predictive models without writing python code from scratch. To enable these templates:
jupyter_lab_config.py
that you created as part of the Mounting a local directory, add the following lines, replacing /Users/{your_user_name}/repos/
with the path to the data-science/templates
repo on your local machine:
c.JupyterLabTemplates.template_dirs = ['/Users/{your_user_name}/repos/data-science/templates']
c.JupyterLabTemplates.include_default = False
By default, rancher will allocate a small percentage of your machine's memory to run containers. This is likely not enough RAM to work with jupyter and python, as data is held in-memory. It is recommended you increase the memory allocation to avoid out-of-memory errors.
/Users/{your_user_name}/.jupyter/lab/user-settings/@jupyterlab/notebook-extension/tracker.jupyterlab-settings
{
"codeCellConfig": {
"codeFolding": true,
"lineNumbers": true,
},
"recordTiming": true,
}