March 17, 2020

1130 words 6 mins read



An introduction to Python and programming for wanna-be data scientists

repo name webartifex/intro-to-python
repo link
language Jupyter Notebook
size (curr.) 3967 kB
stars (curr.) 420
created 2018-09-29
license MIT License

Important: The content is being updated and amended throughout the spring semester of 2020!

An Introduction to Python and Programming

The purpose of this repository is to serve as an interactive “book” for a thorough introductory course on programming in the Python language.

The course’s main goal is to prepare the student for further studies in the “field” of data science.

The “chapters” are written in Jupyter notebooks which are a de-facto standard for exchanging code and results among data science professionals and researchers. They can be viewed in a plain web browser with the help of nbviewer:

However, it is recommended that students install Python and Jupyter locally and run the code in the notebooks on their own. This way, the student can play with the code and learn more efficiently. Precise installation instructions are either in the 00th notebook or further below.

Feedback is encouraged and will be incorporated. Open an issue in the issues tracker or initiate a pull request if you are familiar with the concept.


To be suitable for total beginners, there are no formal prerequisites. It is only expected that the student has:

  • a solid understanding of the English language,
  • knowledge of basic mathematics from high school,
  • the ability to think conceptually and reason logically, and
  • the willingness to invest 2-4 hours a day for a month.


To follow this course, a working installation of Python 3.7 or higher is expected.

A popular and beginner friendly way is to install the Anaconda Distribution that not only ships Python but comes pre-packaged with a lot of third-party libraries from the so-called “scientific stack”. Just go to the download section and install the latest version (i.e., 2019-10 with Python 3.7 at the time of this writing) for your operating system.

Then, among others, you will find an entry “Anaconda Navigator” in your start menu like below. Click on it.

A window opens showing you several applications that come with the Anaconda Distribution. Now, click on “JupyterLab.”

A new tab in your web browser opens with the website being “localhost” and some number (e.g., 8888). This is the JupyterLab application that is used to display and run the Jupyter notebooks mentioned above. On the left, you see the files and folders in your local user folder. This file browser works like any other. In the center, you have several options to launch (i.e., “create”) new files.

Next, to download the course’s materials as a ZIP file, click on the green “Clone or download” button on the top right on this website. Then, unpack the ZIP file into a folder of your choosing, ideally somewhere within your personal user folder so that the files show up right away in JupyterLab.

Alternative Installation (for Instructors)

Python can also be installed in a “pure” way as obtained from its core development team (i.e., without any third-party packages installed). However, this may be too “advanced” for a beginner as it involves working with a terminal emulator, which looks like the one in the picture below and is used without a mouse by typing commands into it.

Assuming that you already have a working version of Python 3.7 or higher installed (cf., the official download page), the following summarizes the commands to be typed into a terminal emulator to get the course materials up and running on a local machine without the Anaconda Distribution. You are then responsible for understanding the concepts behind them.

First, the git command line tool is a more professional way of “cloning” the course materials as compared to downloading them in a ZIP file.

  • git clone

This creates a new folder intro-to-python with all the materials of this repository in it.

Inside this folder, it is recommended to create a so-called virtual environment with Python’s venv module. This must only be done the first time. A virtual environment is a way of isolating the third-party packages installed by different projects, which is considered a best practice.

  • python -m venv venv

The second venv is the environment’s name and by convention often chosen to be venv. However, it could be another name as well.

From then on, each time you want to resume work, go back into the intro-to-python folder inside your terminal and “activate” the virtual environment (venv is the name chosen before).

  • source venv/bin/activate

This may change how the terminal’s command prompt looks.

poetry and virtualenvwrapper are popular tools to automate the described management of virtual environments.

After activation for the first time, you must install the project’s dependencies (= the third-party packages needed to run the code), most notably JupyterLab in this project (the “python -m” is often left out but should not be; if you have poetry installed, you may just type poetry install instead).

  • python -m pip install -r requirements.txt

With everything installed, you can now do the equivalent of clicking the “JupyterLab” entry in the Anaconda Navigator.

  • jupyter lab

This opens a new tab in your web browser just as above.

Interactive Presentation Mode & Live Coding

The requirements.txt file also installs the nbextensions for Jupyter notebooks, the black code formatting tool (incl. the blackcellmagic Jupyter extension) and the RISE Jupyter extension. With them, the instructor can easily re-format code in a class session and execute code in presentation mode.

Note: Currently, the RISE extension only works with the older notebook command.

  • jupyter notebook (so, jupyter lab may not be used).

After installing the dependencies, the instructor must copy the extensions' JavaScript and CSS files into Jupyter’s search directory.

  • jupyter contrib nbextension install --user

Now, the instructor can enable/disable the various Jupyter notebook extensions.

Note: The extension “Collapsible Headings” may interfere with the RISE presentation if hotkeys are enabled.

About the Author

Alexander Hess is a PhD student at the Chair of Logistics Management at the WHU - Otto Beisheim School of Management where he conducts research on urban delivery platforms and teaches an introductory course on Python (cf., Fall Term 2019, Spring Term 2020).

Connect him on LinkedIn.

comments powered by Disqus