Python is an extremely popular programming language that has quickly become the language of choice for programmers, data scientists, and computational social scientists alike due to its rich ecosystem of statistical and data management libraries, a vibrant open source community, and its clean and easy to read syntax.
While Python can be installed by itself from the official website, for most social science researchers, it makes much more sense to install Python as part of the Anaconda distribution. Anaconda is data science platform that supports all platforms—Windows, macOS, and Linux—and makes installing and managing Python and its scientific computing libraries a breeze.
What You’ll Learn
- How to install Python as part of the Anaconda distribution.
- How to install packages with
conda
andpip
. - How to create and manage programming environments with
conda
.
What You’ll Need
- A computer running a recent version of Windows, macOS, or Linux.WWWW
Let’s get started!
Note: All tutorial examples will be illustrated using a Windows PC.
Download the Installer
- Click here to go to the Windows Anaconda download page (Mac users click here).
- On the site, you will see the following download landing page:
- Click the green
Download
button. - Save the
exe
file to a folder of your choice. - When the download completes, continue to the next section of this tutorial.
Installation
- Navigate to where you downloaded Anaconda.
Note: On PC, this will be a
.exe
file. On macOS, this will be a.dmg
file.
- Double click the file to launch the installer.
- Click
Next
. - Read and accept the licensing the agreement.
- Choose if you want to install Anaconda for only you (“Just Me”) or every account on your computer.
- As a rule of thumb, it is probably best to choose “Just Me."
- Choose where you’d like to install Anaconda and click
Next
.
Note: It is generally safest and recommended to install Anaconda in the default directory.
- Important: Make sure both
Add Anaconda3 to my PATH environment variable
andRegister Anaconda3 as my default Python 3.x
are selected!
- Click the
Install
button (this can take several minutes). - Anaconda will offer you the option to install optional software packages. Choose if you’d like to try these software. Do note that they are completely optional: Anaconda will function perfectly fine without them.
- When the installer completes, click
Finish
to exit the installer.
Note: We will be illustrating how to use Anaconda from your computer’s terminal (e.g.,
command line
on Windows,Terminal
on macOS and Linux). This is the recommended usage of Anaconda. If you would like to use the Anaconda GUI, consult the documentation on the Anaconda Navigator application.
The following steps should work on all operating systems.
Check for Anaconda:
- Launch your computer’s terminal app.
- Windows: click
Start
and typecmd
; right-clickCommand Prompt
and clickRun as Administrator
. - macOS: open your
Terminal
app; ifTerminal
is not in your dock, you can find it in theLaunchpad
.
- Windows: click
- In your computer’s terminal window, type
conda
just as below and hit enter.
$ > conda
- Your terminal should output a message similar to the one below:
usage: conda-script.py [-h] [-V] command ...
conda is a tool for managing and deploying applications, environments and packages.
Options:
positional arguments:
command
clean Remove unused packages and caches.
compare Compare packages between conda environments.
config Modify configuration values in .condarc. This is modeled after the git config command. Writes to the user
.condarc file (C:\Users\[USERNAME]\.condarc) by default.
create Create a new conda environment from a list of specified packages.
help Displays a list of available conda commands and their help strings.
info Display information about current conda install.
init Initialize conda for shell interaction. [Experimental]
install Installs a list of packages into a specified conda environment.
list List linked packages in a conda environment.
package Low-level conda package utility. (EXPERIMENTAL)
remove Remove a list of packages from a specified conda environment.
uninstall Alias for conda remove.
run Run an executable in a conda environment. [Experimental]
search Search for packages and display associated information. The input is a MatchSpec, a query language for conda
packages. See examples below.
update Updates conda packages to the latest compatible version.
upgrade Alias for conda update.
optional arguments:
-h, --help Show this help message and exit.
-V, --version Show the conda version number and exit.
conda commands available from other packages:
env
- If you see this output, Anaconda has been successfully installed.
- You should also have two new applications installed on your computer:
Anaconda Navigator
andAnaconda Prompt
.
- You should also have two new applications installed on your computer:
conda
is shorthand for Anaconda, and its the command we will use to call Anaconda from your computer’s terminal to install and manage all of your Python packages.
Check for Python:
- In your terminal window, type
python
as below:
$ > python
- Your terminal should now launch a Python prompt, as indicated by the following output:
Python 3.8.10 (default, May 19 2021, 13:12:57) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
- If you see this output, Python has been successfully installed.
Now, let’s learn about creating and managing environments with conda
.
What are environments?
In Python, a programming environment (also known as a “virtual environment” or just an “environment”) is a self-contained ecosystem containing:
- Your Python interpreter.
- The Python libraries you have installed with
conda
orpip
. - The collection of relevant scripts related to your libraries and Python interpreter.
Python environments are isolated from one another, meaning the packages and scripts installed in one environment do not interfere or interact with the packages and scripts in another environment.
Why use environments?
You should always use a fresh and unique environment for every project you work on. There are four main reasons for this:
- Only installing the libraries necessary for a project to avoid dependency conflicts.
- Avoiding library version conflicts (e.g., some packages require different versions of the same library).
- Installing and using different versions Python.
- Allowing for easy reproducibility of your research and programming environment by providing a list of packages that other scholars can quickly install.
Creating Environments with conda
We can create, activate, and manage environments using conda
in your computer’s terminal.
- Launch your computer’s terminal.
- We’re now going to create an environment. I am just going to call it
test
for now. - The syntax for creating a
conda
environment follows this order:- call
conda
- call the
create
command - provide a name for your environment
- call
- Here’s the code:
$ > conda create --name test
- After running the script, you should have an output that looks like this:
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##
environment location: **YOUR_PATH_HERE**\test
Proceed ([y]/n)?
- Type
y
and hit enter to continue. - You should see the following output:
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate test
#
# To deactivate an active environment, use
#
# $ conda deactivate
Using the Environment
In order to use your conda
environment, you need to call the activate
command in your terminal.
Since we named our environment test
, we will tell conda
to activate the test
environment:
$ > conda activate test
Your terminal should now look something like this (note: this will vary slightly by operating system):
$ (test) >
The parentheses around test
tells you your current active conda
environment.
Closing the Environment
When you are done using your environment, make sure you deactivate it. This is accomplished easily with the following command:
$ (test) > conda deactivate
Installing Packages
There are two primary ways to install packages in Python: (1.) using conda
and (2.) using pip
.
It is recommended to use conda
when you can, but in some cases, some libraries are not available
in conda
’s repository. In that case, use pip
. It is also easiest to install packages loaded
from GitHub with pip
as well.
Installing Packages with Conda
Note: Before installing packages, make sure you activate your
conda
environment.
Usually, you will want to use conda
to install Python packages. This can be accomplished
easily from the terminal using the install
argument:
$ > conda install PACKAGE_NAME_HERE
An important caveat is that conda
has differnt channels for installing packages. If you do
not specify the channel when you install the package, the installation could take a very long time.
Anaconda provides code snippets for all packages in its package repository. Click here
to search the repository.
Here’s an example of choosing a channel when installing the numpy
package:
$ > conda install -c conda-forge numpy
By calling -c
and conda-forge
, conda
knows to install numpy
from the conda-forge
channel.
Often, specifying -c conda-forge
will work, but consult the Anaconda documentation for your particular
package.
Installing Packages with Pip
pip
is Python’s default package manager. While it comes installed with Python, it is a good
idea to install pip
with conda
within each of your programming environments. This prevents
pip
from installig packages globally and creating conflicts:
$ > conda install pip
Sometimes, conda
will not have a package that you want, and you will need to use pip
.
In this case, pip
is very similar to conda
. Let’s install the Tabulate
package with pip as an example:
$ > pip install tabulate
Note: Unlike
conda
,pip
does not need to choose a channel to optimize downloads and installs.
pip
is also great when you want to install a pacakge from GitHub,
like so:
$ > pip install git+https://github.com/tabulate/tabulate.git
Uninstalling Packages
Uninstalling packages follows the same basic syntax:
$ > conda uninstall PACKAGE_NAME_HERE
$ > pip uninstall PACKAGE_NAME_HERE
Due to the potential for dependency conflicts, however, you are unlikely to uninstall packages much, if at all.
Note: Regardless of what compiler or IDE you are using, you must execute your Python scripts from within the environment you have installed your libraries in to use your libraries.
Using Packages
To use the packages you install, you have to import them into your Python program, whether that be a standard Python script, an IDE, or a Jupyter Notebook (this assumes you have activated the appropriate programming environment).
# Import the default datetime library
import datetime
# Now you can use the methods from the tabulate package:
date = datetime.datetime(2021, 1, 1)
# Print year:
print(date.year)
>>> 2021
Alternatively, you can import submodules of a library to call them directly:
from datetime import datetime
date = datetime(2021, 1, 1)
print(date.year)
>>> 2021
Aliases
Many Python packages are imported as aliases, which are usually shorthand representations
of the full library name. For example, two extremely popular libraries that you will use
on a regular basis, NumPy and Pandas, are aliased as np
and pd
, respectively:
import numpy as np
import pandas as pd
# Now you can call methods from these libraries with less text:
array_one = np.array([1,2,3,4])
array_one.reshape(2,2)
>>> array([[1, 2]
[3, 4]])
# Load a file:
df = pd.read_csv("your_datac.csv")
df.head()
>>> "HEAD_OF_YOUR_DATA_HERE"
Summary
Figuring out the submodules and aliases of the Python packages you use is something that comes from experience. Consult the API documentation of the packages you use to see specific examples about usage, imports, submodles, and aliases.
Conclusion
Congratulations! You have completed this tutorial!
You learned:
- How to install Python with the Anaconda distribution.
- How to create programming environments for your projects.
- How to use
conda
andpip
to install Python libraries. - How to import libraries into your Python programs.
The ins-and-outs of knowing which methods to use from the libraries you install, the specific syntax of libraries and submodules, and the common aliases used by different Python libraries is something that only comes with experience.
So get out there, code, and have some fun!
Important Resources
Here are some useful resources related to this tutorial: