# How to Create 2D and 3D Scatter Plots with Python and Matplotlib

## Tested On

• Linux Ubuntu 20.04
• Windows 10
• macOS Catalina

## Prerequisites

Scatter plots are used to find the relationship between two variables, using Cartesian coordinates (coordinates along the x y axes). Unlike bar graphs, that simply plot values for comparison, scatter plots aim to communicate when there's a real or implied continuity (a trend) to the x variable data.

So for example, to better understand the correlation between people's heights and weights, every data point should contain a numeric value for the person's height, which we'll plot along the x-axis, and a numeric value for the person's weight, which we'll plot along the y-axis.

## Prerequisites

It helps to be familiar with Python fundamentals, like data types, loops, functions, conditionals, modules, packages, virtual environments, etc. If you need a crash course, navigate to the Python Developer section of our tutorials.

If you're not familiar with Matplotlib, we recommend that you complete the prerequisites, above. They introduce you to important data science and data visualization concepts, various graph types, and scenarios where certain graphs are more appropriate than others. For example, a line graph is able to connect all points on a dataset with a line, indicating a slope. But a scatter plot is more concerned with expressing a trend, with the use of a regression line. Rather than connect all of the points, it overlays a line through high-concentration clusters in the direction they're leaning towards.

## How to Set Up a Project Skeleton

### How to Create Python Project Files with Windows 10 PowerShell 2.0+

``````cd ~
New-Item -ItemType "directory" -Path ".\matplotlib-bar-project"
cd matplotlib-bar-project
virtualenv venv
.\venv\Scripts\activate``````

To verify that the virtual environment is active, make sure (venv) is in the PowerShell command prompt. For example, (venv) PS C:\Users\username\matplotlib-bar-project>

### How to Create Python Project Files with Linux Ubuntu 14.04+ or macOS

``````cd ~
mkdir matplotlib-bar-project
cd matplotlib-bar-project
virtualenv -p python3 venv
source venv/bin/activate``````

To verify that the virtual environment is active, make sure (venv) is in the terminal command prompt.

This will create the following files and folders, and activate the virtual environment.

``````▾ matplotlib-bar-project/
▸ venv/``````

## Installing Matplotlib with Pip

This tutorial requires you to install a specific version of Matplotlib with pip3 install matplotlib==3.3.3. To get the plot to display in a window, you can install PyQt5 with pip3 install PyQt5==5.15.2.

## Creating a Matplotlib Scatter Plot to Find the Correlation Between Height vs. Weight

Here's an example where we use Python and Matplotlib generate a scatter plot to understand people's weight in relation to their height. Although we have some outliers, the trend is that taller people generally weigh more than shorter people. This example uses lbs and inches. To convert lbs to kg, divide by 2.205, and for inches to cm, divide by 2.54.

``````import matplotlib.pyplot as plt

heights = [62, 62.5, 62.5, 62.5, 63, 63, 63.5, 63.5, 63.5, 63.5, 64, 64, 64, 64, 64.5, 64.5, 64.5, 65, 65, 65, 65, 65, 65, 65.6, 66, 66, 66, 66, 66, 66, 66, 66, 66.5, 67, 67, 67.5, 67.5, 67.5, 67.5, 67.5, 68, 68, 68, 68]
weights = [120, 120, 122, 123, 130, 140, 145, 140, 142, 143, 115, 120, 124, 135, 136, 135, 137, 130, 132, 135, 128, 139, 134, 140, 142, 130, 180, 145, 142, 143, 141, 149, 150, 145, 142, 145, 159, 155, 158, 166, 170, 165, 160, 163]

fig, ax = plt.subplots(figsize=(10, 5))
ax.scatter(heights, weights)

ax.set_title('Height vs. Weight')
ax.set_xlabel('Height')
ax.set_ylabel('Weight')

plt.scatter(heights, weights)

plt.savefig("plot.png")
plt.show()``````

## Explanation of Some Common plt.scatter() Parameters

Parameter Name Description Data Type Default Value
x The variables to be plotted along the x axis array-like or float or shape
y The variables to be plotted along the y axis array-like or float or shape
s Marker size on points ** 2 array-like or float or shape rcParams['lines.markersize'] ** 2
c Color of the marker array-like or list None
marker The marker style. See Matplotlib v3.3.3 Markers documentation. str rcParams["scatter.marker"] (default: 'o')
cmap A Colormap instance or colormap name that can only be used if c is an array of floats str or Colormap rcParams["image.cmap"] (default: 'viridis')
norm If c is an array of floats, norm scales the color data, c, in the range 0 to 1. If None, defaults to colors.Normalize. array of floats None
alpha Opacity of the marker between 0 (transparent) and 1 (opaque) float None
linewidths The line widths of the marker edges array-like or float rcParams["lines.linewidth"] (default: 1.5)
edgecolors The edge color of the marker. Possible values: 'face', 'none', or a series of Marplotlib color or sequence of colors color or sequence of color rcParams["scatter.edgecolors"] (default: 'face')

## How to Customize a Scatter Plot with Python and Matplotlib

In the following example, we remove the top and right borders, add major gridlines, and add some transparency to the markers to make it easier to see when they overlap.

``````import matplotlib.pyplot as plt

heights = [62, 62.5, 62.5, 62.5, 63, 63, 63.5, 63.5, 63.5, 63.5, 64, 64, 64, 64, 64.5, 64.5, 64.5, 65, 65, 65, 65, 65, 65, 65.6, 66, 66, 66, 66, 66, 66, 66, 66, 66.5, 67, 67, 67.5, 67.5, 67.5, 67.5, 67.5, 68, 68, 68, 68]
weights = [120, 120, 122, 123, 130, 140, 145, 140, 142, 143, 115, 120, 124, 135, 136, 135, 137, 130, 132, 135, 128, 139, 134, 140, 142, 130, 180, 145, 142, 143, 141, 149, 150, 145, 142, 145, 159, 155, 158, 166, 170, 165, 160, 163]

fig, ax = plt.subplots(figsize=(10, 5))
ax.scatter(heights, weights)

ax.set_title('Height vs. Weight')
ax.set_xlabel('Height')
ax.set_ylabel('Weight')

plt.scatter(heights, weights, alpha=0.75)

# Remove top and right borders
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

ax.grid(color='grey', linestyle='-', linewidth=0.25, alpha=0.4)

plt.savefig("plot.png")
plt.show()``````

## How to Plot 3 Variables on a Scatter Plot with Python and Matplotlib

We can plot a third variable by mapping them to the sizes of the markers. In the following example, we add age data, and increase each value by a multiplier to make it easier to see the variations.

``````import matplotlib.pyplot as plt

# Data
heights = [62, 62.5, 62.5, 62.5, 63, 63, 63.5, 63.5, 63.5, 63.5, 64, 64, 64, 64, 64.5, 64.5, 64.5, 65, 65, 65, 65, 65, 65, 65.6, 66, 66, 66, 66, 66, 66, 66, 66, 66.5, 67, 67, 67.5, 67.5, 67.5, 67.5, 67.5, 68, 68, 68, 68]
weights = [120, 120, 122, 123, 130, 140, 145, 140, 142, 143, 115, 120, 124, 135, 136, 135, 137, 130, 132, 135, 128, 139, 134, 140, 142, 130, 180, 145, 142, 143, 141, 149, 150, 145, 142, 145, 159, 155, 158, 166, 170, 165, 160, 163]
ages = [20, 34, 24, 26, 32, 23, 27, 28, 40, 32, 33, 30, 31, 29, 28, 26, 25, 39, 37, 28, 38, 40, 25, 35, 25, 26, 28, 29, 30, 31, 25, 34, 38, 20, 21, 23, 29, 27, 27, 35, 30, 25, 28, 29]

fig, ax = plt.subplots(figsize=(6, 6))

# Titles
ax.set_title('Height vs. Weight')
ax.set_xlabel('Height')
ax.set_ylabel('Weight')

# Remove top and right borders
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

ax.grid(color='grey', linestyle='-', linewidth=0.25, alpha=0.4)

ax.scatter(heights, weights,
linewidths=1, alpha=0.75,
edgecolor='k',
s=[age * 7.5 for age in ages],
c='palegreen')

plt.savefig("plot.png")
plt.show()``````

## How to Plot 4 Variables and Add Legends

We can also map a fourth variable to color. When mapping a variable to the colors of the markers, it's best to use a categorical variable, to limit the color variation, which is harder to distinguish when too many colors are rendered. In the following example, we limit the color variation to two, to represent each gender.

``````import matplotlib.pyplot as plt
import numpy as np

# Data
heights = [62, 62.5, 62.5, 62.5, 63, 63, 63.5, 63.5, 63.5, 63.5, 64, 64, 64, 64, 64.5, 64.5, 64.5, 65, 65, 65, 65, 65, 65, 65.6, 66, 66, 66, 66, 66, 66, 66, 66, 66.5, 67, 67, 67.5, 67.5, 67.5, 67.5, 67.5, 68, 68, 68, 68]
weights = [120, 120, 122, 123, 130, 140, 145, 140, 142, 143, 115, 120, 124, 135, 136, 135, 137, 130, 132, 135, 128, 139, 134, 140, 142, 130, 180, 145, 142, 143, 141, 149, 150, 145, 142, 145, 159, 155, 158, 166, 170, 165, 160, 163]
ages = [20, 34, 24, 26, 32, 23, 27, 28, 40, 32, 33, 30, 31, 29, 28, 26, 25, 39, 37, 28, 38, 40, 25, 35, 25, 26, 28, 29, 30, 31, 25, 34, 38, 20, 21, 23, 29, 27, 27, 35, 30, 25, 28, 29]
genders = [0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

fig, ax = plt.subplots(figsize=(10, 7))

# Titles
ax.set_title('Height vs. Weight')
ax.set_xlabel('Height')
ax.set_ylabel('Weight')

# Remove top and right borders
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

ax.grid(color='grey', linestyle='-', linewidth=0.25, alpha=0.4)

#
scatter = ax.scatter(heights, weights,
linewidths=1, alpha=0.75,
edgecolor='k',
s=[age * age for age in ages],
c=genders)

kw = dict(prop="sizes",
func=lambda s: np.sqrt(s),
alpha=0.6)
legend1 = ax.legend(*scatter.legend_elements(**kw),
loc="upper left", title="Ages",
labelspacing=2)

handles, labels = scatter.legend_elements(prop="colors", alpha=0.6)
ax.legend(handles, labels, loc="upper right", title="Genders")

plt.tight_layout()
plt.savefig("plot.png")
plt.show()``````

## How to Plot Multiple Variables On a 3D Scatter Plot

### Plotting a Third Variable Along the Z-Axis

Here's our same Height vs. Weight chart, but we have a 3rd(z) dimension to plot the age data along.

``````import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# Data
heights = [62, 62.5, 62.5, 62.5, 63, 63, 63.5, 63.5, 63.5, 63.5, 64, 64, 64, 64, 64.5, 64.5, 64.5, 65, 65, 65, 65, 65, 65, 65.6, 66, 66, 66, 66, 66, 66, 66, 66, 66.5, 67, 67, 67.5, 67.5, 67.5, 67.5, 67.5, 68, 68, 68, 68]
weights = [120, 120, 122, 123, 130, 140, 145, 140, 142, 143, 115, 120, 124, 135, 136, 135, 137, 130, 132, 135, 128, 139, 134, 140, 142, 130, 180, 145, 142, 143, 141, 149, 150, 145, 142, 145, 159, 155, 158, 166, 170, 165, 160, 163]
ages = [20, 34, 24, 26, 32, 23, 27, 28, 40, 32, 33, 30, 31, 29, 28, 26, 25, 39, 37, 28, 38, 40, 25, 35, 25, 26, 28, 29, 30, 31, 25, 34, 38, 20, 21, 23, 29, 27, 27, 35, 30, 25, 28, 29]
genders = [0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

fig = plt.figure(figsize=(6, 6))
ax = plt.axes(projection="3d")

# Titles
ax.set_title('Height vs. Weight vs. Age')
ax.set_xlabel('Height')
ax.set_ylabel('Weight')
ax.set_zlabel('Age')

# Remove top and right borders
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

ax.grid(color='grey', linestyle='-', linewidth=0.25, alpha=0.4)

ax.scatter(heights, weights, ages,
linewidths=1, alpha=0.75,
edgecolor='k',
s=200,
c='palegreen')

plt.savefig("plot.png")
plt.show()``````

### Plotting a Fourth Variable with Color Variations

In this example, we add a separate color for each gender and a legend.

``````import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.colors import ListedColormap

# Data
heights = [62, 62.5, 62.5, 62.5, 63, 63, 63.5, 63.5, 63.5, 63.5, 64, 64, 64, 64, 64.5, 64.5, 64.5, 65, 65, 65, 65, 65, 65, 65.6, 66, 66, 66, 66, 66, 66, 66, 66, 66.5, 67, 67, 67.5, 67.5, 67.5, 67.5, 67.5, 68, 68, 68, 68]
weights = [120, 120, 122, 123, 130, 140, 145, 140, 142, 143, 115, 120, 124, 135, 136, 135, 137, 130, 132, 135, 128, 139, 134, 140, 142, 130, 180, 145, 142, 143, 141, 149, 150, 145, 142, 145, 159, 155, 158, 166, 170, 165, 160, 163]
ages = [20, 34, 24, 26, 32, 23, 27, 28, 40, 32, 33, 30, 31, 29, 28, 26, 25, 39, 37, 28, 38, 40, 25, 35, 25, 26, 28, 29, 30, 31, 25, 34, 38, 20, 21, 23, 29, 27, 27, 35, 30, 25, 28, 29]
genders = [0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
labels = ['Female', 'Male']
colors = ListedColormap(['lightcoral', 'b'])

fig = plt.figure(figsize=(6, 6))
ax = plt.axes(projection="3d")

# Titles
ax.set_title('Height vs. Weight vs. Age')
ax.set_xlabel('Height')
ax.set_ylabel('Weight')
ax.set_zlabel('Age')

# Remove top and right borders
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

ax.grid(color='grey', linestyle='-', linewidth=0.25, alpha=0.4)

scatter = ax.scatter(heights, weights, ages, c=genders, cmap=colors)
ax.legend(handles=scatter.legend_elements()[0], labels=labels)

plt.savefig("plot.png")
plt.show()``````

`

## Conclusion

In this tutorial, we covered how to utilize Matplotlib and Python to generate a 2D scatter plot, a 3D scatter plot, and how to plot multiple variables.

## Book Recommendations for You

Subscribe to comments for this post

## Hire Us for IT and Consulting Services

Do you have a specific IT problem that needs solving or just have a general IT question? Use the contact form to get in touch with us and an IT professional will be with you, momentarily.

## Services

We offer web development, enterprise software development, QA & testing, google analytics, domains and hosting, databases, security, IT consulting, and other IT-related services.

## Free IT Tutorials

Head over to our tutorials section to learn all about working with various IT solutions.