10. Plotting Graphs with Matplotlib#

Estimated time to complete: one hour.

10.1. Introduction: Matplotlib and Pyplot#

Numerical data is often presented with graphs, and the tools we use for this come from the module matplotlib.pyplot which is part of the Python package matplotlib. (A Python package is essentially a module that also contains other modules.)

10.2. Sources on Matplotlib#

Matplotlib is a huge collection of graphics tools, of which we see just a few here. For more information, the home site for Matplotlib is http://matplotlib.org and the section on pyplot is at http://matplotlib.org/1.3.1/api/pyplot_api.html

However, another site that I find easier as an introduction is https://scipy-lectures.org/intro/matplotlib/

In fact, that whole site https://scipy-lectures.org/ is quite useful a a reference on Python, Numpy, and so on.

Note: the descriptions here are for now about working in notebooks: see the note below on differences when using Spyder and IPython

10.3. Choosing where the graphs appear#

In a notebook, we can choose between having the figures produced by Matplotlib appear “inline” (that is, within the notebook window) or in separate windows. For now we will use the inline option, which is the default, but can also be specified explicitly with the command

%matplotlib inline

To activate that, uncomment the line below; that is, remove the leading hash character “#”

#%matplotlib inline

This is an IPython magic command, indicated by starting with the percent character “%” — you can read more about them at https://ipython.org/ipython-doc/dev/interactive/magics.html

Alternatively, one can have figures appear in separate windows, which might be useful when you want to save them to files, or zoom and pan around the image. That can be chosen with the magic command

%matplotlib tk
#%matplotlib tk

As far as I know, this magic works for Windows and Linux as well as Mac OS; let me know if it does not!

We need some NumPy stuff, for example to create arrays of numbers to plot.

Note that this is NumPy only: Python lists and tuples do not work for this, and nor do the versions of functions like sin from module math!

# Import a few favorites, and let them be known by their first names:
from numpy import linspace, sin, cos, pi

And for now, just the one main matplotlib graphics function, plot

from matplotlib.pyplot import plot

To access all of pyplot, add its common nickname plt:

import matplotlib.pyplot as plt

Producing arrays of “x” values with the numpy function linspace#

To plot the graph of a function, we first need a collection of values for the abscissa (horizontal axis). The function linspace (from numpy) gives an array containing a specified number of equally spaced values over a specified interval, so that

tenvalues = linspace(1., 6., 10)

gives ten equally spaced values ranging from 1 to 6:

print(f"Array 'tenvalues' is:\n{tenvalues}")
Array 'tenvalues' is:
[1.         1.55555556 2.11111111 2.66666667 3.22222222 3.77777778
 4.33333333 4.88888889 5.44444444 6.        ]

Not quite what you expected? To get values with ten intervals in between them, you need 11 values:

tenintervals = linspace(1., 6., 11)
print(f"Array 'tenintervals' is: \n {tenintervals}")
Array 'tenintervals' is: 
 [1.  1.5 2.  2.5 3.  3.5 4.  4.5 5.  5.5 6. ]

Basic graphs with plot#

We could use these 11 values to graph a function, but the result is a bit rough, because the given points are joined with straight line segments:

plot(tenintervals, sin(tenintervals))
[<matplotlib.lines.Line2D at 0x7f8c01c68400>]
_images/graphing-with-matplotlib_23_1.png

Here we see the default behavior of joining the given points with straight lines.

Aside: That text output above the graph is a message returned as the output value of function plot; that is what happens when you execute a function in the last line of a cell but do not “use” its return value by either saving its result into a variable or making it input to another function.

You might want to suppress that, and that can be done by ending the command with a semi-colon.

More generally, if the last line of a cell gives a value, that value is displayed when the cell is run, and appending a semi-colon supprresss that.

Semi-colons at the end of any other line do nothing, but are harmless.

plot(tenintervals, sin(tenintervals));
_images/graphing-with-matplotlib_26_0.png

For discrete data it might be better to mark each point, unconnected. This is done by adding a third argument, a text string specifying a marker, such as a star:

plot(tenvalues, sin(tenvalues), '*');
_images/graphing-with-matplotlib_28_0.png

Or maybe both lines and markers:

plot(tenvalues, sin(tenvalues), '-*');
_images/graphing-with-matplotlib_30_0.png

10.4. Smoother graphs#

It turns out that 50 points is often a good choice for a smooth-looking curve, so the function linspace has this as a default input parameter: you can omit that third input value, and get 50 points.

Let’s use this to plot some trig. functions.

x = linspace(-pi, pi)
print(x)
[-3.14159265 -3.01336438 -2.88513611 -2.75690784 -2.62867957 -2.5004513
 -2.37222302 -2.24399475 -2.11576648 -1.98753821 -1.85930994 -1.73108167
 -1.60285339 -1.47462512 -1.34639685 -1.21816858 -1.08994031 -0.96171204
 -0.83348377 -0.70525549 -0.57702722 -0.44879895 -0.32057068 -0.19234241
 -0.06411414  0.06411414  0.19234241  0.32057068  0.44879895  0.57702722
  0.70525549  0.83348377  0.96171204  1.08994031  1.21816858  1.34639685
  1.47462512  1.60285339  1.73108167  1.85930994  1.98753821  2.11576648
  2.24399475  2.37222302  2.5004513   2.62867957  2.75690784  2.88513611
  3.01336438  3.14159265]
# With a line through the points
plot(x, sin(x), '-');
_images/graphing-with-matplotlib_34_0.png

10.5. Multiple curves on a single figure#

As we have seen when using plot to produce inline figures in a Jupyter notebook, plot commands in different cells produce separate figures.

To combine curves on a single graph, one way is to use successive plot commands within the same cell:

plot(x, cos(x), '*')
plot(x, sin(x));
_images/graphing-with-matplotlib_36_0.png

On the other hand, when plotting externally, or from a Python script file or the IPython command line, successive plot commands keep adding to the same figure until you explicitly specify otherwise, with the function figure introduced below.

Note: The semi-colon was only needed on the final plot command, because a Juypter cell only displays the output of the last command in the cell (along with anything explicitly output with a print function of course).

10.6. Two curves with a single plot command#

Several curves can be specified in a single plot command (which also works with external figure windows of course.)

plot(x, cos(x), '*', x, sin(x));
_images/graphing-with-matplotlib_40_0.png

Note that even with multiple curves in a single plot command, markers can be specified on some, none or all: Matplotlib uses the difference between an array and a text string to recognize which arguments specify markers instead of data.

Here are some other marker options — particularly useful if you need to print in back-and-white.

plot(x, cos(x), '.', x, sin(x), ':');
_images/graphing-with-matplotlib_42_0.png

10.7. Multiple curves in one figure#

There can be any number of curves in a single plot command:

x = linspace(-1,1)
plot(x, x+1, x, x+2, x, x+3, x, x+4, x, x+5, x, x+6, x, x+7, x, x+8, x, x+9, x, x+10, x, x+11, x, x+12);
_images/graphing-with-matplotlib_44_0.png

Note the color sequence: blue, orange, green, red, … blue, orange …

With enough curves (ten here, but it can vary between versions of matplotlib) the color sequence eventually repeats – but you probably don’t want that many curves on one graph.

Aside on long lines of code: The above illustrates a little Python coding hack: one way to have a long command continue over several lines is simply to have parentheses wrapped around the part that spans multiple lines—when a line ends with an opening parenthesis not yet matched, Python knowns that something is still to come.

Aside: using IPython magic commands in Spyder and with the IPython command line

If using Spyder and the IPython command line, there is a similar choice of where graphs appear, but with a few differences to note:

  • With the “inline” option (which is again the default) figures then appear in a pane within the Spyder window.

  • The “tk” option works exactly as with notebooks, with each figure appearing in its own window.

  • Note: Any such IPython magic commands must be entered at the IPython interactive command line, not in a Python code file.

10.8. Plotting sequences#

A curve can also be specified by a single array of numbers: these are taken as the values of a sequence, indexed Pythonically from zero, and plotted as the ordinates (vertical values):

plot(tenvalues**2, '.');
_images/graphing-with-matplotlib_49_0.png

10.9. Plotting curves in separate figures (from a single cell)#

From within a single Jupyter cell, or when working with Python files or in the IPython command window (as used within Spyder), successive plot commands keep adding to the previous figure. To instead start the next plot in a separate figure, first create a new “empty” figure, with the function matplotlib.pyplot.figure.

With a full name as long as that, it is worth importing so that it can be used on a first name basis:

from matplotlib.pyplot import figure
x = linspace(0, 2*pi)
plot(x, sin(x))
figure()
plot(x, cos(x), 'o');
_images/graphing-with-matplotlib_52_0.png _images/graphing-with-matplotlib_52_1.png

The figure command can also do other things, like attach a name or number to a figure when it is displayed externally, and change from the default size.

So even though this is not always needed in a notebook, from now on each new figure will get an explicit figure command. Revisiting the last example:

x = linspace(0, 2*pi)
figure(99)
# What does 99 do?
# See with external "tk" display of figures,
# as with `%matplotlib tk`
plot(x, sin(x))
figure(figsize=(12,8))
plot(x, cos(x), 'o');
_images/graphing-with-matplotlib_54_0.png _images/graphing-with-matplotlib_54_1.png

10.10. Decorating the Curves#

Curves can be decorated in different ways. We have already seen some options, and there are many more. One can specify the color, line styles like dashed or dash-dot instead of solid, many different markers, and to have both markers and lines. As seen above, this can be controlled by an optional text string argument after the arrays of data for a curve:

figure()
plot(x, sin(x), '*-')
plot(x, cos(x), 'r--');
_images/graphing-with-matplotlib_56_0.png

These three-part curve specifications can be combined: in the following, plot knows that there are two curves each specified by three arguments, not three curves each specified by just an “x-y” pair:

figure()
plot(x, sin(x), 'g-.', x, cos(x), 'm+-.');
_images/graphing-with-matplotlib_58_0.png

Exercise A: Explore ways to refine your figures#

There are many commands for refining the appearance of a figure after its initial creation with plot. Experiment yourself with the commands title, xlabel, ylabel, grid, and legend.

Using the functions mentioned above, produce a refined version of the above sine and cosine graph, with:

  • a title at the top

  • labels on both axes

  • a legend identifying each curve

  • a grid or “graph paper” background, to make it easier to judge details like where a function has zeros.

Exercise B: Saving externally displayed figures to files#

Then work out how to save this figure to a file (probably in format PNG), and turn that in, along with the file used to create it.

This is most readily done with externally displayed figures; that is, with %matplotlib tk. Making that change to tk in a notebook requires then restarting the kernel for it to take effect; use the “fast foerwrd button or open menu Kernel above and select “Restart Kernel and Run All Cells …*

For your own edification, explore other features of externally displayed figures, like zooming and panning: this cannot be done with inline figures.

Getting help from the documentation#

For some of these, you will probably need to read up. For simple things, there is a function help, which is best used in the IPython interactive input window (within Spyder for example), but I will illustrate it here.

The entry for plot is unusually long! It provides details about all the options mentioned above, like marker styles. So this might be a good time to learn how to clear the output in a cell, to unclutter the view: either use the above menu “Edit’ or open the menu with Control-click or right-click on the code cell; then use “Clear Outputs” to remove the output of just the current cell.

help(plot)
Help on function plot in module matplotlib.pyplot:

plot(*args, scalex=True, scaley=True, data=None, **kwargs)
    Plot y versus x as lines and/or markers.
    
    Call signatures::
    
        plot([x], y, [fmt], *, data=None, **kwargs)
        plot([x], y, [fmt], [x2], y2, [fmt2], ..., **kwargs)
    
    The coordinates of the points or line nodes are given by *x*, *y*.
    
    The optional parameter *fmt* is a convenient way for defining basic
    formatting like color, marker and linestyle. It's a shortcut string
    notation described in the *Notes* section below.
    
    >>> plot(x, y)        # plot x and y using default line style and color
    >>> plot(x, y, 'bo')  # plot x and y using blue circle markers
    >>> plot(y)           # plot y using x as index array 0..N-1
    >>> plot(y, 'r+')     # ditto, but with red plusses
    
    You can use `.Line2D` properties as keyword arguments for more
    control on the appearance. Line properties and *fmt* can be mixed.
    The following two calls yield identical results:
    
    >>> plot(x, y, 'go--', linewidth=2, markersize=12)
    >>> plot(x, y, color='green', marker='o', linestyle='dashed',
    ...      linewidth=2, markersize=12)
    
    When conflicting with *fmt*, keyword arguments take precedence.
    
    
    **Plotting labelled data**
    
    There's a convenient way for plotting objects with labelled data (i.e.
    data that can be accessed by index ``obj['y']``). Instead of giving
    the data in *x* and *y*, you can provide the object in the *data*
    parameter and just give the labels for *x* and *y*::
    
    >>> plot('xlabel', 'ylabel', data=obj)
    
    All indexable objects are supported. This could e.g. be a `dict`, a
    `pandas.DataFrame` or a structured numpy array.
    
    
    **Plotting multiple sets of data**
    
    There are various ways to plot multiple sets of data.
    
    - The most straight forward way is just to call `plot` multiple times.
      Example:
    
      >>> plot(x1, y1, 'bo')
      >>> plot(x2, y2, 'go')
    
    - If *x* and/or *y* are 2D arrays a separate data set will be drawn
      for every column. If both *x* and *y* are 2D, they must have the
      same shape. If only one of them is 2D with shape (N, m) the other
      must have length N and will be used for every data set m.
    
      Example:
    
      >>> x = [1, 2, 3]
      >>> y = np.array([[1, 2], [3, 4], [5, 6]])
      >>> plot(x, y)
    
      is equivalent to:
    
      >>> for col in range(y.shape[1]):
      ...     plot(x, y[:, col])
    
    - The third way is to specify multiple sets of *[x]*, *y*, *[fmt]*
      groups::
    
      >>> plot(x1, y1, 'g^', x2, y2, 'g-')
    
      In this case, any additional keyword argument applies to all
      datasets. Also this syntax cannot be combined with the *data*
      parameter.
    
    By default, each line is assigned a different style specified by a
    'style cycle'. The *fmt* and line property parameters are only
    necessary if you want explicit deviations from these defaults.
    Alternatively, you can also change the style cycle using
    :rc:`axes.prop_cycle`.
    
    
    Parameters
    ----------
    x, y : array-like or scalar
        The horizontal / vertical coordinates of the data points.
        *x* values are optional and default to ``range(len(y))``.
    
        Commonly, these parameters are 1D arrays.
    
        They can also be scalars, or two-dimensional (in that case, the
        columns represent separate data sets).
    
        These arguments cannot be passed as keywords.
    
    fmt : str, optional
        A format string, e.g. 'ro' for red circles. See the *Notes*
        section for a full description of the format strings.
    
        Format strings are just an abbreviation for quickly setting
        basic line properties. All of these and more can also be
        controlled by keyword arguments.
    
        This argument cannot be passed as keyword.
    
    data : indexable object, optional
        An object with labelled data. If given, provide the label names to
        plot in *x* and *y*.
    
        .. note::
            Technically there's a slight ambiguity in calls where the
            second label is a valid *fmt*. ``plot('n', 'o', data=obj)``
            could be ``plt(x, y)`` or ``plt(y, fmt)``. In such cases,
            the former interpretation is chosen, but a warning is issued.
            You may suppress the warning by adding an empty format string
            ``plot('n', 'o', '', data=obj)``.
    
    Returns
    -------
    list of `.Line2D`
        A list of lines representing the plotted data.
    
    Other Parameters
    ----------------
    scalex, scaley : bool, default: True
        These parameters determine if the view limits are adapted to the
        data limits. The values are passed on to `autoscale_view`.
    
    **kwargs : `.Line2D` properties, optional
        *kwargs* are used to specify properties like a line label (for
        auto legends), linewidth, antialiasing, marker face color.
        Example::
    
        >>> plot([1, 2, 3], [1, 2, 3], 'go-', label='line 1', linewidth=2)
        >>> plot([1, 2, 3], [1, 4, 9], 'rs', label='line 2')
    
        If you specify multiple lines with one plot call, the kwargs apply
        to all those lines. In case the label object is iterable, each
        element is used as labels for each set of data.
    
        Here is a list of available `.Line2D` properties:
    
        Properties:
        agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array
        alpha: scalar or None
        animated: bool
        antialiased or aa: bool
        clip_box: `.Bbox`
        clip_on: bool
        clip_path: Patch or (Path, Transform) or None
        color or c: color
        dash_capstyle: `.CapStyle` or {'butt', 'projecting', 'round'}
        dash_joinstyle: `.JoinStyle` or {'miter', 'round', 'bevel'}
        dashes: sequence of floats (on/off ink in points) or (None, None)
        data: (2, N) array or two 1D arrays
        drawstyle or ds: {'default', 'steps', 'steps-pre', 'steps-mid', 'steps-post'}, default: 'default'
        figure: `.Figure`
        fillstyle: {'full', 'left', 'right', 'bottom', 'top', 'none'}
        gid: str
        in_layout: bool
        label: object
        linestyle or ls: {'-', '--', '-.', ':', '', (offset, on-off-seq), ...}
        linewidth or lw: float
        marker: marker style string, `~.path.Path` or `~.markers.MarkerStyle`
        markeredgecolor or mec: color
        markeredgewidth or mew: float
        markerfacecolor or mfc: color
        markerfacecoloralt or mfcalt: color
        markersize or ms: float
        markevery: None or int or (int, int) or slice or list[int] or float or (float, float) or list[bool]
        path_effects: `.AbstractPathEffect`
        picker: float or callable[[Artist, Event], tuple[bool, dict]]
        pickradius: float
        rasterized: bool
        sketch_params: (scale: float, length: float, randomness: float)
        snap: bool or None
        solid_capstyle: `.CapStyle` or {'butt', 'projecting', 'round'}
        solid_joinstyle: `.JoinStyle` or {'miter', 'round', 'bevel'}
        transform: unknown
        url: str
        visible: bool
        xdata: 1D array
        ydata: 1D array
        zorder: float
    
    See Also
    --------
    scatter : XY scatter plot with markers of varying size and/or color (
        sometimes also called bubble chart).
    
    Notes
    -----
    **Format Strings**
    
    A format string consists of a part for color, marker and line::
    
        fmt = '[marker][line][color]'
    
    Each of them is optional. If not provided, the value from the style
    cycle is used. Exception: If ``line`` is given, but no ``marker``,
    the data will be a line without markers.
    
    Other combinations such as ``[color][marker][line]`` are also
    supported, but note that their parsing may be ambiguous.
    
    **Markers**
    
    =============   ===============================
    character       description
    =============   ===============================
    ``'.'``         point marker
    ``','``         pixel marker
    ``'o'``         circle marker
    ``'v'``         triangle_down marker
    ``'^'``         triangle_up marker
    ``'<'``         triangle_left marker
    ``'>'``         triangle_right marker
    ``'1'``         tri_down marker
    ``'2'``         tri_up marker
    ``'3'``         tri_left marker
    ``'4'``         tri_right marker
    ``'8'``         octagon marker
    ``'s'``         square marker
    ``'p'``         pentagon marker
    ``'P'``         plus (filled) marker
    ``'*'``         star marker
    ``'h'``         hexagon1 marker
    ``'H'``         hexagon2 marker
    ``'+'``         plus marker
    ``'x'``         x marker
    ``'X'``         x (filled) marker
    ``'D'``         diamond marker
    ``'d'``         thin_diamond marker
    ``'|'``         vline marker
    ``'_'``         hline marker
    =============   ===============================
    
    **Line Styles**
    
    =============    ===============================
    character        description
    =============    ===============================
    ``'-'``          solid line style
    ``'--'``         dashed line style
    ``'-.'``         dash-dot line style
    ``':'``          dotted line style
    =============    ===============================
    
    Example format strings::
    
        'b'    # blue markers with default shape
        'or'   # red circles
        '-g'   # green solid line
        '--'   # dashed line with default color
        '^k:'  # black triangle_up markers connected by a dotted line
    
    **Colors**
    
    The supported color abbreviations are the single letter codes
    
    =============    ===============================
    character        color
    =============    ===============================
    ``'b'``          blue
    ``'g'``          green
    ``'r'``          red
    ``'c'``          cyan
    ``'m'``          magenta
    ``'y'``          yellow
    ``'k'``          black
    ``'w'``          white
    =============    ===============================
    
    and the ``'CN'`` colors that index into the default property cycle.
    
    If the color is the only part of the format string, you can
    additionally use any  `matplotlib.colors` spec, e.g. full names
    (``'green'``) or hex strings (``'#008000'``).

The jargon used in help can be confusing at first; fortunately there are other online sources that are more readable and better illustrated, like http://scipy-lectures.github.io/intro/matplotlib/matplotlib.html mentioned above.

However, that does not cover everything; the official pyplot documentation at http://matplotlib.org/1.3.1/api/pyplot_api.html is more complete: explore its search feature.