. Advertisement .
..3..
. Advertisement .
..4..
It is always straightforward for beginners to plot a histogram in Python as long as they have the right approach. This article will focus on two libraries used for plotting. Keep reading to explore them.
What Is A Histogram?
A histogram is often employed to make a quick assessment of a probability distribution. Looking at one histogram, any reader can get the implication and conclusion easily.
Python provides you with a handful of options to build and plot histogram. Most people often understand a histogram in its graphical representation like a bar chart.
The first step to build a histogram is to create the ranges’ bin, distribute its values into intervals, and count them in each interval. Bins are also understood as non-overlapping and consecutive variable intervals.
How To Plot A Histogram
By The matplotlib.pyplot.hist() Function
The matplotlib.pyplot.hist() option computes and creates a histogram of x. Its parameters include x, bins, density, range, histtype, align, weights, bottom, rwidth, color, label, and log.
These parameters also allow users to modify a histogram. Meanwhile, the hist() function offers an object accessing the created objects’ properties. This way, you can modify it according to your will.
First, let’s import the function with the following command:
import matplotlib.pyplot as plt
Now pick one column and plot it with the plot() function. There are two approaches to use this one, including on the dataframe directly or passing it to the plt.plot() function.
df['Apps'].plot(kind='hist')
The number of bins is one crucial plotting parameter, which is often counted 10 by default. A higher number of bins will offer more numbers of smaller bars. Thus, the data can be seen more granularly.
df['Apps'].plot(kind='hist',bins=15)
Plus, the plt.plot() method offers more options and flexibility to control the figure. Let’s draw a histogram with the same column:
plt.plot(df['Apps'])
With this approach, you will likely get a line plot, which requires the hist() pyplot method to plot:
Input:
plt.hist(df['Apps'])
Output:
(array([638., 92., 31., 11., 4., 0., 0., 0., 0., 1.]),
array([ 81. , 4882.3, 9683.6, 14484.9, 19286.2, 24087.5, 28888.8,
33690.1, 38491.4, 43292.7, 48094. ]),
<a list of 10 Patch objects>)
Overall, Matplotlib is one of the greatest packages to control the plot’s axes and figures. In this case, there are two axes and a bounding box figure. The function approaches both objects and allows you to control the size easily.
By Seaborn
Unlike Matplotlib, Seaborn enables you to draw eye-catching plots quickly and in a straightforward way. Let’s import the function:
Input:
import seaborn as sns
Here, the created histogram includes a density line and you can move it with kde = False option.
Input:
sns.distplot(df['Apps'],kde=False)
Output:
<matplotlib.axes._subplots.AxesSubplot at 0x7f3b2acb24d0>
With kde = True, the function doesn’t show frequency of a variable’s values. Instead, it shows density on the y axis. You can also use the pyplot object to control the plot.
Conclusion
A histogram is undoubtedly useful for representing frequencies of a variable, which offers a better visualization of the data distribution. The article covers two main methods to plot a histogram in Python, including Seaborn and Matplotlib.
Leave a comment