This is part of the course “Probability Theory and Statistics for Programmers”.

Image for post
Image for post
Probability Theory and Statistics For Programmers

In the previous article, we have seen one form of data representation — empirical distribution function. This time we will see more convenient representation for a large amount of data — histogram.

Suppose that we have the results of observation of a random variable X. In our example, we assume that a variable can take any value in the interval from 0 to 10. The sorted result of our observations:

Then we divide the whole range of observed values into intervals and calculate the number of values in each interval and divide it by the total number of observations. The length of the interval depends on how you want to represent the data. Let’s take the length of the interval equal to 2.

Image for post
Image for post

Now let’s draw histograms with a different number of intervals by using matplotlib:

Next part ->

Reach the next level of focus and productivity with increaser.org.

Image for post
Image for post
Increaser

Written by

Software engineer, creator of increaser.org. More at geekrodion.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store