Statistics In Python

In the era of big data and artificial intelligence, data science and machine learning have become essential in many fields of science and technology. A necessary aspect of working with data is describing, summarising, and visually representing data. Statistics in python is a popular and widely used tool that will assist you in working with data.

There are many Python statistics libraries out there for you to work with, but in this book, you’ll be learning about some of the most popular and widely used ones:

  • Python’s statistics is a built-in Python library for descriptive statistics. You can use it if your datasets are not too large or if you can’t rely on importing other libraries.
  • NumPy is a third-party library for numerical computing, optimized for working with single- and multi-dimensional arrays. Its primary type is the array type called ndarray. This library contains many routines for statistical analysis.
  • SciPy is a third-party library for scientific computing based on NumPy. It offers additional functionality compared to NumPy, including scipy.stats for statistical analysis.
  • Pandas is a third-party library for numerical computing based on NumPy. It excels in handling labelled one-dimensional (1D) data with Series objects and two-dimensional (2D) data with DataFrame objects.
  • Matplotlib is a third-party library for data visualization. It works well in combination with NumPy, SciPy, and Pandas.