πŸ“ŠWhat is Data Distribution?
β–Ό

The distribution of a dataset is the way the values are spread across the different ranges. When we draw a histogram, the bars form a shape, and that shape is what we call the distribution.

Different datasets have different shapes. Some are balanced and smooth, others lean to one side, and others have more than one peak. Each shape tells us something useful about the data.

Why Do We Look at the Shape?

The shape helps us understand the data quickly. For example:

  • Are most values close to each other or are they spread out?
  • Are the high values more common, or the low values?
  • Is there a single common range, or are there a few common ranges?
πŸ’‘
Key Idea

Before doing any calculations, look at the shape of the histogram. The shape gives you a quick first impression of the data.

πŸ””The Symmetric (Normal) Shape
β–Ό

One of the most common shapes in real life is the symmetric distribution, also called the Normal Distribution or the Bell Curve.

What Does It Look Like?

In a symmetric distribution, the bars are tallest in the middle and get shorter as we move to the left or right. Both sides look the same, like a mirror image.

What Does It Mean?

  • Most values are close to the middle.
  • Very high and very low values are uncommon.
  • The mean, median, and mode are very close to each other.
Real-Life Examples
  • The heights of adult men or women in a country.
  • The weights of bags of rice produced in a factory.
  • Test scores in a fair exam where most students score around the average.
↗️Skewed Distributions
β–Ό

When a histogram is not symmetric, we say it is skewed. A skewed distribution has a longer "tail" on one side. There are two types:

πŸ”” Symmetric

Both sides look the same. Mean β‰ˆ Median.

Example: heights of students.

↗️ Right-Skewed

The tail points to the right (higher values). Most values are small, but a few are very large.

Example: family income.

↖️ Left-Skewed

The tail points to the left (lower values). Most values are large, but a few are very small.

Example: scores on an easy quiz.

Right-Skewed (Positive Skew)

Most of the values are on the left side of the chart, with a tail stretching to the right. This means most people or items have small values, while a small number have large values.

Left-Skewed (Negative Skew)

Most of the values are on the right side of the chart, with a tail stretching to the left. This means most people or items have large values, while a small number have small values.

πŸ“
Easy Way to Remember

The skew is named after the side where the tail points, not where the bars are tallest.

🎯Mean vs. Median in Skewed Data
β–Ό

The shape of the data tells us which measure of center to trust more β€” the mean (average) or the median (middle value).

In a Symmetric Distribution

The mean and the median are very close. Either one is a good way to describe the center.

In a Skewed Distribution

The mean is affected by very large or very small values, so it gets pulled toward the tail. The median is not affected by extreme values as much, because it only depends on the middle position.

  • Right-skewed: The mean is usually higher than the median.
  • Left-skewed: The mean is usually lower than the median.
Simple Example

Imagine a small classroom with 5 friends, with monthly allowances of 100, 100, 200, 200, and 5,000 EGP.

  • Mean = (100 + 100 + 200 + 200 + 5000) Γ· 5 = 1,120 EGP
  • Median = middle value = 200 EGP

Here the mean of 1,120 EGP is misleading because four out of five friends actually have 100 or 200 EGP. The median of 200 EGP is a much better description of a "typical" friend in this group.

πŸ’‘
Useful Tip

When a distribution is strongly skewed, the median is usually a more useful measure of the center than the mean.

πŸ’ΌWhy the Shape Matters
β–Ό

Looking at the shape of a distribution helps us make better decisions. With one quick look at a histogram you can:

  • Decide whether to use the mean or the median to describe the data.
  • Spot if the data leans more to one side.
  • Notice values that look different from the rest (we will learn about these in the next topics).
  • Compare two groups easily by drawing two histograms side by side.

Quick Comparison

ShapeWhat it looks likeBest center to use
SymmetricBoth sides equal, peak in the middleMean or Median (both are close)
Right-skewedLong tail to the rightMedian (mean is pulled higher)
Left-skewedLong tail to the leftMedian (mean is pulled lower)
?
In a right-skewed distribution, where is the tail?
  • The distribution is the shape made by the bars of a histogram.
  • A symmetric distribution looks the same on both sides; the mean and median are close.
  • A right-skewed distribution has a long tail on the right (a few large values).
  • A left-skewed distribution has a long tail on the left (a few small values).
  • In skewed data, the median usually describes the center better than the mean.
πŸ“šExternal Resources
β–Ό