๐ What is the Mean?
The mean, often called the average, is calculated by summing all the values in a dataset and then dividing by the number of values. It's a common way to find a central tendency.
- โ Formula: The mean ($ \bar{x} $) is calculated as: $ \bar{x} = \frac{\sum_{i=1}^{n} x_i}{n} $, where $ x_i $ represents each value in the dataset and $ n $ is the number of values.
- ๐ข Example: For the dataset [2, 4, 6, 8, 10], the mean is (2+4+6+8+10)/5 = 6.
- ๐ Sensitivity: The mean is highly sensitive to outliers, which can significantly skew the average.
๐ What is the Median?
The median is the middle value in a dataset when the values are arranged in ascending order. If there's an even number of values, the median is the average of the two middle values.
- ๐ Finding the Median: Arrange the data in ascending order and find the middle value.
- ๐งฎ Example: For the dataset [2, 4, 6, 8, 10], the median is 6. For [2, 4, 6, 8], the median is (4+6)/2 = 5.
- ๐ก๏ธ Robustness: The median is resistant to outliers; extreme values don't affect it much.
๐ Mean vs. Median: A Detailed Comparison
| Feature |
Mean |
Median |
| Definition |
The sum of all values divided by the number of values. |
The middle value when the data is ordered. |
| Calculation |
$ \frac{\sum x}{n} $ |
Middle value or average of the two middle values. |
| Sensitivity to Outliers |
Highly sensitive; outliers can significantly skew the result. |
Resistant to outliers; not significantly affected by extreme values. |
| Use Cases |
Useful when the data is normally distributed and there are no significant outliers. |
Useful when the data is skewed or contains outliers. |
| Example |
Average income. |
Typical house price in an area. |
๐ Key Takeaways
- ๐ฏ Outlier Impact: Outliers have a substantial impact on the mean but minimal impact on the median.
- ๐ Data Distribution: If your data is symmetrical and without outliers, the mean and median will be similar. If the data is skewed, the median is a better representation of central tendency.
- ๐ก Choosing the Right Measure: Select the mean when you want to include all data points in your calculation and the data is relatively normal. Opt for the median when you want a measure that's not easily influenced by extreme values.