Identifying common data observation patterns is core data literacy skill. Anyone using data to inform their work must be able to detect and describe the structure of the information in front of them. Here are some of the most common data observation patterns and what they mean.
If you look at your data and see your data points in grouped batches instead of following a more uniform distribution, you’re observing data clustering. Clustered data points have more in common with their neighbors than they do with data points outside of their neighborhood.
Clustering data observation patterns may be obvious (like in the example above) or they might be harder to spot. In many cases, you may need to perform a bonafide cluster analysis before you can expose the patterns. Those methods are beyond the scope of this post, but future posts will cover clustering in detail. For now, know that you should always imply on the lookout for overt clustering patterns in all the data you encounter.
Skewed data appears distorted in one direction or another. In other words, it is asymmetrically distributed along an axis. This type of observation pattern is easy to spot. You see it all the time in business or any situation where the average measurement doesn’t fall around the middle of the data. Skewed data is usually described as positively-skewed, negatively-skewed.
Skewing is one of the most important data observation patterns to spot because it has important implications for summary statistics like averages (arithmetic mean). In skewed datasets, summary statistics like the median are more appropriate to use than the mean when trying to find the “typical value”.
Do the data points appear directional? If so, you have observed a data trend. Trending is probably the most ubiquitous data observation pattern and is familiar to even the most novice data observer. You hear about trends all the time, and they fall into a few key subgroups:
Linear trends follow a “straight line” upward or downward direction. These are very easy to spot in charts and are probably the most common trend observation you’ll encounter. You see them all the time in business scenarios like meetings where teams illustrate a steady increase or decrease in sales figures over time.
Data exhibiting an exponential trend appears to follow a non-linear curve with an exponential slope. You see these trends if you’re looking at something like the meteoric rise of a stock price or the explosive growth of a new product. You’ll also see them all the time in disciplines like the life sciences where researchers are illustrating growth patterns. Certain physical phenomena like radioactive decay also follow exponential trends.
A logarithmic trend looks similar to an exponential trend, and there’s an excellent reason for that. A logarithmic trend is in the inverse of an exponential trend!
For example, the growth curve of an exponential growth trend gets steeper over time. That’s the opposite of a logarithmic growth trend. Logarithmic growth curves get shallower over time. You’ll often see this trend in business scenarios where you have progressive market saturation.
If you’re particularly data-savvy, you might even spot logistic trends. Logistic growth, for instance, follows a gradual increase followed by a sudden sharp rise, followed by a gradual increase. You’ve probably seen this pattern before if you work in marketing. It often occurs in business scenarios where increased advertising spend stimulates rapid sales growth following by saturation.
Non-uniform trends are also common. Data points (particularly data charted over long timescales) don’t always exhibit a constant directionality. They may have an upward trend for one timeframe and then show a downward trend for another. Be sure to look at the data as a whole and observe all the patterns in the “big picture.”
Periodic data patterns follow discernable cycles. There are ups and downs over time, and this behavior often follows a predictable pattern. One of the most commonly-encountered periodic trends is a seasonal trend. This trend, like all periodicity, is beneficial to spot because it suggests that you can apply seasonal forecasting techniques during subsequent predictive analytics processes. I’ll cover predictive analytics in more detail in future posts!
Outliers are data points that differ significantly from the other data points in their group. Let’s say that, during your data collection on donation amounts, you notice most donations are around $50. Then, all of a sudden, you get a $100,000 gift. When plotted together, most of the data would cluster around the $50 mark, but there would be one dot way out in right-field for that $100K donation. That’s an outlier.
Outliers aren’t always so extreme, and they can be tough to detect. In many cases, you need statistical tests to catch them all. You should, however, always be on the lookout for them. Outliers are never entirely irrelevant, and a large number of outliers suggest that you may need to evaluate your measurement processes or even your metrics in general. So, always be aware of them, acknowledge them, and try to discern if they’re real anomalies vs. harbingers of deeper issues.
Master These Basics and You’re in Great Shape
If you can master these necessary data observation pattern identifications, you’re in great shape. However, understanding these patterns and continuously looking for them in the wild is just one component of data literacy. If your organization struggle with data literacy in general, take a look at this post about Improving Data Literacy at Your Data-Driven Workplace.
I will always make it clear if I am writing to endorse or recommend a specific product(s) or service(s). I hate it when I visit a site only to find out that the article is just one big ad.
Various ads may be displayed on this post to help defray the operating cost of this blog. I may make a small commission on any purchases you make by clicking on those advertisements. Thank you for supporting my work bringing you accurate and actionable information on data literacy, analytics, and engineering.