data observation environmental sensor data time series

Common Data Observation Patterns and What They Mean

Identifying common data observation patterns is core data literacy skill. Anyone using data to inform their work must be able to detect and describe the structure of the information in front of them. Here are some of the most common data observation patterns and what they mean.

Clustering

If you look at your data and see your data points in grouped batches instead of following a more uniform distribution, you’re observing data clustering. Clustered data points have more in common with their neighbors than they do with data points outside of their neighborhood.

Clustering data observation patterns may be obvious (like in the example above) or they might be harder to spot. In many cases, you may need to perform a bonafide cluster analysis before you can expose the patterns. Those methods are beyond the scope of this post, but future posts will cover clustering in detail. For now, know that you should always imply on the lookout for overt clustering patterns in all the data you encounter. 

Clustering data observation pattern
The data points are clustered into two main groups in this scatterplot of Old Faithful Geiser Eruption observations.

Skewness

Skewed data appears distorted in one direction or another. In other words, it is asymmetrically distributed along an axis. This type of observation pattern is easy to spot. You see it all the time in business or any situation where the average measurement doesn’t fall around the middle of the data. Skewed data is usually described as positively-skewed, negatively-skewed.

Skewing is one of the most important data observation patterns to spot because it has important implications for summary statistics like averages (arithmetic mean). In skewed datasets, summary statistics like the median are more appropriate to use than the mean when trying to find the “typical value”.

Skewed distribution data observation pattern
Contributions to ActBlue are heavily skewed to the left of the x-axis, suggesting that most donation transactions involve small amounts of money.

Trends

Do the data points appear directional? If so, you have observed a data trend. Trending is probably the most ubiquitous data observation pattern and is familiar to even the most novice data observer. You hear about trends all the time, and they fall into a few key subgroups:

Linear Trends 

Linear trends follow a “straight line” upward or downward direction. These are very easy to spot in charts and are probably the most common trend observation you’ll encounter. You see them all the time in business scenarios like meetings where teams illustrate a steady increase or decrease in sales figures over time.

Linear trend data observation pattern
The opening stock price for Google (GOOG) over the last several years exhibits a positive linear trend.

Exponential Trends

Data exhibiting an exponential trend appears to follow a non-linear curve with an exponential slope. You see these trends if you’re looking at something like the meteoric rise of a stock price or the explosive growth of a new product. You’ll also see them all the time in disciplines like the life sciences where researchers are illustrating growth patterns. Certain physical phenomena like radioactive decay also follow exponential trends.

Exponential trend data observation pattern
The total number of reported Ebola cases in the early months of the 2014 epidemic exhibited an exponential growth trend.

Logarithmic Trends

logarithmic trend looks similar to an exponential trend, and there’s an excellent reason for that. A logarithmic trend is in the inverse of an exponential trend!

For example, the growth curve of an exponential growth trend gets steeper over time. That’s the opposite of a logarithmic growth trend. Logarithmic growth curves get shallower over time. You’ll often see this trend in business scenarios where you have progressive market saturation.

Logarithmic trend data observation pattern example
With each visit, the cumulative total species observed begins to taper off. Eventually, the slope would flatten completely when the observer has observed all possible species at the site.

Logistic Trends

If you’re particularly data-savvy, you might even spot logistic trends. Logistic growth, for instance, follows a gradual increase followed by a sudden sharp rise, followed by a gradual increase. You’ve probably seen this pattern before if you work in marketing. It often occurs in business scenarios where increased advertising spend stimulates rapid sales growth following by saturation.

Logistic growth logistic trend data observation pattern
This smartphone market penetration curve mirrors the Diffusion of Innovations theory work of Everett Rogers. There was an initially slow adoption pace followed by a sudden increase in adoption rate

Non-uniform Trends

Non-uniform trends are also common. Data points (particularly data charted over long timescales) don’t always exhibit a constant directionality. They may have an upward trend for one timeframe and then show a downward trend for another. Be sure to look at the data as a whole and observe all the patterns in the “big picture.”

Periodicity

Periodic data patterns follow discernable cycles. There are ups and downs over time, and this behavior often follows a predictable pattern. One of the most commonly-encountered periodic trends is a seasonal trend. This trend, like all periodicity, is beneficial to spot because it suggests that you can apply seasonal forecasting techniques during subsequent predictive analytics processes. I’ll cover predictive analytics in more detail in future posts!

Periodic, seasonal data observation pattern example
The monthly airline passenger data in this chart follows a periodic observation pattern. Diving deeper, it appears that there is predictable seasonality. This suggests that seasonal forecasting techniques could be applied in later predictive analysis.

Outliers

Outliers are data points that differ significantly from the other data points in their group. Let’s say that, during your data collection on donation amounts, you notice most donations are around $50. Then, all of a sudden, you get a $100,000 gift. When plotted together, most of the data would cluster around the $50 mark, but there would be one dot way out in right-field for that $100K donation. That’s an outlier.

Outliers aren’t always so extreme, and they can be tough to detect. In many cases, you need statistical tests to catch them all. You should, however, always be on the lookout for them. Outliers are never entirely irrelevant, and a large number of outliers suggest that you may need to evaluate your measurement processes or even your metrics in general. So, always be aware of them, acknowledge them, and try to discern if they’re real anomalies vs. harbingers of deeper issues.

Multiple outlier data observation pattern example
Outliers are highlighted in red in the array of boxplots above. These outliers might be true anomalies or they might signal that better measurement methods or other interventions are warranted.

Master These Basics and You’re in Great Shape

If you can master these necessary data observation pattern identifications, you’re in great shape. However, understanding these patterns and continuously looking for them in the wild is just one component of data literacy. If your organization struggle with data literacy in general, take a look at this post about Improving Data Literacy at Your Data-Driven Workplace.

Advertisements Disclosure

I will always make it clear if I am writing to endorse or recommend a specific product(s) or service(s). I hate it when I visit a site only to find out that the article is just one big ad.

Various ads may be displayed on this post to help defray the operating cost of this blog. I may make a small commission on any purchases you make by clicking on those advertisements. Thank you for supporting my work bringing you accurate and actionable information on data literacy, analytics, and engineering.

Advertisements

2 Responses

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.