Correspondence Analysis

Correspondence Analysis is a statistical technique that plays a significant role in marketing, insights and media research. It is primarily used to analyze and visualize relationships between category variables, making it especially useful in consumer behavior analysis, market segmentation and brand positioning. It condenses information efficiently enabling professionals to explore, visualize, and understand complex relationships between variables in a quick and easy way, making it an invaluable asset for data driven decision making and strategy formulation.

What are the benefits of creating a Correspondence analysis?

Correspondence Analysis offers several benefits for use in the advertising and media industry:

Segmentation and Targeting:
Correspondence can help identify distinct audience segments and their preferences. By analyzing various media consumption habits, Correspondence enables marketers to fine-tune their targeting strategies, ensuring that the right message reaches the right audience.

Media Channel Optimization:
Correspondence can assist in identifying the most effective media channels for advertising campaigns. By analyzing how different media platforms relate to consumer responses or brand metrics, researchers can allocate resources more efficiently and make informed decisions about where to invest their budget.

Competitor Analysis:
By clustering brands and products, competitive sets within the market can be identified. Brands in the same factors compete more with each other than with brands in other clusters. A business can examine its current offerings compared to those of its competitors to identify potential new opportunities.

Identifying Psychographic Drivers:
Assists in identifying which psychographic variables have the strongest associations with specific audience segments and products, prior to running a Cluster analysis. This information is valuable for understanding what drives certain audience behaviors as well as understanding brand loyalty and adoption.

How to read a Correspondence Analysis Chart

Explaining the results table

Percent:
Refers to how much weight each factor contributes to the Correspondence calculation. This number should sum to 100. As a guide, one should be aiming for a ‘variance explained’ of 70% or above by a combination of Factors 1 and 2. This means that there is a large enough differentiation in the data variables, to create a robust analysis.

Per. Cum (Percent cumulative):
This is a cumulative of each percent as you go down the factors. i.e. Factor 1 will always be the same for Percent and Per. cum, Factor 2 Per. Cum will be the cumulative of Factor 1 percent and Factor 2 percent, and so on till Factor 6 which should be 100%

CHI2 (Chi-squared):
This metric is used to analyze the rows and columns of the crosstab and understand how statistically related to one another they are.

Deg. Free (Degrees of freedom):
Shows the number of independent pieces of information used to calculate the correspondence analysis.

The significance of degrees of freedom lies in interpreting Chi-squared results and establishing whether they are statistically meaningful. A greater number of degrees of freedom indicates more flexibility in the data, suggesting less constraint. These values primarily matter when assessing the statistical value of the results.

Explaining the data items on the map

The proximity of Active row labels in the table indicates the degree of similarity among the data items. In the graph below, you'll notice that the Active rows "Cherry Coke" "Pepsi Cola" and "Coca Cola" are grouped closely together, signifying their high degree of similarity to each other. Likewise, when examining the Active column, the age groups "15-24" and "25-34" are positioned near each other, suggesting that these age groups are very similar.

To interpret the relationship between the Active rows and Active columns, lines are drawn from each of the points to the interception of the quadrants, or the origin (zero on the graph).

Obtuse angles tell us that there is a negative correlation between two data points. Look at the line between “Ages 35-44” and “Coca-Cola Zero Sugar”. The angle is obtuse and the lines are rather short, indicating that there is a rather strong negative correlation.

Position of the data items relative to the center of the map

Since Correspondence is all about relativity, the closer a data item is to the center of the map, the less information will be revealed about the data item, by the map. The further any data item is from the middle of the map, the more strongly it is related to something
else in the map.

Understanding a Correspondence Analysis Table

% INF (inference):

Percent influence. The contribution each variable makes to the overall analysis (the higher it is, the more relevant it is to the differences in the market). If a psychographic statement has a low % INF then there is a weak relationship between the Brands (columns) and the psychographics (row). If a statement has a low % INF then it is not adequately explaining the difference between brands

ABS (absolute):

The ‘absolute contribution’ or the “value” of each statement/brand within the factor. Interpret it like % col in Crosstab. The higher the ABS, the better the factor is at explaining that particular variable.

REL (relative):

The ‘relative contribution’ of each variable across the factors. Interpret it like % row in Crosstab. How well are the users of a brand explained by the factors.

Interpreting a Correspondence Analysis Table

The correspondence application plots the brands and demographics on a graph. It shows Factors 1 as the ‘X-axis’ (left to right) and Factor 2 as the ‘Y-axis’ (top to bottom). In order to understand the factors and the data, the demographics need to be analysed in order to examine which are important to each factor, this is done by switching to the Table view.

Factors

%INF (Inference):

This shows how much influence a brand or demographic has on the analysis and is generally sorted by %INF prior to a Cluster analysis to determine the most influential variables in the analysis.

ABS (Absolute):

This shows the influence of a brand or, in this example, the demographics on this factor. The ABS score can be interpreted as the %Col in Explore.

REL (Relative):

This score determines which side of the graph the brand or lifestyle statement will appear on. For Factor 1 “-“ appears on the left and “+” appears on the right of the Correspondence map. The REL score also explains which factor best explains the brand/demographic and can be read like the %Row in Explore.