Principal Component Analysis (PCA) in Jamovi

Cheatsheet

Published

September 24, 2024

This work was developed using resources that are available under a Creative Commons Attribution 4.0 International License, made available on the SOLES Open Educational Resources repository by the School of Life and Environmental Sciences, The University of Sydney.

Assumed knowledge
  • You have Jamovi installed, ideally 2.5.7.0 or later.
  • You can follow instructions to select, click and drag elements in Jamovi.
  • You already understand what a PCA is and when to use it.

The data should be in a long format (also known as tidy data), where each row is an observation and each column is a variable (Figure 1). If your data is not already structured this way, reshape it manually in a spreadsheet program or in R using the pivot_longer() function from the tidyr package.

Sex BW
F 2.15
M 2.55
F 2.95
F 2.70
M 2.20
F 1.85
M 2.55
M 2.60

 

F M
2.15 2.55
2.95 2.20
2.70 2.55
1.85 2.60
Figure 1: Data should be in long format (left) where each row is an observation and each column is a variable. This is the preferred format for most statistical software. Wide format (right) is also common, but may require additional steps to analyse or visualise in some instances.
Data

Download data

We will use a crab dataset. Download it below:

About

Principal component analysis (PCA) is a technique used to reduce the dimensionality of a dataset and is also used to identify patterns in data using a lower-dimensional space, making it easier to interpret. PCA answers questions such as:

  • What are the most important variables in the dataset?
  • Are there any patterns in the data?
  • Can we reduce the number of variables in the dataset and still retain most of the information?

Interpretation of the results is not covered here.

Before we begin: install the snowCluster module

Important

The snowCluster module makes plotting the PCA results easier. To install it:

  1. Select the Analyses tab.
  2. Click on the Modules button, and select jamovi library.
  3. Scroll down to find or do a search for snowCluster.
  4. Click INSTALL, and wait for the installation to complete.
  5. Exit by clicking on the upper-right arrow button.

Import the crab data

  1. Click the menu icon.
  2. Select Open > Browse and select the crab.csv file.
  3. Click Open.

PCA walkthrough

Tip

Click on the images to enlarge them.

  1. In the Analyses tab, click on Factor and select Principal Components Analysis.
  2. Use the Ctrl key to select multiple variables or the Shift key to select a range of variables. All of these variables must be numeric.
  3. Drag the selected variables to the Variables box or click on the arrow button to use them in the analysis.
  4. Adjust the settings as needed:
    1. Method: The method used to calculate the PCA. The default is Varimax.
    2. Number of components: The number of components to display. By default, Jamovi decides using the parallel analysis but you can also use the Kaiser criterion (eigenvalues > 1) or specify the number of components.
    3. Additional output: useful summary tables, including a scree plot if you wish to visualise how the variance is explained by each component.

Click to enlarge. PCA analysis in Jamovi.

Click to enlarge. PCA analysis in Jamovi.
  1. In the Analyses tab, click on snowCluster and select PCA & Group Plot.
  2. If you have a grouping variable that you want to use, select the Group Plot tab. Otherwise, use the PCA Plot tab. Here we will use the Group Plot tab.
  3. Drag the same variables you used in the PCA analysis to the Variables box.
  4. Drag the grouping variable to the Grouping box. Try sex and sp to see how the groups are distributed.
  5. Check the Individuals by groups box to display the individual plot.
  6. Check the PCA-Biplot box to display the biplot of individuals and variables.
  7. Adjust the size of the plot as needed, noting that it is width x height in pixels.

Click to enlarge. PCA plotting in Jamovi.

Click to enlarge. PCA plotting in Jamovi.