Sex | BW |
---|---|
F | 2.15 |
M | 2.55 |
F | 2.95 |
F | 2.70 |
M | 2.20 |
F | 1.85 |
M | 2.55 |
M | 2.60 |
Principal Component Analysis (PCA) in Jamovi
Cheatsheet
This work was developed using resources that are available under a Creative Commons Attribution 4.0 International License, made available on the SOLES Open Educational Resources repository by the School of Life and Environmental Sciences, The University of Sydney.
- You have Jamovi installed, ideally 2.5.7.0 or later.
- You can follow instructions to select, click and drag elements in Jamovi.
- You already understand what a PCA is and when to use it.
The data should be in a long format (also known as tidy data), where each row is an observation and each column is a variable (Figure 1). If your data is not already structured this way, reshape it manually in a spreadsheet program or in R using the pivot_longer()
function from the tidyr
package.
F | M |
---|---|
2.15 | 2.55 |
2.95 | 2.20 |
2.70 | 2.55 |
1.85 | 2.60 |
About
Principal component analysis (PCA) is a technique used to reduce the dimensionality of a dataset and is also used to identify patterns in data using a lower-dimensional space, making it easier to interpret. PCA answers questions such as:
- What are the most important variables in the dataset?
- Are there any patterns in the data?
- Can we reduce the number of variables in the dataset and still retain most of the information?
Interpretation of the results is not covered here.
Before we begin: install the snowCluster module
The snowCluster module makes plotting the PCA results easier. To install it:
- Select the Analyses tab.
- Click on the Modules button, and select jamovi library.
- Scroll down to find or do a search for snowCluster.
- Click INSTALL, and wait for the installation to complete.
- Exit by clicking on the upper-right arrow button.
Import the crab data
- Click the menu icon.
- Select
Open
>Browse
and select thecrab.csv
file. - Click
Open
.
PCA walkthrough
Click on the images to enlarge them.
- In the Analyses tab, click on Factor and select Principal Components Analysis.
- Use the
Ctrl
key to select multiple variables or theShift
key to select a range of variables. All of these variables must be numeric. - Drag the selected variables to the Variables box or click on the arrow button to use them in the analysis.
- Adjust the settings as needed:
- Method: The method used to calculate the PCA. The default is Varimax.
- Number of components: The number of components to display. By default, Jamovi decides using the parallel analysis but you can also use the Kaiser criterion (eigenvalues > 1) or specify the number of components.
- Additional output: useful summary tables, including a scree plot if you wish to visualise how the variance is explained by each component.
- In the Analyses tab, click on snowCluster and select PCA & Group Plot.
- If you have a grouping variable that you want to use, select the Group Plot tab. Otherwise, use the PCA Plot tab. Here we will use the Group Plot tab.
- Drag the same variables you used in the PCA analysis to the Variables box.
- Drag the grouping variable to the Grouping box. Try
sex
andsp
to see how the groups are distributed. - Check the Individuals by groups box to display the individual plot.
- Check the PCA-Biplot box to display the biplot of individuals and variables.
- Adjust the size of the plot as needed, noting that it is width x height in pixels.