← All methods
Population Structure

Population Structure (PCA & ADMIXTURE)

Decompose your germplasm into ancestral subpopulations before GWAS or selection.

How it works

Principal Component Analysis (PCA) on the genotype matrix gives a fast, model-free view of population stratification. The top PCs are essential covariates in GWAS — without them, subpopulation differences inflate false positives. We complement PCA with an ADMIXTURE-style ancestry analysis that estimates fractional ancestry from K subpopulations and uses cross-validation error to choose the best K.

Formula

PCA: eigendecomposition of the centered, scaled genotype matrix. ADMIXTURE: maximum-likelihood estimation of Q (ancestry) and P (allele-frequency) matrices under a model of K ancestral populations.

What you get

  • PC1–PC4 scatter plots with percent variance explained
  • Ancestry-proportion stacked bars for K=2..8
  • Cross-validation error curve to select K

When to use it

  • Before any GWAS run on a diverse panel
  • When sampling parents for a breeding program from multiple gene pools
  • To verify dataset composition before genomic selection

References

Run Population Structure on your data

Open the module and upload a CSV.

Open module