Software
PLINK
PLINK is freely available to users at HPC2N.
PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.
The focus of PLINK is purely on analysis of genotype/phenotype data, so there is no support for steps prior to this (e.g. study design and planning, generating genotype or CNV calls from raw data).
PLINK (one syllable) is being developed by Shaun Purcell at the Center for Human Genetic Research (CHGR), Massachusetts General Hospital (MGH), and the Broad Institute of Harvard & MIT, with the support of others.
On HPC2N we have PLINK available as a module on Kebnekaise. Binaries are compiled for parallel usage, with BLAS/LAPACK, but without support for webcheck and R.
To use the plink module, add it to your environment. Use:
module spider plink
to see how to load the module and the needed prerequisites.
Example, loading Plink version 1.07
ml GCC/6.3.0-2.27 ml OpenMPI/2.0.2 ml PLINK/1.07
You can read more about loading modules on our Accessing software with Lmod page and our Using modules (Lmod) page.
Loading the module should set any needed environmental variables as well as the path.
See http://zzz.bwh.harvard.edu/plink/ for more information about the usage of plink.
Make a job-script similar to this example:
#!/bin/bash # Project to run under #SBATCH -A SNICXXXX-YY-ZZ # name of the error file #SBATCH --error=my_plink_job_%J.err # name of the output file #SBATCH --output=my_plink_job_%J.out # when to send email #SBATCH --mail-type=ALL # asking for 1 node, 8 processors #SBATCH -N 1 #SBATCH -n 8 # the job may to use up to 30 minutes to run #SBATCH -t 00:30:00 # Purge any loaded modules and then load the PLINK module and its prerequisites ml GCC/6.3.0-2.27 ml OpenMPI/2.0.2 ml PLINK/1.07 # run the job plink <options>
Submit it with
sbatch <jobscript>
where <jobscript> is the name you give your batch-script.
More information about PLINK can be found on the PLINK homepage.