Whole Genome Resequencing Pipeline v2.0

Automated mapping and indel realignment of resequencing data
393 Downloads
Updated 23 Apr 2014

View License

NEW!: SAMtools and Picard dependencies replaced by built-in MATLAB capabilities. Now only requires BWA and GATK.
Automates single-end whole genome resequence (WGRS) data processing whereby pre-installed dependencies are used to map reads from FASTQ to a reference and realign indels. BWA must be installed and available on the system path and GenomeAnalysisTK.jar must be available on the MATLAB path. If no arguments are provided, the user will be asked to provide one or more FASTQ files of reads and a reference FASTA. Developers are encouraged to adapt this template to their needs. Pipeline steps are:
(0a) FM-index reference (BWA index)
(0b) Create FASTA index (Internal fai)
(0c) Create sequence dictionary (Internal dict)
(1) Map reads (BWA mem)
(2) Convert SAM to BAM (MATLAB sam2bam)
(3) Sort BAM (MATLAB bamsort)
(4) Index BAM (MATLAB BioMap)
(5) Discover indels (GATK RealignerTargetCreator)
(6) Realign indels (GATK IndelRealigner)
(7) Cleanup

Cite As

Turner Conrad (2024). Whole Genome Resequencing Pipeline v2.0 (https://www.mathworks.com/matlabcentral/fileexchange/46078-whole-genome-resequencing-pipeline-v2-0), MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R2014a
Compatible with any release
Platform Compatibility
Windows macOS Linux
Categories
Find more on Genomics and Next Generation Sequencing in Help Center and MATLAB Answers

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Published Release Notes
1.1.0.0

See CHANGELOG section in code

1.0.0.0