BI 분석에서 대표적으로 사용되는 툴들의
설치 방법에 대해 알아보겠습니다.
설치 대상은 아래와 같습니다.
- bedtools2
- samtools-1.16.1
- bcftools-1.16
-HTSlib-1.16
- sickle-1.33
- bwa-mem2
- gatk-4.3.0.0
git clone, wget 명령어를 통해서 파일을 다운받아서 설치 하겠습니다.
( centos 운영체제를 사용하고 있습니다. )
다운 받고자하는 디렉토리로 이동하시고 아래 cmd 를 입력하시면 됩니다.
[skkwon1048@server Tools]$ pwd
/data/Tools
1. BEDTOOLS
$ wget https://github.com/arq5x/bedtools2/releases/download/v2.29.1/bedtools-2.29.1.tar.gz
$ tar -zxvf bedtools-2.29.1.tar.gz
$ cd bedtools2
$ make
$ cd bedtools2/bin
$ ./bedtools
#작동 확인
bedtools is a powerful toolset for genome arithmetic.
Version: v2.30.0
About: developed in the quinlanlab.org and by many contributors worldwide.
Docs: http://bedtools.readthedocs.io/
Code: https://github.com/arq5x/bedtools2
Mail: https://groups.google.com/forum/#!forum/bedtools-discuss
Usage: bedtools <subcommand> [options]
The bedtools sub-commands include:
https://bedtools.readthedocs.io/en/latest/content/installation.html
Installation — bedtools 2.30.0 documentation
Installation bedtools is intended to run in a “command line” environment on UNIX, LINUX and Apple OS X operating systems. Installing bedtools involves either downloading the source code and compiling it manually, or installing stable release from packa
bedtools.readthedocs.io
2. SAMTOOLS
$ wget https://github.com/samtools/samtools/releases/download/1.16.1/samtools-1.16.1.tar.bz2
$ tar -vxjf samtools-1.16.1.tar.bz2
$ cd samtools-1.16.1
$ make
$ cd samtools-1.16.1
$ ./samtools
#작동확인
Program: samtools (Tools for alignments in the SAM format)
Version: 1.16.1 (using htslib 1.16)
Usage: samtools <command> [options]
Commands:
-- Indexing
dict create a sequence dictionary file
faidx index/extract FASTA
fqidx index/extract FASTQ
index index alignment
http://www.htslib.org/download/
SAMtools/BCFtools/HTSlib - Downloads
Current releases SAMtools and BCFtools are distributed as individual packages. The code uses HTSlib internally, but these source packages contain their own copies of htslib so they can be built independently. HTSlib is also distributed as a separate packag
www.htslib.org
3. BCFTOOLS
$ wget https://github.com/samtools/bcftools/releases/download/1.16/bcftools-1.16.tar.bz2
$ tar -vxjf bcftools-1.9.tar.bz2
$ cd bcftools-1.9
$ make
$ cd bcftools-1.16
$ ./bcftools
#작동확인
Program: bcftools (Tools for variant calling and manipulating VCFs and BCFs)
Version: 1.16 (using htslib 1.16)
Usage: bcftools [--version|--version-only] [--help] <command> <argument>
Commands:
-- Indexing
index index VCF/BCF files
-- VCF/BCF manipulation
annotate annotate and edit VCF/BCF files
concat concatenate VCF/BCF files from the same set of samples
convert convert VCF/BCF files to different formats and back
head view VCF/BCF file headers
http://www.htslib.org/download/
SAMtools/BCFtools/HTSlib - Downloads
Current releases SAMtools and BCFtools are distributed as individual packages. The code uses HTSlib internally, but these source packages contain their own copies of htslib so they can be built independently. HTSlib is also distributed as a separate packag
www.htslib.org
4. HTSlib
$ wget https://github.com/samtools/htslib/releases/download/1.16/htslib-1.16.tar.bz2 -O htslib.tar.bz2
$ tar -xjvf htslib.tar.bz2
$ cd htslib-1.16
$ make
$ cd htslib-1.16
$ ./htslib
#작동확인
http://www.htslib.org/download/
SAMtools/BCFtools/HTSlib - Downloads
Current releases SAMtools and BCFtools are distributed as individual packages. The code uses HTSlib internally, but these source packages contain their own copies of htslib so they can be built independently. HTSlib is also distributed as a separate packag
www.htslib.org
Export To Path And Refresh
시스템변수를 지정하면, 툴실행할때
절대경로를 이용하거나 working directory 에 가서 실행해야하는
번거로움을 해결 할 수 있습니다.
export PATH="$PATH:/usr/bin/bcftools-1.16"
export PATH="$PATH:/usr/bin/samtools-1.16.1"
export PATH="$PATH:/usr/bin/htslib-1.16"
source ~/.profile
5. SICKLE
$ https://github.com/najoshi/sickle/archive/refs/tags/v1.33.tar.gz
$ tar -zxvf sickle-1.33.tar.gz
$ cd
$ make
$ cd sickle-1.33
$ ./sickle
#작동확인
Usage: sickle <command> [options]
Command:
pe paired-end sequence trimming
se single-end sequence trimming
https://github.com/najoshi/sickle/releases
Releases · najoshi/sickle
Windowed Adaptive Trimming for fastq files using quality - najoshi/sickle
github.com
6. BWA MEM2
$ git clone https://github.com/bwa-mem2/bwa-mem2
$ cd bwa-mem2
$ git submodule init
$ git submodule update
# Compile and run
$ make
$ ./bwa-mem2
cd bwa-mem2
./bwa-mem2
#작동확인
Looking to launch executable "/data/Tools/align/bwa-mem2/./bwa-mem2.avx2", simd = .avx2
Launching executable "/data/Tools/align/bwa-mem2/./bwa-mem2.avx2"
Usage: bwa-mem2 <command> <arguments>
Commands:
index create index
mem alignment
version print version number
https://github.com/bwa-mem2/bwa-mem2
GitHub - bwa-mem2/bwa-mem2: The next version of bwa-mem
The next version of bwa-mem . Contribute to bwa-mem2/bwa-mem2 development by creating an account on GitHub.
github.com
Difference between mem and mem2
mem 과 mem2 성능 비교한 포스팅이 있어서 참고하시면 유용한 정보가 될듯합니다.
https://oyat.nl/bwa2/
Running bwa vs bwa-mem2 – Oyat consulting
What is the main difference between bwa and bwa-mem2? bwa-mem2 only covers mem. Not aln or bwasw etceterabwa-mem2 is faster Is there a difference in command line syntax when running bwa-mem2? No. The command line syntax is the same for bwa mem as for bwa-m
oyat.nl
7. GATK4
$ wget https://github.com/broadinstitute/gatk/releases/download/4.3.0.0/gatk-4.3.0.0.zip
$ unzip gatk-4.3.0.0.zip
$ cd gatk-4.3.0.0
$ ./gatk
#작동확인
Usage template for all tools (uses --spark-runner LOCAL when used with a Spark tool)
gatk AnyTool toolArgs
Usage template for Spark tools (will NOT work on non-Spark tools)
gatk SparkTool toolArgs [ -- --spark-runner <LOCAL | SPARK | GCS> sparkArgs ]
Getting help
gatk --list Print the list of available tools
gatk Tool --help Print help on a particular tool
Configuration File Specification
--gatk-config-file PATH/TO/GATK/PROPERTIES/FILE
gatk forwards commands to GATK and adds some sugar for submitting spark jobs
--spark-runner <target> controls how spark tools are run
valid targets are:
LOCAL: run using the in-memory spark runner
SPARK: run using spark-submit on an existing cluster
--spark-master must be specified
--spark-submit-command may be specified to control the Spark submit command
arguments to spark-submit may optionally be specified after --
GCS: run using Google cloud dataproc
commands after the -- will be passed to dataproc
--cluster <your-cluster> must be specified after the --
spark properties and some common spark-submit parameters will be translated
to dataproc equivalents
--dry-run may be specified to output the generated command line without running it
--java-options 'OPTION1[ OPTION2=Y ... ]' optional - pass the given string of options to the
java JVM at runtime.
Java options MUST be passed inside a single string with space-separated values.
--debug-port <number> sets up a Java VM debug agent to listen to debugger connections on a
particular port number. This in turn will add the necessary java VM arguments
so that you don't need to explicitly indicate these using --java-options.
--debug-suspend sets the Java VM debug agent up so that the run get immediatelly suspended
waiting for a debugger to connect. By default the port number is 5005 but
can be customized using --debug-port
https://gatk.broadinstitute.org/hc/en-us/articles/360036194592-Getting-started-with-GATK4
Bioinformatics 분석에 대표적으로 사용되는 툴 설치 방법에 대해 알아보았습니다.
각 툴들에서 다양한 옵션들과 기능들을 파악해보면 좋을듯 합니다.
다음에 또 찾아뵙겠습니다 :)
'Bioinformatics' 카테고리의 다른 글
Bedtools v2.17.0 vs v2.30.0 비교 [coverage] (0) | 2023.02.23 |
---|---|
NA12878 fastq download (0) | 2023.02.16 |
Qualimap2 - BAMQC (0) | 2023.01.20 |
[Samtools] [Bedtools] BAM to FASTQ 파일 전환 (0) | 2023.01.19 |
[Bedtools] BAM to BED 파일 전환 (0) | 2023.01.16 |