sratools
is a useful tool for downloading sequencing data from NCBI.
Installation
You can download sratools
from here on website or you can do it with commandLine
1
curl --output sratoolkit.tar.gz http://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/current/sratoolkit.current-mac64.tar.gz
Then extract the contents of the tar file
1
tar -vxzf sratoolkit.tar.gz
Use pwd
to see where your path is, then add it to environment variable
1
export PATH=$PATH:~/bio/software/sratoolkit.3.0.0-mac64/bin
Don’t forget to source
your ~/.bashrc
or ~/.zshrc
, depending on what kind of shell
you’re using.
Then you should enable your Security & Privacy
allowing the permission of software.
Alternatively, you can install sratools
by using brew
1
brew install sratools
Usage
First of all, you should learn how to search in SRA Enterz.
Downloading public data
Prefetch is a part of the SRA toolkit. This program downloads Runs (sequence files in the compressed SRA format) and all additional data necessary to convert the Run from the SRA format to a more commonly used format. Prefetch can be used to correct and finish an incomplete Run download.
Download one Run:
1
prefetch SRR000001
Download a list of Runs:
1
prefetch --option-file SraAccList.txt
faster-dump
and sam-dump
are part of SRA toolkit that can be use to convert protected Runs from compressed SRA to fastq or sam format.
1
faster-dump SRR11180057.sra
You can also do above jobs in one step, avoiding the prefetch
1
faster-dump SRR11180057
Downloading metadata associated with SRA data
SRA Run files do not contain any information about the metadata, such as sample information.
To download metadata for each Run, you can click Send to on the right top of the page.
Alternatively, you can use Run Selector, which provide useful and various filters to refine your search results.