Let's suppose you already downloaded CRAC (if not, it is not too late!). We now explain how to install it and how to launch it (without entering details).
You just need to install the package using a dedicated program on your distribution or by typing
dpkg -i package-name, where
package-name must be replaced by the name of your package. This will install
crac as well as
crac-index (to create indexes working with CRAC) on your system.
From the source code
- Unpack the archive
- Enter the directory
- If everything went fine, run
- You may want to check everything is ok by running
- Finally, you can install the software on your system by running
If the configure step failed, that may be due to a missing library. In particular the
zlib is needed. On a Debian or Debian-like system, you'll need to install
On a Galaxy instance
CRAC relies on a pre-computed index of the genome, as Bowtie or BWA do. The first step is therefore to build such an index, if it is not done yet. Another solution is to download one of the precomputed indexes.
If you still need to index your genome, we explain it in the following:
crac-index program is the one that creates an index for a genome. For creating such an index, you must launch a command as this one:
crac-index index myIndex sequence1.fa sequence2.fa sequence3.fa
The first parameter (
index) specifies that we want to create an index. The second parameter (
myIndex) is the name of the index to be created. The following parameters are FASTA or multi-FASTA files containing the sequences to be indexed.
The creation of the index always generates two files: a .ssa file and a .conf file. Please, note that the extensions must not be provided neither to
crac-index nor to
If needed, the original sequences can be recovered using the CRAC index. Therefore you can delete the original FASTA files to save space. This recovery can be done using the following command:
crac-index get sequences.fa myIndex
This will output all the sequences indexed in myIndex.ssa in the file sequences.fa.
Once an index has been built, CRAC can be used with that index. CRAC must always be launched with at least three parameters.
-ithe name of the index (e.g. myIndex, recall that the extension must not be provided!)
-rthe name of the FASTA or FASTQ file containing the reads (the input file may also be compressed using gzip)
-kthe length of the k-mer to be used, we recommend to set k to 22 for the human genome for a better accuracy.
The value of
k is very important for the algorithm. You must not underestimate it, otherwise the results will be of no utility. It must be chosen to ensure (as much as possible) that a k-mer has a very high probability to occur a single times on the genome.
If the read length is fixed, we deeply recommend you to specify the read length, by using the
-m parameter. CRAC will therefore be much faster.
You may also want some output to be created to know what was mapped, what was not, and where. CRAC can produce a SAM file by specifying the name of the SAM output file using the
As an example, CRAC can be launched with those parameters:
crac -i myIndex -k 22 -r reads.fastq -m 200 --sam output.sam --nb-threads 10
In that example CRAC is launched on the genome indexed in myIndex, with 22-mers on the reads stored in reads.fastq. All reads are 200bp-long (or are truncated if longer, or ignored if shorter). The output is written in the output.sam file and the program is launched in parallel on 10 threads.