Let’s suppose you already downloaded CRAC (if not, it is not too late!). We now explain how to install it and how to launch it (without entering details).
You just need to install the package using a dedicated program on your
distribution or by typing
dpkg -i package-name, where
must be replaced by the name of your package. This will install
as well as
crac-index (to create indexes working with CRAC) on your
From the source code
- Unpack the archive
- Enter the directory
- If everything went fine, run
- You may want to check everything is ok by running
- Finally, you can install the software on your system by running
If the configure step failed, that may be due to a missing library. In
zlib is needed. On a Debian or Debian-like system,
you’ll need to install
CRAC relies on a pre-computed index of the genome, as Bowtie or BWA do. The first step is therefore to build such an index, if it is not done yet.
If you still need to index your genome, we explain it in the following:
crac-index program is the one that creates an index for a genome.
For creating such an index, you must launch a command as this one:
crac-index index myIndex sequence1.fa sequence2.fa sequence3.fa
The first parameter (
index) specifies that we want to create an index.
The second parameter (
myIndex) is the name of the index to be created.
The following parameters are FASTA or multi-FASTA files containing the
sequences to be indexed.
The creation of the index always generates two files: a .ssa file and a
.conf file. Please, note that the extensions must not be provided
crac-index nor to
If needed, the original sequences can be recovered using the CRAC index. Therefore you can delete the original FASTA files to save space. This recovery can be done using the following command:
crac-index get sequences.fa myIndex
This will output all the sequences indexed in myIndex.ssa in the file sequences.fa.
Once an index has been built, CRAC can be used with that index. CRAC must always be launched with at least three parameters.
-ithe name of the index (e.g. myIndex, recall that the extension must not be provided!)
-rthe name of the FASTA or FASTQ file containing the reads (the input file may also be compressed using gzip)
-kthe length of the k-mer to be used, we recommend to set k to 22 for the human genome for a better accuracy.
The value of
k is very important for the algorithm. You must not
underestimate it, otherwise the results will be of no utility. It must
be chosen to ensure (as much as possible) that a k-mer has a very high
probability to occur a single times on the genome.
You may also want some output to be created to know what was mapped,
what was not, and where. CRAC can produce a SAM/BAM file by specifying the
name of the SAM output file using the
As an example, CRAC can be launched with those parameters:
crac -i myIndex -k 22 -r reads_1.fastq read_2.fastq --bam -o output.bam --nb-threads 10
In that example CRAC is launched on the genome indexed in myIndex, with 22-mers
on the paired-end reads stored in
The output is written in the output.bam file and the program is launched in
parallel on 10 threads.