Genomes and Reads
You may want to use CRAC as quickly as possible, that is why we provide you with some pre-computed indexes. So you don't need to build them by your own, you just have to download them.
The available indexes are:
- Human genome (GRCh37) : GRCh37.ssa (1.9 GiB) – GRCh37.conf (295 B)
- Mouse genome (mm9) : mm9.ssa (1.6 GiB) – mm9.conf (267 B)
- Drosophila genome (dmel5) : dmel5.ssa (79 M) – dmel5.conf (81 B)
- Chimpanzee genome (panTro2) : panTro2.ssa (2.0 G) – panTro2.conf (317 B)
- Orang utan genome (ponAbe2) : ponAbe2.ssa (1.9 G) – ponAbe2.conf (305 B)
- Rhesus macaque (rheMac2) : rheMac2.ssa (1.8 G) – rheMac2.conf (259 B)
With CRAC an index is composed of two files whose names only differ by their extension. The
.ssa file is the index by itself while the
.conf file records the names and lengths of each chromosome.
In silico reads
In our article, we evaluated CRAC on several simulated datasets. Those datasets were generated by mutating a reference genome and then simulating RNA-sequencing by using FluxSimulator.
Each simulated dataset consists of several files. A FASTA file containing the sequences, a BED file containing the details of the genomic position (on the reference genome) for each read, an INFO file containing the information on all the reads containing a given mutation and an ERR file giving the information on the sequencing errors in each read.