A tool to circularize genome assemblies. The algorithm and benchmarks are described in the Genome Biology manuscript. Citation: "Circlator: automated circularization of genome assemblies using long sequencing reads", Hunt et al, Genome Biology 2015 Dec 29;16(1):294. doi: 10.1186/s13059-015-0849-0. PMID: 26714481.
For how to use Circlator, please see the Circlator wiki page.
Using pip or from source
Circlator has the following dependencies, which need to be installed first (you will need SPAdes or Canu:
- BWA version >= 0.7.12
- prodigal version >= 2.6
- SAMtools (versions 0.1.9 to 1.3)
- MUMmer version >= 3.23
- Canu and/or SPAdes. SPAdes version 3.6.2 or higher is required, but 3.7.1 is recommended (marginally gave the best results on NCTC data from the Circlator publication, tested on all SPAdes versions 3.6.2-3.9.0).
Note that you can use the environment variable $CIRCLATOR_SPADES to specify the name of the SPAdes exectuable. If this environment variable is set, then it is used by Circlator. Otherwise, Circlator will look for spades.py in your $PATH.
Once the dependencies are installed, install Circlator using pip3:
pip3 install circlator
Alternatively, you can download the latest release from the github repository, or clone the repository. Then run the tests:
python3 setup.py test
If the tests all pass, install:
python3 setup.py install
Circlator can be run in a Docker container. First install Docker, then install circlator:
docker pull sangerpathogens/circlator
To use it you would use a command such as this (substituting in your directories), where your files are assumed to be stored in /home/ubuntu/data:
docker run --rm -it -v /home/ubuntu/data:/data sangerpathogens/circlator circlator all /data/assembly.fasta /data/reads /data/output_directory
Verify the dependencies are installed
Check that Circlator can find the dependencies, and that the versions are high enough, by running
See the help for progcheck for more details.
Please read the Circlator wiki page for usage instructions.
BWA: Li, H et al. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997.
MUMmer: Kurtz, S. et al. Versatile and open software for comparing large genomes. Genome Biol. 5, R12 (2004).
Prodigal: Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).
SAMtools: Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–9 (2009).
SPAdes: Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–77 (2012).