View on GitHub

Circlator

A tool to circularize genome assemblies

Circlator

A tool to circularize genome assemblies. The algorithm and benchmarks are described in the Genome Biology manuscript. Citation: "Circlator: automated circularization of genome assemblies using long sequencing reads", Hunt et al, Genome Biology 2015 Dec 29;16(1):294. doi: 10.1186/s13059-015-0849-0. PMID: 26714481.

For how to use Circlator, please see the Circlator wiki page.

Installation

Using pip or from source

Circlator has the following dependencies, which need to be installed first (you will need SPAdes or Canu:

Note that you can use the environment variable $CIRCLATOR_SPADES to specify the name of the SPAdes exectuable. If this environment variable is set, then it is used by Circlator. Otherwise, Circlator will look for spades.py in your $PATH.

Once the dependencies are installed, install Circlator using pip3:

pip3 install circlator

Alternatively, you can download the latest release from the github repository, or clone the repository. Then run the tests:

python3 setup.py test

If the tests all pass, install:

python3 setup.py install

Using Docker

Circlator can be run in a Docker container. First install Docker, then install circlator:

docker pull sangerpathogens/circlator

To use it you would use a command such as this (substituting in your directories), where your files are assumed to be stored in /home/ubuntu/data:

docker run --rm -it -v /home/ubuntu/data:/data sangerpathogens/circlator circlator all /data/assembly.fasta /data/reads /data/output_directory 

Verify the dependencies are installed

Check that Circlator can find the dependencies, and that the versions are high enough, by running

circlator progcheck

See the help for progcheck for more details.

Usage

Please read the Circlator wiki page for usage instructions.

References

BWA: Li, H et al. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997.

MUMmer: Kurtz, S. et al. Versatile and open software for comparing large genomes. Genome Biol. 5, R12 (2004).

Prodigal: Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).

SAMtools: Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–9 (2009).

SPAdes: Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–77 (2012).