Converts GFF3 files from Prokka into a format suitable for submission to EMBL.

Build Status
License: GPL v3
install with bioconda
Container ready
Docker Build Status
Docker Pulls



Submitting annoated genomes to EMBL is a very difficult and time consuming process. This software converts GFF3 files from the most commonly use prokaryote annotation tool Prokka into a format that is suitable for submission to EMBL. It has been used to prepare more than 30% of all annotated genomes in EMBL/GenBank.

N.B. This implements some EMBL specific conventions and is not a generic conversion tool. It is also not a validator, so you need to pass in parameters which are acceptable to EMBL.


GFF3toEMBL has the following dependencies:

Required dependencies

There are a number of ways to install GFF3toEMBL and details are provided below. If you encounter an issue when installing GFF3toEMBL please contact your local system administrator. If you encounter a bug please log it here or email us at


A docker container is provided with all of the dependancies setup and installed. To install the container:

docker pull sangerpathogens/gff3toembl

To run the script from within the container on test data (substituting /home/ubuntu/data for your own directory):

docker run --rm -it -v /home/ubuntu/data:/data sangerpathogens/gff3toembl gff3_to_embl --output_filename /data/output_file.embl ABC 123 PRJ1234 ABC /opt/gff3toembl-1.1.0/gff3toembl/tests/data/single_feature.gff

From source

This is for advanced users. The homebrew recipe, Dockerfile and the TravisCI install dependancies script all contain steps to setup depenancies and install the software so might be worth looking at for hints.

Running the tests

Run python test


usage: gff3_to_embl [-h] [--authors AUTHORS] [--title TITLE]
                    [--publication PUBLICATION] [--genome_type GENOME_TYPE]
                    [--classification CLASSIFICATION]
                    [--output_filename OUTPUT_FILENAME]
                    [--locus_tag LOCUS_TAG]
                    [--translation_table TRANSLATION_TABLE]
                    [--chromosome_list CHROMOSOME_LIST] [--version]
                    organism taxonid project_accession description file

Converts prokaryote GFF3 annotations to EMBL for ENA submission. Cite

positional arguments:
  organism              Organism
  taxonid               Taxon id
  project_accession     Accession number for the project
  description           Genus species subspecies strain of organism
  file                  GFF3 filename

optional arguments:
  -h, --help            show this help message and exit
  --authors AUTHORS, -i AUTHORS
                        Authors (in the EMBL RA line style)
  --title TITLE, -m TITLE
                        Title of paper (in the EMBL RT line style)
                        Publication or journal name (in the EMBL RL line
  --genome_type GENOME_TYPE, -g GENOME_TYPE
                        Genome type (linear/circular)
                        Classification (PROK/UNC/..)
                        Output filename
  --locus_tag LOCUS_TAG, -l LOCUS_TAG
                        Overwrite the locus tag in the annotation file
                        Translation table
                        Create a chromosome list file, and use the supplied
  --version             show program's version number and exit

An example:

gff3_to_embl --authors 'John' --title 'Some title' --publication 'Some journal' \
             --genome_type 'circular' --classification 'PROK' \
             --output_filename /tmp/single_feature.embl --translation_table 11 \
             Organism 1234 'My project' 'My description' gff3toembl/tests/data/single_feature.gff

Example data

The directory ‘example_data’ contains an input GFF file and the output file along with the command.


GFF3toEMBL is free software, licensed under GPLv3.


Please report any issues to the issues page or email


If you use this software please cite:

GFF3toEMBL: Preparing annotated assemblies for submission to EMBL
Andrew J. Page, Sascha Steinbiss, Ben Taylor, Torsten Seemann, Jacqueline A. Keane
The Journal of Open Source Software, 1 (6) 2016. doi: 10.21105/joss.00080

Known Issues

This doesn’t work with some versions of Genometools on Mac OS X; it appears to work with Genometools 1.5.4