GBS Barcode Generator
A Major update v2.0 of the barcode generating algorithm has been implemented on 12-6-2012. This results in 100x faster performance, less variation in nucleotide distribution but also a different order and composition of barcodes compared to the previous version (1.0a). If for whatever reason you require output from the previous version, please click the radio-button at the bottom of the form.
Genotyping by sequencing (GBS) is a simple highly-multiplexed system for constructing reduced representation libraries for the Illumina next-generation sequencing platform developed in the Buckler Lab at Cornell. Key components of the system are: reduced sample handling, fewer PCR and purification steps, no size fractionation and inexpensive barcoding. They use restriction enzymes to reduce genome complexity and avoid the repetitive fraction of the genome.
One of the difficulties in developing this technology was the complexity of generating a lot of barcodes that generate high quality libraries. Since the sequenced fragments all contain an enzyme recognition site this leads to identical nucleotides directly after the barcode if one uses one length of barcodes.
In the context of my PhD project at NIOO KNAW I followed a course at the Buckler lab. I decided to contribute by developing a script for generating Barcoded adapters satisfying the following criteria:
- Barcodes ( ligated to rescriction fragments) do not recreate enzyme restriction site.
- Every barcode differs from all other barcodes by at least 3 mutational steps.
- Barcodes can be modulated in length between 4 and 10 (possibly more) bases.
- The balance of nucleotides at every position in the barcodes is maintained as much as possible.
- Mononucleotide runs are not allowed by default. The threshold can be set by a parameter.
- Smaller Barcodes do not nest in larger barcodes, thus avoiding confusion in downstream bioinformatics.
- The script produces an identical output for identical queries, thus allowing for standardization across labs
Generating Barcodes usually takes less than 5 minutes. It is mainly dependent on the number of possible combinations of barcodes as determined by the maximum barcode length. The script has been tested for barcodes up to length 10.
Please file tickets for all possible bugs and feature requests here.
For question and comments:
This software is provided “as is,” without warranty of any kind, express or implied. In no event shall Deena Bioinformatics or Thomas van Gurp be held liable for any direct, indirect, incidental, special or consequential damages arising out of the use of or inability to use this software.