Welcome to the Reference Sequence Browser
Read this page first if you want to make best use of our app!
This app was developed by the BlueWaltzBio team.
The Reference Sequence Browser (RSB) is a rShiny application that enables the eDNA community to screen and extract organism-specific data from genetic barcode databases such as the NCBI Nucleotide , NCBI Genome , BOLD (Barcode for Life Database) , and publicly available CRUX databases . The app is best used prior to conducting metabarcoding to assess reference sequence availability or to download non duplicative FASTA files from BOLD and NCBI to develop local sequence databases for q-pcr development. Each different database can be searched in its own tab in RSB and has its own quick user-guide.
The tool is built to search for many organisms and genetic barcodes simultaneously and thus can be used in multiple ways to speed up Environmental DNA workflows. Users only need to assemble a list of organisms in scientific names for the tool to search for reference sequences at known barcoding loci (e.g COI, 16S, 18S, trnL ). Only NCBI additionally requires barcode-gene names to complete its Nucleotide search. To make inputting long lists into the app easier, download this CSV template and fill it out with your organism and barcode names. This template can be uploaded within any database tab.
The app uses the R packages “Rentrez” and “Bold” to access the live, up to date NCBI and BOLD databases. RSB retrieves and displays the number of reference sequences in one of the aforementioned databases for the combination of an organism AND a barcode. For example, RSB searches for instances of Canis Lupus’s COI barcode-gene. Thus every database tab produces a Coverage Matrix (CM) with organism names as rows and barcoding loci as columns, with cells that display the number of reference sequences retrieved. Each tab also includes summary statistics and tab specific visualizations/tables to help make sense of searches of numerous species and barcodes.
In addition to previewing what sequences are available in the CM tables, users are able to download the sequences in FASTA file format (excluding CRUX); which can then easily be imported into various genomics softwares (e.g Geneious, etc). The RSB BOLD database search allows users to exclude entries also in NCBI Nucleotide from the BOLD CM results, visualizations, and FASTA downloads. This allows users to avoid downloading duplicate FASTA files between BOLD and NCBI Nucleotide.
Use cases:
RSB can be used to improve the workflow of species specific q-pcr development for eDNA applications (Klymus 1). If you are interested in developing a species specific q-primer, RSB can be used to rapidly create a non-duplicative local sequence database by downloading FASTA files from the NCBI and BOLD tabs.
RSB can also be used to determine what organisms can and to what be detected by metabarcoding and which metabarcodes fit your study needs This can be determined by searching in the seven metabarcoding databases ( 16S, Vertebrate 12S, 18S, Plant ITS1, CO1, Fungal ITS2, and trnL ) in the CRUX tab or in the BOLD tab.
Additionally, If you are interested in finding full mitochondrial or chloroplast genomes in NCBI Nucleotide or entries in NCBI Genome go to the full genome tab. Guides to the aforementioned processes can be found lower down on this page.
Lastly, we hope this tool may be used to point to taxonomic groups lacking publically available reference sequences and thus aid in creating more deliberate and specific sequencing efforts.
This rShiny app was built in part to bridge the gap between eDNA scientists and large genomics databases by providing efficient and high throughput access to NCBI, BOLD, and CRUX databases without the user having to write a single line of code. Click on one of the tabs to get started.