International Sheep Genomics Consortium
Announcing the RefSeq annotation of sheep ARS-UI_Ramb_v2.0 14th Aug 2021
The new reference assembly for sheep is now annotated! Assembly ARS-UI_Ramb_v2.0 is made of 142 scaffolds, a drop from 2,640 in the 2017 assembly Oar_rambouillet_v1.0. With a contig N50 of 43 Mb, ARS-UI_Ramb_v2.0 is 15 times more contiguous than the first assembly of the Rambouillet breed.
Details of this annotation, including statistics on the annotation products, the input data used in the pipeline and intermediate alignment results, can be found in Annotation Release 104 of ARS-UI_Ramb_v2.0Annotation of ARS-UI_Ramb_v2.0
Virtual conference for the ISGC and the IGGC, June, 9-11th 2021: Abstracts and Presentations
ISGC_IGGC 2021 meeting: abstracts and presentations - July 2021
Background to ISGC
The International Sheep Genomics Consortium (ISGC) is a partnership of scientists and funding agencies from Australia, Austria, Brazil, China, Finland, France, Germany, Greece, India, Iran, Israel, Italy, Kenya, New Zealand, Norway, Saudi Arabia, Spain, Switzerland, Turkey, United Kingdom and United States to develop public genomic resources that will help researchers find genes associated with production, quality and disease traits in sheep.
The project commenced informally in 2002 with the creation of a high quality ovine BAC library, and was built on an existing collaboration for the International Mapping Flock that was created nearly a decade earlier.
This work has continued and is most well known for the initial sequencing of the sheep genome and the creation of several SNP chip arrays: specifically the publicly available Illumina 50K and the Illumina 15K SNP chips. The ISGC was also involved in the creation of the Illumina HD 600K chip which is available upon request (see contacts). However, its major ongoing function has been sequencing and annotation of the sheep genome. This includes projects such as FAANG and the SheepGenomesDB commonly called the 1000 genomes sheep project.
Sheep Genome Assemblies
Please be aware that the various sheep genome assemblies are labelled differently in the different repositories. This has significant implications when identifying SNPs and other features in published papers. The initial assembly Oar_v1.0 was used to build the 50K chip and is still available at UCSC labelled as ISGC Ovis_aries_1.0. However, the three assemblies listed below are those that most published work has utilised.
Oar_v4.0 In 2015 the ISGC released Oar_v4.0 whereby long read technology (PacBio RSII) was utilised to improve the Oar_v3.1 assembly.
Oar_rambouillet_v1.0 In 2017 Baylor College of Medicine Human Genome Sequencing Center released a genome assembly from the Ramboullet breed. The genome assembly utilised a combination of Illumina short reads and PacBio RSII long reads.
Please note: this is not the expected final version with an expected update utilising Oxford Nanopore long reads to complement the assembly expected early 2020.
In addition, annotation of Rambouillet (OAR_USU_Benz2616) genome is underway via The Ovine FAANG project, led by Brenda Murdoch University of Idaho and is supported by the National Institute of Food and Agriculture, U.S. Department of Agriculture, award number USDA-NIFA-2017-67016-26301.
Global statistics (NCBI) of the three Ovis aries (sheep) genome assemblies
The Sheep Genomes Database is funded by the USDA AFRI to provide the sheep genomics research community with a genomes hub. It is an initiative of the International Sheep Genomics Consortium and extends the consortiums achievement on the build and release of the sheep reference genome assembly v3.1.
The Sheep Genomes Database has the following three key objectives:
Results from Run2:
Summary of animals in Run 2
ISGC SNP chip array genome positions
The SNPs on the consortium arrays (Illumina 15k, 50k and HD chips) have been mapped to Oar_rambouillet_v1.0. Probe sequences were taken from the Illumina manifests and mapped onto the Rambouillet genome (GCA_002742125.1) using bwa mem v0.7.17-r1188 with default settings (Indels were ignored). For each SNP a probe pair was constructed by using AlleleA_ProbeSeq and appending either the reference or the alternative allele. Only probe pairs were accepted that passed following filters.
The arrays were in addition mapped to Oar_v3.1 and Oar_v4 to enable comparison of mapping approach to NCBI and Ensembl. SNP name, position and allele from the consortium arrays available on Figshare