BEIJING, March 4 (Xinhua) -- The National Genomics Data Center (NGDC) has updated its novel coronavirus disease (COVID-19) database, according to the Beijing Institute of Genomics from the Chinese Academy of Sciences (CAS).
In the updated database, the quality assessment information of each genome sequence and each variation site are provided in the columns of Released Genome Sequences, Data Statistics and Variation Annotation.
The database can also carry out sequence assembly evaluations on relevant reference genomes based on the original sequencing data, thus ensuring the reliability of virus genome reference sequences.
The new column called Variation Frequency has been added to dynamically show the frequency change trend of variation sites across different times and regions through a heat map.
After updating, the database has now integrated information on 249 virus genome sequences including the novel coronavirus and the virus suspected to be related to the novel coronavirus, 38,047 coronavirus sequences, 302 variations of human novel coronavirus genome, and 588 novel coronavirus-related papers.
By using the database, researchers carried out a variation analysis of the genome of COVID-19, and obtained detailed information on variation degree, variation region and variation base between the COVID-19 strains, between the COVID-19 strains and SARS-CoV and between the COVID-19 strains and SARS-like coronavirus bat strains, said Bao Yiming, director of the NGDC.
With over 300,000 sequence file downloads, the database has provided services for over 30,000 visitors from 106 countries and regions, said the CAS.