Analysis of codon usage bias in the VP1 gene of the human norovirus GII.2 genotype

Haoning Wang, Guanglin Cui, Xiaolong Wang, Minghai Zhang, Jimin Tan, Xingguang Li


To investigate the codon usage patterns of all available VP1 gene sequences of the GII.2 genotype, to determine the factors that affect these patterns, and to provide comprehensive details of the characteristics and evolution of the gene. Complete 519 sequences of VP1 gene of the HuNoV GII.2 genotype with known sampling dates and geographic locations from 1971 - 2017 were retrieved from the GenBank nucleotide database of the National Center for Biotechnology Information (NCBI) and analyzed. The percentage composition of T, C, A, and G nucleotides were 24.80 ± 0.30, 26.61 ± 0.31, 25.84 ± 0.13, and 22.75 ± 0.17 %, respectively, with C and A relatively more abundant than T and G, and C the most abundant (p < 0.0001). The values of T3s (34.10 ± 0.90 %) and C3s (33.54 ± 0.90 %) were significantly higher than those of A3s (29.98 ± 0.43 %) and G3s (24.13 ± 0.51 %) (p < 0.0001). While T3s was highest among the four nucleotides, G3s was the lowest. Among the 18 most frequently employed synonymous codons, six optional codons ended with T, five ended with C, five ended with A and two ended with G. Codons ending with T were the most frequently used. The ENC ranged from 51.90 to 54.25 (mean = 52.38 ± 0.43) among the 519 VP1 gene sequences. There were significant correlations between ENC and C % and G % (p < 0.01). Codons containing CpG (1 and 2 or 2 and 3 codon positions) showed the lowest frequencies, while 30, 29, and 2 codons were above, below and on the mean line, respectively. The first four principal components accounted for 69.11 % of the total variation, with the first, second, third, and fourth principal axes contributing 37.90, 14.83, 9.61, and 6.77 %, respectively. The strains were not clustered by country of isolation or year of sampling. Gravy were significantly correlated with T3s, C3s, G3s, GC3s, and ENC (p < 0.01). Mutation pressure and natural selection contributed to the codon usage bias of the VP1 gene of the HuNoV GII.2 genotype. There was a correlation between GC12s and GC3s (R2 = 0.032; p < 0.0001). The relative neutrality was 3.20 %, while natural selection was 96.80 %. The VP1 gene exhibits low codon usage bias which is affected primarily by natural selection, followed by mutation pressure and translational selection.


Human norovirus; GII.2 genotype; Codon usage bias; Natural selection; Mutation pressure.

