Abstract
We report the first isolation and identification of a mouse genomic fragment encoding amino acid sequences for the proα1(I) chain of type I procollagen. The DNA sequence of eight coding sequences is presented; five of these are 54 bp and three are 108 bp in length. Together these specify 198 amino acids which are 94% homologous to the corresponding bovine proα1(I) chain protein sequences. Each of the eight coding sequences is flanked by appropriate splice-junction sequences that exhibit considerable sequence complementarity to the rat small nuclear U1a RNA. In the 198 codons examined in this mouse genomic clone, the preferred codons for glycine and alanine are GGU (46/67) and GCU (23/30), respectively. This is in contrast to the codon usage reported for the chicken proα1(I) cDNA clone (Fuller and Boedtker, 1981). The examined coding sequences exhibit considerable nucleotide homology in both end-to-end and in staggered alignments. Based on an analysis of this homology data, a model is presented for the generation of 108-bp coding sequences from 54-bp units by two successive homologous recombinational events within coding sequences. Alternatively, the 108-bp units may have arisen by precise deletions of an intervening sequence between 54-bp coding sequences. Evidence supporting this is provided by a comparison of proα1(I) and proα2(I) genes. In the mouse proα1(I) gene amino acids 856-891 are encoded in a 108-bp unit; in the chicken proα2(I) gene these residues are encoded in two 54-bp coding sequences. In addition, the coding sequences for nearly 50% of the α domain are condensed in the proα1(I) gene into a region approximately one half the size occupied by the comparable sequences in the proα2(I) gene.
Get full access to this article
View all access options for this article.
