Tag Content
SG ID
SG00007729 
UniProt Accession
Theoretical PI
9.37  
Molecular Weight
138943 Da  
Genbank Nucleotide ID
Genbank Protein ID
Gene Name
Col3a1 
Gene Synonyms/Alias
 
Protein Name
Collagen alpha-1(III) chain 
Protein Synonyms/Alias
Flags: Precursor 
Organism
Mus musculus (Mouse) 
NCBI Taxonomy ID
10090 
Chromosome Location
chr:1;45368383-45406551;1
View in Ensembl genome browser  
Function in Stage
Uncertain 
Function in Cell Type
Uncertain 
Probability (GAS) of Function in Spermatogenesis
0.035491387 
The probability was calculated by GAS algorithm, ranging from 0 to 1. The closer it is to 1, the more possibly it functions in spermatogenesis.
Description
Temporarily unavailable 
Abstract of related literatures
1. Overlapping cosmid clones were isolated that covered the entire mouse type-III collagen-encoding gene (mCol3) locus including flanking sequences approximately 40 kb upstream and 20 kb downstream from the gene. This gene was characterized initially by restriction mapping and then followed by sequencing of 43.6 kb, including 5 kb upstream from the transcription start point (tsp) and all exons and introns of the entire gene. The optimal parameters for sequencing a gene of this size were determined by sequencing 5-10-kb fragments at different ratios of random and directed sequencing, and comparing their efficiency. Based on our experience for sequencing mCol3, we have estimated that the most cost-efficient method was to achieve a twofold redundancy in sequencing by using random DNA subclones as templates for sequencing prior to initiating directed DNA sequencing to close the gaps between contiguous regions. mCol3 spans 37.6 kb from the tsp to the single polyadenylation site and contains 51 exons. The overall structure of mCol3 is similar to that of other members of the fibrillar collagen-encoding gene family. Several repetitive elements were located within the gene boundaries. Based on the nucleotide (nt) sequence, the predicted sizes of the mouse type-III collagen (mCOL3) mRNA and polypeptide are 4767 nt and 1464 amino acids (aa), respectively. A comparison of mCOL3 versus the human type-III collagen (hCOL3) showed 91% identity at the aa level. PMID: [7926795] 

2. The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline. PMID: [15489334] 

3. We present the complete nucleotide (nt) sequence and derived amino acid (aa) sequence of the N-terminal portion of the murine alpha-1 type-III collagen chain. The detailed structure of this region is important for the understanding of type-III collagen biosynthesis in normal tissue and during fibrosis. The cDNA clones, pCIII-1-C119, pCIII-1-C534 and pCIII-1-C572, covering a total of 1485 nt, code for 19 nt of the 5' untranslated region, the 24 aa of the signal peptide, the 130 aa of the N-terminal propeptide, the 9 aa of the telopeptide and 334 aa of the helical domain. PMID: [3443309] 

4. We have identified the promoter-proximal exon of the mouse alpha 1 (III) collagen gene using a synthetic oligonucleotide as a hybridization probe and have determined the DNA sequence of this exon and 380 base pairs 5' to it. The exact start site of transcription was localized with a primer extension experiment. The region upstream of the start of transcription shows only scattered homologies with the analogous sequences in the alpha 1(I) and alpha 2(I) mouse collagen genes although these genes are often co-expressed and co-regulated. The most striking homology with the type I gene is seen around the start of translation. This region contains an inverted repeat which could form a stem-loop structure with a calculated delta G of -30 kcal in the type III collagen mRNA. When compared to the alpha 1(I) and alpha 2(I) signal peptides, the signal peptide of mouse alpha 1(III) collagen presents less homology than when these segments are compared to each other. PMID: [3972847] 

5. This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development. PMID: [16141072] 

6. We have constructed DNA probes for the specific detection of mouse pro alpha 1(I), pro alpha 1(II), pro alpha 1(III) and alpha 1(IX) collagen transcripts. To avoid cross-hybridization the probes for fibrillar collagens cover mainly sequences in the 3' untranslated region of the gene. Sequencing and Northern analysis confirmed that the clones share minimal sequence similarity and detect only the specific mRNAs under normal hybridization and washing conditions. The clone for mouse alpha 1(IX) collagen covers coding sequences but is sufficiently divergent from other collagen transcripts to allow specific detection of the corresponding mRNA. PMID: [2054384] 

Back to Top
Function
Collagen type III occurs in most soft connective tissuesalong with type I collagen. 
Back to Top
Subcellular Location
Secreted, extracellular space, extracellularmatrix (By similarity). 
Tissue Specificity
 
Gene Ontology
GO IDGO termEvidence
GO:0005586 C:collagen type III IDA:MGI.
GO:0005201 F:extracellular matrix structural constituent IEA:InterPro.
GO:0001568 P:blood vessel development IMP:MGI.
GO:0071230 P:cellular response to amino acid stimulus IDA:MGI.
GO:0030199 P:collagen fibril organization IMP:MGI.
GO:0048565 P:digestive tract development IMP:MGI.
GO:0001501 P:skeletal system development IEA:Compara.
Back to Top
Interpro
IPR008160;    Collagen.
IPR000885;    Fib_collagen_C.
IPR001007;    VWF_C.
Back to Top
Pfam
PF01410;    COLFI;    1.
PF01391;    Collagen;    7.
PF00093;    VWC;    1.
Back to Top
SMART
SM00038;    COLFI;    1.
SM00214;    VWC;    1.
Back to Top
PROSITE
PS51461;    NC1_FIB;    1.
PS01208;    VWFC_1;    1.
PS50184;    VWFC_2;    1.
Back to Top
PRINTS
Created Date
18-Oct-2012 
Record Type
GAS predicted 
Sequence Annotation
SIGNAL        1     23       By similarity.
PROPEP       24    154       N-terminal propeptide.
                             /FTId=PRO_0000005743.
CHAIN       155   1219       Collagen alpha-1(III) chain.
                             /FTId=PRO_0000005744.
PROPEP     1220   1464       C-terminal propeptide.
                             /FTId=PRO_0000005745.
DOMAIN       31     90       VWFC.
DOMAIN     1230   1464       Fibrillar collagen NC1.
REGION      155    169       Nonhelical region (N-terminal).
REGION      170   1195       Triple-helical region.
MOD_RES     262    262       5-hydroxylysine (By similarity).
MOD_RES     283    283       5-hydroxylysine (By similarity).
MOD_RES     859    859       5-hydroxylysine (By similarity).
MOD_RES     976    976       5-hydroxylysine (By similarity).
MOD_RES    1093   1093       5-hydroxylysine (By similarity).
MOD_RES    1105   1105       5-hydroxylysine (By similarity).
MOD_RES    1376   1376       Phosphotyrosine (By similarity).
MOD_RES    1381   1381       Phosphoserine (By similarity).
CARBOHYD    262    262       O-linked (Gal...) (By similarity).
DISULFID   1195   1195       Interchain (By similarity).
DISULFID   1196   1196       Interchain (By similarity).
Back to Top
Nucleotide Sequence
Length: 43601 bp   Go to nucleotide: FASTA
Protein Sequence
Length: 1464 bp   Go to amino acid: FASTA
The verified Protein-Protein interaction information
UniProt
Gene Symbol Ref Databases
Other Protein-Protein interaction resources
String database  
View Microarray data
Comments