Tag Content
SG ID
SG00018972 
UniProt Accession
Theoretical PI
5.05  
Molecular Weight
46880 Da  
Genbank Nucleotide ID
Genbank Protein ID
Gene Name
Serpina3k 
Gene Synonyms/Alias
Mcm2, Spi2 
Protein Name
Serine protease inhibitor A3K 
Protein Synonyms/Alias
Serpin A3K Contrapsin; SPI-2;Flags: Precursor 
Organism
Mus musculus (Mouse) 
NCBI Taxonomy ID
10090 
Chromosome Location
chr:12;105576696-105583951;1
View in Ensembl genome browser  
Function in Stage
Uncertain 
Function in Cell Type
Uncertain 
Probability (GAS) of Function in Spermatogenesis
0.112211165 
The probability was calculated by GAS algorithm, ranging from 0 to 1. The closer it is to 1, the more possibly it functions in spermatogenesis.
Description
Temporarily unavailable 
Abstract of related literatures
1. Four overlapping cDNA clones encoding contrapsin were isolated from a mouse liver cDNA library constructed in the expression vector, lambda gt11. M13 vector sequence analysis revealed that contrapsin cDNA contained an open reading frame of 1,254 bases encoding 418 amino acids. The N-terminal amino acid sequence of the isolated contrapsin matched residues 30 to 48 of the sequence deduced on nucleotide analysis. One clone, which had the longest 3' untranslated region, contained two sets of tandem polyadenylation signals, AATACA and AATAAA, which were located 497 bases apart, while the remaining three clones terminated at the first signal. The entire reading frame sequence of contrapsin cDNA showed 64% homology with that of human alpha-1-antichymotrypsin. PMID: [2277027] 

2. A cDNA clone (lambda MC-2) for contrapsin, a serine-proteinase inhibitor, was isolated from a lambda ZAP mouse liver cDNA library. The 1.6 kb cDNA insert of lambda MC-2 contained an open reading frame that encodes a 418-residue polypeptide (46,970 Da), in which a signal peptide of 21 residues was identified by comparison with the N-terminal sequence of the purified protein. The predicted structure (MC-2) also contained other peptide sequences determined by Edman degradation. Four potential sites for N-linked glycosylation were found in the molecule, accounting for the difference in molecular mass between the predicted form and the purified protein (63 kDa). Further screening of the cDNA library with an EcoRI-EcoRI fragment (510 bp) of lambda MC-2 as a probe yielded another cDNA clone (lambda MC-7), which encodes a 418-residue polypeptide (MC-7) with a calculated mass of 47,010 Da. MC-2 showed 83% similarity at the amino acid level to MC-7, in contrast with 44% similarity to alpha 1-proteinase inhibitor. The possible reactive site (P1-P'1) for serine proteinase is suggested to be Lys-Ala for MC-2 and Ser-Arg for MC-7. Northern-blot analysis revealed that both MC-2 and MC-7 mRNAs have the same size of 1.8 kb and are markedly induced in response to acute inflammation. Construction of the expression plasmids pSVMC-2 and pSVMC-7 and their transfection into COS-1 cells demonstrated that pSVMC-2 directs the synthesis of a 63 kDa form whereas pSVMC-7 expresses a 56 kDa form. The difference in molecular mass between the two may be explained by the fact that the MC-7 sequence contains three potential sites for N-glycosylation, one site less than that of MC-2. PMID: [2049065] 

3. Contrapsin is a member of the serpin superfamily and inhibits trypsin much more strongly than alpha1-antiproteinase. Mouse and rat contrapsins, however, have similarity in sequence to human alpha1- antichymotrypsin. In order to test the hypothesis that reactive site regions of contrapsin family evolved under strong selective pressure, cDNA sequence of C57BL/6 mouse contrapsin was determined and compared with that of ICR mouse. The cDNA sequence of C57BL/6 mouse contrapsin was found to contain an open reading frame encoding polypeptide consisting of 418 amino acid residues. The work reported in this paper shows that the reactive site is not hypervariable as compared with the rest of molecule. PMID: [11916263] 

4. This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development. PMID: [16141072] 

5. The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline. PMID: [15489334] 

6. The plasma protease inhibitors control a wide variety of physiological functions including blood coagulation, complement activation and aspects of the inflammatory response. The inhibitors function by forming a 1:1 complex with a specific protease within the reactive centre region of the inhibitor. Little is known about the evolutionary relationships of these inhibitors. We report here the sequences of cDNAs which represent the C-terminal halves of the two major murine plasma protease inhibitors. One of these, murine alpha 1-antitrypsin, more appropriately called alpha 1-proteinase inhibitor (alpha 1-PI), has diverged from its human counterpart at a vital position in the reactive centre but this has not led to a physiologically significant change in function. Also, we have determined the partial sequence of a recently characterized protein termed contrapsin, which inhibits trypsin-like proteases. We show, surprisingly, that contrapsin is highly homologous to human alpha 1-antichymotrypsin, an inhibitor of chymotrypsin-like proteases. The reactive centre regions of these two inhibitors have diverged considerably, which may account for the differences in specificity. We propose that the genes for contrapsin and human alpha 1-antichymotrypsin are the descendents of a single gene that have evolved since rodent and primate divergence to encode proteins with different functions. PMID: [6547997] 

7. We have studied the effects of murine alpha-1-antitrypsin and contrapsin, a new trypsin inhibitor (Takahara, H. and Sinohara, H. (1982) J. Biol. Chem. 257, in press), on several serine proteases participating in blood clotting, fibrinolysis, kinin generation, and complement activation. Bovine plasmin and human plasma kallikrein were inactivated by contrapsin but not by alpha-1-antitrypsin, whereas bovine alpha-thrombin and porcine pancreas kallikrein were inhibited by alpha-1-antitrypsin but not by contrapsin. Heparin protected thrombin from inactivation by alpha-1-antitrypsin. Both inhibitors had virtually no effects on canine C1 esterase. PMID: [6214866] 

8. Contrapsin and alpha-1-antitrypsin have been recently characterized as major protease inhibitors in mouse plasma (Takahara, H. & Sinohara, H. (1982) J. Biol. Chem. 257, 2438-2446). We have studied the effects of the two inhibitors upon various serine proteases prepared from mouse tissues. Trypsin, plasmin and trypsin-like proteases of the submaxillary gland were inhibited by contrapsin but not by alpha-1-antitrypsin. On the other hand, chymotrypsin, elastase, and thrombin were inactivated by alpha-1-antitrypsin but not by contrapsin. Thus, their inhibitory spectra did not overlap each other in spite of their broad specificities. The inhibition of trypsin, chymotrypsin, and elastase was rapid and stoichiometric, whereas the inhibition of the other proteases was relatively slow. Contrapsin accounted for almost the total capacities of mouse plasma to inhibit both trypsin and submaxillary gland trypsin-like proteases, whereas alpha-1-antitrypsin was responsible for nearly all the capacities of plasma to inhibit both chymotrypsin and elastase. PMID: [6224776] 

9. The major human plasma protease inhibitors, alpha(1)-antitrypsin and alpha(1)-antichymotrypsin, are each encoded by a single gene, whereas in the mouse they are represented by clusters of 5 and 14 genes, respectively. Although there is a high degree of overall sequence similarity within these groupings, the reactive-center loop (RCL) domain, which determines target protease specificity, is markedly divergent. The literature dealing with members of these mouse serine protease inhibitor (serpin) clusters has been complicated by inconsistent nomenclature. Furthermore, some investigators, unaware of the complexity of the family, have failed to distinguish between closely related genes when measuring expression levels or functional activity. We have reviewed the literature dealing with the mouse equivalents of human alpha(1)-antitrypsin and alpha(1)-antichymotrypsin and made use of the recently completed mouse genome sequence to propose a systematic nomenclature. We have also examined the extended mouse clade "a" serpin cluster at chromosome 12F1 and compared it with the syntenic region at human chromosome 14q32. In summarizing the literature and suggesting a standardized nomenclature, we aim to provide a logical structure on which future research may be based. PMID: [12659817] 

10. Members of the serpin (serine protease inhibitor) superfamily of genes are well represented in both human and murine genomes. In many cases it is possible to identify a definite ortholog on the basis of sequence similarity and by examining the surrounding genes at syntenic loci. We have recently examined the murine serpin locus at 12F1 and observed that the single human alpha1-antichymotrypsin gene is represented by 14 paralogs. It is also known that the single human alpha1-antitrypsin gene has five paralogs in the mouse. The forces driving this gene multiplication are unknown and there are no data describing the function of the various serpin gene products at the alpha1-antichymotrypsin multigene locus. Examination of the predicted amino acid sequences shows that the serpins are likely to be functional protease inhibitors but with differing target protease specificities. In order to begin to address the question of the problem presented by the murine alpha1-antichymotrypsins, we have used RT-PCR to examine the expression pattern of these serpin genes. Our data show that the divergent reactive center loop sequence, and predictably variable target protease specificity, is reflected in tissue-specific expression for many of the family members. These observations add weight to the hypothesis that the antichymotrypsin-like serpins have an evolutionary importance which has led to their expansion and diversification in multiple species. PMID: [15638460] 

11. A procedure to map N-glycosylation sites is presented here. It can be applied to purified proteins as well as to highly complex mixtures. The method exploits deglycosylation by PNGase F in a diagonal, reverse-phase chromatographic setup. When applied to 10 microL of mouse serum, affinity-depleted for its three most abundant components, 117 known or predicted sites were mapped in addition to 10 novel sites. Several sites were detected on soluble membrane or receptor components. Our method furthermore senses the nature of glycan structures and can detect differential glycosylation on a given site. These properties--high sensitivity and dependence on glycan imprinting--can be exploited for glycan-biomarker analysis. PMID: [16944957] 

12. A comprehensive understanding of the mouse plasma proteome is important for studies using mouse models to identify protein markers of human disease. To enhance our analysis of the mouse plasma proteome, we have developed a method for isolating low-abundance proteins using a cysteine-containing glycopeptide strategy. This method involves two orthogonal affinity capture steps. First, glycoproteins are coupled to an azlactone copolymer gel using hydrazide chemistry and cysteine residues are then biotinylated. After trypsinization and extensive washing, tethered N-glycosylated tryptic peptides are released from the gel using PNGase F. Biotinylated cysteinyl-containing glycopeptides are then affinity selected using a monomeric avidin gel and analyzed by LC-MS/MS. We have applied the method to a proteome analysis of mouse plasma. In two independent analyses using 200 muL each of C57BL mouse plasma, 51 proteins were detected. Only 42 proteins were seen when the same plasma sample was analyzed by glycopeptides only. A total of 104 N-glycosylation sites were identified. Of these, 17 sites have hitherto not been annotated in the Swiss-Prot database whereas 48 were considered probable, potential, or by similarity - i.e., based on little or no experimental evidence. We show that analysis by cysteine-containing glycopeptides allows detection of low-abundance proteins such as the epidermal growth factor receptor, the Vitamin K-dependent protein Z, the hepatocyte growth factor activator, and the lymphatic endothelium-specific hyaluronan receptor as these proteins were not detected in the glycopeptide control analysis. PMID: [17330941] 

Back to Top
Function
Contrapsin inhibits trypsin-like proteases. 
Back to Top
Subcellular Location
Secreted. 
Tissue Specificity
Expressed in liver and secreted in plasma. 
Gene Ontology
GO IDGO termEvidence
GO:0005576 C:extracellular region IBA:RefGenome.
GO:0004867 F:serine-type endopeptidase inhibitor activity IBA:RefGenome.
GO:0030162 P:regulation of proteolysis IBA:RefGenome.
GO:0034097 P:response to cytokine stimulus IDA:MGI.
GO:0043434 P:response to peptide hormone stimulus IDA:MGI.
Back to Top
Interpro
IPR023795;    Protease_inhib_I4_serpin_CS.
IPR023796;    Sepin_dom.
IPR000215;    Serpin_fam.
Back to Top
Pfam
PF00079;    Serpin;    1.
Back to Top
SMART
SM00093;    SERPIN;    1.
Back to Top
PROSITE
PS00284;    SERPIN;    1.
Back to Top
PRINTS
Created Date
18-Oct-2012 
Record Type
GAS predicted 
Sequence Annotation
SIGNAL        1     21
CHAIN        22    418       Serine protease inhibitor A3K.
                             /FTId=PRO_0000032417.
REGION      369    394       RCL.
SITE        384    385       Reactive bond.
CARBOHYD     39     39       N-linked (GlcNAc...) (Potential).
CARBOHYD    105    105       N-linked (GlcNAc...) (Potential).
CARBOHYD    185    185       N-linked (GlcNAc...).
CARBOHYD    270    270       N-linked (GlcNAc...).
CONFLICT     15     15       I -> V (in Ref. 5; AAH19802/AAH11217/
                             AAH16407).
CONFLICT     68     70       PDT -> QDK (in Ref. 3; CAA40106 and 4;
                             BAE28874).
CONFLICT     72     72       I -> F (in Ref. 5; AAH16407).
CONFLICT     84     84       A -> R (in Ref. 2; CAA38948).
CONFLICT    193    193       E -> K (in Ref. 3; CAA40106 and 4;
                             BAE28874).
CONFLICT    200    201       ER -> DG (in Ref. 3; CAA40106 and 4;
                             BAE28874).
CONFLICT    204    204       M -> V (in Ref. 6; CAA25458).
CONFLICT    250    250       T -> A (in Ref. 3; CAA40106 and 4;
                             BAE28874).
CONFLICT    304    304       P -> S (in Ref. 3; CAA40106 and 4;
                             BAE28874).
CONFLICT    320    320       N -> D (in Ref. 3; CAA40106 and 4;
                             BAE28874).
CONFLICT    347    347       T -> I (in Ref. 6; CAA25458).
CONFLICT    349    349       T -> A (in Ref. 3; CAA40106 and 4;
                             BAE28874).
CONFLICT    386    386       I -> V (in Ref. 3; CAA40106 and 4;
                             BAE28874).
CONFLICT    389    389       A -> G (in Ref. 5; AAH19802).
CONFLICT    391    391       H -> C (in Ref. 3; CAA40106 and 4;
                             BAE28874).
CONFLICT    398    398       F -> I (in Ref. 3; CAA40106 and 4;
                             BAE28874).
Back to Top
Nucleotide Sequence
Length: 2020 bp   Go to nucleotide: FASTA
Protein Sequence
Length: 418 bp   Go to amino acid: FASTA
The verified Protein-Protein interaction information
UniProt
Gene Symbol Ref Databases
BMPR2BioGRID 
Kcnma1IntAct 
Other Protein-Protein interaction resources
String database  
View Microarray data
Comments