Tag Content
SG ID
SG00014433 
UniProt Accession
Theoretical PI
5.92  
Molecular Weight
87918 Da  
Genbank Nucleotide ID
Genbank Protein ID
Gene Name
Hnrnpu 
Gene Synonyms/Alias
Hnrpu 
Protein Name
Heterogeneous nuclear ribonucleoprotein U 
Protein Synonyms/Alias
hnRNP U Scaffold attachment factor A;SAF-A 
Organism
Mus musculus (Mouse) 
NCBI Taxonomy ID
10090 
Chromosome Location
chr:1;180253205-180267928;-1
View in Ensembl genome browser  
Function in Stage
Uncertain 
Function in Cell Type
Uncertain 
Probability (GAS) of Function in Spermatogenesis
0.658919752 
The probability was calculated by GAS algorithm, ranging from 0 to 1. The closer it is to 1, the more possibly it functions in spermatogenesis.
Description
Temporarily unavailable 
Abstract of related literatures
1. This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development. PMID: [16141072] 

2. The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline. PMID: [15489334] 

3. In the mammalian central nervous system, the structure known as the postsynaptic density (PSD) is a dense complex of proteins whose function is to detect and respond to neurotransmitter released from presynaptic axon terminals. Regulation of protein phosphorylation in this molecular machinery is critical to the activity of its components, which include neurotransmitter receptors, kinases/phosphatases, scaffolding molecules, and proteins regulating cytoskeletal structure. To characterize the phosphorylation state of proteins in PSD samples, we combined strong cation exchange (SCX) chromatography with IMAC. Initially, tryptic peptides were separated by cation exchange and analyzed by reverse phase chromatography coupled to tandem mass spectrometry, which led to the identification of phosphopeptides in most SCX fractions. Because each of these individual fractions was too complex to characterize completely in single LC-MS/MS runs, we enriched for phosphopeptides by performing IMAC on each SCX fraction, yielding at least a 3-fold increase in identified phosphopeptides relative to either approach alone (SCX or IMAC). This enabled us to identify at least one site of phosphorylation on 23% (287 of 1,264) of all proteins found to be present in the postsynaptic density preparation. In total, we identified 998 unique phosphorylated peptides, mapping to 723 unique sites of phosphorylation. At least one exact site of phosphorylation was determined on 62% (621 of 998) of all phosphopeptides, and approximately 80% of identified phosphorylation sites are novel. PMID: [16452087] 

4. Kinases play a prominent role in tumor development, pointing to the presence of specific phosphorylation patterns in tumor tissues. Here, we investigate whether recently developed high resolution mass spectrometric (MS) methods for proteome and phosphoproteome analysis can also be applied to solid tumors. As tumor model, we used TG3 mutant mice carrying skin melanomas. At total of 100 microg of solid tumor lysate yielded a melanoma proteome of 4443 identified proteins, including at least 88 putative melanoma markers previously found by cDNA microarray technology. Analysis of 2 mg of lysate from dissected melanoma with titansphere chromatography and 8 mg with strong cation exchange together resulted in the identification of more than 5600 phosphorylation sites on 2250 proteins. The phosphoproteome included many hits from pathways important in melanoma. One-month storage at -80 degrees C did not significantly decrease the number of identified phosphorylation sites. Thus, solid tumor can be analyzed by MS-based proteomics with similar efficiency as cell culture models and in amounts compatible with biopsies. PMID: [19367708] 

Back to Top
Function
Component of the CRD-mediated complex that promotes MYCmRNA stabilization. Binds to pre-mRNA. Has high affinity forscaffold-attached region (SAR) DNA. Binds to double- and single-stranded DNA and RNA (By similarity). 
Back to Top
Subcellular Location
Nucleus (By similarity). Cytoplasm (Bysimilarity). Cell surface (By similarity). Note=Localized incytoplasmic mRNP granules containing untranslated mRNAs. Componentof ribonucleosomes. Also found associated with the cell surface(By similarity). 
Tissue Specificity
 
Gene Ontology
GO IDGO termEvidence
GO:0071013 C:catalytic step 2 spliceosome IEA:Compara.
GO:0009986 C:cell surface IEA:UniProtKB-SubCell.
GO:0070937 C:CRD-mediated mRNA stability complex IEA:Compara.
GO:0005524 F:ATP binding IEA:UniProtKB-KW.
GO:0003677 F:DNA binding IEA:UniProtKB-KW.
GO:0003723 F:RNA binding IEA:UniProtKB-KW.
GO:0070934 P:CRD-mediated mRNA stabilization IEA:Compara.
GO:0006397 P:mRNA processing IEA:UniProtKB-KW.
GO:0008380 P:RNA splicing IEA:UniProtKB-KW.
Back to Top
Interpro
IPR001870;    B30.2/SPRY.
IPR008985;    ConA-like_lec_gl.
IPR003034;    SAP_DNA-bd.
IPR018355;    SPla/RYanodine_receptor_subgr.
IPR003877;    SPRY_rcpt.
Back to Top
Pfam
PF02037;    SAP;    1.
PF00622;    SPRY;    1.
Back to Top
SMART
SM00513;    SAP;    1.
SM00449;    SPRY;    1.
Back to Top
PROSITE
PS50188;    B302_SPRY;    1.
PS50800;    SAP;    1.
Back to Top
PRINTS
Created Date
18-Oct-2012 
Record Type
GAS predicted 
Sequence Annotation
INIT_MET      1      1       Removed (By similarity).
CHAIN         2    800       Heterogeneous nuclear ribonucleoprotein
                             U.
                             /FTId=PRO_0000387947.
DOMAIN        8     42       SAP.
DOMAIN      244    440       B30.2/SPRY.
NP_BIND     480    487       ATP (Potential).
REGION      690    715       RNA-binding RGG-box (By similarity).
COILED      626    653       Potential.
COMPBIAS      2    154       Asp/Glu-rich (acidic).
COMPBIAS    679    769       Gly-rich.
MOD_RES       2      2       N-acetylserine.
MOD_RES       4      4       Phosphoserine (By similarity).
MOD_RES      26     26       Phosphoserine (By similarity).
MOD_RES      49     49       Omega-N-methylarginine (By similarity).
MOD_RES      58     58       Phosphoserine.
MOD_RES     183    183       Phosphoserine (By similarity).
MOD_RES     210    210       N6-acetyllysine (By similarity).
MOD_RES     241    241       N6-acetyllysine (By similarity).
MOD_RES     247    247       Phosphoserine.
MOD_RES     328    328       N6-acetyllysine (By similarity).
MOD_RES     440    440       N6-acetyllysine (By similarity).
MOD_RES     492    492       N6-acetyllysine (By similarity).
MOD_RES     500    500       N6-acetyllysine (By similarity).
MOD_RES     527    527       N6-acetyllysine (By similarity).
MOD_RES     541    541       N6-acetyllysine (By similarity).
MOD_RES     602    602       N6-acetyllysine (By similarity).
MOD_RES     611    611       N6-acetyllysine (By similarity).
MOD_RES     646    646       N6-acetyllysine (By similarity).
MOD_RES     709    709       Omega-N-methylated arginine (By
                             similarity).
MOD_RES     715    715       Dimethylated arginine (By similarity).
MOD_RES     715    715       Omega-N-methylated arginine (By
                             similarity).
MOD_RES     789    789       N6-acetyllysine (By similarity).
Back to Top
Nucleotide Sequence
Length: bp   Go to nucleotide: FASTA
Protein Sequence
Length: 800 bp   Go to amino acid: FASTA
The verified Protein-Protein interaction information
UniProt
Gene Symbol Ref Databases
Prmt1BioGRID 
Irs1IntAct 
Pou5f1BioGRID 
Other Protein-Protein interaction resources
String database  
View Microarray data
Comments