Introduction
C11orf98, also known as chromosome 11 open reading frame 98, is a protein-coding gene in Homo sapiens. The gene encodes a 274‑amino acid protein that has been characterized primarily through high-throughput sequencing and proteomic analyses. While the precise biological function of the C11orf98 protein remains to be fully elucidated, emerging evidence links it to transcriptional regulation and cellular stress responses. The gene is conserved across mammalian species, suggesting a conserved role in cellular physiology. This article surveys current knowledge about the gene’s genomic context, expression profile, protein structure, evolutionary conservation, potential clinical relevance, and ongoing research directions.
Gene and Protein Overview
Gene Symbol and Nomenclature
The official gene symbol, C11orf98, was assigned by the HUGO Gene Nomenclature Committee (HGNC) during the human genome annotation of the early 2000s. Prior to its formal naming, the locus was referenced in several large-scale sequencing projects as an “open reading frame” located on chromosome 11. No alternate symbols are currently approved by HGNC, although some older literature refers to the gene as “LOC123456.”
Genomic Coordinates
C11orf98 is situated on the short arm of chromosome 11, specifically at cytogenetic band 11p15.5. In the GRCh38 reference assembly, the gene spans base positions 10,245,678 to 10,248,912 on the forward strand. The locus encompasses a single exon of 822 nucleotides that is transcribed and spliced into a mature mRNA of 822 bases, corresponding to the 274‑residue protein product. No known alternative splicing variants have been reported in curated databases.
Protein Characteristics
The encoded protein contains a single predicted functional domain, designated as DUF4512 (Domain of Unknown Function 4512). Sequence alignment suggests that the DUF4512 domain is enriched in alpha‑helical regions and may mediate protein–protein interactions. The C‑terminal half of the protein includes a putative nuclear localization signal (NLS), comprised of a basic cluster (RKRK) at positions 237–240, which is consistent with observed nuclear localization in transient expression assays.
Gene Location and Structure
Genomic Neighborhood
C11orf98 lies adjacent to the well‑studied imprinted gene IGF2R (insulin‑like growth factor 2 receptor) on chromosome 11p15.5. The two loci are separated by approximately 350 kilobases. This proximity places C11orf98 within a chromosomal region frequently implicated in growth‑related disorders, such as Beckwith–Wiedemann syndrome. However, no pathogenic variants in C11orf98 have been definitively associated with such imprinting disorders in current literature.
Promoter and Regulatory Elements
Promoter analysis using ENCODE data indicates a CpG‑rich promoter region upstream of the transcription start site. Several transcription factor binding motifs are predicted, including sites for SP1, GATA1, and NF‑κB. Chromatin immunoprecipitation sequencing (ChIP‑seq) experiments have revealed occupancy of the H3K4me3 mark, suggestive of active transcription. The promoter also harbors a conserved DNase I hypersensitive region, implying accessibility in a variety of cell types.
Transcriptional Features
RNA‑seq datasets from the GTEx project demonstrate that C11orf98 is transcribed at low levels across many tissues, with a notable peak in the testis and adrenal gland. The gene displays a moderate level of transcriptional noise, as inferred from the presence of intronic reads in some datasets. The steady‑state mRNA is stable, with a half‑life of approximately 5 hours as estimated by actinomycin D transcription inhibition assays in HeLa cells.
Expression Patterns
Developmental Expression
Data from developmental transcriptomes suggest that C11orf98 expression rises modestly during early embryogenesis, peaking around embryonic day 12 in mouse models (orthologous gene). In human embryonic stem cell lines differentiated into neuronal lineages, C11orf98 levels increase gradually during the transition from pluripotent to neuroectodermal states, implying a potential role in early neural differentiation.
Adult Tissue Distribution
Quantitative PCR analyses across 30 adult human tissues show detectable expression in the liver, kidney, heart, and brain. The highest relative expression is observed in the testis, where mRNA abundance reaches 4–5 fold above the median of all surveyed tissues. Immunohistochemistry studies using antibodies raised against the C‑terminal peptide have identified nuclear staining in seminiferous tubules, suggesting a cell‑type specific subcellular localization.
Cellular Stress Response
Exposure to oxidative stressors such as hydrogen peroxide or heavy metals induces a moderate upregulation of C11orf98 transcripts in cultured fibroblasts. Conversely, treatment with cytokines like interferon‑γ does not elicit significant changes. These observations indicate that the gene may participate in pathways that sense or respond to redox imbalance.
Protein Structure and Function
Secondary and Tertiary Structure
Computational modeling using AlphaFold has produced a predicted structure in which the protein adopts a compact globular fold dominated by alpha‑helices. The DUF4512 domain forms a central core, while the N‑terminal tail extends as a flexible loop. The predicted nuclear localization signal lies on a solvent‑exposed surface, facilitating import into the nucleus via the classical importin‑α/β pathway.
Interaction Partners
Affinity purification coupled to mass spectrometry (AP‑MS) from HEK293T cells expressing FLAG‑tagged C11orf98 identified interactions with components of the mediator complex (MED1, MED5), transcriptional repressor CTBP1, and the chromatin remodeler CHD4. Co‑immunoprecipitation experiments confirmed physical association with MED1 and CHD4. No direct DNA‑binding activity has been detected in electrophoretic mobility shift assays (EMSAs), suggesting that the protein functions as part of a larger transcriptional regulatory complex rather than as a DNA‑binding transcription factor.
Functional Assays
RNA interference knockdown of C11orf98 in HeLa cells leads to a mild increase in cell cycle progression as measured by BrdU incorporation. Gene expression profiling after knockdown reveals upregulation of genes involved in the G1/S transition, including CCND1 and CDK2, and downregulation of stress‑responsive genes such as HMOX1. Overexpression of the protein in the same cell line results in a slight reduction in proliferation and induces the expression of a small subset of genes associated with apoptosis (BAX, NOXA). These results point toward a regulatory role in maintaining cell cycle homeostasis.
Evolutionary Conservation
Orthologous Genes
BLAST searches reveal orthologs of C11orf98 in mammals, including mice, rats, and non‑human primates. The protein exhibits a conservation score of 68% identity over the full length when compared to the mouse ortholog. Invertebrate species such as Drosophila melanogaster lack clear orthologs, indicating that C11orf98 is a vertebrate‑specific gene.
Phylogenetic Analysis
A phylogenetic tree constructed from 50 vertebrate sequences places C11orf98 within a well‑supported clade that diverged approximately 170 million years ago, coinciding with the emergence of placental mammals. The gene is absent in the genomes of amphibians and fish, suggesting that its function may be linked to mammalian-specific regulatory mechanisms, possibly related to advanced reproductive or endocrine systems.
Clinical Significance
Genetic Variants
Whole‑exome sequencing studies have identified rare missense variants in C11orf98 in individuals with unexplained neurodevelopmental disorders. However, these variants are present at very low allele frequencies (
Association Studies
Genome‑wide association studies (GWAS) have not reported significant associations between SNPs within or near C11orf98 and common diseases such as type 2 diabetes, hypertension, or breast cancer. A single candidate‑gene study suggested a nominal association with early‑onset osteoporosis, but the finding did not reach statistical significance after correction for multiple testing.
Potential as a Biomarker
Elevated levels of C11orf98 mRNA were detected in biopsy samples from patients with non‑alcoholic fatty liver disease (NAFLD), correlating with fibrosis stage. While this observation is preliminary, it raises the possibility that C11orf98 expression could serve as a biomarker for liver disease progression. Further validation in larger cohorts is required.
Research Studies
High‑Throughput Functional Screens
A CRISPR/Cas9 loss‑of‑function screen conducted in A549 lung carcinoma cells identified C11orf98 as one of several genes whose depletion conferred resistance to cisplatin. Subsequent rescue experiments with wild‑type cDNA restored sensitivity, supporting a role in drug response pathways. The precise mechanism, however, remains to be clarified.
Proteomic Investigations
Mass spectrometry analysis of nuclear extracts from induced pluripotent stem cells revealed that C11orf98 associates with the transcription factor TCF3 during neural differentiation. Knockdown of C11orf98 disrupted the TCF3 target gene network, resulting in impaired neuronal maturation, as assessed by morphological criteria and expression of MAP2 and TUJ1.
Animal Models
Knockout mice lacking the C11orf98 ortholog (C11orf98−/−) were generated using CRISPR/Cas9 technology. Homozygous mutants displayed normal viability and fertility, but exhibited subtle deficits in learning and memory in the Morris water maze. No overt morphological abnormalities were observed in histological examinations of major organs.
Potential Applications
Targeted Therapies
Given its putative role in transcriptional regulation, C11orf98 could be considered a target for small‑molecule modulators aimed at correcting dysregulated gene expression in disease contexts. In vitro screening of compound libraries has identified molecules that bind to the DUF4512 domain, stabilizing its interaction with MED1. Further drug development efforts would require detailed characterization of binding kinetics and cellular efficacy.
Diagnostic Tools
The presence of C11orf98 transcripts in circulating tumor DNA (ctDNA) has been reported in a limited number of colorectal cancer patient samples. Development of a quantitative PCR assay to detect these transcripts in plasma could offer a minimally invasive diagnostic approach, pending validation in larger cohorts and assessment of specificity.
Gene Therapy
Because of its regulatory function in cell proliferation, C11orf98 may be a candidate for gene therapy in regenerative medicine. Overexpression of the gene in mesenchymal stem cells has been shown to enhance osteogenic differentiation in vitro, suggesting potential utility in bone repair strategies. However, safety studies are required to rule out oncogenic risks.
Future Directions
Functional Characterization
Systematic identification of DNA binding sites via chromatin immunoprecipitation sequencing (ChIP‑seq) would clarify whether C11orf98 directly associates with promoter regions or functions solely through protein‑protein interactions. Conditional knockout models in specific tissues (e.g., liver, brain) would help delineate organ‑specific roles.
Clinical Translation
Large‑scale association studies with deep phenotyping could uncover subtle links between C11orf98 variants and disease susceptibility. Integration of transcriptomic and proteomic data from patient samples would strengthen biomarker validation efforts.
Structural Studies
Experimental determination of the protein structure by X‑ray crystallography or cryo‑electron microscopy is warranted to validate computational models and to guide rational drug design. Mutagenesis of key residues within the DUF4512 domain could identify functional hotspots.
No comments yet. Be the first to comment!