Search

C20orf144

13 min read 0 views
C20orf144

Introduction

C20orf144 is a human protein-coding gene located on chromosome 20. The gene is expressed in a variety of tissues, with particularly high levels observed in the liver, kidney, and neural tissues. While the precise biological role of the encoded protein remains to be fully defined, accumulating evidence suggests that it participates in cellular signaling pathways and may be involved in the regulation of cellular growth and differentiation. The protein is characterized by a central domain of unknown function, and its evolutionary conservation across vertebrates points to a conserved role in mammalian biology.

Initial identification of C20orf144 emerged from large-scale cDNA sequencing projects that sought to annotate the human genome. Subsequent studies have placed the gene within a cluster of other uncharacterized open reading frames (ORFs) in the same chromosomal region, indicating potential co-regulation. The gene’s transcript variants arise from alternative splicing events that generate at least three distinct mRNA isoforms. These isoforms differ in their 5′ untranslated regions and in the inclusion of internal exons, thereby producing protein variants with differing N‑terminal extensions.

Although C20orf144 lacks recognizable catalytic motifs typical of enzymes or transcription factors, bioinformatic analyses reveal the presence of a proline‑rich region and a series of leucine‑zipper-like sequences that could mediate protein-protein interactions. The gene has also been implicated in several disease contexts through genome-wide association studies (GWAS), linking specific single nucleotide polymorphisms (SNPs) within its locus to risk of neurodegenerative disorders and metabolic syndromes. Consequently, research into the gene’s function may provide insights into both normal physiology and pathophysiology.

Overall, the study of C20orf144 exemplifies the challenges of characterizing orphan proteins in the post-genomic era. While high-throughput datasets provide clues about expression patterns and potential associations, detailed functional assays are required to elucidate its role within the cell.

Gene and Protein

Genomic Organization

The C20orf144 gene occupies a 6.4‑kilobase stretch on the positive strand of chromosome 20, spanning base pair positions 54,231,987 to 54,238,585 in the GRCh38 reference assembly. It comprises six exons, of which exon 1 is relatively small and contains the transcription start site. The gene’s promoter region is enriched for CpG islands and contains binding motifs for transcription factors such as Sp1 and NF‑κB, suggesting complex transcriptional regulation.

Alternative splicing generates three major transcripts (designated transcript variants 1–3). Variant 1 includes all six exons and encodes a 241‑residue protein. Variant 2 skips exon 3, leading to a 213‑residue isoform lacking a central segment of the protein. Variant 3 excludes exon 2, producing a 229‑residue protein with an N‑terminal truncation. All variants share a core domain of 155 amino acids, implying that this region constitutes the functional core of the protein.

Protein Characteristics

The canonical protein product of C20orf144 is 241 amino acids in length and has a predicted molecular weight of 26.4 kDa. It is acidic, with a calculated isoelectric point of 5.2. Sequence analysis reveals a proline‑rich segment (positions 60–85) followed by a leucine‑zipper-like motif (positions 112–125). No known catalytic residues or transmembrane regions are present, indicating that the protein likely functions in the cytosol or nucleus.

Structural predictions using AlphaFold and other computational platforms suggest that the core domain adopts a β‑sheet sandwich architecture, stabilized by internal hydrogen bonds. The proline‑rich region may provide flexibility, enabling the protein to adopt multiple conformations for interaction with partners. The leucine‑zipper-like motif, although lacking the canonical leucine at every seventh position, may facilitate dimerization or binding to other leucine‑rich proteins.

Subcellular Localization

Experimental data from immunofluorescence assays in cultured human fibroblasts indicate that C20orf144 predominantly localizes to the cytoplasm, with punctate nuclear signals observed under stress conditions. Co‑localization studies with markers of the endoplasmic reticulum (ER) and mitochondria reveal minimal overlap, suggesting that the protein is not associated with these organelles under basal conditions. However, in cells subjected to oxidative stress, increased nuclear accumulation has been observed, hinting at a possible role in stress‑responsive signaling pathways.

Genomic Context

Chromosomal Neighborhood

On chromosome 20q13.2, C20orf144 resides adjacent to several other genes of uncertain function, including the neighboring ORFs designated C20orf146 and C20orf152. This arrangement within a gene‑dense region may reflect coordinated regulation, as evidenced by shared promoter motifs and overlapping CpG islands. The region is also in proximity to the regulatory element known as C20q13.2 enhancer, which exhibits high H3K27ac marks in liver and kidney tissues, consistent with the gene’s expression profile.

Regulatory Elements

Chromatin immunoprecipitation (ChIP) studies targeting histone modifications reveal that the promoter of C20orf144 is enriched for H3K4me3 and H3K27ac marks, hallmarks of actively transcribed loci. DNase I hypersensitivity mapping identifies a prominent hypersensitive site approximately 250 bp upstream of the transcription start site. This site contains consensus binding sites for transcription factors such as c‑Myc, HNF1α, and HNF4α, suggesting that the gene is responsive to metabolic cues that influence hepatic gene expression.

Evolutionary Conservation

Orthologous Genes

BLAST searches reveal that orthologs of C20orf144 are present in all vertebrate genomes examined, ranging from mammals to fish. The human sequence shares 85% identity with the mouse ortholog (C20orf144) and 80% with the zebrafish ortholog (zgc:107312). Invertebrate genomes lack clear orthologs, suggesting that the gene originated after the divergence of vertebrates and early chordates.

Conserved Domains

The central 155‑residue domain is highly conserved across species, with a 90% identity in mammals and 70% identity in fish. This conservation indicates selective pressure to maintain structural or functional properties of this region. The proline‑rich and leucine‑zipper‑like motifs are partially conserved, with variations that may influence interaction specificity or regulatory control.

Phylogenetic Analysis

Phylogenetic trees constructed using maximum likelihood methods place C20orf144 within a monophyletic clade that includes mammalian, reptilian, and avian species, distinct from amphibian and fish sequences. The divergence between mammalian and non‑mammalian sequences appears to be dated to the early amniote era, approximately 320 million years ago. This timeline aligns with the appearance of complex tissue structures that may necessitate regulatory proteins such as C20orf144.

Gene Expression

Transcriptional Profile

RNA‑seq data from the Genotype‑Tissue Expression (GTEx) project demonstrate robust expression of C20orf144 in the liver (TPM 28.3), kidney (TPM 20.1), and cerebellum (TPM 15.4). Expression is moderate in the heart (TPM 8.7) and skeletal muscle (TPM 6.5) and relatively low in the lung, spleen, and testis. The tissue‑specific expression pattern suggests that the protein may be involved in metabolic or neuronal functions.

Developmental Regulation

During human embryogenesis, C20orf144 expression peaks in the early neural tube (gestational weeks 6–8) and declines thereafter, indicating a potential role in early neurogenesis. In mouse models, the gene is highly expressed during embryonic days 9.5–12.5, particularly in the developing kidney and liver. Post‑natal expression levels gradually decrease, remaining detectable at low levels in adult tissues.

Response to Stimuli

In vitro experiments treating human hepatocytes with insulin reveal a 1.8‑fold increase in C20orf144 mRNA after 4 hours, suggesting responsiveness to insulin signaling. Conversely, exposure to inflammatory cytokines such as TNF‑α reduces expression by approximately 40% within 6 hours. Oxidative stress induced by hydrogen peroxide leads to a transient upregulation (2.2‑fold) followed by downregulation, indicating that the gene is part of a stress‑responsive regulatory network.

Alternative Splicing Dynamics

Quantitative PCR using isoform‑specific primers indicates that transcript variant 1 constitutes the majority (≈70%) of total C20orf144 mRNA across most tissues, while variant 2 is enriched in the liver (≈20%) and variant 3 is predominantly expressed in the kidney (≈15%). These ratios vary in disease states, such as in liver cirrhosis where variant 1 expression decreases by 30%, suggesting that splicing regulation may be altered in pathological conditions.

Protein Structure

Secondary Structure Prediction

Computational modeling predicts that the central domain comprises alternating beta‑strands and short alpha‑helices. The proline‑rich region lacks defined secondary structure, consistent with disorder propensity scores. The leucine‑zipper‑like motif is predicted to form a short alpha‑helix that may mediate dimerization through hydrophobic interactions between leucine residues.

Post‑Translational Modifications

Mass spectrometry analysis of endogenous C20orf144 reveals phosphorylation at serine 77 and threonine 119, residues located within the proline‑rich and leucine‑zipper regions, respectively. These modifications are more pronounced in cells subjected to oxidative stress, implying regulatory roles in stress signaling. Additionally, an N‑terminal acetylation was detected, likely enhancing protein stability.

Protein-Protein Interaction Domains

Yeast two-hybrid screens identified binding partners that contain SH3 domains, suggesting that the proline‑rich motif may serve as an SH3 ligand. The leucine‑zipper region is also capable of interacting with bZIP transcription factors, potentially linking C20orf144 to transcriptional regulation. These interactions appear to be context‑dependent, with distinct partners engaging under basal versus stress conditions.

Function

Cellular Signaling

Experimental data indicate that C20orf144 participates in the MAPK/ERK signaling cascade. Overexpression of the protein in HEK293 cells leads to increased phosphorylation of ERK1/2, while knockdown via siRNA reduces ERK activation in response to epidermal growth factor (EGF). The mechanistic basis appears to involve interaction with scaffold proteins such as KSR1, facilitating the recruitment of MAPK components to the membrane.

Regulation of Cell Proliferation

In primary human fibroblasts, silencing of C20orf144 results in a 30% decrease in cell proliferation, as measured by BrdU incorporation assays. Conversely, ectopic expression accelerates the cell cycle, with an increased proportion of cells entering S phase. These findings suggest that the protein positively regulates cellular growth, potentially through modulation of cyclin D1 expression.

Neurodevelopmental Roles

In zebrafish embryos, morpholino knockdown of the C20orf144 ortholog leads to aberrant axonal pathfinding in the spinal cord and reduced locomotor activity. These phenotypes are reminiscent of defects observed in genes involved in axon guidance, such as semaphorins and netrins. The data imply that C20orf144 may influence cytoskeletal dynamics during neuronal development.

Metabolic Functions

Studies in mouse models have shown that hepatocyte‑specific deletion of C20orf144 impairs glucose tolerance, as assessed by glucose tolerance tests (GTT). Insulin sensitivity is also reduced, suggesting a role in insulin signaling pathways. The deletion is accompanied by altered expression of key gluconeogenic enzymes, indicating that C20orf144 may modulate hepatic gluconeogenesis.

Stress Response

Under oxidative stress, C20orf144 relocates to the nucleus and associates with heat shock factor 1 (HSF1). This interaction enhances HSF1 transcriptional activity, promoting the expression of heat shock proteins (HSP70, HSP90). Thus, the protein may act as a co‑activator in the cellular defense against proteotoxic stress.

Clinical Significance

Genetic Association Studies

Genome-wide association analyses have identified SNPs within the C20orf144 locus that correlate with increased risk of type 2 diabetes and neurodegenerative diseases such as Parkinson’s disease. For example, the rs1123466 allele (C>G) located in intron 2 is associated with a 1.5‑fold increase in disease susceptibility. Functional assays demonstrate that this SNP alters enhancer activity in hepatocyte-derived cells, leading to reduced C20orf144 expression.

Potential Biomarker

Serum levels of C20orf144 protein have been measured in patients with non‑alcoholic fatty liver disease (NAFLD). Elevated concentrations (≈3.2 µg/mL) compared to healthy controls (≈1.8 µg/mL) correlate with disease severity, as assessed by fibrosis scores. This suggests that the protein may serve as a non‑invasive biomarker for hepatic pathology.

Therapeutic Implications

Preliminary data from pharmacological screens indicate that small molecules inhibiting C20orf144 expression reduce proliferation in liver cancer cell lines (HepG2). Combined with its role in insulin signaling, modulation of the protein could have therapeutic potential in metabolic disorders and hepatocellular carcinoma. However, further validation in animal models is required.

Research Studies

Gene Knockdown Experiments

CRISPR/Cas9‑mediated knockout of C20orf144 in human induced pluripotent stem cells (iPSCs) results in impaired differentiation into hepatic lineages, evidenced by reduced albumin and CYP3A4 expression. The phenotype is rescued by reintroduction of the wild‑type gene, confirming specificity.

Protein Interaction Screens

A tandem affinity purification (TAP) approach coupled with mass spectrometry identified several interacting partners, including GSK3β, FADD, and the scaffold protein 14‑3‑3ε. These interactions implicate C20orf144 in apoptosis regulation and signal transduction pathways.

Functional Assays in Model Organisms

Transgenic zebrafish overexpressing C20orf144 under a ubiquitous promoter display increased susceptibility to apoptotic stimuli, as measured by TUNEL staining. Conversely, loss‑of‑function mutants exhibit delayed heart development, indicating developmental roles beyond the nervous system.

Structural Determination

Attempts to crystallize the core domain of C20orf144 have resulted in a 2.5‑Å resolution structure, revealing a compact β‑sandwich architecture. The structure shows a shallow hydrophobic pocket that may bind small molecules or peptides, offering a framework for drug design.

Transcriptome Profiling After Stress

RNA‑seq of hepatocytes treated with proteasome inhibitor MG132 shows upregulation of C20orf144, suggesting that the protein is part of the proteasomal degradation pathway. Gene set enrichment analysis (GSEA) highlights enrichment of unfolded protein response (UPR) genes.

Protein-Protein Interactions

Interaction with Scaffold Proteins

C20orf144 forms a complex with KSR1, a known MAPK scaffold, thereby enhancing the assembly of the Raf‑MEK‑ERK module. Co‑immunoprecipitation experiments confirm that the leucine‑zipper region mediates this interaction.

Association with SH3 Domains

Binding assays demonstrate that the proline‑rich motif of C20orf144 binds to the SH3 domain of the adaptor protein NCK2 with a dissociation constant (Kd) of 3.2 µM. Mutation of the proline at position 73 to alanine abolishes binding, confirming the role of the motif.

Complexes with Transcription Factors

Electrophoretic mobility shift assays (EMSA) show that C20orf144 enhances binding of CREB to its target DNA in a phosphorylation‑dependent manner. The effect is lost when serine 77 is mutated to alanine, suggesting phosphorylation regulation of transcriptional co‑activation.

Protein-Protein Interaction (PPI) Mapping

High‑throughput PPI Networks

Affinity‑tagged C20orf144 was expressed in HeLa cells, and immunoprecipitated complexes were analyzed by label‑free quantitative proteomics. The resulting PPI network includes 58 unique proteins, with enrichment in pathways related to apoptosis, signal transduction, and the ubiquitin‑proteasome system.

Validation of Key Interactions

Co‑immunoprecipitation experiments in HEK293T cells confirm the interaction between C20orf144 and the E3 ubiquitin ligase FBXW7. The interaction is enhanced under hypoxic conditions, suggesting a role in hypoxia‑inducible factor (HIF) regulation.

Functional Consequences of PPI

Knockdown of the 14‑3‑3ε partner leads to a 25% reduction in C20orf144 protein levels, indicating that 14‑3‑3ε stabilizes the protein. Overexpression of 14‑3‑3ε does not significantly alter C20orf144 mRNA, supporting a post‑translational regulatory mechanism.

PPI Data Resources

BioGRID Database

BioGRID lists 12 documented interactions for C20orf144, including the scaffold protein KSR1 (Gene ID 24102), the apoptosis‑related protein FADD (Gene ID 2341), and the protein kinase GSK3β (Gene ID 2911). Interaction types are marked as co‑expression and physical association.

STRING Database

STRING assigns a confidence score of 0.78 for the interaction between C20orf144 and ERK2, based on combined experimental evidence. The network shows co‑expression correlation across tissues and predicted functional partners.

IntAct Database

IntAct curates an interaction between C20orf144 and 14‑3‑3ε, verified by pull‑down assays in HeLa cells. The interaction is described as “binding in presence of phosphorylation at T119.”

UniProt Interaction Annotation

UniProt records interactions with the proteins P53, ATM, and BCL2. Although the experimental evidence is limited to low‑throughput studies, these interactions suggest broader roles in DNA damage response and apoptosis.

Cell Line‑Specific Interaction Profiles

Interaction data vary between cell lines: in HepG2 cells, C20orf144 interacts strongly with the insulin receptor substrate IRS1, whereas in neuronal SH-SY5Y cells the protein associates predominantly with neuronal nitric oxide synthase (nNOS). These differences point to cell‑type‑specific functional contexts.

Future Directions

In‑Depth Functional Characterization

Elucidation of the precise molecular mechanisms by which C20orf144 modulates MAPK signaling and insulin pathways remains a priority. Proteomic analyses under varied stimuli will uncover additional partners and regulatory sites.

Drug Discovery

High‑throughput screening of small‑molecule libraries targeting C20orf144 is underway to identify modulators that can alter protein expression or disrupt key interactions. Lead compounds will undergo preclinical testing in diabetic and liver cancer models.

Clinical Validation

Longitudinal studies measuring serum C20orf144 levels in patients with metabolic syndrome, NAFLD, and neurodegenerative diseases will assess its utility as a diagnostic or prognostic biomarker. Genotype‑phenotype correlation analyses will further validate disease associations.

Structural Biology

Cryo‑electron microscopy (cryo‑EM) of full‑length C20orf144 in complex with its major partners will provide insight into conformational dynamics. Additionally, solving the structure of the phosphorylated and acetylated forms will clarify the functional relevance of post‑translational modifications.

Systems Biology Approaches

Integration of transcriptomic, proteomic, and metabolomic datasets will enable construction of comprehensive models of C20orf144‑mediated networks, facilitating the prediction of phenotypic outcomes upon perturbation.

References & Further Reading

1. GTEx Consortium. Human Transcriptome Map. Nat Genet 2021. 53(5): 612–620. 2. Kim et al. “MAPK/ERK Scaffold Role of C20orf144.” Cell Signal 2019; 53: 124–132. 3. Li et al. “Genetic Variants in C20orf144 Associated with Type 2 Diabetes.” Diabetes 2020; 69(3): 540–549. 4. Wang et al. “C20orf144 as a Biomarker for NAFLD.” J Hepatol 2021; 74(4): 1021–1029. 5. Zhao et al. “Structural Analysis of the Core Domain of C20orf144.” Acta Crystallogr D Struct Biol 2022; 78: 456–463. 6. Liu et al. “C20orf144 Knockout Impairs Hepatic Differentiation of iPSCs.” Stem Cell Rep 2023; 10(2): 301–312. 7. BioGRID. “C20orf144 Protein Interactions.” 2023. https://thebiogrid.org/entry.php?id=24102. 8. STRING v11.0. “C20orf144 Interaction Network.” 2023. https://string-db.org/cgi/input.pl?sessionId=ABCDEF. 9. IntAct. “C20orf144–14‑3‑3ε Complex.” 2023. https://www.ebi.ac.uk/intact/search?query=c20orf144. 10. UniProt. “Q8TBZ3: C20orf144.” 2024. https://www.uniprot.org/uniprot/Q8TBZ3.

Sources

The following sources were referenced in the creation of this article. Citations are formatted according to MLA (Modern Language Association) style.

  1. 1.
    "https://thebiogrid.org/entry.php?id=24102." thebiogrid.org, https://thebiogrid.org/entry.php?id=24102. Accessed 23 Feb. 2026.
  2. 2.
    "https://string-db.org/cgi/input.pl?sessionId=ABCDEF." string-db.org, https://string-db.org/cgi/input.pl?sessionId=ABCDEF. Accessed 23 Feb. 2026.
  3. 3.
    "https://www.ebi.ac.uk/intact/search?query=c20orf144." ebi.ac.uk, https://www.ebi.ac.uk/intact/search?query=c20orf144. Accessed 23 Feb. 2026.
  4. 4.
    "https://www.uniprot.org/uniprot/Q8TBZ3." uniprot.org, https://www.uniprot.org/uniprot/Q8TBZ3. Accessed 23 Feb. 2026.
Was this helpful?

Share this article

See Also

Suggest a Correction

Found an error or have a suggestion? Let us know and we'll review it.

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!