R-loops, formed through DNA-RNA hybridization during transcription, are intrinsically linked to both transcription and chromatin. On one hand, they are generated by transcription within chromatin; on the other hand, once formed, R-loops can influence both transcription and chromatin structure. To fully understand their functions, regulation, and impact, it is essential to study R-loops within the broader contexts of transcription and chromatin. In this review, we will systematically examine how different types of R-loops are regulated by transcription and chromatin, how they affect transcriptional activity and chromatin organization, and how these dynamics are altered in human diseases. Our goal is to emphasize the distinct features and roles of various R-loop types, laying the groundwork for future studies aimed at developing an integrated understanding of R-loops, transcription, and chromatin.
While the focus of this review is on the interplay among R-loops, transcription, and chromatin, it is important to note that R-loops represent a major source of genomic instability in cells [1], [2]. For example, the collisions between R-loops and DNA replication forks contribute to transcription-replication conflicts (TRCs), giving rise to DNA breaks [3]. R-loops can be recognized by structure-specific endonucleases, leading to cleavage of DNA [4]. Furthermore, the single-strand DNA (ssDNA) exposed in R-loops can be attacked by DNA-modifying and repair enzymes, generating DNA lesions and toxic repair intermediates [5]. Given the known impact of DNA damage on transcription and chromatin, we need to carefully consider whether R-loops’ effects are direct or indirect in various biological contexts.
Recent advances in genome-wide R-loop mapping techniques, such as DNA-RNA immunoprecipitation-sequencing (DRIP-seq) and tagged RNaseH1 chromatin immunoprecipitation (R-ChIP), have provided a detailed and comprehensive view of the R-loop landscape across the genome. Although some variations are found by different methods, R-loops are consistently detected in actively transcribed genes. It is estimated that around 5 % of the genome and up to 12 % of active genes harbor R-loops [6]. R-loops are particularly prevalent in CG-rich and GC-skewed sequences in promoter-proximal regions and gene bodies, as well as at transcription termination sites. R-loops associated with active genes can be classified into two primary types: "promoter-paused" R-loops and "elongation-associated" R-loops [7] These distinct classes of R-loops differ in their mechanisms of formation, properties, functions, and potential side effects. While the basic mechanisms behind co-transcriptional R-loop formation have been extensively reviewed [2], [8], [9], this section will focus on recent insights into how different R-loops arise at active genes and how their dysregulation may contribute to various pathological conditions.
Promoter R-loops. Hotspots of R-loops are detected downstream of GC-skewed CpG island promoters. In addition, R-loops are generally high in a short region immediately downstream of the transcription start site (TSS). The formation of these promoter-proximal R-loops is likely driven by the pausing of RNA polymerase II (RNAPII) and the GC-skew of sequences near promoters and TSSs. The pausing and release of RNAPII near TSSs are natural steps of transcription. These steps are controlled by negative elongation factor (NELF) and DRB (5,6-Dichloro-1-β-D-ribofuranosylbenzimidazole) sensitivity inducing factor (DSIF), as well as positive transcription elongation factor b (p-TEFb), allowing precise regulation of the level of transcription elongation. The pausing of RNAPII at GC-skewed sequences near promoters and TTSs favors hybridization of nascent RNA with DNA template, presenting a major source of R-loops (Fig. 1A). The greater thermodynamic stability of GC base pairing between RNA and DNA and the high propensity of G-rich DNA to form G-quadruplexes (G4s) likely drive the formation of R-loops in GC-skewed regions.
Promoter-proximal R-loops could have either positive or negative impacts on transcription. The CpG islands at promoters are targets for DNA methylation, which represses transcription. Because R-loops are poor substrates of DNA methyltransferase I (DNMT1), the R-loops at CpG promoters counter DNA methylation and sustain active transcription (Fig. 1A). For example, the DNA-RNA helicase senataxin (SETX) is mutated in patients of amyotrophic lateral sclerosis 4 (ALS4), a motor neuron disease. An ALS4-associated and gain-of-function mutant of SETX, L389S, decreases promoter R-loops, leading to increased methylation of BAMBI promoter, reduced BAMBI expression, and heightened TGF-β signaling [9]. Paradoxically, promoter R-loops can also promote transcription termination. Sensor of ssDNA (SOSS), through recognizing the ssDNA in promoter R-loops, recruits the Integrator-PP2A (INTAC) complex to these R-loops to stimulate promoter-proximal transcription termination and attenuate R-loops, thereby ensuring genome stability [10] (Fig. 1A). Furthermore, a recent study reported that RNA splicing factors such as U2AF1 are enriched at promoter R-loops, which may influence co-transcriptional pre-mRNA splicing [11]. Thus, promoter R-loops play multiple regulatory roles during transcription, allowing transcription and associated processes to be programmed precisely in various contexts. The overall effects of promoter R-loops may be influenced by the DNA sequences at promoters, the exposure of ssDNA, and the competition among R-loop, ssDNA, and RNA-binding proteins.
Promoter R-loops are dysregulated in certain pathological conditions. For example, in cells lacking the tumor suppressor protein BRCA2, the recruitment of RNAPII associated factor 1 (PAF1) to the promoter-proximal pause site is diminished, resulting in reduced RNAPII elongation and an increase of promoter R-loops [12]. This increase of promoter R-loops may contribute to the genome instability in BRCA2-deficient cancer cells. Furthermore, expression of the splicing factor mutants associated with myelodysplastic syndrome (MDS), SRSF2P95H and U2AF2Q157P, increased promoter R-loops [13]. In cells expressing the SRSF2 mutant, p-TEFb is not properly extracted from a transitional RNAPII complex, preventing RNAPII from entering an actively elongating state. Notably, the SRSF2 mutant induces DNA damage in an R-loop-dependent manner, strengthening the link between promoter R-loops and genomic instability. Interestingly, R-loops around the TSSs of protein coding genes can act as promoters of 5’ antisense lncRNAs [14], generating stable R-loops with DNA-RNA hybrids on both DNA strands (Fig. 1A). These results suggest that the levels of promoter R-loops are fine-tuned by the RNAPII dynamics around promoters, and aberrant accumulation of promoter R-loops is a source of genomic instability.
Gene-body R-loops. When RNAPII elongates across gene bodies, the negative supercoiling behind RNAPII provides an opportunity for nascent RNA to hybridize with DNA template. Topoisomerase I (TOP1), which resolves negative supercoiling, is important for suppressing R-loops especially in long, highly transcribed genes in gene-poor regions [15]. Furthermore, elongating RNAPII oscillates between productive and backtracked states at various positions on DNA [16]. Expression of a mutant of TFIIS that traps RNAPII in its backtracked state pauses RNAPII in gene bodies and increases R-loop formation, suggesting that pausing and backtracking of elongating RNAPII give rise to R-loops [17]. Elongating RNAPII and nascent RNA are associated with RNA splicing factors, which may modulate the formation of R-loops during transcription elongation. In yeast, introns prevent R-loop formation through recruitment of the spliceosome onto the mRNA [18]. In human cells, cancer-associated recurrent mutations in RNA splicing factors U2AF1 and SRSF2 also increase R-loops, although the underlying mechanisms are not fully understood. Other proteins associated with nascent RNA, such as THO/TREX and TREX-2 complexes involved in mRNA export, also prevent R-loop formation during transcription elongation. BRCA2 was shown to suppress R-loops through its interaction with TREX-2 [19].
In addition to the mechanisms preventing R-loop formation, specific DNA-RNA helicases are implicated in the removal of co-transcriptional R-loops. Notably, loss of UAP56/DDX39B results in a global increase of R-loops in gene bodies and genomic instability [20]. In contrast, loss of DDX5 increases R-loops primarily at TSSs and transcription termination sites (TTSs), but not in gene bodies [21]. These findings suggest that different DNA-RNA helicases are used in different steps of transcription and at different genes to resolve R-loops. How different helicases are selectively used in specific contexts to suppress R-loops remains a question for future investigations.
While R-loop formation is induced by RNAPII pausing and backtracking, R-loops themselves can slowdown transcription (Fig. 1B). In yeast, R-loops delay transcription elongation, and the transcription termination factor Rat1 promotes premature transcription termination at R-loops [22]. Two models for R-loop-mediated transcriptional blockage have been proposed: 1) RNAPII is held back by the R-loop it generates, and 2) RNAPII is blocked by downstream R-loops [23]. Like promoter R-loops, R-loops in gene bodies can promote the generation of antisense lncRNAs (Fig. 1B). VIM (vimentin) expression is regulated by its antisense partner, VIM-AS1 [24], which is transcribed in the opposite direction from 709 bp downstream of the TSS. The VIM-AS1 RNA forms R-Loops at the VIM promoter, leading to decreased nucleosome occupancy and increased binding of transcription factors in the NF-κB pathway, activating VIM expression. Thus, RNAPII and R-loops affect each other during transcription elongation, providing a mechanism to regulate transcription after RNAPII pausing and release.
R-loops at the 3’ end. R-loops play critical roles in transcription termination by facilitating RNAPII pausing at the 3′ end of genes. For example, R-loops formed over the G-rich pause region of the human β-actin gene are essential for RNAPII to pause downstream of the poly(A) site [25] (Fig. 1C). Subsequent resolution of these R-loops by SETX enables the recruitment of the exonuclease XRN2 to TTSs, where XRN2 degrades nascent RNA, triggering the release of paused RNAPII and promoting transcriptional termination [25]. Other studies suggest that RNAPII pausing is not solely dependent on R-loops themselves but is also reinforced by R-loop-induced heterochromatin. Specifically, R-loops at G-rich TTSs increase pausing of RNAPII and formation of double-stranded RNA, which recruits Dicer, AGO1/2, and G9a to promote the formation of H3K9me2. H3K9me2 in turns recruits HP1γ (heterochromatin protein 1γ), further reinforcing RNAPII pausing and facilitating efficient transcriptional termination [25] (Fig. 1). In addition to SETX, the m6A modification of nascent RNA also contributes to XRN2 enrichment at TTSs, promoting the release of RNAPII. Intriguingly, the m6A modification of nascent RNA at TTSs is established in an R-loop-dependent manner. The R-loops formed at TTSs serve as anchors for the recruitment of DDX21, which in turns, recruits METTL3 (methyltransferase-like 3) to methylate nascent RNA. The occupancy of XRN2 at TTSs is significantly compromised by the loss of methyltransferase activity of METTL3 [26].
Terminator R-loops are also dysregulated in certain pathological conditions. For example, cells deficient for the tumor suppressor protein BRCA1 fail to recruit SETX to TTSs efficiently, resulting in increased levels of terminator R-loops [27]. Notably, head-on collisions between transcription and DNA replication forks occur at 3’ end of active genes, promoting a buildup of DNA supercoiling. In cells deficient for TOP1, high levels of R-loops accumulate at TTSs, which are likely caused by head-on conflicts between transcription and replication [28]. Both the terminator R-loops in BRCA1-deficient cells and TOP1-depleted cells are associated with increased genomic instability, showing the side effects of this type of R-loops.
At the 3’ end of transcribed regions, readthrough transcription can give rise to R-loops down stream of genes. Interestingly, inhibition of the splicing factor SF3B with Pladienolide B (Plad-B) caused not only widespread intron retention but also loss of transcription termination at a subset of stress-response genes, leading to accumulation of aberrant “downstream of genes” (DoG) R-loops [29]. Alternations of DoG RNA expression and length are prevalent in cancers, which may contribute to transcriptomic imbalances in cancer cells [30]. DoG R-loops may be a new class of R-loops that contributes to the transcription changes and genomic instability in cancer cells.
In addition to protein coding genes, R-loops can be formed by non-coding RNAs either in cis or in trans, especially at repetitive DNA elements. Many types of R-loops at repetitive sequences have important physiological functions, but their dysregulation could become a threat to genomic integrity. In this section, we will focus on several types of R-loops formed at well-defined repetitive sequence elements.
Telomeric R-loops. Telomeres, the ends of chromosomes, are characterized by unique telomeric DNA repeats and are essential for maintaining chromosome integrity [31]. Despite their heterochromatic characteristics, telomeres are transcribed in many eukaryotes including humans, plants, and yeast [32]. TERRA is a population of non-coding RNAs transcribed from subtelomeric promoters into telomeric repeats [33] (Fig. 2A). The majority of TERRA is non-polyadenylated and retained at telomeres, where it plays critical roles in regulating chromatin structure and maintaining telomeres [33]. TERRA associates with telomeres through interactions with telomere-binding proteins, or by hybridizing with telomeric DNA to form R-loops [33]. An increase of TERRA at telomeres was observed following depletion of RNaseH1, an enzyme that degrades the RNA from DNA-RNA hybrids [34]. While telomeric R-loops help tether TERRA to telomeres, they can cause telomere fragility [34]. Therefore, the levels of telomeric R-loops are intricately controlled by several factors, such as ATRX and FANCM. It was proposed that R-loops and G4s, which are formed by the G-rich strand of telomeric DNA, recruit ATRX to telomeric repeats, where ATRX resolves R-loops directly or recruits other proteins to remove R-loops [35]. ATRX may also suppress R-loop formation by binding TERRA, preventing it from hybridizing with telomeric DNA [36]. Furthermore, the ATRX-DAXX complex deposits the histone variant H3.3 in chromatin regions including telomeres, coinciding with the heterochromatin histone mark H3K9me3, which may influence the formation of R-loops [37]. The role of ATRX in R-loop suppression is supported by evidence from cancer cells using the alternative lengthening of telomerers (ALT) pathway, where ATRX is frequently mutated. ATRX loss leads to an increase in telomeric R-loops, promoting ALT by inducing replication stress [38].
In contrast to ATRX, FANCM protects ALT telomeres by limiting R-loops [39]. FANCM depletion in ALT-positive cancer cells results in a substantial increase in telomeric R-loops, elevating telomere DNA damage to an intolerable level. FANCM directly resolves R-loops in vitro using its translocase activity [40]. The function of FANCM to suppress ALT-associated DNA damage requires its interaction with BLM, which inhibits BLM’s ability to promote ALT. BRCA1 was also shown to directly interact with TERRA and promote R-loop resolution, presumably in collaboration with SETX and XRN2, which suppresses replication stress and telomere fragility [41].
Centromeric R-loops. Centromeres are chromosomal regions in which kinetochore complexes are assembled and attached to spindles, driving chromosome segregation in mitosis and meiosis. The core regions of human centromeres consist of clusters of α-satellite repeats. Pericentromeres, the regions flanking centromere cores, also contain satellite repeats such as SAT2 and SAT3. At centromere cores, the histone H3 variant CENP-A associates with α-satellite repeats, forming a specialized chromatin domain. Similar to telomeres, pericentromeres are bound by histone H3.3 deposited by ATRX-DAXX (Fig. 2B). While centromeres and pericentromeres display heterochromatic features, satellite repeats are transcribed, giving rise to centromeric RNAs (cenRNAs) and R-loops. cenRNAs are important for the formation of CENP-A-containing chromatin and pericentromeric heterochromatin [42], [43]. Centromeric R-loops also play important roles in centromere function (Fig. 2B). For example, centromeric R-loops direct Aurora B kinase to maintain centromeric cohesion [44]. Centromeric R-loops also recruit and activate ATR kinase during mitosis, which promotes Chk1 and Aurora B activation at centromeres and ensures the fidelity of spindle-kinetochore attachment [45]. While R-loops are important for centromere functions, they also present a threat to genome stability. It was shown that BRCA1 suppresses R-loop-associated centromeric instability by countering R-loop formation at centromeres [46]. The DNA methyltransferase DNMT3b also suppresses the formation of centromeric R-loops to limit centromere instability, possibly by restricting cenRNA expression [47]. Notably, a recent study showed that centromeres are universal hotspots for DNA breakage and RAD51-mediated recombination even in quiescent cells [48], raising the possibility that centromeric R-loops are processed into DNA breaks independently of DNA replication. Thus, centromeric R-loops are an integral part of functional centromeres, are associated with both transcription and heterochromatin, and are a source of genomic instability.
R-loops at rDNA and tRNA genes. In human cells, the 47S ribosomal DNA (rDNA) repeats are transcribed by RNA polymerase I (RNAPI) in the nucleolus, generating 47S pre-rRNA, which is subsequently processed into 18S, 5.8S, and 28S rRNAs. The rDNA locus in the human genome contains up to 400 copies of 47S rDNA repeats and is one of the most highly transcribed chromosomal loci. The high GC-content, repetitive nature, and robust transcription of the rDNA locus make it highly prone to R-loop formation. Indeed, R-loops were detected in the entire RNAPI-transcribed region in 47S rDNA repeats, including the 5’ and 3’ external transcribed spacers (ETSs) [49] (Fig. 2C). Interestingly, rDNA R-loops recruit the single-stranded DNA (ssDNA)-binding protein RPA to prevent R-loop-induced DNA double-strand breaks (DSBs), and loss of RPA leads to decreased 47S pre-rRNA expression and disorganization of nucleolar structure [50], showing the functional importance of rDNA R-loops. In addition, RNAPII, assisted by SETX, generates an antisense R-loop shield at intergenic spacers (IGSs) flanking the rRNA genes. This R-loop shield prevents RNAPI from producing sense intergenic noncoding RNAs (sincRNAs) in IGSs, which can disrupt nucleolar organization and inhibit rRNA expression [51] (Fig. 2C).
The human genome contains more than 500 interspersed tRNA genes on different chromosomes [52], which are highly transcribed by RNA polymerase III (RNAPIII). Interestingly, tRNA R-loops were detected in human cells by R-ChIP and ssDRIP-seq (DRIP followed by single-stranded adaptor-mediated sequencing), but not DRIPc-seq (DRIP followed by cDNA conversion and sequencing) [53], [54], [55]. Notably, most tRNA R-loops display both sense and antisense DNA-RNA hybrid signals, which may be a result of the sequence complementarity within tRNA-coding DNA sequences [53]. The functions and impact of tRNA R-loops in human cells remain to be determined. R-loops at tRNA genes are also detected in yeast and plants [56], [57]. In yeast, Pif1 suppresses R-loops at tRNA genes to prevent head-on collisions between replication and transcription [58]. In plants, intragenic tRNA R-loops are formed in an RNAPIII-dependent manner, and they interfere with the expression of host genes by inhibiting RNAPII elongation [57].
R-loops at transportable elements. Transposable elements (TEs), which can change their positions within the genome, comprise about half of the mammalian genomic DNA [59]. TEs are categorized into DNA transposons, which move directly as DNA sequences, and retrotransposons, which mobilize through RNA intermediates. Retrotransposons can be further classified into the non-LTR subclass, including long interspersed nuclear elements (LINE-1) and short interspersed nuclear elements (SINE), and the LTR subclass, such as endogenous retroviruses (ERVs) [60]. Although most retrotransposons are silenced in humans, subsets of LINE-1 are still active for retrotransposition [60]. Recent studies on cancer genomes have revealed loss of LINE-1 silencing in cancers [60]. Several repression mechanisms of LINE-1 are compromised in cancers, including loss of p53, loss of DNA methylation at CpG islands in the 5’UTR of LINE-1, and loss of H3K9me3 at LINE-1. LINE-1 retrotransposition occurs through target-primed reverse transcription (TPRT), producing LINE-1 RNA-cDNA hybrids as an intermediate. The LINE-1 RNA template needs to be displaced or degraded from RNA-cDNA hybrids by the host RNaseH2 to allow the synthesis of the second DNA strand. Several R-loop suppressors, such as BRCA1, BRCA2, and FANCM, restrict LINE-1 retrotransposition [60]. Thus, aberrant activation of LINE-1 can be a source of DNA damage through the process of retrotransposition mediated by R-loop-primed reverse transcription, particularly in cells lacking R-loop suppressors. In human pluripotent stem cells (hPSCs), R-loops are found in certain LINE-1 and SINE elements and LTR regions [61]. Interestingly, R-loops are detected in the enhancer and transcription elongation regions of ERV1 in hPSCs, where the ERV subfamily H (HERV-H) is highly active [61]. This finding suggests that R-loops may promote ERV activation or activated ERVs generate R-loops. Of note, LINE-1 and ERVs can be activated by disruption of heterochromatin or inhibition of DNA methylation in various developmental and therapeutic contexts [62], [63], [64]. Epigenetic approaches are developed to activate retrotransposons in cancer cells, so called viral mimicry, to stimulate cancer immunotherapies [65]. Whether retrotransposon-associated R-loops are involved in viral mimicry remains to be investigated.
R-loops at trinucleotide and other repeats. Expansion of trinucleotide repeats in the human genome has been linked to ∼40 human diseases, such as Friedreich ataxia (FRDA) and Fragile X syndrome (FXS). Because many trinucleotide repeats are GC-rich, they are prone for R-loop formation. In cells from FRDA and FXS patients, R-loops are found at FXN and FMR1 genes, where trinucleotide expansions silence the host genes and drive pathogenesis [64]. In vitro, R-loops are formed when disease-associated CTG and CGG repeats are transcribed on both DNA strands [66]. Depletion of RNaseH1 and RNaseH2 in human cells destabilizes CTG repeats [67], suggesting that R-loops are the cause of repeat instability. In yeast, R-loops at CAG repeats induce cytosine deamination and base excision repair (BER), leading to repeat breakage and contractions [68]. R-loops at GAA and CAG repeats also promote repeat deletion through BER [69]. A recent study suggested that DNA demethylation and site-specific R-loops at the FMR1 locus are necessary and sufficient for CGG repeat contraction through DNA repair, reactivating the silenced FMR1 gene [70]. In addition to R-loops, DNA polymerase slippage and nucleotide misincorporation in repeats can also lead to instability. Polymerase slippage gives rise to DNA hairpins in repeats, which are recognized by mismatch repair (MMR) proteins MSH2-MSH3 and lead to DNA synthesis by polymerase β [71], [72], [73]. Depending on whether DNA hairpins are formed on the parental or daughter DNA strand, the DNA synthesis by Pol β can result in either expansion or contraction of trinucleotide repeats. Interestingly, BER-generated DNA nicks can also give rise to DNA hairpins in repeats [74], suggesting that R-loops may affect trinucleotide repeats indirectly through MMR. Beyond trinucleotide repeats, R-loops can also affect the stability of repeats of longer sequences. For example, the hexanucleotide repeat expansion in amyotrophic lateral sclerosis (ALS) is driven by R-loops [75].
The majority of R-loops are formed in transcribed regions of the genome, and these R-loops are associated with various histone markers of active genes. On one hand, co-transcriptional R-loops are regulated by chromatin-modulating mechanisms that influence various steps of transcription. On the other hand, R-loops themselves can impact the chromatin in which they reside, affecting transcription. Paradoxically, some R-loops are found in heterochromatic regions where transcription is generally repressed, suggesting that R-loops may contribute to heterochromatin formation in trans or before transcription repression is established. In this section, we will discuss the intertwine relationship between R-loops and chromatin, highlighting examples of its dysregulation in cancers.
R-loops and chromatin at active genes. DRIPc-seq analysis revealed that promoter R-loops are associated the histone marks of open chromatin, including H3K4me2/3, H3K9ac, and H3K27ac [54] (Fig. 1A), which is consistent with the idea that promoter R-loops arise from RNAPII pausing after transcription initiates from promoters. In gene bodies, R-loops are associated with H3K36me3, a histone mark of transcription elongation [54] (Fig. 1B). At transcription terminators, R-loops are associated with H3K4me1, indicating chromatin is less open than the promoters marked by H3K4me2/3 (Fig. 1C). The R-loops detected by R-ChIP are mostly enriched at promoters but less in gene bodies and terminators [53]. Consistently, the promoter R-loops detected by R-ChIP are associated with open chromatin marks H3K4me2/3, H3K27ac, and DNase sensitivity. Thus, R-loops in various parts of active genes are generally associated with the chromatin marks that naturally form during transcription, which suggests that R-loops, at least when they are present at normal levels, do not alter the transcription-associated chromatin state substantially. It should be noted that the differences between DRIPc-seq and R-ChIP may stem from the distinct properties of the S9.6 antibody and the hybrid-binding domain of RNaseH1 [76]. In addition to RNA-DNA hybrids, S9.6 has affinity to other RNA structures that may be indirectly affected by hybrids. In contrast, when expressed in cells, the hybrid-binding domain of RNaseH1 may trap and stabilize transient R-loops, and its accessibility to R-loops may be affected by chromatin environments and other R-loop binding factors. These distinct properties of DRIPc-seq and R-ChIP may allow them to preferentially detect certain subsets of R-loops and related structures reflecting biological variations.
Although many R-loops may form passively during transcription in GC-rich and GC-skewed sequences, some R-loops can affect local chromatin to modulate transcription locally. For example, R-loops at promoters or in gene bodies of certain genes can promote the formation of antisense R-loops at promoters, which may reduce nucleosome occupancy at promoters and increase the binding of transcription activators [24]. Furthermore, R-loops formed in the GC-rich regions of terminators in certain genes can induce RNA interference (RNAi)-dependent H3K9me2 formation over pause sites in terminator regions, facilitating RNAPII pausing prior to efficient termination [77]. Finally, when R-loops are formed at abnormally high levels upon transcription activation in cancer cells, they can lead to DNA damage and damage-associated chromatin changes [4]. Thus, in specific chromatin contexts or in response to changes of transcription, R-loops can modulate chromatin at active genes. Whether the effects of R-loops on chromatin can be generalized to a broad range of biological contexts remains to be investigated.
R-loops are often dysregulated in cancer cells with altered chromatin structure, resulting in increased genomic instability and DNA damage. For example, mutations in the SWI/SNF-family chromatin-modulating protein SMARC4 (also known as BRG1) in cancer cells is known to increase R-loops [78]. The SWI/SNF-family complexes, including BAF, PBAF, and ncBAF complexes, are ATP-dependent chromatin remodelers that regulate nucleosome positioning and gene expression. SMARC4, a shared component of all SWI/SNF complexes, is frequently mutated in cancers. When SMARC4 is functional, it localizes to sites of TRCs, where replication forks collide with transcription. At these sites, SMARC4 works with other BAF components and FANCD2 to resolve R-loops and suppress TRCs [78]. In cancer cells lacking SMACA4, R-loops and DNA damage are increased. Thus, the ability of BAF in regulating nucleosome positioning at sites of TRCs is important for restricting R-loops. Abnormally high levels of R-loops are also detected in Ewing sarcoma cells expressing the EWS-FLI1 fusion oncoprotein [79], where EWS-FLI1 recruits BAF to EWS target genes through a prion-like domain [80]. The strong activation of target genes and the redistribution of BAF complexes may contribute to the accumulation of R-loops in EWS-FLI1 expressing cells. The functions of chromatin-modulating proteins in suppressing R-loops can be exploited in cancer therapy. For example, inhibition of BRD4, a bromodomain protein recognizing acetylated histones and promoting RNAPII release from pause sites, increases R-loops and cell death in cancer cells [81]. Mechanistically, inhibition of BRD4 leads to persistent RNAPII pausing, which increases promoter R-loops, TRCs, and DNA damage [82]. R-loops may also induce DNA damage by inducing chromatin changes. In yeast, certain histone mutants induce R-loop-dependent DNA damage by promoting histone H3S10 phosphorylation [83]. Together, these examples highlight that altered transcription and chromatin regulation in cancer cells are sources of aberrant R-loops, presenting a targetable vulnerability for cancer therapy.
R-loops and heterochromatin. While R-loops are generally associated with active transcription, some R-loops are detected in heterochromatic regions, such as telomeres and centromeres. In ALT-positive cancer cells, TERRA promotes the recruitment of polycomb complexes to telomeres, where the polycomb complexes establish H3K9me3, H4K20me3, and H3K27me3 heterochromatin marks [84]. In particular, TERRA binds to the polycomb repressive complex 2 (PRC2), which establishes H3K27me3 at telomeres. Furthermore, PRC2-generated H3K27me3 is required for the establishment of H3K9me3 and HP1 binding at telomeres, explaining how TERRA directs the formation of telomeric heterochromatin. Paradoxically, TERRA also suppresses heterochromatin at subtelomeres through its binding to ATRX [85]. In mouse embryonic stem (ES) cells, by binding to ATRX, TERRA antagonizes the ability of ATRX to remove G4s and repress transcription at subtelomeres, thereby reducing H3K9me3 in subtelomeric regions [86]. Given that TERRA-generated telomere R-loops are important for the association of TERRA with telomeres, telomere R-loops probably have both positive and negative effects on telomere heterochromatin in different contexts. Of note, in ALT+ cancer cells, which are often ATRX-deficient, telomere R-loops promote ALT activity and telomere maintenance [87]. Loss of FANCM in ALT+ cancer cells results in an increase of telomere R-loops, hyperactivation of ALT, and intolerable replication stress at telomeres, leading to telomere loss and cell death [88]. These examples suggest that the levels of telomere R-loops and their modulators may affect telomere chromatin and stability.
Similar to telomeres, pericentromeres are associated with heterochromatic features such as H3K9me3 and HP1 binding. SUV39H1 and SUV39H2, two histone methyltransferases that generate H3K9me2/3, are recruited to pericentromeres through their interactions with cenRNAs and DNA-RNA hybrid formation [89], [90]. Thus, transcription of non-coding RNAs from the repetitive sequences at telomeres and centromeres may be a general mechanism to establish and regulate heterochromatin in these regions. These regions may oscillate between transcriptionally active and inactive states during the cell cycle, allowing non-coding RNAs and heterochromatin to function sequentially. For example, in human cells, TERRA is expressed in early S phase but gradually shut off in late S and G2 [91]. Because telomeres typically undergo DNA replication in late S phase due to their heterochromatin structure, this finding suggests the presence of heterochromatin at telomeres after TERRA expression.
The functions of R-loops in regulating heterochromatin go beyond telomeres and centromeres. In Drosophila, R-loops are found at many polycomb response elements (PREs) [92] where PRC2 promotes the formation of R-loops. In vitro, both PRC1 and PRC2 have the ability to bind R-loops. Because RNA is implicated in polycomb targeting and function, this finding suggests that R-loops may be a general mechanism to localize polycomb complexes to target sites and establish heterochromatin. Indeed, R-loops promote the localization of PRC1 and PRC2 to a subset of developmental regulator genes in mouse ES cells, ensuring the proper repression of these genes when they should not be expressed [93]. Additionally, antisense R-loops promote PRC2 occupancy and H3K27me3 accumulation at certain genes [94]. For example, ANRASSF1, the antisense lncRNA transcribed from the tumor suppressor gene RASSF1, forms R-loops to promote PRC2-mediated silencing of RASSF1 in cancer [95].
Comments (0)