Scientists from Skoltech (part of the VEB.RF group) and the University of Potsdam have developed a physical theory that sheds light on how molecular motors organize the three-dimensional structure of the genome. Using theoretical polymer physics and computer simulations, the researchers for the first time calculated a universal parameter of this organization — the density of loops formed through active extrusion by cohesin motors in each living cell.
The findings, published in the prestigious Proceedings of the National Academy of Sciences, show that 60 to 70 percent of all DNA in a cell is located within loops, each of which is formed by exactly one molecular motor. This means that loop extrusion is a key mechanism for compacting two-meter-long chromosomes inside a living cell. The research was supported by grants from the Russian Science Foundation and the Alexander von Humboldt Foundation.
The primary method for studying the three-dimensional structure of the genome — Hi-C technology — captures random contacts between different DNA regions within the cell nucleus. However, the experimental procedure itself distorts the original picture: chemical crosslinking, DNA fragmentation, and ligation all influence which contacts get registered. The authors built a physical model that explicitly accounts for all these biochemical steps alongside random loop formation, allowing them to separate the actual chromosome structure from the “optics” of the complex experimental protocol.
The theory predicted a universal feature found in all mammalian Hi-C maps. When you look at how the probability of contact between two DNA regions changes with the genomic distance between them along the chromosome, a characteristic dip appears at short scales. This dip arises from the competition between two effects: at very short distances, the signal is blurred due to the finite size of DNA fragments, while at longer distances, loops begin to dominate as cohesin motors actively pull distant sites together. The position and shape of the dip depend directly on loop density and on specific features of the experimental protocol.
To test the theory, the researchers analyzed more than 30 datasets from human and mouse cells. The loop density turned out to be surprisingly large: on average, six loops per million base pairs of DNA (with each loop spanning about a hundred thousand base pairs). When compared with independent measurements of cohesin bound to chromosomes — obtained using mass spectrometry and fluorescence microscopy — the numbers matched almost perfectly: about five to seven cohesin complexes per million base pairs.
“Nearly one-to-one agreement between the number of loops and the amount of chromatin-bound cohesin is strong evidence that, in living cells, it is single cohesin complexes that possess motor activity and are capable of extruding DNA loops on their own. For a long time, cohesin’s main structural role was thought to be limited to holding sister chromatids together after replication. Our collaborative work with PhD student Dmitry Starkov [from the Computational and Data Science and Engineering program] shows that this is not the whole story: the same single cohesin complexes actively organize the three-dimensional structure of the genome throughout most of the cell’s lifetime by extruding long stretches of DNA into loops. In addition, we were able, for the first time, to quantitatively determine the density of such loops in the genome. This is a clear example of how the methods of theoretical and statistical polymer physics make it possible to extract fundamental properties of living systems from biological data. It is no coincidence that our country has historically developed one of the world’s strongest schools in this field, dating back to the time of I. M. Lifshitz,” commented Kirill Polovnikov, the first and corresponding author of the study, an assistant professor and the head of the research group at the Skoltech AI Center.
Additional experiments confirmed the theory. In mouse cells with artificially degraded cohesin, loop density decreased exactly in proportion to the amount of remaining protein. When an enhanced protocol with an additional crosslinking agent was used, only the parameters related to the experimental procedure changed, while the estimated loop density remained the same.
This work is important not only for fundamental biophysics. Cohesin dysfunction has already been linked to developmental disorders, immune cell differentiation, and likely many other biological processes whose roles have yet to be uncovered. Cancer is increasingly seen not just as a disease of point mutations but as a disease of disrupted genome spatial organization. If chromatin architecture changes globally, it can alter gene expression and genome stability during cell division. Polymer physics offers a way to quantitatively monitor chromatin structure in such rearrangements.
The authors have released open-source code for extracting loop density from any Hi-C dataset. The method allows researchers to quantitatively assess changes in cohesin loop density based on characteristic features of the Hi-C signal.