Mining Large Datasets of Genomic Architecture

The analysis of large data sets reveals surprises within forgotten strands of DNA in a research project headed by Biology Professor Cornelis Murre.

Intricate human features such as the immune system require exquisite formation and timing to develop properly. Genetic elements must be activated at just the right moment, across vast distances of genomic space. “Promoter” areas, locations where genes begin to be expressed, must be paired precisely with “enhancer” clusters that allow cells to mature to a targeted function. The two must be brought in close proximity and if not properly synced, diseases (such as leukemia and lymphoma) and immune deficiencies can result.

Seeking to identify DNA elements that influence these sophisticated processes, the Division of Biological Science’s Cornelis Murre and his colleagues mined large data sets of 3D genome architecture in search of elements that move across genomic neighborhoods to bring promoters and enhancers together.

The search yielded a spectrum of results including a surprising discovery within previously overlooked stretches of DNA between genes. Calling it the “Big Bang of immune cell development,” the researchers found that a non-coding transcript named “ThymoD” orchestrates the development of immune system building blocks known as T cells. Murre describes the mechanism as somewhat like a stiff wire—with enhancers and promoters on either end—that’s bended together into a loop and anchored in place. Once brought together in a loop, enhancers and promoters orchestrate immune cell fate and act to suppress the development of disease.

“Nature is so clever. We think of the genome as an unstructured strand but in fact what we are seeing is a highly structured and meaningful design,” said Murre. “The process of architecture remodeling we’ve described allows the enhancer and promoter to find each other in 3D space at precisely the right time. The beauty is that it’s all very carefully orchestrated. Ultimately we may be able to fix mutations associated with disease at these forgotten strands of DNA.”