In pursuing the genetic roots of a particular illness or disease we are getting to the very heart of it. This will enable us to develop better prevention techniques and more personalized treatments in the future. By revealing the biological mechanisms of these inheritable diseases and discovering their genetic roots, scientists can create more effective prevention strategies and better therapies with fewer side effects. Taking all that into consideration, over the past decade, thousands of genetic studies have been performed by various scientists all over the globe known as genome-wide association studies (GWAS).  The main focus of these studies is to reveal which common DNA differences influence traits such as hair color, blood sugar level, or height or that raise the risk of developing certain diseases such as cancer or psychiatric illnesses.

GWAS bring many great things to the table in terms of genetic findings and are able to uncover risk factors from across the genome in an unbiased way. Their findings can reveal new and unsuspected biological mechanisms that may one day be targeted with drugs.  Several thousand genome-based experiments have taken place since the first GWAS were published in 2007 bringing together thousands of genetic regions with various diseases and traits. Although GWAS results reveal genetic changes that correspond to a particular disease, that doesn’t mean the variant identified is what caused the illness. The best way to look at GWAS is to view them as a “hypothesis-generating” effort that will perhaps point you in the right direction, but you still need to test the hypothesis.

The next phase, beyond GWAS, involves uncovering the biological consequences of mutations associated with disease and developing a new form of therapeutics from that insight. “Now, it’s relatively straightforward to discover genetic associations, but then to go from association to function is a much taller order,” said Jose Florez, chief of the diabetes unit at the Massachusetts General Hospital (MGH), an Institute member at the Broad where he co-directs the Metabolism Program, and an associate professor at Harvard Medical School (HMS).

We have the Human Genome Project to thank for really developing the notion of using genetics to uncover the roots of common diseases. Genetically, all humans are 99.9% alike, but it’s that .1% that researchers are interested in and is the key to uncovering inheritable diseases and traits. The human genome consists of around three billion nucleotides, and around ten million of these ate polymorphic SNP’s (single-nucleotide polymorphism), which means they occur in two or more forms. While no single common disease SNP is enough to cause disease on its own, they still contribute to it.

In the early 2000’s the International SNP Consortium surveyed SNP’s across the genome and by 2005 the International HapMap Consortium was in place. This mapped SNP’s called haplotypes that are often inherited together in DNA blocks and allowed a smaller amount of SNPs to take the place of neighboring variants, reducing the cost and resources needed to sufficiently test DNA for its common variation. Soon, companies started to make “SNP chips” that were able to test a single sample for thousands of SNPs simultaneously. In 2004, recognizing how important this technology was to become, the Broad Institute built a genetic analysis facility.

An inexpensive genotyping technology was now available which was enabled by the HapMap. As was an emphasis on open sharing of data and sophisticated analytical methods. In 2007, the GWAS approach began to take hold when the first large genome-wide association study was published. The study saw 2,000 cases each of coronary heart disease, Crohn’s disease, bipolar disorder, type 1 and type 2 diabetes, rheumatoid arthritis, and high blood pressure and 3,000 healthy controls. Results from the study revealed 24 significant disease-associated variants of DNA.

Since then many thousands upon thousands of associations have been made through the use of GWAS in terms of genetic variants and complex traits. There are now dozens of huge GWAS that each involve more than 100,000 subjects included in the GWAS Catalog.  This massive amount of data allows scientists to test for different sets of SNPs and to reveal more genetic markers. “For pretty much every polygenic trait and every disease that we looked at with GWAS, more samples gave us more and more discovered loci, in a surprisingly linear fashion,” said Joel Hirschhorn, GIANT Consortium leader and Institute member and co-director of the Metabolism Program at the Broad.

The diversity and number of DNA changes uncovered by GWAS surprised Hirschhorn and colleagues.  Since then, scientists have learned that some common disease is more complex than others with some being associated with many different genetic regions.  Another discovery to have come from GWAS is that disease associated variants are very distinct from those initially thought to have importance. “Surprisingly, the hits we got from GWAS didn’t overlap well with the list of candidate disease genes that we had drawn up,” said Hirschhorn. “Our next step is even more challenging: going from those loci to uncovering the actual genes that are involved and discovering what that means for the underlying biology.”

In order to interpret GWAS hits, scientists have to carry out what is known as “fine-mapping”, which is a deep analysis of the surrounding DNA where a denser set of SNPs is genotyped in the region of DNA that it’s needed. If the scientists discover the causal mutation within DNA that encodes for a protein, they can carry out their studies in a dish or cell to get a feel for how its functional consequences or biological pathways work.  However, the majority of mutations uncovered by GWAS are found within non-protein-coding regions. To learn about these mutations, scientists often rely on data from previous or ongoing projects such as ENCODE, Genotype Tissue Expression (GTEx), or Roadmap Epigenomics.  They sometimes use experimental tools, or study reporter assays, or even use techniques associated with CRISPR-Cas9 genome editing.

Some areas where GWAS has had dramatic effects include Crohn’s disease, type 2 diabetes, and schizophrenia. One of the first ever conditions to be pursued aggressively using GWAS was inflammatory bowel disease (IBD). With very little known about genetic risk factors prior to GWAS, the study revealed dozens of underlying genetic links to IBD.  Following that initial success, scientists worldwide collaborated to combine their data and efforts to create the International IBD Genetics Consortium (IIBDGC). New data has revealed that using fine-mapping methods researchers have been able to pinpoint the causal mutations underlying 18 IBD-associated DNA regions.

More recently, in 2016, Mark Daly led a study that revealed, even more, insights into IBD by studying previous GWAS hits. His study uncovered a rare mutation that disrupts the function of the gene and protects against ulcerative colitis. GWAS also offers scientists a non-invasive approach to studying psychiatric illnesses. Scientists and researchers worldwide have worked hard to collect data from thousands of patients suffering from schizophrenia.  In 2014, using a GWAS, 108 genomic loci were discovered that were associated with schizophrenia opposed to the handful known about a few years prior.

A 2016 study carried out by scientists at the Broad’s Stanley Center for Psychiatric Research, Harvard Medical School. and Boston Children’s Hospital revealed proof that schizophrenia can be caused in part by excessive synaptic pruning that happens in the brain during late adolescence.  With thanks to studies like these, new therapeutic avenues can be explored and the stigma attached to mental illness can hopefully begin to disappear.

Back in 2017, GWAS was published on type 2 diabetes, led in part by Broad, and including the work of the Diabetes Genetics Initiative (DGI). There were 10 genetic risk factors uncovered collectively through these studies which suggested that undiscovered T2D-associated with loci, would have an effect on the risk. Albeit small, it was enough to constitute larger studies.  The number of subjects being studied increased to more than 10,000.  In doing so, they discovered a further six genetic regions associated with T2D.  A second study carried out used 45,000 samples and identified another 12 more T2D loci. Through GWAS, more than 100 genetic regions have now been discovered that are associated with type 2 diabetes.

Nearly the same amount have been found that are associated with coronary artery disease.  Some of these have been near genes relating to metabolism while others relate to blood pressure. In doing these studies the biological pathways involved have been highlighted. New GWAS data from the UK has identified 15 new loci relating to coronary heart disease (CAD), bringing the total to 95. Sekar Kathiresan is a Broad scientist who also conducted a study recently that related to a genetic region originally noticed in a 2009 GWAS aimed at heart attack risk. Through this study, the team discovered an SNP within the PHACTR1 gene that seemed to be the causal variant. They used genome editing tools and gene regulatory data to show that SNP, in fact, controls EDN-1, and not PHACTR1.

One day scientists will be able to search massive biobanks that hold genome-wide data and related information on lifestyle, measured traits, environmental exposures, and diet. The larger samples will shed light on interactions between various genes and environmental factors. Now that the cost of both exome-wide genotyping and next-generation DNA sequencing is coming down, more rare variants can also be picked up from future GWAS.  Researchers will also begin to put more emphasis on including more diverse populations such as Africans and Latin Americans.

More News to Read