42 posts · 41,090 views
Sort by Latest Post, Most Popular
View by Condensed, Full
by Daniel MacArthur in Genetic Future (Wired)
Regardless of your stance on the hyping of genomic medicine over the last decade, there is one small but important group of patients – those suffering from rare, severe diseases – for whom the unravelling of the human genome and subsequent advances in genetic technology have unambiguously changed lives. The genetic map provided by the [...]... Read more »
Bale, S., Devisscher, M., Criekinge, W., Rehm, H., Decouttere, F., Nussbaum, R., Dunnen, J., & Willems, P. (2011) MutaDATABASE: a centralized and standardized DNA variation database. Nature Biotechnology, 29(2), 117-118. DOI: 10.1038/nbt.1772
by Daniel MacArthur in Genetic Future (Wired)
Genetic Future will be back to more regular posting next week, when I get back from holiday. In the meantime I asked the one and only Misha Angrist – Assistant Professor, blogger, tweeter, fellow genomic exhibitionist, and author of the excellent new book Here is a Human Being – if he’d be willing to contribute [...]... Read more »
Bredenoord AL, Kroes HY, Cuppen E, Parker M, & van Delden JJ. (2011) Disclosure of individual genetic data to research participants: the debate reconsidered. Trends in Genetics, 27(2), 41-7. PMID: 21190750
by dgmacarthur in Genetic Future
Kai Wang is a postdoctoral fellow at the Center for Applied Genomics, Children's Hospital of Philadelphia and an author on numerous genome-wide association studies. He left this lengthy comment as a response to my recent post on this comment by McClellan and King in Cell, and I felt it warranted promotion to a full post (with Kai's permission). For more discussion of the M&K review see also two recent posts by Steve Turner at Getting Genetics Done, and an excellent post from p-ter at Gene Expression. A similar version of this comment is also published at Getting Genetics Done. I've done some mild editing here for clarity, added some sub-headings and links, and deleted two statements that could be regarded as ad hominem arguments. None of these changes affect the substance of Kai's argument.Citation: McClellan, J., & King, M. (2010). Genetic Heterogeneity in Human Disease Cell, 141 (2), 210-217 DOI: 10.1016/j.cell.2010.03.032Quite a few people mentioned to me about the McClellan et al paper and the related Internet posts about it (including those in Genetic Future). Discussion on at least three diseases in the paper (hearing loss, SCA and autism) cited some of my published papers, and I therefore decided to post my comments on the Internet, to set the records straight. Although I whole-heartedly agree that rare variants play a substantial role in human diseases, I also think that the section on GWAS reflects misunderstandings of the concept of GWAS, ignorance of standard practices in GWAS, misinterpretation of published primary research data, and as a result, is misinforming the general readership of Cell. These issues need to be rectified for the good of the scientific community, and for the healthy development of methodology and practice of human genetic research. For impatient readers, these are the major points: GWAS interrogate disease loci through linkage disequilibrium, so the lack of known biological function on GWAS SNPs does not justify the attack against GWAS by McClellan et al; Methods for adjusting population stratification are well established in the GWAS community; it is not a valid argument to explain most GWAS signals (with odds ratio less than 2) by stratification, especially if family-based study design is used (including the autism GWAS); McClellan et al used rs4307059 (from autism GWAS) as a "particularly dramatic" example of stratification because its frequency varies across Europe and it is monoallelic in Africa, which is not scientifically and statistically justified. In fact, it is the nature of SNPs to have differing allele frequencies across populations, and almost half of the SNPs in Illumina array have higher Fst population divergence values than rs4307059 (that is, half the SNPs are more variable than rs4307059 across human populations). Below I elaborate these points more specifically for interested readers. Read the rest of this post... | Read the comments on this post...
... Read more »
McClellan, J., & King, M. (2010) Genetic Heterogeneity in Human Disease. Cell, 141(2), 210-217. DOI: 10.1016/j.cell.2010.03.032
by dgmacarthur in Genetic Future
... Read more »
Craddock, N., Hurles, M., Cardin, N., Pearson, R., Plagnol, V., Robson, S., Vukcevic, D., Barnes, C., Conrad, D., Giannoulatou, E.... (2010) Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls. Nature, 464(7289), 713-720. DOI: 10.1038/nature08979
by dgmacarthur in Genetic Future
Lupski, J.R., et al. (2010). Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy. New England Journal of Medicine advance online 10.1056/nejmoa0908094Roach, J.C., & et al. (2010). Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science : 10.1126/science.1186802[Note that links to the papers may not yet be active.]Two new papers out today - the first ever studies to employ whole-genome sequencing for disease gene discovery - neatly illustrate both the promise and the challenges lying ahead both for clinical and personal genomics.The first paper presents the final - and successful - outcome of geneticist James Lupski's attempt to track down the genetic basis of his own disease. Lupski suffers from a syndrome called Charcot-Marie-Tooth (CMT) disease, a neurological condition which results in muscle weakness and wasting. The paper describes the process of sifting through the thousands of potentially functional variants to eventually pin down the mutations responsible, which turn out to be in a gene that has been previously associated with CMT.This study is a clear illustration of the power of whole-genome sequencing to cast light on a long-standing personal mystery (Lupski has been searching for his disease mutation for decades). However, Lupski was fortunate that his mutation fell within a gene that had already been demonstrated to be linked to CMT; as the second study shows, researchers hunting for entirely novel disease-causing genes face a more serious challenge.The second paper describes a similar attempt to nail down the gene responsible for a severe disease, this time using whole genome sequencing performed by Complete Genomics on four members of a family: two siblings affected by a disease called postaxial acrofacial dysostosis (Miller syndrome), and their two unaffected parents.Here the outcome is less unambiguously cheerful: this paper illustrates that even with complete genomes it can still be hard to pick apart the genetic origins of disease. Read the rest of this post... | Read the comments on this post...
... Read more »
Lupski, J.R. (2010) Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy. New England Journal of Medicine. info:/10.1056/nejmoa0908094
Roach, J.C., & et al. (2010) Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science. info:/10.1126/science.1186802
by dgmacarthur in Genetic Future
Medland et al. (2009). Common Variants in the Trichohyalin Gene Are Associated with Straight Hair in Europeans. The American Journal of Human Genetics DOI: 10.1016/j.ajhg.2009.10.009A couple of weeks ago I reported on a presentation by 23andMe's Nick Eriksson at the American Society of Human Genetics meeting in Honolulu, in which Eriksson presented data on a series of genome-wide association studies performed by the company using genetic and trait data from its customers.Along with genetic analysis of a variety of other traits (such as asparagus anosmia and photic sneeze) Eriksson presented data on two novel regions significantly associated with hair curl, one close to the TCHH gene and a second near WNT10A (see the abstract for details). I noted at the time that 23andMe appears to be doing a pretty good job of running genome-wide association studies, although of course the real test of this is independent replication.Well, now we have replication (of a sort) for at least two of 23andMe's novel findings - but unfortunately for the 23andMe crew the "replication" study has beaten them into print. Read the rest of this post... | Read the comments on this post...
... Read more »
Medland, S., Nyholt, D., Painter, J., McEvoy, B., McRae, A., Zhu, G., Gordon, S., Ferreira, M., Wright, M., & Henders, A. (2009) Common Variants in the Trichohyalin Gene Are Associated with Straight Hair in Europeans. The American Journal of Human Genetics. DOI: 10.1016/j.ajhg.2009.10.009
by dgmacarthur in Genetic Future
Mihaescu, R., van Hoek, M., Sijbrands, E., Uitterlinden, A., Witteman, J., Hofman, A., van Duijn, C., & Janssens, A. (2009). Evaluation of risk prediction updates from commercial genome-wide scans Genetics in Medicine, 11 (8), 588-594 DOI: 10.1097/GIM.0b013e3181b13a4fCaroline Wright from the Public Health Genomics Foundation has a concise post describing the results from a recent paper in Genetic Medicine. The paper evaluates the probability that personal genomics customers will find that their predicted risk of a common disease changes significantly over time as their genetic data are updated, using data on known type 2 diabetes risk variants as a case study. Read the rest of this post... | Read the comments on this post...
... Read more »
Mihaescu, R., van Hoek, M., Sijbrands, E., Uitterlinden, A., Witteman, J., Hofman, A., van Duijn, C., & Janssens, A. (2009) Evaluation of risk prediction updates from commercial genome-wide scans. Genetics in Medicine, 11(8), 588-594. DOI: 10.1097/GIM.0b013e3181b13a4f
by dgmacarthur in Genetic Future
Pushkarev, D., Neff, N., & Quake, S. (2009). Single-molecule sequencing of an individual human genome Nature Biotechnology DOI: 10.1038/nbt.1561Yes, it's yet another "complete" individual genome sequence, following on the heels of Craig Venter, James Watson, an anonymous African male (twice, and not without controversy), a female cancer patient, a Chinese man, and two Koreans. There is a new twist, though: this is the first genome to be sequenced using single molecule sequencing technology - also known as "third-generation" sequencing, to distinguish it from first-generation Sanger sequencing, and from the newer second-generation platforms 454, Illumina and SOLiD that have been responsible for seven of the eight individual genomes published so far*. The technology in question is the Heliscope, brought to you by Helicos BioSciences; and the genome in question belongs to Helicos co-founder Stephen Quake.Single molecule sequencing is clearly the future of genome analysis, so this should be an exciting announcement - but while this paper is a promising taste of things to come, the genome sequence itself is in many ways a disappointment. Let's take a look at what Helicos have achieved, and at just how far the company has to go before it can hope to compete with established second-gen platforms. Read the rest of this post... | Read the comments on this post...
... Read more »
Pushkarev, D., Neff, N., & Quake, S. (2009) Single-molecule sequencing of an individual human genome. Nature Biotechnology. DOI: 10.1038/nbt.1561
by dgmacarthur in Genetic Future
While I continue my work-induced blog coma, here's a guest post from Luke Jostins, a genetic epidemiology PhD student and the author of the blog Genetic Inference, delivering a fairly scathing critique of a recent whole-genome sequencing paper based on Life Technologies' SOLiD platform.McKernan et al. 2009. Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding Genome Research DOI: 10.1101/gr.091868.109In prepublication at the moment is a
paper from the labs of ABI,
makers of the SOLiD sequencing system. It is the first published human genome to be sequenced entirely by SOLiD, the lowest-coverage non-454 second generation genome, and, interestingly, the second whole genome publication to not come out in either Science or Nature. There is a lot of interesting stuff going on with this paper, and with the discussions going on around it, and there are a lot of stories to tell about it.
Genome Blogging
This is a case of the genomics blogging community breaking a very interesting story hot off the press. Dan Koboldt, in a blog
post at the blog MassGenomics reported the pre-publication of the SOLiD genome very rapidly after its appearance. He had a very interesting insight; the individual sequenced was the SAME INDIVIDUAL as Bentley
et al, the Illumina genome! The SOLiD paper makes no comparison, and does not even mention that they are, despite reviewing previous work done on this individual:
The NA18507 genome has not been Sanger sequenced to more than 0.5× sequence coverage (Kidd et al. 2008). However, it has been extensively genotyped as part of the HapMap project (Frazer et al. 2007) and some regions shotgun sequenced to higher depth as part of the ENCODE project (Birney et al. 2007).
Score one for blog-based reporting. Score another one for the fact that Kevin McKernan, lead author of the paper, spoke up in the comments of Dan's post, saying that they had not done a comparison due to the lack of genotype information published by Illumina. Why they don't mention this justification in the paper, or even comment on the existence of a first higher-coverage second gen sequencing project on the same individual, is far from clear (and it is a bit distressing that the reviewers failed to pick up on this).
Anyway, a quick-and-dirty comparison is not too hard, so lets do one now. Read the rest of this post... | Read the comments on this post...
... Read more »
McKernan, K., Peckham, H., Costa, G., McLaughlin, S., Fu, Y., Tsung, E., Clouser, C., Duncan, C., Ichikawa, J., Lee, C.... (2009) Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Research. DOI: 10.1101/gr.091868.109
by dgmacarthur in Genetic Future
Purcell et al. (2009). Common polygenic variation contributes to risk of schizophrenia and bipolar disorder Nature DOI: 10.1038/nature08185Neil Walker has been doing a spectacular job of serving up useful information in the comments recently, so I asked him to write the first ever guest post on Genetic Future - something that (as I will be announcing shortly) I intend to do fairly regularly over the next couple of months.The topic is a paper that has created a rather perplexed buzz recently in the complex disease genetics community: the genome-wide association study (GWAS) for schizophrenia published in Nature last week. This paper takes a novel and (at first glance) rather alarming approach to exploring the genetic basis of this complex disease, so I asked Neil to provide some insight into what he thought about the approach used in this paper and what it means for complex disease genetics.Without further comment, I present Neil's post: Read the rest of this post... | Read the comments on this post...... Read more »
Purcell, S., Wray, N., Stone, J., Visscher, P., O'Donovan, M., Sullivan, P., Sklar, P., Purcell (Leader), S., Stone, J., Sullivan, P.... (2009) Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. DOI: 10.1038/nature08185
by dgmacarthur in Genetic Future
Cho, Y., Go, M., Kim, Y., Heo, J., Oh, J., Ban, H., Yoon, D., Lee, M., Kim, D., Park, M., Cha, S., Kim, J., Han, B., Min, H., Ahn, Y., Park, M., Han, H., Jang, H., Cho, E., Lee, J., Cho, N., Shin, C., Park, T., Park, J., Lee, J., Cardon, L., Clarke, G., McCarthy, M., Lee, J., Lee, J., Oh, B., & Kim, H. (2009). A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits Nature Genetics, 41 (5), 527-534 DOI: 10.1038/ng.357A paper just released in Nature Genetics takes the most comprehensive look yet at the genetic factors underlying complex traits in an East Asian population, using a large sample of Korean individuals.The researchers used a genome-wide association study to look at eight medically relevant traits: body mass index (BMI), waist-to-hip ratio, height, systolic and diastolic blood pressure, pulse rate, and measures of bone density in the arm and leg. The results reveal both similarities and intriguing differences in the genes contributing to trait variation in East Asian and European populations. For instance, while many of the genomic regions associated with BMI and height were the same in this study as those previously identified in European cohorts, there were a number of replicated variants associated with the other traits that have not previously been found in large European studies. Read the rest of this post... | Read the comments on this post...... Read more »
Cho, Y., Go, M., Kim, Y., Heo, J., Oh, J., Ban, H., Yoon, D., Lee, M., Kim, D., Park, M.... (2009) A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits. Nature Genetics, 41(5), 527-534. DOI: 10.1038/ng.357
by dgmacarthur in Genetic Future
Pickrell, J., Coop, G., Novembre, J., Kudaravalli, S., Li, J., Absher, D., Srinivasan, B., Barsh, G., Myers, R., Feldman, M., & Pritchard, J. (2009). Signals of recent positive selection in a worldwide sample of human populations Genome Research DOI: 10.1101/gr.087577.108I pointed yesterday to a new paper in Genome Research taking a genome-wide look at the signatures of recent natural selection in a worldwide sample of humans.I promised a more thorough analysis of this paper today, but I see Razib at Gene Expression has already done a fine job of that. Razib's post covers the bulk of the most important findings of this paper in detail, so you should go read it now; I'll really just be expanding on what I see as some of the most interesting nuggets of data.I also mentioned the paper's rather indirect critique of John Hawks' "recent acceleration" hypothesis, which proposes that humans have experienced very rapid evolutionary change over the last 40,000 years. John Hawks responded to that critique last night, pointing out that the paper does not explicitly test the acceleration hypothesis and that its major findings are in fact consistent with his theory. The paper's lead author, Joe Pickrell, has a quick comment on my post yesterday clarifying his position.Now, onto what I see as some of the most interesting results from the paper. Read the rest of this post... | Read the comments on this post...... Read more »
Pickrell, J., Coop, G., Novembre, J., Kudaravalli, S., Li, J., Absher, D., Srinivasan, B., Barsh, G., Myers, R., Feldman, M.... (2009) Signals of recent positive selection in a worldwide sample of human populations. Genome Research. DOI: 10.1101/gr.087577.108
by dgmacarthur in Genetic Future
Nalls, M., Simon-Sanchez, J., Gibbs, J., Paisan-Ruiz, C., Bras, J., Tanaka, T., Matarin, M., Scholz, S., Weitz, C., Harris, T., Ferrucci, L., Hardy, J., & Singleton, A. (2009). Measures of Autozygosity in Decline: Globalization, Urbanization, and Its Implications for Medical Genetics PLoS Genetics, 5 (3) DOI: 10.1371/journal.pgen.1000415A new study indicates that increases in mobility, urbanisation, and cross-population mating over the last century have substantially reduced inbreeding, and left a distinctive trace in the genome of modern Americans.The study, published in the latest issue of PLoS Genetics, used genome-wide patterns of genetic variation in 809 unrelated American males to explore changes in the degree of inbreeding in the North American population over the last century. Inbreeding leaves a characteristic signature in the genome: long tracts of autozygosity, meaning regions in which an individual has inherited the same sets of genetic variants from both parents. The more closely-related an individual's parents are the more of these regions they will carry and the longer such regions will be.The authors of this study took advantage of a set of healthy individuals collected by the Coriell Institute as a generic control group, sampled so as to be broadly representative of the North American population of varied European descent. The cohort contained individuals ranging from 19 to 99 years of age, essentially providing a sliding window of patterns of genetic change over the last century. Because the cohort was used as a control in genome-wide association studies the researchers had access to information from over half a million genetic variants scattered throughout the genome of each participant.That dataset allowed the authors to directly measure the levels of autozygosity and estimate changes in the level of inbreeding over the last hundred years. Here's the money shot: Read the rest of this post... | Read the comments on this post...... Read more »
Nalls, M., Simon-Sanchez, J., Gibbs, J., Paisan-Ruiz, C., Bras, J., Tanaka, T., Matarin, M., Scholz, S., Weitz, C., Harris, T.... (2009) Measures of Autozygosity in Decline: Globalization, Urbanization, and Its Implications for Medical Genetics. PLoS Genetics, 5(3). DOI: 10.1371/journal.pgen.1000415
by dgmacarthur in Genetic Future
Fan Liu, Kate van Duijn, Johannes R. Vingerling, Albert Hofman, André G. Uitterlinden, A. Cecile J.W. Janssens, Manfred Kayser (2009). Eye color and the prediction of complex phenotypes from genotypes Current Biology, 19 (5) DOI: 10.1016/j.cub.2009.01.027In a recent post I noted that genetic tests to predict adult height are still a long way off being accurate; currently, known genetic variants can predict just over 5% of the variance in height, as opposed to 40% predicted using a simple algorithm based on the heights of both parents. The genetic complexity of height means that trying to screen embryos for this trait using pre-implantation genetic diagnosis is likely to be little more than an exercise in frustration. However, that's not true for all traits. In several recent posts I've mentioned eye colour as one relatively genetically simple trait that would prove amenable to embryo screening, and indeed there has already been at least one US fertility clinic offering such screening to couples undergoing IVF (although that offer has now been withdrawn).A paper in the most recent issue of Current Biology puts some numbers on the predictive value of genetic testing for the prediction of eye colour. The authors examined all of the 37 variants (in 8 separate genes) previously reported to have an association with eye colour in a sample of 6,168 Dutch of European ancestry, and assessed the predictive value of these markers alone and in combination.The results were clear: with just six genetic markers, individuals with blue or brown eye colour could be predicted with over 90% accuracy*; the accuracy for predicting intermediate colours (e.g. green) was somewhat lower at around 72%.Adding further markers had rapidly diminishing returns, with the last 15 markers adding essentially nothing further in terms of predictive value (these markers were typically captured by other variants within the same gene).The authors note that "[t]he genetic prediction values obtained here for blue and brown eyes in
Europeans represent the highest accuracies revealed so far in genetic
prediction of human complex phenotypes." This makes for a pleasant change from the results for most disease-related traits, where common genetic variants have generally provided frustratingly little predictive power for risk prediction.In addition to being a tempting target for "cosmetic" genetic screening by parents undergoing IVF, eye colour genes will no doubt prove useful in forensic applications - being able to predict any physical traits from trace DNA left at the scene of a crime will at least occasionally be useful for investigators. However, the researchers note that their results should only be regarded as applying to individuals of European ancestry; different populations are known to have different genetic determinants of pigmentation. Subscribe to Genetic Future.* Strictly speaking, the researchers showed that the area under the ROC curve was greater than 0.9 for blue and brown eyes and 0.72 for green eyes; this can be interpreted as the probability that a randomly chosen positive instance would be ranked higher than a randomly chosen negative one. Read the comments on this post...... Read more »
Fan Liu, Kate van Duijn, Johannes R. Vingerling, Albert Hofman, André G. Uitterlinden, A. Cecile J.W. Janssens, & Manfred Kayser. (2009) Eye color and the prediction of complex phenotypes from genotypes. Current Biology, 19(5). DOI: 10.1016/j.cub.2009.01.027
by dgmacarthur in Genetic Future
Nejentsev et al. (2009). Rare Variants of IFIH1, a Gene Implicated in Antiviral Responses, Protect Against Type 1 Diabetes. Science DOI: 10.1126/science.1167728The first item on my long list of predictions for 2009 was that this will be the year of rare variants for common disease - the year that we really start tracking down the low-frequency genetic variants (between 0.1 and 5% in frequency) that likely contribute substantially to the risk of common diseases like arthritis and diabetes. It's far too early for me to claim vindication for this prediction, but a paper published online today in Science is at least a step in the right direction.
Read the rest of this post... | Read the comments on this post...... Read more »
Nejentsev et al. (2009) Rare Variants of IFIH1, a Gene Implicated in Antiviral Responses, Protect Against Type 1 Diabetes. Science. DOI: 10.1126/science.1167728
by dgmacarthur in Genetic Future
Jones et al. (2009). Exomic Sequencing Identifies PALB2 as a Pancreatic Cancer Susceptibility Gene. Science DOI: 10.1126/science.1171202A paper published online today in Science illustrates both the potential and the challenges of using large-scale DNA sequencing to identify rare genetic variants underlying disease risk.Traditionally, geneticists have pinned down such variants using large family studies. By using these families to track which parts of the genome tend to be co-inherited with the disease, it's possible to zoom in on the region of DNA that harbours the disease-causing mutation. This step is then followed by laboriously sequencing each gene within the disease-linked region (guided by functional information, when available) in the hope of eventually finding an obvious disruptive change. Although this approach has been successful in identifying thousands of genes associated with severe disease, it breaks down when the disease is sporadic (i.e. is not associated with a family history), is found in a small family, or when other family members are not available for testing. Without a large family to link the gene with disease risk there's no way to narrow down the list of genes responsible, making it impossible to find the underlying mutation.Until now. Within the last year, the combination of DNA capture approaches with cheap, large-scale sequencing technology has made it technically feasible to simply sequence every known protein-coding gene in the genome (in combination known as the exome) to hunt for possible mutations. Although it's not a complete genome sequence - it leaves out the ~98% of the genome that doesn't code for protein - this approach offers the possibility of finding novel protein-altering mutations even in isolated disease cases.In the Science paper, the authors made use of exome sequence from a female pancreatic cancer patient to look for possible susceptibility mutations that may have predisposed her to the disease. In this patient's case the existence of a susceptibility gene (rather than an environmentally-induced cancer) was suggested by the fact that her sister had also developed the same type of cancer.In the first sentence of the paper the authors frame this study as an explicit test of the utility of personal genome sequencing - and indeed it provides some useful insight into both the value of large-scale genetic data, and just how difficult it will be to find disease-causing variants even with the complete sequence of every coding gene. Read the rest of this post... | Read the comments on this post...... Read more »
Jones et al. (2009) Exomic Sequencing Identifies PALB2 as a Pancreatic Cancer Susceptibility Gene. Science. DOI: 10.1126/science.1171202
by dgmacarthur in Genetic Future
Yurii S Aulchenko, Maksim V Struchalin, Nadezhda M Belonogova, Tatiana I Axenovich, Michael N Weedon, Albert Hofman, Andre G Uitterlinden, Manfred Kayser, Ben A Oostra, Cornelia M van Duijn, A Cecile J W Janssens, Pavel M Borodin (2009). Predicting human height by Victorian and genomic methods European Journal of Human Genetics DOI: 10.1038/ejhg.2009.5
Human height is a strongly genetic trait: in well-nourished Westerners somewhere in the vicinity of 80-90% of the variation in height is due to genetic factors; if your parents are tall, there's a very good chance you will be too. That means that if we understood the genetic factors that influenced height we could predict the future height of a child (or even an embryo) with a reasonable degree of accuracy.However, developing that genetic understanding has proved extremely difficult. It turns out that height is also the classic model of a genetically complex trait: a spate of very large recent genome-wide association studies has nailed down over 50 different regions of the genome that affect height, which in total explain less than five percent of the overall variation - suggesting that hundreds (if not thousands) of individual genetic variants contribute, most of them nudging us upwards or downwards by just a few millimetres.Height is not the only human trait to demonstrate such a convoluted molecular basis; aside from a few unusual traits such as skin pigmentation, the majority of traits that vary between humans are genetically complex. This is one of the reasons why embryo screening for traits like height or IQ is unlikely to be effective in the near future.A recent paper in the European Journal of Human Genetics (also covered by Dienekes) illustrates this point by comparing the predictive power of modern genetics with a method for height prediction developed back in 1886 by Sir Francis Galton. The results are a humbling reminder of just how much we have to learn about the genetic architecture of variable human traits. Read the rest of this post... | Read the comments on this post...... Read more »
Yurii S Aulchenko, Maksim V Struchalin, Nadezhda M Belonogova, Tatiana I Axenovich, Michael N Weedon, Albert Hofman, Andre G Uitterlinden, Manfred Kayser, Ben A Oostra, Cornelia M van Duijn.... (2009) Predicting human height by Victorian and genomic methods. European Journal of Human Genetics. DOI: 10.1038/ejhg.2009.5
by dgmacarthur in Genetic Future
James Clarke, Hai-Chen Wu, Lakmal Jayasinghe, Alpesh Patel, Stuart Reid, Hagan Bayley (2009). Continuous base identification for single-molecule nanopore DNA sequencing Nature Nanotechnology DOI: 10.1038/nnano.2009.12The clever boys and girls at Oxford Nanopore Technologies - one of the most quietly impressive contenders in the hotly-contested next-generation DNA sequencing race - have a new paper out in Nature Nanotechnology today. The paper demonstrates proof of principle for a crucial step in their approach to DNA sequencing, the accurate recognition of DNA bases as they pass through a tiny protein nanopore.Oxford's approach is outlined in cartoon format in this video: (See the Oxford Nanopore website for a better-quality Flash version of the video.)Put simply, the system works by sequentially chewing DNA bases off the end of a long strand and then detecting each cleaved base as it falls through a protein nanopore. In today's paper the company demonstrates the use of engineered nanopores to achieve accurate recognition of five different DNA bases (the standard A, C, G and T, as well as methylated C).Here's the crucial figure:
The picture above shows the traces left by the four different DNA bases due to the different effects these molecules have on electrical current flowing across the nanopore. Although there's a fair bit of noise at the molecular level, these signals provide surprisingly accurate base detection: the histogram on the left shows the fraction of bases correctly assigned (base-calling errors fall in the troughs between the four peaks). The average accuracy is 99.8%, and importantly the errors aren't random: where there is ambiguity, there are only two possible states (rather than four). That type of error constraint will make it much easier for downstream analyses to minimise the impact of base-calling errors.There's still plenty of work to be done before this technology can be converted into a commercial platform, but I understand that the company has made substantial progress since the work described in the paper was completed over six months ago. Clearly Oxford is doing something right - the recent $18 million cash injection from sequencing giant Illumina, with a promise of more funding to come if Oxford met specific (unnamed) technical benchmarks, was an impressive vote of confidence.Oxford's technology strikes me as the most exciting in the emerging third-gen arena: the nanopore approach offers some hefty potential advantages over competing technologies. Read the rest of this post... | Read the comments on this post...... Read more »
James Clarke, Hai-Chen Wu, Lakmal Jayasinghe, Alpesh Patel, Stuart Reid, & Hagan Bayley. (2009) Continuous base identification for single-molecule nanopore DNA sequencing. Nature Nanotechnology. DOI: 10.1038/nnano.2009.12
by dgmacarthur in Genetic Future
Razib points to an article suggesting that Australian couples are "flocking" to a US fertility clinic that allows them to screen their potential IVF embryos for sex and even cosmetic traits like skin and eye colour, in addition to variants that predispose to severe disease risk. ("Flocking", in this context, means about 14 couples a month.)This follows on the heels of a fairly widely-publicised study published last week that surveyed around 1,000 genetic counselling patients about their attitudes towards embryo testing. The study was sold to the media as indicating that "consumers desire more genetic testing, but not designer babies" - but the numbers actually suggest a substantial market for genetic screening of embryos, despite the hefty social taboo against this decision.The survey found that the majority of respondents would elect to screen their embryos for diseases like mental retardation, blindness, cancer and heart disease, and a hefty minority (20%) would screen for a disease that would result in death by the age of 50. More surprisingly, over 10% of respondents would screen their embryos for tall stature, athletic performance, or increased intelligence. Although this population is not a perfectly representative sample of the broader population, I'm still surprised to see that demand for such a socially unacceptable process is as high as this.No doubt stories like this will result in increased hand-wringing and predictions of moral anarchy from social conservatives over the next few years. However, there are several good reasons to expect that embryo screening for late-onset diseases and non-disease traits (such as gender and eye colour) will not become widespread, at least in the near future: Read the rest of this post... | Read the comments on this post...... Read more »
Feighanne Hathaway, Esther Burns, & Harry Ostrer. (2009) Consumers’ Desire towards Current and Prospective Reproductive Genetic Testing. Journal of Genetic Counseling. DOI: 10.1007/s10897-008-9199-3
by dgmacarthur in Genetic Future
A new paper in Bioinformatics describes an efficient compression algorithm that allows an individual's complete genome sequence to be compressed down to a vanishingly small amount of data - just 4 megabytes (MB).
The paper takes a similar approach to the process I described in a post back in June last year (sheesh, if only I'd thought to write that up as a paper instead!). I estimated using that approach that the genome could be shrunk down to just 20 MB - compared to about 1.5 GB if you stored the entire sequence as a flat text file - with even further compression if you took advantage of databases of genetic variation like dbSNP. The basis of this compression is the use of a universal reference sequence. Each individual will differ at only a minority of sites (about 0.1%) from this reference, so you can save huge amounts of space simply by not storing the vast majority of the bases where their sequence is the same, and instead just creating a compressed list of the differences.
The authors of this paper one further refinement to this approach that I hadn't considered, and is quite elegant: taking advantage of the repetitive nature of the human genome to further compress the sequence of insertions (i.e. areas of the genome that are present in the individual but not in the reference sequence). It's worth noting, however, that the benefit of this tactic will erode over time as the reference sequence becomes steadily more complete, and eventually becomes a montage containing all of the unique sequence found in common insertion variants in the population as a whole. Then most individuals will contain deletions relative to the reference rather than insertions, and deletions take up a lot less data.
While all this is very impressive, making such a heroic effort to compress the genome is probably a little excessive given how rapidly digital storage space is growing. From a personal genome point of view, most of us already carry gigabytes of digital storage on our person most of the time, so shrinking sequences down to 4 MB (which comes at the cost of adding to the time required to access the data in that sequence) is probably going too far - less stringent compression would probably be fine in most cases. However, I suppose that extreme compression may be useful for organisations that intend to archive extremely large numbers of complete genome sequences (assuming that sequencing costs continue to drop faster than digital storage costs).
And of course there's the whole issue of the need to store sequence quality data. The system in the article works fine for a complete, perfectly accurate genome sequence, but right now no sequencing platform is capable of generating such a sequence - far from it, in fact. It's likely that for the foreseeable future personal genome sequences will contain a mixture of both high- and low-quality sequence, and it will thus be useful to keep them attached to information on the confidence of each called base. That will add at least somewhat to the size of the storage space required.
Still, I imagine this paper was more of an intellectual than a practical challenge. I look forward to the inevitable Netflix Prize-style arms race as competing genome enthusiasts struggle to squeeze out even more extraneous kilobytes over the next few years.
Subscribe to Genetic Future.
S. Christley, Y. Lu, C. Li, X. Xie (2008). Human genomes as email attachments Bioinformatics, 25 (2), 274-275 DOI: 10.1093/bioinformatics/btn582 Read the comments on this post...... Read more »
S. Christley, Y. Lu, C. Li, & X. Xie. (2008) Human genomes as email attachments. Bioinformatics, 25(2), 274-275. DOI: 10.1093/bioinformatics/btn582
Do you write about peer-reviewed research in your blog? Use ResearchBlogging.org to make it easy for your readers — and others from around the world — to find your serious posts about academic research.
If you don't have a blog, you can still use our site to learn about fascinating developments in cutting-edge research from around the world.