Link: http://phenomena.nationalgeographic.com ... iscovered/The most common viruses in your body don’t make you ill. Instead, they infect the legions of microbes that live in your gut. These bacteriophages, or phages for short, number in their trillions. And the most common of them might be a newly discovered virus called crAssphage.
No one has seen crAssphage under the microscope, but we know what its genome looks like—Bas Dutilh from Radboud University Medical Centre pieced it together using fragments of DNA from the stools of 12 individuals. He found crAssphage in all of them. Then, he found it in hundreds more.
To study the microbes that live in a person’s guts, scientists will typically collect a stool sample, break all the DNA within into small fragments, and sequence these pieces. The result is a metagenome: a mish-mashed collection of DNA from all the local bacteria, viruses and other microbes.
Dutilh’s team, led by Rob Edwards at San Diego State University, analysed 466 metagenomes that have been added to public databases and found crAssphage in three-quarters of them. It’s there in stool samples from people in the USA, Europe and South Korea. It actually accounted for 1.7 percent of all the sequences that the team analysed—six times more than all the other known phages put together. You probably have it inside you right now.
The work highlights just how much we don’t know about the viruses in our guts and “what exciting times these are for viral discovery”, says Lesley Ogilvie from the Max Planck Institute for Molecular Genetics.
But how could such a common virus go undiscovered for so long, especially considering how popular the study of gut microbes has become? It’s as if zookeepers suddenly realised that most of their zoos contain a giant grey animal with tusks and a trunk, which no one had noticed before.
For one thing, the viruses in our guts are hard to study. “To study a virus, normally you have to make heaps of it, which isn’t possible if you can’t grow the host,” says Martha Clokie from the University of Leicester. And since most gut bacteria won’t grow easily in a lab, the viruses that infect them are similarly hard to rear.
The alternative is to use metagenomics to analyse a microbe’s genes without having to grow it. But first, you have to assemble your mish-mash of sequences, which come from different organisms, into a complete genome. It’s a bit like putting all the pieces of a thousand jigsaw puzzles into one bag, and trying to solve just one.
The usual strategy is to work off what you know by aligning these new sequences to those in databases. But this approach doesn’t work very well for our inner viruses because most of them are unknown. The sequences in the databases represent the tip of the iceberg. According to Dutilh, around 75 percent of the DNA from any new stool sample—and as much as 99 percent—won’t match any of these known sequences.
So what’s in that other 75 percent?
Well, crAssphage for starters.
Dutilh’s team found it by using a different approach based on a simple idea: that fragments which repeatedly turn up in the same samples are more likely to be parts of the same genome. They used a technique called cross-assembly to identify one such group of co-occurring sequences, in stool samples from 12 people. They then assembled these sequences into a single genome.
The genome had several distinctive features which told the researchers that it belonged to a phage, albeit one that’s very different to any we currently know of. They called it crAssphage after the cross-assembly method that revealed its existence.
They used the same technique to work out what the virus infects: if there’s lots of crAssphage DNA in a sample, there should also be lots of DNA from its host. Based on this logic, the most likely hosts are a group of bacteria called Bacteroides.
The team checked this result with a second technique. They looked at CRISPR sequences—a kind of bacterial immune system that recognises DNA from infecting phages. The team scanned all known bacterial genomes for CRISPR sequences that matched crAssphage and found that the closest matches came from two groups of gut bacteria, one of which was Bacteroides.
Bacteroides are major players in our guts. They help us break down our food, control the development of our immune system, and protect us from disease-causing bacteria. Their numbers change depending on the food we eat, and they correlate with our risk of different diseases. If crAssphage infects these microbes, it could also be an important player in our daily dramas.
It’s too early to speculate what its role might be, says Dutihl. Still, we know that phages are generally important. By killing off the most abundant bacteria in the gut, they ensure that no single species can monopolise the space. And last year, Jeremy Barr, who was involved of this new study, showed that phages could even act as part of our own immune system.
Many scientists had assumed that viruses in the gut are caught up in fast-paced evolutionary battles with local bacteria. This leaves people with very different collections, and explains why most of the viral sequences that we find don’t match anything in the databases. But the existence of crAssphage challenges this concept: it was part of the pool of unknowns but it’s also incredibly common. “It definitely changes the idea we had about viruses being very individual-specific,” says Dutihl. The study of human gut bacteria followed a similar path: early studies highlighted the differences between us but important similarities started emerging as our techniques became more sophisticated.
There are probably many more common viruses waiting to be discovered. “The biggest contribution of this work is the method they used,” says David Pride from the University of California, San Diego. “It provides a blueprint for further viral discovery.”
“What are we missing when we are unable to classify a sequence? What do we do with all of the sequence reads that we can’t classify? These are tough questions that we’ve been thinking about for years,” says Kristine Wylie from Washington University in St Louis. “This paper demonstrates that the community is developing clever approaches that can be used to mine those data.”
Unknown virus present in intestine of most humans discovered
Moderator: Alyrium Denryle
Unknown virus present in intestine of most humans discovered
Re: Unknown virus present in intestine of most humans discov
The first thing that stood out for me was the fact a virus that is excreted in your poop was named ASSphage.
You will be assimilated...bunghole!
- Ziggy Stardust
- Sith Devotee
- Posts: 3114
- Joined: 2006-09-10 10:16pm
- Location: Research Triangle, NC
Re: Unknown virus present in intestine of most humans discov
As the end of the article mentions, this is notable more for the development of a new method rather than making any surprising new findings. We've known for a while that our digestive tract is an ecosystem in and of itself, with incredible frequencies and varieties of bacteria and viruses that we know essentially nothing about (hell, the 75% estimate listed in that paper is one of the lowest I've seen). In the past, our only real knowledge of this was based on studying antibody responses in the immune system. It is only relatively recently that the knowledge and technology available for "next-generation" sequencing has become common place enough for anyone to even attempt to systematically categorize these viruses and bacteria.
Those interested can read more about the hilariously named crAss here. Hell, if you have any DNA sequences just sitting around there is a web interface for analyzing this. It's essentially a new and rather clever algorithm for sequence alignment; it essentially just performs a crap-load of comparisons between every sequence in the data set and creates a matrix of distance scores. It's mathematically simple (you could easily calculate it by hand), but (as is the problem with genetic data in general) is that the datasets are incredibly large even before you start taking into account pairwise and higher order comparisons.
One thing to note in methods like this is that they typically are not "whole genome"; that is, the assembly tool used to build a dataset to feed into crAss will often make use of "sequence tags"; essentially, individual genes that are known to have some specific function (i.e. produce a specific amino acid or are known associates in some regulatory role) are the basis for the comparison and the rest is more or less thrown out. It's a very well established method, but it is worth noting that for the purposes of examining unknown (and often volatile) viral DNA it may not be optimal.
Those interested can read more about the hilariously named crAss here. Hell, if you have any DNA sequences just sitting around there is a web interface for analyzing this. It's essentially a new and rather clever algorithm for sequence alignment; it essentially just performs a crap-load of comparisons between every sequence in the data set and creates a matrix of distance scores. It's mathematically simple (you could easily calculate it by hand), but (as is the problem with genetic data in general) is that the datasets are incredibly large even before you start taking into account pairwise and higher order comparisons.
One thing to note in methods like this is that they typically are not "whole genome"; that is, the assembly tool used to build a dataset to feed into crAss will often make use of "sequence tags"; essentially, individual genes that are known to have some specific function (i.e. produce a specific amino acid or are known associates in some regulatory role) are the basis for the comparison and the rest is more or less thrown out. It's a very well established method, but it is worth noting that for the purposes of examining unknown (and often volatile) viral DNA it may not be optimal.