Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's quite fascinating. It's like if order to figure out the shape of a teacup, we generate thousands of identical copies, smash them all to rather small bits, and then try to count the different types of shards as a first step to piecing together one full copy. Impressive that it works.


> It's like if order to figure out the shape of a teacup, we generate thousands of identical copies, smash them all to rather small bits, and then try to count the different types of shards as a first step to piecing together one full copy. Impressive that it works.

Yes, but you've got the order wrong.

The teacup is smashed before all of the identical copies are created.

(I wrote DNA analysis software for 6.5 years)


It's not fascinating; it's an endless source of trouble. We only do it because we don't have sequencers that produce extremely long (chromosome length) high quality reads, especially in sequences that contain a lot of repetition. This has been a source of errors and ambiguity for as long as we've used shotgun.


This is a great analogy. One small change is that there are two ways to reassemble it. One is to try to blindly put the pieces together and fork a teacup (read assembly) vs trying to use a picture of the teacup to figure out where the pieces go (read alignment / mapping)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: