I am a bit surprised because I did assume that the number of chromosomes would be the major indicator, which I also assumed wouldn’t be hard to determine, but you didn’t address that part of the question. (“What are the defining factors: the number of chromosomes only?” )
If I can jump in to address this - you're right that it's very easy to determine the number of chromosomes that a cell has; unlike the latest and greatest molecular biology techniques, counting chromosomes is something you do with a special dye and a microscope. They're large enough (when we collect them in their "chromatid" state, found in cells that are captured in the process of division, when the cell very conveniently "packages" its chromosomes so that they can be moved around.)
But the wrinkle is that different species may have the same number of chromosomes, especially if they're closely related; different individuals in the same species may have different numbers of chromosomes, different cells in the same organism may have different numbers of chromosomes, and prokaryotic organisms like bacteria have only a single circular chromosome, regardless of their species. So what can be gleaned strictly from chromosome count is often subject to what species you're talking about. There's no way to look at just a chromosome number and say "well, that's from Bos taurus."
Do you ever actually look at the DNA itself or are you looking at some sort of indicator, model, or whatever you call it that you somehow derive from the cell? I know a DNA portrait as it were is often represented by some sort of bars that to me are indecipherable. Do I have to learn what those mean in order to get answers to the sort of questions I'm asking?
Last question first: No. The "black bar" pictures you're familiar with are an older way of characterizing DNA by chemically cutting it at specific sequence sites, and then measuring the length of the fragment - the closer to the bottom of the picture the black bar is, the shorter the fragment is. Different rows of black bars are different samples, and a "ladder" - a mixture of fragments of known sizes - runs along the sides for comparison. The pattern of fragment sizes can be compared with other people's pattern of fragment sizes, assuming you broke up their DNA in the same way.
So that's a case where you're actually looking at DNA - it's been dyed, or sometimes made radioactive, and used to expose a piece of photographic film. That's why it's kind of fuzzy. We have new methods, though, where we can use chemistry and special machines to sequence DNA directly, and have a digital readout of all of it's bases. When we do that, we're starting with DNA, usually making a lot of copies of it so that there's more of it to work with (we have a chemical DNA copier called "PCR"), and then producing a digital file that contains all the bases in our sample. Then we just look at it on computers. We're using the third or fourth generation of technologies developed during the Human Genome project. We're experiencing the same kind of technological growth that we had in computers in the 80's and 90's.
But you can look at DNA right now, without much of anything special:
In this form, it doesn't tell anybody much at all. It's just a slimy fiber. Learning to read this - for that matter, learning what it did - was the central task in biology and biochemistry for about 50 years. As a field, we're largely moving away from "black bars"-type techniques in molecular biology and more into direct sequencing. Actually I'd say that we already have. So, no, you won't have to learn how to read one of those.
They're numbered in order of length. 1 is the longest, etc, except that the sex-determining chromosomes will be labeled X and Y. (Or sometimes Z and W, to indicate the different sexing scheme that some birds, lizards, and insects have.) Since every cell has at least one of each chromosome, you can lay them out in order of size and just count. That's called a "karyotype." Also, when you dye ("stain") a chromosome, they produce a distinct pattern of banding that can also be used to identify them. For the most part that's species-specific.
To address Percy's comment, the reason that I mentioned that some species have intraspecific variations in chromosome count is 1) it's true, and 2) you should get used to the idea that the only absolute rule in biology is that there are no absolute rules in biology. Life exists in infinite and continuous diversity. That's not just some spiritual-sounding mumbo-jumbo. Life really is a lot more varied than we usually imagine.
Also, I'm assuming the DNA strand doesn't break for this exchange to take place, break apart and come together again, so it isn't exactly right to say that a whole GENE is being moved, is it?
It actually is being broken when these kinds of re-writes are happening. It's being snipped, moved, and re-combined. Doing this safely with controlled results is pretty important to the continued health of the cell, as you might imagine, so it happens to be the case that quite a bit of the enzymes "at work" in the cell's nucleus are involved in this process. But the fact that this happens in all species - and that the enzymes for doing it are relatively similar across widely different species, and therefore compatible - gave us our first chemical tools for studying DNA sequences and manipulating them. The same enzymes that snip DNA in cells are how we produce those "black bar" pictures you asked about before.
How that actually happens is a matter of the chemistry of the DNA molecule itself, which I think we've all decided is a little out of scope right now, but I can elaborate at your request.
Yeah, the white stringy stuff. Like I say in its raw form it's not of much use.
But what IS direct sequencing? You look at the DNA strand under a microscope and count off the chemicals?
The bases of DNA are far too small to see like that. We use chemistry, electronics, and computers to sequence DNA bases. There are a couple of ways to do it but they're all based on PCR - the polymerase chain reaction, a combination of chemicals, an enzyme called Taq polymerase, which is where our friend Taq derives his username, and a machine called a "thermocycler" that basically gets hot and cold on command. (The PCR reaction is controlled by temperature.)
What does PCR do? It's a way to chemically "amplify" DNA. When you extract the DNA from a single human cell, for instance, you get two copies of each of 23 human chromosomes - and that's it. 46 molecules in total. That's not nearly enough to do much chemistry on, so PCR was developed as a way to take a sample of DNA and duplicate it over and over again, making a perfect copy each time. 1 DNA molecule becomes 2, 2 get copied to 4, 4 get copied to 8; if you do it about 30 times you turn a single DNA molecule into one billion identical copies.
PCR really revolutionized biology and biotechnology, it's the basis of the "biotechnology revolution" and fundamentally made genetic engineering and other applications possible. The sequencing technologies we use are adaptations of this technology. If you're interested in the chemistry I can elaborate.
My questions come more from wanting to be able to visualize the DNA strand and what's going on there more than anything else.
Sure. The most important thing to visualize is the "Watson-Crick" pairing - that, if you know the bases along one strand of the DNA, you can reconstruct the other strand by matching each base with its Watson-Crick pair: A to T, C to G. If you have two strands of DNA, you can copy it by separating the strands and then, using each strand as a template, reconstructing the complimentary strand. Now you have two copies, instead of just one. Every time you do that, you double your DNA.
Hopefully it's a little more clear - after direct sequencing, you're not looking at a DNA strand, you're usually looking at a computer. For instance, all we do in my lab is sequencing, and the output we get are text files on hard drives. Each text file - sometimes these are gigabyte files - are just lists of letters. ATGC, corresponding to the bases. It's much more useful to have it that way, because now we can use computer programs to look for sequences, put them into databases, and match or compare them to sequences that are already there. That's all incredibly useful and it's part of the field called "bioinformatics."
For starters let me guess, you don't sequence an entire gene, which is too huge, but there are parts of the sequence that give you useful information about it?
Genes are small enough to be readily sequenced - about 3000 base pairs. If you have information about what sequence that gene already has - usually because you've already sequenced the orthologous gene in another related organism - then it's really easy to pull out that gene and read it directly. If you're trying to sequence a gene for which you have no pre-existing sequence information - de novo sequencing, that's called - then it can be harder to get it's sequence. But it's not impossible, just hard.
It's a lot more time-consuming to sequence the entire genome of an organism, that's every base pair on every chromosome. In human beings, for instance, that's 3.6 billion base pairs. But as the technology marches along we've dramatically improved sequencing times. The project to sequence the entire human genome (appropriately, the "Human Genome Project") took 11 years and millions of dollars. With current technology we could do it in a few days for about $5,000.
Of course, just the string of bases doesn't immediately tell you a whole lot. It all looks like random noise. To get something useful out of it the genome has to be annotated, that is, we have to associate parts of the DNA sequence with their function, with traits and alternative alleles - with whatever information we already have about what part of the DNA does what. That's called the "annotated genome" and producing it for humans and other organisms is what we're currently doing in molecular biology.
This is the Annotated Genome Browser, which is a way to browse the information we've associated with the DNA sequence of an organism (human, in this case.) You don't need to put anything into the fields, just click "submit". For whatever reason it takes you to human chromosome 21 by default. Since it's primarily a research tool, it'll be like drinking from the firehose for you, but don't be discouraged, most of the scientists who look at this are so specialized in one particular area of the genome that most of the information presented here isn't very meaningful to them, either. It's just to give you a sense of what the information we're generating looks like, or can be displayed as.
Looking at the genome browser will probably give you like a million questions, I just present it so that you can see what kind of sequence and annotation data is being generated by scientists in molecular biology.
I really would like to know about "Junk DNA" and was going to skip to that question anyway when I saw Dr. A's post with the diagram of the Human Genome here: http://www.evcforum.net/dm.php?control=msg&m=680560 It doesn't even mention Junk DNA. Can you explain?
I think you've largely got the long and the short of it. "Junk" DNA is a name some people gave to DNA sequences that don't encode proteins. For my own part, I think it's a bad name for a couple of reasons:
1) Even if DNA doesn't encode a protein, it might be involved in regulation, it might be a docking site for structural or regulatory proteins, etc. It may be a "telomere" that allows a cell lineage to know when it's reproduced too many times.
2) Even if it doesn't have any function at all to the organism, it may have once had function, or it might be exogenous to the organism - having been inserted by a virus, it may be a "transposeable element" (transposon), or entered the organism's genome by some other means (like "RNA retrotransposition"). If it did, we should find out if it did, not just write it off as junk.
3) "Bulking out" one's genome with genetic "noise" serves to minimize the effect of frameshift mutations by spreading out the genes.
Like most people working in genetics, I generally don't say "junk DNA."
I don't see an indication of what percentage of the Genome is pseudogenes.
The human genome? We don't yet know. It's hard to determine whether a gene is actually never expressed, or whether it's merely seldom expressed. Also, our tests have a hard time determining whether a gene is transcribed and expressed, or transcribed but not expressed (this is the fate of some pseudogenes, they appear in the genome and the transcriptome, but not the proteome.) In an organism like a human (or a plant, for that matter) there's an enormous amount of editing in between genetic transcription and building a finished protein, which can make it difficult to look at a gene in the genome and trace it exactly back to a known protein.
One paper I just found suggests that the number of pseudogenes in the human genome exceeds the amount of genes. One of the biggest surprised of the human genome project (and the subsequent annotation - actually assigning function to sequence) was how few human genes there are - less than 20,000.
Look, you guys can be as skeptical as you like, but this is straight out of Leninger Principles of Biochemistry, Inferring Phylogenies, and for that matter Wikipedia:
quote:Noncoding DNA separate genes from each other with long gaps, so mutation in one gene or part of a chromosome, for example deletion or insertion, does not have the "frameshift mutation" on the whole chromosome. When genome complexity is relatively high, like in the case of human genome, not only different genes, but also inside one gene there are gaps of introns to protect the entire coding segment to minimise the changes caused by mutation.
Probably not the clearest way to put it - this doesn't seem like it was written by a native English speaker, actually - but I'm hardly running off the edge of the science, here.
Crash, that quote doesn't say anything about the non-coding regions being an advantage because they might get extra mutations putting things right. That idea is still very implausible just from statistical principles.
Hrm, then maybe I misunderstood. I'll see if I can dig up a better reference. I know I've gone back and forth on this over the years, so in the meantime I'm happy to have people consider the claim withdrawn. I'll see if I can find some support for it. This is a good thread, I think the potential for informing Faith is pretty high, and I like to not drag it off-topic towards confusion.