I assume the program you were watching was the gene code, but unless you are more specific in where the claim you are discussing is made, and since it is on I-player you could have told us the program and even told us the right time point had you cared about productive discussion, it is hard to address what claim it is you are talking about.
A brief skim view suggests that perhaps what you are misunderstanding is the concept of conservation, where some regions of DNA appear to have undergone substantial random mutation while others seem to have been very precisely maintained across vastly distinct species, i.e. chickens and humans.
Your mistake is in believing that this is somehow controlled by the DNA. What it is is the result of natural selection removing mutations which are deleterious. If a specific nucleotide sequence is vital to the development, survival or reproductive success of the organism then it will tend to be highly conserved.
This doesn't mean that there is anything preventing mutations happening in these regions, it just means that when they do the organisms in which they arise do not tend to contribute their genes to future generations.
You could argue I suppose that since the organism's genome is part of its whole environment then through the complex interconnected nature of the gene regulatory networks the rest of the genome is one of the selective factors which constrains which mutations are viable. But if that was what you were trying to say you said it very badly, and it is totally irrelevant to defining species.
So if you want to discuss specifics then give us some specific to discuss, either in your own words or by giving us a time code for when in the program the comments you are interested in are.
This is actually veering back round to the original topic as what we were discussing previously was the distinction between morphospecies and genetic distance based species definitions. Specifically many of us were trying to make Big_Al understand both the arbitrary nature of species definitions and the disconnect between morphology and DNA which meant that morphologically identical species could be significantly divergent in their genomes. This then took us on to discussing morphological variability which was not genetically determined, something Big_Al appeared not to believe in.
Both you and Huntard have already given examples of three legged individuals which I would describe as highly deleterious.
I would certainly agree with you, and I suspect Huntard would to, but you still seem to have failed to grasp the point we were making, which is that this is not a genetic mutation.
You presented your arguments as if there was no distinction between highly mutated areas and those that are more precisely maintained. Ofcourse you're back peddaling now.
The whole reason we got onto the three legged issue was because you were denying the role of environmental factors in affecting development. That is why were weren't discussing conserved or non-conserved regions of the genome, because the whole point was that there are extra-genomic factors that can affect an embryo's development.
The sequence of DNA which specifies having two legs falls within those precisely defined and highly maintained segments of DNA that we are talking about.
I think the idea that there is one 'sequence of DNA which specifies having two legs' is probably quite mistaken. I am quite happy to accept that there is probably a high degree of conservation amongst the various genes and regulatory elements which produce bilaterally symmetric limbs amongst vertebrates.
I don't dispute that any area of the genome is vulnerable to being perturbed, but clearly we both agree that some areas are highly maintained whereas within other seqments of DNA code, non-deleterious mutations occur often, frequently and infact with every offspring.
I think you are making a false distinction here, the important thing is not the frequency and nature of the mutation but it's subsequent spread through the population.
There is little reason to think that highly conserved regions are any less susceptible to mutation than non-conserved regions, conservation is principally a reflection of natural selection rather than mutational frequencies. The important distinction is not that the mutations occur in these non-conserved regions but that they are allowed to persist in them while similar mutations in more developmentally/functionally important regions will tend to be selected against because of the perturbations they produce. In the most extreme cases this selection will be virtually immediate in the form of embryonic lethal mutations where the organism will never become viable.
If the example you gave was of the impact of an extra-genomic factor unrelated to genetic mutation then your point is irrelevant to the study of evolution.
That was my point, you were the one saying that there weren't organisms being born with extra legs because the environment couldn't affect DNA and claiming that the variations in fingerprints and retina must be encoded in DNA.
I was pointing out that extra-genomic environmental factors were a strong influence on the random/stochastic elements of development which produce a further level of variation beyond that encoded in the DNA.
But since this variation is not heritable it is not relevant to evolution, whereas in your scenario where all the variation is somehow coded into the DNA it would be.
Highly conserved genes have spread throughout the population and therefore would be the genes that identify species more accurately.
No. In fact genes that aren't at all conserved across species would be best to uniquely identify a species. What conserved genes are good for is to produce phylogenies across multiple species. There can be conservation of a gene within a species but not between species such a species specific gene would be the ideal molecular marker.
A highly conserved region is by definition less susceptible to mutation than non-conserved regions.
No it isn't, it is just that the organisms in which it is mutated are much less likely to survive and reproduce. There is no protective effect stopping mutations form occurring it is post hoc natural selection that is eliminating the mutants.
More reasonably by definition a conserved region is one in which significant divergent mutations have not accumulated, that doesn't mean that those mutations can't occur.
If you think you can find a reputable source to support your definition feel free to provide it.
This is not my understanding of how DNA works. My parents had several children each one with different DNA however small. They did not have several miscarriages to balance out the successful outcomes which would be the logical conclusion of your argument.
Well I think we have fairly well established that you don't have a particularly solid grasp of how DNA works. Your argument fails to make sense though, what basis are you using to claim that logically your parents should have suffered several miscarriages? Can you show me appropriate mutation rates and the exact number of 100% obligatory nucleotide sequences that support your claim? If so then you know more about human genetics than anyone else on the planet.
As I pointed out embryonic lethals are the most extreme form of the operation of selection that I described. But it is worth noting that your parents may have had a miscarriage and been unaware of it, there is evidence that as many as 25% of natural conceptions end in miscarriage within the first 6 weeks of pregnancy (Wilcox et al., 1999). It is also worthwhile noting that anyone who is congenitally sterile would satisfy the most extreme criteria as well.
This doesn't begin to take into account non-lethal, non-sterility causing mutations which nevertheless significantly impact survival and reproductive success leading to a tendency not to propagate.
You didn't need much information to identify the program I was referring to and the same follows for the above.
No I didn't, but the program doesn't use your definition at all. They are talking about the conservation of sequences between chickens and humans. And since I know that your definition is nonsense, since I have studied and worked in the field of genetics for 15 years, I am hardly going to waste my time looking for a reputable source that uses it.
So since you are claiming that is the definition why is it you can't you tell me where you saw it defined that way?
Firstly you gave me an example of a three legged individual whose deformity was due to environmental impact and therefore irrelevant to the study of genetics or evolution and you claim that I don't have a solid grasp?
It would appear that you don't have a solid grasp of reality or the relevant.
Okay, so apparently you have completely forgotten the previous discussion on this thread. Why not look at Message 204 where you say that the random factors affecting fingerprint and retina patterns must be encoded in the genome and take up considerable space there, or look at Message 212, where you ask ...
How can the variable nature of the environment affect some parts of the DNA but not the important information storing parts which spell out that we should have arms, legs, torso, head etc...I am guessing you are going to tell me now that some people are born with three legs?
What I was doing is pointing out to you what is really the case with these sort of variations, that the randomisation comes from the environment and also in part from the statistical mechanical nature of biochemistry. These random factors aren't affecting the DNA but rather the functional activities that the genes encode for during development and in some cases the expression level and timing of particular genes.
So your whole swathe of genetic information acting as a random seed for variation is a complete fabrication you made up of whole cloth, and now we are supposed to accept that you know what you are talking about? Especially hard to do when you claim I am dragging things off topic when I answer specific questions that you have asked.
You will persist in this game of making utterly irrelevant points in your show of brinksmanship. Who cares about the conservation of sequences between chickens and humans?
The people in the TV program you were claiming supported your claim that ...
DNA is coded such that it specifies which sequences of DNA code can change randomly and which areas should not be changed randomly
You have yet to identify a point in the program which actually supports this claim. I identified what I thought might be the relevant segment where they talk about highly conserved regions across species, if this wasn't what you meant then why not, actually finally after half a dozen posts, give us a useful point of reference so we know exactly what you are talking about, or even give us the relevant quotes since all you need to do is transcribe them from i-player.
It's utterly irrelevant again. Humans, mice and chickens are all different species which didn't even evolve along the same paths if they evolved at all.
Well if one accepts evolution then they did evolve along exactly the same paths up until they diverged from their respective most recent common ancestors.
It's not about which sequences are the same but about which sequences are different.
Wasn't that kind of my point? That between species differences rather than commonalities were more appropriate for defining species?
I get the impression that the segment I identified wasn't actually the one you were thinking of, if this is the case it would have been a lot less confusing if you had just said so at once and identified the correct segment.
If you want to set an arbitrary limit on the genetic distance from a given genotype or an arbitrary set of molecular markers or haplotypes that defines a species (as in species barcodes) then I am quite happy with that, as long as you appreciate that it is arbitrary.
If you are claiming that your criteria aren't arbitrary then you need to provide some sort of rationale for why not. I understand the basis for your claim that there should be some clear demarcation, some form of created kinds argument or baraminology, but that isn't a rationale for choosing any particular set of criteria.
I'll be honest Big_Al, I'm having a hard time seeing this as you trying to engage in a productive debate.
This doesn't change the fact that you will persist on highlighting irrelevancies again and again. Fingerprints and retinas do not define species and environmental influences do not define species.
Irrelevancies that you brought into the discussion and accompanied with a gigantic false claim concerning genetics.
As I said before who cares about human and chicken similarities.
Lots of people who study comparative genetics do, including some from the documentary you directed us all to.
Why a chicken and not an elephant?
Because chickens are easily experimentally manipulable and readily produced in great numbers, none of which are true for elephants. Also chickens are considerably further removed from us in evolutionary distance, meaning that conservations across that distance are more likely to be functionally important.
You continually refer to conserved areas across species but never refer to conserved areas within one species.
We don't really have enough human genomes sequenced yet to be able to say too much about this. Things like the HapMap project and the 1000 genomes project are good starts though.
I think you are wrong to ignore cross species comparisons though, there is no point defining a suite of genetic characteristics common to all humans as a species definition if you then find out that the same characteristic suite is found in chimps and gorillas as well. You need both to come up with a suitable criteria.
Since you brought up DNA barcoding earlier on a suitable paper for discussion might be this one ...
Proc Biol Sci. 2008 February 7; 275(1632): 237–247.
DNA barcoding has become a promising means for identifying organisms of all life stages. Currently, phenetic approaches and tree-building methods have been used to define species boundaries and discover ‘cryptic species’. However, a universal threshold of genetic distance values to distinguish taxonomic groups cannot be determined. As an alternative, DNA barcoding approaches can be ‘character based’, whereby species are identified through the presence or absence of discrete nucleotide substitutions (character states) within a DNA sequence. We demonstrate the potential of character-based DNA barcodes by analysing 833 odonate specimens from 103 localities belonging to 64 species. A total of 54 species and 22 genera could be discriminated reliably through unique combinations of character states within only one mitochondrial gene region (NADH dehydrogenase 1). Character-based DNA barcodes were further successfully established at a population level discriminating seven population-specific entities out of a total of 19 populations belonging to three species. Thus, for the first time, DNA barcodes have been found to identify entities below the species level that may constitute separate conservation units or even species units. Our findings suggest that character-based DNA barcoding can be a rapid and reliable means for (i) the assignment of unknown specimens to a taxonomic group, (ii) the exploration of diagnosability of conservation units, and (iii) complementing taxonomic identification systems.
What you propose seems to be similar to this characteristic attribute organization system (CAOS) approach. But this approach relies on the fact that members of a given taxonomic group share attributes (e.g. polymorphisms) that are absent from comparable groups. This approach explicitly relies on both within and between group comparisons to define a meaningful suite of diagnostic characteristics.
Can you outline what sort of approach you would take to usefully define a species genetically looking only within that species?
You can see how unhelpful the results are and no definition can be reached.
Incorrect, it is worth noting that between species differences are not simply studied on the basis of 1 individual organism. But putting that aside we clearly already have a panel of putative diagnostic characters that could define the 3 species.
Chicken: QRST Mouse: LMNO Human: GHIJK
Obviously the accuracy of these can be refined both by the study of other species and of further individuals within each species. This is much more useful for defining a species than the commonalities from the cross species analysis, as I said it would be.
If we were to find a third, fourth and fifth human with a matching genome of ABCDEFGHI we might start to think that ABCDEFGHI could be used as a good and reliable definition for a human.
And when we find a chimpanzee with ABCDEFGHIST? That is the limitation of your only within approach. In fact no matter how many humans you find with ABCDEFGHI it will never be a suitable definition if you haven't looked at other, ideally closely related, species because you simply won't know what their characteristics, and consequently any characteristics shared with humans, are.
This is why I have been saying all along that you need to look at both within and between species variation.
Incorrect, as my example already goes on to find a human with GHIVW.
Fine, my whole point was that from comparisons of the initial 3 individual organisms you had a preliminary basis to be further refined. So you add another couple of humans in and find that only GHI are species specific, and maybe you find that JK is population specific to Kalahari bushmen, whatever.
Yes you have kept referring to conservation across the species but never conservation within one species see previous posts.
I have also referred to conservation within species, see previous posts. Again, since you seem to be incredibly resistant to reading what I actually write, what is needed to produce a useful definition is both within and between species comparisons. In fact the most important things to look at for defining a species are differences between species and similarities within species, which was why I said exactly that in Message 235.
In fact genes that aren't at all conserved across species would be best to uniquely identify a species. What conserved genes are good for is to produce phylogenies across multiple species. There can be conservation of a gene within a species but not between species such a species specific gene would be the ideal molecular marker.
This whole recurrence of the thread just seems another prime example of you bringing up something irrelevant, the as yet unidentified claim made in the BBC program which you say supports ...
that DNA is coded such that it specifies which sequences of DNA code can change randomly and which areas should not be changed randomly
And then when people address this you act as if you were talking about the original topic all along and claim that other people are talking about irrelevancies, i.e. the cross species conservation which was discussed in the BBC program.
If the cross species conservation wasn't what you were talking about then why not finally pull your finger out and identify what it was in the program that you were talking about, it has been several days now and you have still to do the most basic thing to actually provide any support for your claim, or even show how you relate that claim to the central topic.
I have tried to steer the topic back to a more focused discussion of species and cited a paper using a genetic barcoding approach which emphasises the importance of both within and between species/population comparisons. Do you agree yet that a within species only approach is insufficient?
I'd been considering these characteristics to be more akin to haplotypes or single nucleotide polymorphisms (SNPs) rather than actual discrete genes. Certainly SNPs within a gene or set of genes is the usual basis for DNA barcoding approaches.
I think that WK is now seeking to find whether you will agree that barcoding by itself is insufficient for species identification.
It isn't quite that, DNA barcodes are very effective for Species identification as long as you have a reference, what they aren't necessarily good for is species discovery.
If we are naively presented with a panel of genetic sequences from an unknown set of organisms then we can use the approaches used for genetic barcoding to identify specific populations. we can produce a phylogeny of the organisms and we should be able to identify specific barcodes corresponding to the various clades.
What this doesn't tell us, in the absence of a criterion for defining it, is which of these organisms are of the same species. Clades may be identified at all levels from sub-populations to species and above that through genera and beyond.
To identify a specific barcode for any given population we want a within and between genetic comparison for all of the other populations and organisms available.
To identify a specific barcode for a species we need to have a criteria for what will constitute a species. This can either be based on a previous taxonomy or can be some degree of tightness in the clustering of the genetic markers in the barcodes or some similar metric. Either way that definition will still be arbitrary, which has been the point all along.
Species discovery does not just fall out of DNA barcoding naturally, there has to be a pre-existing species concept for it to hang on.
Edited by Wounded King, : No reason given.
Edited by Wounded King, : retrospective proof reading
It would be interesting to discover how many genes are different between humans and chimps and what fraction of the entire genome this represents?
There is already quite a bit of information on this from the human and chimpanzee genome projects. Some highlights from an overview from NIH's Genome.gov site ...
At the protein level, 29 percent of genes code for the same amino sequences in chimps and humans.
In fact, the typical human protein has accumulated just one unique change since chimps and humans diverged from a common ancestor about 6 million years ago.
the number of genetic differences between a human and a chimp is about 10 times more than between any two humans.
More than 50 genes present in the human genome are missing or partially deleted from the chimp genome.
The researchers found six regions in the human genome that have strong signatures of selective sweeps over the past 250,000 years.
A slightly more recent paper from 2007 (Hahn et al.) looks at rates of gene loss and gene gain in the primate lineage and notes unusual expansions in neural related genes in the human lineage.
This ties in to studies of what are called Human accelerated Regions, regions which are conserved amongst mammals generally but show significant differences in humans ( Prabhakar et al, 2006; Katzman et al., 2010).
As a fraction of the genome these regions are all pretty tiny, ~50 missing genes, 50 human accelerated regions. Even as a proportion of the difference between chimps and humans, and allowing the same again for the chimp lineage, they are still likely to be a very small proportion, especially when there are some single insertion-deletion events which account for megabases of sequence at a time. Small but possibly very important in terms of the actual functional changes which account for the differences between humans and chimpanzees.
It is the gene sets that are unique. Different species can share many of the same genes, but the complete set of genes are unique for each species. The more closely related two species are the more genes they are likely to share, but if they are in reality two different species then one or both will have genes the other does not have. And if you discovered a new species that shared all the same genes as an existing species, guess what? It's not a new species!
Hey Percy, now its you that is either using genes in a funny way or getting the biology mixed up. Alternatively you may be using yet another species concept.
If we go by the biological species concept of interbreeding potential as the criteria for distinct species then it is quite possible to produce reproductively isolated species without changing the Gene sets in terms of the broader conception of a gene, i.e. a specific genetic locus which is associated with a heritable trait rather than any particular exact sequence of nucleotides at that locus.
At base as few as 2 SNPs might be sufficient to establish reproductive isolation between two populations. We could envision an initial isolation event followed by subsequent evolution leading to the origination and fixation of these SNPs, either in one population or split between them. Alternatively they may already be extant in one population but the presence of other genetic variants allows sufficient gene flow to keep them within one species if subsequently the mediating genetic variations are lost from the population you have potential sympatric speciation.
Maybe WK can confirm, but I get the impression that while we might know a great deal about some genes, the majority are vague and amorphous and possibly only known to exist by some gross estimation process.
Again it all comes down to what we mean by a "gene". In terms of the traditional protein coding sequence conception of a gene we have things pretty tied up, open reading frame (ORF) detection is a pretty mature technology and error checking approaches have been set up to screen out false positives (a few years ago about 5,000 putative coding sequences were scrubbed (Clamp et al., 2007)) although thorough annotation is still lacking in most genomes outside of mouse, human and chimp. Also there is still quite a bit of work to be done tidying up the multiple isoforms of proteins that many genes produce. Where there is considerably more doubt is in the areas of regulatory sequences and non-coding RNAs, areas that while not matching the conception of a gene as a protein coding sequence match the more traditional Mendelian idea of a gene as a discrete heritable genetic locus with variants associated with a trait/phenotype.
Another study estimated that between them humans and chimps have 6% of their genes different through loss or gain (1,418 of 22,000 genes) (Demuth et al., 2007), but there is some reason to think that they overestimate the numbers one comment on the article describes several possible artefactual problems with their approach.
Let me get this right, 97% of our genome (maybe 98%) consists of apparent junk (no function identified as of yet)
I think you are mixing up a whole lot of things here. About 2% of the human genome consists of protein coding genes. Junk DNA is not simply everything that isn't a protein coding gene, that is a very naive conception such as one might find in a newspaper science article. Even in the 70s when the term was coined there were already a number of known functions for non-coding stretches of DNA, there are structural elements like telomeres and centromeres which have very clear roles in mitosis and chromosome stability, there are regulatory sequences where transcription factors bind which mediate the timing and extent of expression of specific genes and there are genes for non-coding RNAs which form part of the machineries of protein synthesis. Since the 70s we have discovered more functional elements, such as non-coding RNAs which have regulatory roles.
There are still however large regions with no readily apparent function. These include large amounts of repetitive DNA, some simply the same 2 or 3 nucleotides repeated multiple times, others longer sequences between 10 and 60 bps. This sort of repetitive DNA makes up around 40% of the genome and overlaps to some extent with the other major class of sequences which make up most of the rest. This other class is retrotransposons, most retrotransposons in the human genome are no longer active but they make up ~42% of the human genome, these are mostly variants on 2 types of sequence LINE (Long interspersed elements) and Sines (Short interspersed elements). Another class of sequence usually considered junk are pseudogenes which are the normally non-functional remnants of genes and gene duplicates. Some neo-functionalised pseudogenes have been identified but only a handful.
So really the vast bulk of DNA that is shared across the species is junk
Not necessarily, the vast majority of DNA is shared between Humans and chimpanzees, including that in 'Junk' DNA. Some biologists do argue that the vast majority of DNA is junk, in which case you would be right, and that its conservation across species is strong evidence in favour of common ancestry.
in the case of chimpanzees our genes are actually 70% different?
I assume you get this figure from when I quoted ...
At the protein level, 29 percent of genes code for the same amino sequences in chimps and humans.
... if so then I apologise for not clarifying the statement from the article. When they say the sequences are "the same" they mean they are 100% identical at the amino acid level as opposed to the other genes which vary in a few amino acids, but not so many as to bring the typical amino acid divergence above 1 per protein.
I hope that clears things up, no one is trying to mislead you.
My approach of starting simply and gradually adding complexity often draws objections from you. The species divisions getting attention in this thread, such as between chimp and human, include differences in gene sets, even in chromosomes because of the fusion of 2-into-1 possessed by humans.
Sure I get that, I just thought that you were making some pretty bald statements there that simply aren't backed up by the biology.
It is possible to project a better understanding onto replies than is actual there in reality, encouraging one to continue the discussion at what can become for many an inaccessible level, especially when long familiarity with complex details causes some simple but important things to go unsaid.
Yeah, I get that a lot. To some extent I blame the Creationist/IDists since so many of them come in insisting that they know exactly what they are talking about.
Equally helpful is the concept of genetic uniqueness of a species, a point made most simply by assuming that each species has a unique gene set. Once that point is made and understood one can move on and say that even just allele differences can divide species.
I agree, I just think that the fact that it is an assumption for the purposes of demonstration needs to be in there somewhere rather than presenting it as a fact of biology that each species has a unique set of genes.
I usually round up or down anyway and my figures are usually based on not one source but an average of several sources.
The only figure I had an issue with was the one for ...
in the case of chimpanzees our genes are actually 70% different?
Which isn't really consistent with anything, which was why I thought you might just have misunderstood.
No one is bothered to quibble about protein sequences being 1 or 2 % identical but going from 100% identical to 30% is a pretty dramatic difference and the 30% figure isn't really supported by anything at all as far as I can tell. I'm pretty sure Percy wasn't thinking of the 97-98% figures when he asked you for a source.