|
Register | Sign In |
|
QuickSearch
Thread ▼ Details |
|
Thread Info
|
|
|
Author | Topic: Gitt Information from Evolution FairyTale | |||||||||||||||||||
Admin Director Posts: 13014 From: EvC Forum Joined: Member Rating: 1.9 |
A discussion thread about Gitt information in the Free Scientists Lecture !, Dr. Werner Gitt Genetic Information Specialist thread at the Evolution FairyTale's Discussion Board was temporarily closed by a moderator for being off-topic, and I'm inviting the participants to pick up discussion here if they're so inclined.
|
|||||||||||||||||||
Percy Member Posts: 22479 From: New Hampshire Joined: Member Rating: 4.7 |
Well, it doesn't look like any of the thread's participants are interested in resuming here, so I'll describe the last issue being discussed in case anyone here wishes to comment.
Deadlock posed this problem:
deadlock writes: Please, show me what is the probability of 3 mutations happen in 3 specific spots in a bacteria genome of 10^7 positions in 10^6 tries. And when I gave the "wrong answer" he responded with this solution:
deadlock writes: It’s a Bernoulli Distribution. P(Y) = Cn,y * p^y*q^n-y y = number of successn = number of tries p -> We have three possible bases to change in each position, so the probability of mutating a specific base in a specific point is : 1/3^(10^7) q -> 1 - p so P( Y >= 3 ) = 1 - ( P(Y=0) + P(Y=1) + P(Y=2) ) Where Deadlock writes "Cn,y" he means combination, i.e., "n things taken y at a time". As part of his solution Deadlock claims that the probability of a mutation at a specific point in the genome is (1/3)107 (nice to be back in a land where HTML in messages is legal). When the thread was closed I was attempting to explain to Deadlock why this was incorrect, that this value is many thousands of orders of magnitude smaller than a goolgol-th and is ridiculous. But he stood by it, then the thread was closed. Anyone have any idea where Deadlock might have gotten the idea that this was the correct probability? What's the correct value, and how would you persuade Deadlock that it is the correct value? --Percy Edited by Percy, : Grammar.
|
|||||||||||||||||||
PaulK Member Posts: 17825 Joined: Member Rating: 2.2 |
To put the obvious first it's so clear that he meant 1/(3 * 10^7) that I'd have asked him straight out if it was a typo. If he insisted then I'd point out that 3 base pairs per codon for 10^7 codons is obviously 3*10^7 base pairs. If that didn't work I'd ask him to explain himself to find out just what absurdity he had in mind.
Looking at the thread:
quote: That works if the probabilities p and q are constant. Because he insists on mutations in 3 presumably different bases it isn't that simple. It can be used as an estimate, but there is a corrective factor that needs to be applied.
quote: With the correction noted above this would be a reasonable probability if a "try" was a mutation. (i.e. if a mutation occurs, assuming equiprobability, the chance of it occurring at a particular location is 1 divided by the number of locations). Unfortunately he states:
A try is a reproduction event.The only moment when a mutation can happen and be passed on
In which case he should forget about the genome size and just use the per-base probability of mutation (a simpler calculation, since there is no need to invoke combinatorials) Assuming that a "try" is a mutation:
[qs]
q -> 1 - p so P( Y >= 3 ) = 1 - ( P(Y=0) + P(Y=1) + P(Y=2) )
[/quote] The probability p is the probability of getting ONE specific mutation. But we don't want one specific mutation three times. We want 1 of three mutations, then one of the remaining two, then the last one (the order doesn't matter). SInce there are 6 ways that this could happen the real probability is about 6 times higher. I estimate it as about 4*10-5 (if I did the calculation right). A simpler estimate (multiply the probability of one occurring by 10^6 and cube the result) comes out at about the same. The real probability ought to be a bit lower, but not greatly so.
|
|||||||||||||||||||
Percy Member Posts: 22479 From: New Hampshire Joined: Member Rating: 4.7 |
PaulK writes: To put the obvious first it's so clear that he meant 1/(3 * 10^7) that I'd have asked him straight out if it was a typo. If he insisted then I'd point out that 3 base pairs per codon for 10^7 codons is obviously 3*10^7 base pairs. If that didn't work I'd ask him to explain himself to find out just what absurdity he had in mind. I think he may really have meant (1/3)107, because 107 is the number of base pairs in his genome. I would argue that he was thinking, "There are three wrong bases for any position, and the probability of getting one of those three wrong bases is 1/3. So the probability of the first base pair in the genome being wrong is 1/3, and the probability of the second base pair being wrong is 1/3, and so forth, so multiple 1/3 by itself 107 times." Which is, course, completely bogus. What he really wanted to do if he was determined to use that style of approach was properly figure out the probability of an error for a base pair, and 10-8 seemed to be an acceptable figure to him. This means that the probability of all the base pairs being right would be:
(1-10-8)107 This happens to be around .9, which means there's about a .1 probability of one or more mutations. This is so similar to his (1/3)107 that it makes it seem likely that he was trying to use this particular approach, but what he did was both wrong (the 1/3) and too simplistic (this has to be calculated, it isn't something simple that you write down off the top of your head). So the probability of three mutations at specific predesignated positions in a single reproductive event is:
(1-10-8)(107-3) * 10-83 = 9.08*10-25 I'm on the edge of my competency here, so let me know if I've gone wrong. Anyway, it is indeed a very small number. As I told Deadlock, predesignating the positions isn't the way evolution operates. Natural selection doesn't sit around waiting for beneficial mutations, it just works on what it has. And if you apply 106 Bernoulli trials like this you still get a very small number:
C(106,1) * 9.08*10-25 * (1 - 9.08*10-25)(106-1)) = 9.08*10-20 This is unsurprising because of the unlikelihood of three mutations in predesignated locations in a single reproductive event. As I explained to Deadlock several times, this isn't the way evolution works because of multiple offspring and generations, and because of the post facto fallacy (is that the proper name for this fallacy, calculating the odds of what really happened, like winning a lottery, when all the outcomes were equally unlikely but one of them has to happen?). Anyway, that's my attempt at things, not the same answer you got, but I had a different interpretation of this problem. What do you think? --Percy
|
|||||||||||||||||||
PaulK Member Posts: 17825 Joined: Member Rating: 2.2 |
If you use the probability of a mutation at a single base in a single reproductive event you can do the calculation relatively easily.
If the probability is p, to get mutations at three specific points in n reproductive events would be (1 - (1-p)^n)^3 If you assume that a specific replacement is required (not necessarily true, because of the redundancy in the genetic code) you get p by multiplying the chance of getting any mutation by the chance of getting the right replacement (you can assume 1/3 to simplify but some substitutions are more likely than others). The probability of getting all three in a single reproductive event is obviously going to be p^3. So if the probability of a base pair mutating is 10^-8 the probability of all three mutating is 10^-24 - close to your estimate, but a much simpler calculation ! Your second calculation is also not the obvious approach. If you want to calculate the chance of getting all three in a single attempt in 10^6 reproduction events (itself an odd thing to do) you're normally just calculate the probability of it not happening and subtract from 1. i.e. 1 - (1-10^-24)^10^6 (which means you need 25 places of precision just to do the subtraction in the brackets !)
|
|||||||||||||||||||
Percy Member Posts: 22479 From: New Hampshire Joined: Member Rating: 4.7 |
PaulK writes: Your second calculation is also not the obvious approach. If you want to calculate the chance of getting all three in a single attempt in 10^6 reproduction events (itself an odd thing to do) you're normally just calculate the probability of it not happening and subtract from 1. i.e. 1 - (1-10^-24)^10^6 (which means you need 25 places of precision just to do the subtraction in the brackets !) Yep, that's the same answer I get, 9.08*10-19 (I counted decimal places wrong, the exponent of 10 in my previous message should have been -19, not -20). I was trying to use the approach Deadlock insisted on using for Bernoulli trials, so it sounds like you're saying I did it properly? I'm amazed! I used to be more intuitive with probabilities. I was able to follow the approach you used pretty easily, but I'm not sure I would have got there on my own. I use that skill very rarely these days and it is seriously rusty. Thanks for the help! --Percy
|
|||||||||||||||||||
PaulK Member Posts: 17825 Joined: Member Rating: 2.2 |
quote: Strictly speaking it's not exactly the same as the simpler calculation I used. It's the probabilty of it happening exactly once as opposed to the probability of it happening at least once. Not that it makes a lot of difference when the expected number of successes is << 1, and it's not clear that either one is more accurate anyway. My own probability is pretty rusty although I occasionally do the odd exercise (and it was one of my better areas, too). If you learn just one thing about probability it;'s that you have to be really, really careful about what you're doing and try to understand exactly what your calculations mean. Too many people just put down something that makes intuitive sense - and it badly wrong.(If I can find the link I can point to a calculation that Lee Spetner - who really ought to have known better - badly bodged)
|
|
|
Do Nothing Button
Copyright 2001-2023 by EvC Forum, All Rights Reserved
Version 4.2
Innovative software from Qwixotic © 2024