Register | Sign In


Understanding through Discussion


EvC Forum active members: 63 (9162 total)
1 online now:
Newest Member: popoi
Post Volume: Total: 916,357 Year: 3,614/9,624 Month: 485/974 Week: 98/276 Day: 26/23 Hour: 0/1


Thread  Details

Email This Thread
Newer Topic | Older Topic
  
Author Topic:   Recent paper with an ID spin? Abel and Trevors (2005).
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 33 of 85 (516284)
07-24-2009 12:54 PM
Reply to: Message 3 by Percy
09-24-2005 8:51 AM


quote:
My own opinion is that this is full-blown ID.
Is there something wrong with that?
quote:
The peer-review process of Theoretical Biology and Medical Modelling is seriously broken.
Why? Because they allowed something to be published that you do not approve of?
quote:
I've quoted the 3rd paragraph of Shannon's paper to Creationists many times ("Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem."). Abel and Trevors avoid the mistake of confusing Shannon information with meaning, but they proceed on to concoct replacement principles of information theory out of thin air. They provide no underlying mathematical foundation, or any other kind of foundation. They simply make assertions.
No, they do not "replace" the information theory. They build up on it. Shannon's description of information is the first and the lowest possible description of information there is. It deals ony with statistical aspect of information. There is still syntax, semantics, pragmatics and apobetics do deal with.
quote:
We have extended Shannon uncertainty by incorporating the data variable with a functionality variable. The resulting measured unit, which we call Functional bit (Fit), is calculated from the sequence data jointly with the defined functionality variable. To demonstrate the relevance to functional bioinformatics, a method to measure functional sequence complexity was developed and applied to 35 protein families.
As you can clearly see, they actually made another paper where they actually do make mathematical calculations for functional biological information. So there is nothing wrong with their work in extending Shannon's notion of information. Which supports ID.
Measuring the functional sequence complexity of proteins | Theoretical Biology and Medical Modelling | Full Text
quote:
I think they say this because they don't want Intelligent Design confused with Creation Science, which they view as a distinctly different discipline.
Well isn't it different?
quote:
It's become common for those in the ID movement to distance themselves from the largely unsuccessful Creation Science movement.
I don't know how Crweation Sicence is unsuccessful, but I see nothing wrong in defining terms properlly. Do you?
quote:
My own view of this is that Creation Science and ID are in an uneasy and unacknowledged alliance based upon the "the enemy of my enemy is my friend" principle.
Going a bit off topic here aren't we?
quote:
Creation Science groups like ICR are quietly watching from the sidelines hoping that, even though skeptical of many ID positions, enough disruption is caused with science to provide them further openings.
Just like naturalists were waiting fro their chance in the early 19th century? Right?

This message is a reply to:
 Message 3 by Percy, posted 09-24-2005 8:51 AM Percy has seen this message but not replied

Replies to this message:
 Message 34 by Wounded King, posted 07-27-2009 11:50 AM Smooth Operator has replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 35 of 85 (518067)
08-03-2009 10:22 PM
Reply to: Message 34 by Wounded King
07-27-2009 11:50 AM


quote:
The problem as I see it with this approach is that the actual functionality component is so subjective. The functionality variable is pretty much the only thing to differentiate this approach from any other purely sequence comparison/conservation based approach.
It's not really subjective. Anyone who examines a specific part of the genome will find the same objective function there. Biological functions are objective.
quote:
For instance if we choose a set of sequences based on their functional ability to bind a specific DNA sequence how do we know that it isn't simply generic DNA binding ability or some entirely other commonality of function effect that is being identified?
We select them by which biological function they perform. Binding is not a function. Coding for a eye is. Coding for a flagellum is. Since those structures perform biological functions.
quote:
This is both the most important and by far the least satisfactory part of their methodology since in their own dataset analysis they don't actually tell us what the functionality variables for the different protein families were or what they were derived from. As far as I can tell from looking at the python program they include the function variable is simply derived from the sequence analysis. This seems to make the whole thing nothing more than an exercise in saying 'I think all these sequences share some common function' and you have to question how well such an approach would be suited to analysing anything other than exactly the sort of protein sequence sets they look at in the paper and even then the extent to which you could ascribe any similarities to the specifc shared functional criteria.
I really don't se where the problem is. If you know the protein performs a function, you just need to find it's sequence on the genome and you found teh functional part of the DNA sequence for that biological function.
quote:
If 'Fits' and 'FSC' are an effective way of identifying common functionally significant sites between sequences then that is all to the good. The presupposing of exactly what function you are looking for seems an unnecessary additional step, especially when using pfam family membership is apparently the only selection criteria used for discriminating function.
But we are supposed to measure biological functions. If we don't find them in a given sequence than what are we measuring?
quote:
They also leave open the problem they recognise in terms of only having a best guess at what the whole functional sequence space is based on the collated functional sequences. The possibility of unknown functional sequences means that an underestimation of the likelihood of any particular function arising is almost unavoidable. That is even before considering functionally equivalent but genetically/structurally distinct proteins.
This just means we are not measuring nothing and claiming it's information. The more functions there are the better. With more work, we will be able to analyze more functions.
quote:
This definitely seems like a worthwhile endeavour if it could discriminate something more than traditional sequence analysis methods do but certainly doesn't offer much support for ID, at least beyond being the prelude to an argument from big numbers based on a calculation of something like the "FSC value for an entire prokaryotic cell where the genome has been sequenced and all translated proteins are known" which they mention in the discussion. Such a calculation would of course serve to compound any errors arising from the other more subjective elements of the process.
Any information based research is an indirect support for ID, becasue id is based on the idea that design is only the product of a designing mind that produces information.

This message is a reply to:
 Message 34 by Wounded King, posted 07-27-2009 11:50 AM Wounded King has replied

Replies to this message:
 Message 36 by Wounded King, posted 08-04-2009 3:54 AM Smooth Operator has replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 37 of 85 (518210)
08-04-2009 3:31 PM
Reply to: Message 36 by Wounded King
08-04-2009 3:54 AM


quote:
If our knowledge was perfect I might agree, but given the high possibility of our failing to recognise a function. The function is not subjective but our knowledge and understanding of it can be.
But if someone does find a function and tells someone, he will also see it. It's not an invention it's really there.
I mean, are you saying that the beating of the heart is subjective?
quote:
Have you even read the Durston et. al, paper? If you had surely you would understand that the very data they test their approach with was selected based on the way they were assigned to PFAM families which is based on similarities of functional structural domains for activites such as protein-protein or protein-DNA interactions, they are certainly not classified as 'Coding for a eye' or 'Coding for a flagellum'.
But those sequences have biological functions by default. Those sequences code for proteins and flagellum is made from proteins. So where do you see the problem?
quote:
That isn't information science, that is simply basic biology. You identify a functional change has occurred from a change in phenotype and then use genomic sequence comparisons or classical genetics to identify where the change occurred and therefore the functional sequence which was altered leading to the change. That isn't what Durston et al. did.
They don't ahve to do it. They already know, everybody does, that protein coding sequences have biological functions.
quote:
Maybe we should see if we can get the program up and running and try it out in a whole set of sequences from something like a common developmental pathway? How does that sound?
I'm skeptical myself since their method seems to rely so heavily on sequence alignment. I'm happy to accept that they can identify conserved functional sites within a protein family, so can many sequence analysis methods. I am doubtful their method will work for a heterogenous set of sequences linked by a function such as 'build and eye' or build a flagellum'. Do you think it would work?
Why should in not work, since it is well known that protein coding sequences are coding for biological functions?
quote:
It is only support if that is actually true, you keep missing the point that not everyone already accepts your assumptions, if they did we wouldn't be having a discussion.
True, but the problem is, if someone does not accept it, than they are simply having blind faith. I have never seen information arising from a natural process without an intelligence to guide it. Have you?

This message is a reply to:
 Message 36 by Wounded King, posted 08-04-2009 3:54 AM Wounded King has replied

Replies to this message:
 Message 38 by Wounded King, posted 08-04-2009 5:49 PM Smooth Operator has replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 41 of 85 (518497)
08-06-2009 10:46 AM
Reply to: Message 38 by Wounded King
08-04-2009 5:49 PM


quote:
No I'm saying that different levels of understanding of the functioning of the heart can give rise to subjectivity in discussion of it, and that the same is true of functions at the molecular level.
So you agree that there are objective biological functions?
quote:
In you saying that it isn't protein-binding function that is being looked at when anyone who had read the paper would see that is exactly the sort of functions that they were looking at and 'Coding for a flagellum' was not.
I still get what you are trying to say. Are yous aying that sequences do not represent proteins that code for the flagellum, or any other biological machine?
quote:
But they don't know which sequences have what function in all cases, even when they know what process the protein is involved in they may still not understand its molecular function.
It doesn't matter. If it has the function, it has one. The point is to measure the functions, not to explain what they do in detail.
quote:
Because only an idiot would think that the point of the paper is to point at protein coding sequences and say 'these have some function'.
But I never said that.
quote:
The point is to identify regions in the proteins with functional information and from that to be able to derive a value for the functional information of the whole protein and possibly from there to higher levels of organisation up to whole genomes.
Exactly, that's what I've been saying the whole time.
quote:
The reason I don't think it will work is because their method is heavily reliant on protein sequence alignment, which will obviously work for PFAM families, and all the proteins involved in a specific function will be hugely structurally diverse making such an alignment virtually impossible, how therefore would such an analysis proceed? Durston et al. unfortunately leave this question simply hanging.
And why exactly should their difference in sequence be an obstacle?

This message is a reply to:
 Message 38 by Wounded King, posted 08-04-2009 5:49 PM Wounded King has replied

Replies to this message:
 Message 43 by Wounded King, posted 08-06-2009 10:58 AM Smooth Operator has replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 44 of 85 (518518)
08-06-2009 12:07 PM
Reply to: Message 43 by Wounded King
08-06-2009 10:58 AM


quote:
Because they only show how to measure FSC for a set of aligned sequences with a shared putative function.
And you thought that FSC was a perfect method that can meassure anything from it's first go? You didn't think that the scientists needed time to improve their models to be able to do more?

This message is a reply to:
 Message 43 by Wounded King, posted 08-06-2009 10:58 AM Wounded King has replied

Replies to this message:
 Message 45 by Wounded King, posted 08-07-2009 3:48 AM Smooth Operator has replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 46 of 85 (518827)
08-08-2009 4:51 PM
Reply to: Message 45 by Wounded King
08-07-2009 3:48 AM


quote:
They claim you can apply their method to similarly functional but molecularly dissimilar proteins. I think that if they make some claims they should be able to support them with something, I can see how you wouldn't think so though, since you absolutely refuse to support any of your own claims beyond bare repetition.
I always support my claims, but when I get same questions asked over and over again, what else should I do?
quote:
Well have they? Its been 2 and a half years since the paper was published.
Are we talking about the same 2005 paper? Anyway, you have got to give people some time.

This message is a reply to:
 Message 45 by Wounded King, posted 08-07-2009 3:48 AM Wounded King has replied

Replies to this message:
 Message 47 by Wounded King, posted 08-09-2009 5:43 AM Smooth Operator has replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 48 of 85 (520176)
08-19-2009 5:57 PM
Reply to: Message 47 by Wounded King
08-09-2009 5:43 AM


quote:
Well, not to put too fine a point on it you should do exactly what you just claimed you do do, but anyone looking at either this thread or the other one about information can see that simply isn't true.
So you are saying I never ever posted alink that I used as an argument.
quote:
No, we are talking about the 2007 Durston et al. paper that you introduced to the discussion. I understand that it takes time to do research but shouldn't they have done the research before claiming they could do something?
I thought we were talkign about the 2005 paper. Listen, it's very simple. They know what they are doing. If you have certain problems with their work, than yes, I agree. Nobody's work is perfect. But that doesn't mean it's flat out wrong, and that they can't use the papaer they wrote for anyhting.

This message is a reply to:
 Message 47 by Wounded King, posted 08-09-2009 5:43 AM Wounded King has replied

Replies to this message:
 Message 49 by Wounded King, posted 08-21-2009 6:07 AM Smooth Operator has replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 50 of 85 (520515)
08-21-2009 8:10 PM
Reply to: Message 49 by Wounded King
08-21-2009 6:07 AM


quote:
Have Durston et al. expanded Abel and Trevor's work to get an algorithm for producing some measure of complexity from sequence alignments, yes but not any better than half a dozen already extant methods. Have they shown how to use this method to analyse a heterogenous set of sequences, no. Have they given us usable criteria for 'Function' that we can use for selection? No.
Let's see now...
Measuring the functional sequence complexity of proteins | Theoretical Biology and Medical Modelling | Full Text
This is their method of computing the FSC. They used the protein sequences from the PFAM database and than they were compared and measured. What do you think is wrong with this method?
quote:
As far as evidence supportive of ID goes all they have done is assume that they can factor all of these 'FSC' measurements together to get a meaningful overall value for a system or organism. But they give us no reason to assume that this is actually the case along with them. Even if we allow this it only helps ID if we accept it as a measure of probability and further accede to concepts like the 'universal probability bound' being relevant.
Why would they not be able to put all those values together? The only reason I could think of is because you think that some of them could have evolved. But could they have done that?

This message is a reply to:
 Message 49 by Wounded King, posted 08-21-2009 6:07 AM Wounded King has replied

Replies to this message:
 Message 51 by AdminNosy, posted 08-21-2009 8:21 PM Smooth Operator has replied
 Message 53 by Wounded King, posted 08-22-2009 4:15 AM Smooth Operator has replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 52 of 85 (520537)
08-22-2009 12:28 AM
Reply to: Message 51 by AdminNosy
08-21-2009 8:21 PM


Re: bare links again
quote:
Let's see you actually explain things in your own words SO. You'll start to get short suspensions if you don't.
I did explain it in short. Am I supposed to explain the whole paper in full details?

This message is a reply to:
 Message 51 by AdminNosy, posted 08-21-2009 8:21 PM AdminNosy has replied

Replies to this message:
 Message 54 by AdminNosy, posted 08-22-2009 9:35 AM Smooth Operator has replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 55 of 85 (520704)
08-23-2009 9:12 AM
Reply to: Message 54 by AdminNosy
08-22-2009 9:35 AM


Re: bare links again
quote:
Where are your words on the method to analyze a heterogeneous set? Where are the usable criteria for function? Either explain them or explain why they aren't needed.
I see, no problem.
I will explain it soon. I'll be back in a week or so when my exams are done. I should really be going to study now.
So, yeah, see you in about a week again...

This message is a reply to:
 Message 54 by AdminNosy, posted 08-22-2009 9:35 AM AdminNosy has not replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 64 of 85 (522913)
09-06-2009 9:39 AM
Reply to: Message 53 by Wounded King
08-22-2009 4:15 AM


quote:
So where are the usable criteria for function?
What doy ou mean by this?
quote:
A PFAM family is based upon structural similarity not function per se. I think the problem with this method is that they claim that it can be used for genes/proteins with similar functions but highly divergent molecular structures, but given their method is based on sequence similarity they give no explanation of how one could do this. Their method seems redundant when compared to more sophisticated metrics like BLOSUM which take into account the physicochemical properties of the amino acids.
But they already assume a function in the first place. That is why they said that the genome of a certain organism has to be sequenced and studied first. That you can apply the method to other proteins.
What exactly does the BLOSUM measure? The functionality of the protein, or just the complexity?
quote:
Well of course they can. There is nothing creationists and IDers love better than multiplying a whole lot of probabilities together to come up with one really tiny probability and then declaring that it shows evolution is impossible. The question is whether doing so actually has any sort of meaning or utility.
Why wouldn't is? Are you saying that nature can take apart a problem and deal with it bit by bit? No it's can't it has to deal with the whole thing. When you add up all the probabilities, you come up with a really small probability, and yes, random chances has to resolve it all.
quote:
The FSC is really nothing more than a slightly modified measure of conservation. It in no way accounts for all the possible unknown sequences which can also fulfil a particular function.
That is true, but it gives us an estimate. The only thing that we shoudl be really interested there is that it can tell the difference from OSC and RSC.
quote:
So as usual the probability measurement is only ever going to be for the exact set of genetic structures that have been fed into it. This means that rather than the likelihood of any molecule evolving to perform a particular function you are only looking at the likelihood of the exact molecular set you were studying evolving. So you will no doubt have a very small number but sets of die rolls with infinitesimally small probabilities are generated every day.
As I said, you will have an estimate, but again, you can't just say that ANY sequence could have evolved to be the right one. Because we know for a fact that a lot of sequences have no biological meaning. They are useless. So we are basicly estimating where is an island of functionality in a sea of meaninglessness.

This message is a reply to:
 Message 53 by Wounded King, posted 08-22-2009 4:15 AM Wounded King has replied

Replies to this message:
 Message 66 by Wounded King, posted 09-07-2009 8:00 AM Smooth Operator has replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 65 of 85 (522915)
09-06-2009 9:45 AM
Reply to: Message 54 by AdminNosy
08-22-2009 9:35 AM


Re: bare links again
quote:
This is your whole explanation?
Yes, it is. Look at how short the method is. What else should I say?
They weere basicly going AA by AA and trying to find how may times it repeats itself in other protein structures. The more you find that specific AA, in teh right place, the more likely it is that it is contributing to the said function. And of course, you have the cutoff that represents the minimum AAs you should have. If there are not enough of them they could be contributing to the FSC count, but reality they are random sequences.
quote:
The ∆H for each column in the array was computed by moving through each column in the set of aligned sequences and compiling the number of occurrences of each amino acid in each column. The estimated probability P for each amino acid represented in the column was equal to the number of occurrences of that amino acid in that column divided by the total number of aligned sequences in the set. There are numerous columns in a flat set of aligned sequences that contain insertions or deletions. These are usually in regions that have considerable flexibility and may, or may not contribute to the functional complexity of the protein. To avoid the effect of the columns representing indels, a cutoff value was input into the program. The cutoff value represented the minimum number of amino acids occurring in a column divided by the total number of aligned sequences. If the total number of amino acids in a given column was below the cut-off value, due to a large number of indel-produced vacancies in the column, then the value of ∆H was automatically set to zero. Since these regions indicate little or no sequence conservation, they are already close to the null state, so setting ∆H = 0 is a reasonable move and prevents such columns from a spurious contribution to the overall FSC of the protein. The cutoff value was adjusted from a minimum of .55 to a maximum of .89 such that the number of remaining sites to be evaluated was very close to the standard protein length suggested by Pfam. The total number of remaining sites was output as the size of the protein. The ∆H for each column was summed and then output as the estimation in Fits of the FSC of the protein. The FSC density of each protein was computed by dividing the estimated number of Fits by the size N of the protein.

This message is a reply to:
 Message 54 by AdminNosy, posted 08-22-2009 9:35 AM AdminNosy has not replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 67 of 85 (523333)
09-09-2009 3:20 PM
Reply to: Message 66 by Wounded King
09-07-2009 8:00 AM


quote:
I mean other than saying, 'this PFAM family are all likely to share a common function', they give no usable criteria for how one can investigate any other aspect of biological function.
They give lots of examples of what they think it could be applied to, but none of them are doable the way they describe their process working. I can't derive an FSC value for the FGF signaling pathway by making an alignment of all the genes/proteins in the pathway, because they won't align. Am I supposed to make alignments for every element with whatever relatives I can find? Should they be within the same organism? From different species? They say you can compare functionally similar structurally distinct proteins, but they don't say how.
The main problem with this approach is that your measures could be entirely wrong because you just don't know what sequences could perform a particular biological function, you only know which ones you have which putatively do perform that function. So you will always be overestimating the FSC.
Yes, well, you are supposed to measure the sequences of the functions you do know what they are for. What would be the point of measuring a sequence for which you do not know a function? Maybe it's totally useless.
quote:
Except their examples use proteins from multiple different species, not just one genome. And the proteins haven't just been sequenced they have also been aligned and assigned to families based on structural similarity. It seems like all of the work has already been done here, what is Durston et al.'s program adding to the mix, except a slightly varied form of conservation metric, of which several already exist?
Exactly, it does just that. Because they are trying to measure the conservation of specific AA through evolutionary history. I think that's not such a great approach, but what can I say... The more the specific AA is conserved, that is, the more it is a part of some sequence in more species, the more probable it is that it is a part of a biological function.
As for what they are adding, well, the measure of functionality. Shannon's model of information is not good enough for this job. It enough if you want to measure how much bits you need in a communication channel, but not enough for biological functions.
quote:
It is a measure of the conservation of amino acids which takes into account the nature of the substitutions which aren't most conserved, as the Durston et al. program does, but also takes into account the physiochemical/functional properties of the amino acids by scoring substitutions based on a matrix of amino acid substitutions scores derived from a number of highly conserved protein sequences. There are different Blosum matrices depending on the similarity of the sequences under investigation. Like Durston et al.'s FSC measure it is calculated for each residue.
Hmm, well, that basicly seems like the same thing as the Durston's model. What's the measure of functionality in the BLOSUM uses?
quote:
No it didn't. Random chance at worst had to generate all of the genetic variability involved, something that mutational mechanisms do all the time.
That is basicly all of it. Do you think one part of genetic variability would come from nowhere? And no, mutational mechanisms have never been seen to increase the functional variability. They do cause differences in the genome and make different genomes, yes. But they have not been seen to create new functions.
quote:
Be that as it may it still doesn't obviate the grossly mistaken assumption that the way things are currently is the only possible functional conformation for things. So all you are calculating is the probability that exactly this form of biological complexity evolved. This sort of posterior probability calculation is completely meaningless, even if we were to accept that all the values you might wish to plug in were accurate.
But nobody said that. There are lot's of possible sequences that can code for one particular function.
https://www.youtube.com/watch?v=vUeCgTN7pOo
Here, look at this presentation by Durston himslef. He guves out a formula in which M(Ex) is the number of different configurations that can perform a specific function. It's explained in the 01:30 into the video.
quote:
There were already plenty of methods available for distinguishing totally random sequences (RSC) and highly repetitive sequences (OSC), like the examples that were given, from functional coding sequences and functional non-coding sequences. AS for the OSC example, they calculate that in a completely different way to all of the other ones, since if you actually use it with their program, i.e. give it a dozen sequences of poly-A from montmorillonite clay. it will give you a maximal FSC result.
Okay, I know. Now we have a new one, is that bad?
quote:
Indeed, with the most conservative estimate possible based on highly structurally similar proteins. And you are then adding up all of these conservative estimates to make a number with essentially no scientific meaning. One has to wonder why? It doesn't show that the system couldn't evolve. It doesn't even show that it is impossibly unlikely. It certainly isn't positive evidence for intelligent design. The only purpose of this seems to be for rhetorical purposes to generate big numbers to impress people with.
If you look at the video I posted above you will also notice that there is a limit what natural processes can do. That is, how much Fits they can produce since life began. If the number of Fits we find in living organisms exceeds the number of Fits nature can produce, than the best explanation is that an intelligence did it.

This message is a reply to:
 Message 66 by Wounded King, posted 09-07-2009 8:00 AM Wounded King has replied

Replies to this message:
 Message 68 by Wounded King, posted 09-10-2009 6:40 AM Smooth Operator has replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 70 of 85 (524157)
09-14-2009 4:31 PM
Reply to: Message 68 by Wounded King
09-10-2009 6:40 AM


quote:
But while they claim you can derive an FSC value for a single biosequence they only show how to do it in the context of a pre-existing alignment. I suspect this is because outwith the modified conservation metric they have no way of setting the function variable that isn't totally arbitrary. I could identify 6 non aligning structurally diverse proteins with similar functions, would this method let me compare their FSC? The paper seem to claim it would but the method certainly doesn't and the paper doesn't make it clear how it can be used in such a way.
From what I've see seen, they are only talking about the PFAM family of proteins. You need those proteins that are similar and can be aligned. It probably doesn't work any other way.
quote:
It's derived from the conservation of amino acids across multiple highly conserved proteins, the matrix is weighted so amino acids with similar functional physicochemical properties are scored higher than disimilar ones. This is why I think it is superior to Durston et al's technique. They treat all substitutions as equal with only the proportions at each individual site affecting the Fits for that site, the Blosum score on the other hand takes into account the likely functional effects of a particular amino acid substitution.
How exactly is this better than Durston's model? And how do they tell apart functional and non-functional sequences?
quote:
No, just redundant. Why re-invent the wheel more crudely?
Simply because there is time to improve it and to become better.
quote:
I have to ask, why do you think he used the equation from the Hazen et al. (2007) paper rather than his own? I suggest that it is precisely because Hazen et al. clearly state how they derive their measure of functionality.
Aside form that this is exactly what I suggested, simply an argument from big numbers where Duston plugs in lots of assumed values which are highly questionable, i.e. he uses the calculations from his paper for RecA and SecY even though he has no idea what the actual possible number of functional sequences is. He seems to have done a little bait and switch between his equation and the, albeit similar, one in the Hazen et al. paper. Durston is eliding over what Hazen et al. identify as a crucial step ...
He is explaining how he got to his equation. His work is based on Hazen's.
I don't see any assumptions. We know the number fo proteins the said structures have. We know what RecA does, so there is nothing left to assume.
Not only that, but he cited Doug Axe, who dealt with the modifications to the proteins. What he actually did, was to modify proteins in such a way to show how much change they can take, but still perform the function they did. Now we know that there is a subset that is between 10^-64 and 10^-77 of all possible sequences that will still give you the same function in the modified protein. It's at around 06:05.
quote:
Durston et al.'s method skips over this step and just takes the conservation of amino acid sites in PFAM alignments as a good enough estimate, which naturally leads them to overestimate the degree of specification, I can't say how much because I have no idea of what all the sequences which fulfill a specific function are.
Didn't Durston actually mention that whaen they measure the AAs, that they specifically deal the cutoff to certain parts of the genome, so not to inflate the number of Fits?
quote:
In as much as it is a stochastic process then yes. Genetic variability comes from errors in genetic replication and repair, from crossovers that swap domain around and from multiple other sources with no apparent source outside of the statistical nature of biochemistry and its interactions with the environment.
You do know that errors may casue variability, but no functional variability. All mutations either tune the function or degrade it. We haven't actually observed, something liek ATP syntase arise de novo. Have you got any examples?
quote:
You would have to define 'functions' quite clearly before w e could even agree to begin to discuss this.
A biological function is a process that takes in an input an gives an output within an organism. For an example. The turning of the flagellar motor is a function. Energy production form the ATP synthase is a function. Food degradation is a function.
quote:
There are numerous instances of populations gaining new functions in in vitro experiments. Antibiotic resistance and other similar examples spring immediately to mind, but the RNA polymer experiments the Hazen et al. paper refer to shows that random mutation can generate and improve functionality.
We have only seen modification to existing functions. A so called fine tuning that is already present withing the genetic code. Or a degradation. Antibiotic resistance is gained by either degradation or fine tuning of the genes. Not by producing new molecular machines and structures.
quote:
Thus, the organism develops a resistance to an antibiotic by eliminating certain systems, such as transport proteins, enzymatic activity, and binding affinity. For example, ampicillin resistance can result from an SOS response that halts cell division,66 loss of nitroreductase activity can provide resistance to metronidazole and amoxicillin,67 and kanamycin resistance can result from loss of a specific transport protein.68
A Creationist Perspective of Beneficial Mutations in Bacteria | Answers in Genesis
As you can see, all the known cases of resistance to antibiotics have been either degradations or simple fine tuning. Antibiotic resistance is not a biological function. It is an inability to perform something. There is no input or output. No new structures were produced that perform something that bacteria was not able to do before.
For an example. ATP synthase does take inputs, and produces outputs, the energy in this case. This is done by a specific molecular machine. On teh other hand, resistance is gained when something is either turned off or destroyed. You will not get any genetic variability, in the sense of more genetic functions this way. You will get genetic variability only in the fact that different bacteria have different sequences now. But the new sequence in resistent bacteria do not produce anything new.
quote:
I understand that that is Durston's argument, the question is can we accept his estimates of where those limits are, I don't think we can given what he presents. Of course whether the correct response if this was true is to immediately leap to the conclusion of intelligent design is another matter which is still open for discussion.
If you assume 4.6 billion year old Earth and you assume how many organisms have lived since than. Than you could have some idea of what nature could have produced in that time. It seems he does not say this in this video, so I understand why you said you do not agree with it.
Here, have a look at the full video:
http://www.seraphmedia.org.uk/ID.xml
quote:
This presents an interesting counterpoint to Doug Axe's estimates of the likelihood of the evolution of functional protein folds.
This would just mean that more known proteins could have been produced by chance. But the point remains that there is a limit to what nature could have produced. And that is below 10^42.

This message is a reply to:
 Message 68 by Wounded King, posted 09-10-2009 6:40 AM Wounded King has replied

Replies to this message:
 Message 71 by Wounded King, posted 09-14-2009 4:55 PM Smooth Operator has not replied
 Message 72 by Wounded King, posted 09-15-2009 6:09 AM Smooth Operator has replied

  
Smooth Operator
Member (Idle past 5133 days)
Posts: 630
Joined: 07-24-2009


Message 73 of 85 (524952)
09-20-2009 3:14 PM
Reply to: Message 72 by Wounded King
09-15-2009 6:09 AM


quote:
And far as I can see from looking back there your 'fine tuning' just appears to be the name you give to beneficial mutations which can also encompass maintenance and presumably even increases in information.
Do you increase the complexity of a light switch if you flip it on or off? No, you do not. You simply tune it to a position you want. It didn't gain any new information or new functions. The same happens with some mutations. Other mutations which are deleterious reduce the information in the genome. None of them makes a gain.
To be sure. A genetic duplication does increase the Shannon information in the genome. But such a measure can not be used for biological functions. Simply because it does not take into consideration the function of the information it is measuring. Much better measure is Dembski's CSI. So no, natural causes do not increase CSI.
quote:
Perhaps what they mean is that you can compare the FSCs of two distinct alignments having a common function. It isn't clear from the paper.
They are talking about measuring differnet sequences with the same function. It can't be mutated enough to either lose, or change it's function. Because than, you would be measuring different functions.
quote:
It is better because it actually looks at the frequencies of substitutions in amino acids and identifies common substitutions that presumably allow functional conservation since they are maintained. In contrast as I said, Durston et al. simply look at the distribution of amino acids at each particular site, treating all amino acids as equal.
Are you saying that some amino acids are more important than other? From what I've read they are talking about measuring the conservation of amino acids over all sequences.
quote:
How do Durston et al.? I'm not sure if you are talking here about when the initial alignment is generated, in which case since I directed you to the PFAM database it is exactly the same functional criteria as Durston et al. use, i.e. whatever criteria PFAM used to define their structural families. If you mean how do they get the functionality criteria for individual amino acid substitutions then it is by looking at multiple highly conserved sets of sequences and looking at the distribution of tolerated amino acid substitutions and using that to infer functional physico-chemical similarities between amino acids.
Exactly. They look for conserved amino acids. This is the same thing Durston's model does. I see no difference here. This may be, as you said reinventioning the wheel. But hey, I see no problem with it if it get improved later on.
quote:
But they dont. How is their method an improvement on BLOSUM or other methods which actually consider the biological properties of the amino acids? Both of these generate a metric you could use for similar calculation to the ones Durston et al. perform.
Well it's seems that other model's metric does not take into account the functional part of the sequence. They all may be based on Shannon's information. That is what Durston was talking about in his paper.
quote:
His key points chime with my precise concerns with Durston's work, the failure to take into account structurally dissimilar but functionally equivalent sequences. The problem is that Durston et al. don't seem to have taken the extra step necessary to move beyond looking at functional complexity over aligned sequences.
But I showed you where he said that Doug Axe has done this with his experiments with proteins. And how much you can change them before they lose their function. Durston simply builds on Axe's work and plugs in Axe's numbers into his equation.
And what Szostak is actually saying here. Is that current way in which we are measuring information is not good enough. Becuse it does not take into account functional information.
quote:
Yes there is, we need to assume that we know a high enough proportion of the extant functional sequences of RecA for our estimates based on those we do know to be meaningful.
What exactly is missing?
quote:
I'm familiar with Axe's work. He is extrapolating from one particular functional fold in one enzyme to all of the possible functional folds of all proteins. Not only that but he is doing so based on estimates derived from a highly proscriptive experimental set up using a protein variant already mutated to put it at the borderlines of functionality. As with Durston et al. one of the big flaws with Axe's approach is that it entirely ignores the existence of structurally dissimilar proteins which can perform the same function. The probability of evolving a particular functional fold is not so relevant if there are 10 other folds out there which can perform the same function.
1.) Yes, that was the point of his research. To see how much change a protein can take unitll it loses it's function. You can extrapolate this on other proteins.
2.) No, the point of his work to show just that. If he mutates the proteins enough, he will show exactly how many different combinations, i.e. different sequences of proteins will work that same function. So yes, in this way you can calculate this one function even if there are 10 different protein that can do the same function.
quote:
They have a cutoff value to eliminate stretches of indels which produce gaps in the alignment. But that in no way addresses what I am talking about. The sampling they analyse is only a small subset of the possible functional variants for the sequence but they effectively assume it represents the entire functional sequence space.
It says in the paper that they deal the cutoff to those parts of the sequence so that they would not be counted as functional information. They don't assume that the whole sequence is functional.
quote:
This is simply not true unless you are using the word 'functional' in a highly novel way. Of course the cause functional variability, even producing a loss of function is causing functional variability.
I understand what you mean. I worded my sentence the wrong way. What I meant to say is that errors, i.e. mutations will not give you new biological functions. For an example, they will not produce a flagellum from something vastly different.
quote:
Not of that specifically, but looking at Selex experiments will show you that randomly generated pools of RNA oligonucleotides produce multiple functional motif including binding and catalytic activities. Subsequently sequences encoding RNAs with similar structures have been found in many organisms.
Binding and catalysis is not a biological function, it is a chemical process, and as such, a natural law. This is algorithmic information, as Szostak said where you quoted him. This kind of processes do not produce biological information.
quote:
I'm not sure how you think one could force the de novo production of a catalytic activity like ATP synthase in the lab. Obviously the answer is you don't but I also don't see why you think this is relevant.
It's very important if you want to be extrapolate changes in the biological organisms to account for all the diversity of life we observe today. If you have no observational evidene for such changes, than how can you claim that natural processes can produce them?
quote:
And to the extent that we can measure those functions we can incorporate them into an equation like Hazen's and maybe Durston's but I'm still not clear how. So if we found a mutation that improved motility of the flagellum would that be sufficient? What exact criteria would you use to measure flagellar functionality?
1.) Nope. The function is already there. It would be a simple case of fine-tuning.
2.) I like to use Dembski's CSI.
quote:
Even accepting that as the upper bound this still would allow the entire sequence space of a simplified amino acid repertoire to be explored for shorter sequence lengths, once functional sequences are extant their modification and recombination with other functional sequences is able to occur. Even once we have reached an agreement on upper bound we still need some agreement on the actual size of functional space that is needed to be searched, IDists tend to maximise this and perhaps evolutionists to minimise it, certainly the Dryden paper uses some pretty radical minimisation for its lowest estimates.
Again, here I like to use what Dembski uses. His number is derived from Seth Lloyd's work. And that's 10^120. A much higher number. This is the amount of bit operations the whole observable universe could have produced from it's origin. And that is about 15 billion years ago. So there is nothing more to explore here.

This message is a reply to:
 Message 72 by Wounded King, posted 09-15-2009 6:09 AM Wounded King has replied

Replies to this message:
 Message 74 by Straggler, posted 09-20-2009 6:53 PM Smooth Operator has replied
 Message 75 by Wounded King, posted 09-22-2009 12:07 PM Smooth Operator has replied
 Message 76 by Coyote, posted 09-22-2009 12:54 PM Smooth Operator has replied

  
Newer Topic | Older Topic
Jump to:


Copyright 2001-2023 by EvC Forum, All Rights Reserved

™ Version 4.2
Innovative software from Qwixotic © 2024