|
Register | Sign In |
|
QuickSearch
EvC Forum active members: 64 (9164 total) |
| |
ChatGPT | |
Total: 916,768 Year: 4,025/9,624 Month: 896/974 Week: 223/286 Day: 30/109 Hour: 3/3 |
Thread ▼ Details |
|
Thread Info
|
|
|
Author | Topic: Sequence comparisons (Bioinformatics?) | |||||||||||||||||||||||
Entomologista Inactive Member |
Personally, I like the program Biology Workbench. http://workbench.sdsc.edu/
"Note also that prior to the molecular data, there were anatomists that speculated" You don't want to start the morphological vs molecular data debate. You'll never hear the end of it from the morphologists. Trust me.
|
|||||||||||||||||||||||
randman  Suspended Member (Idle past 4925 days) Posts: 6367 Joined: |
This is what I got for CytoC comparisons for turtle, kangaroo and rattlesnake.
{qs Sequence 1: TurtleCytC>GDVEKGKKIFVQKCAQCHT 34 aaSequence 2: KangarooCytC>GDVEKGKKIFVQKCAQC 34 aa Sequence 3: RattlesnaleCytC>GDVEKGKKIFSMKC 34 aa Start of Pairwise alignments Aligning... Sequences (1:2) Aligned. Score: 88.2353Sequences (1:3) Aligned. Score: 76.4706 Sequences (2:2) Aligned. Score: 100 Sequences (2:3) Aligned. Score: 73.5294 Sequences (3:2) Aligned. Score: 73.5294 Sequences (3:3) Aligned. Score: 100 [/qs] This suggests that the kangaroo and the turtle are more closely related than the turtle and rattlesnake, but that's not what current phylogenies would predict.
|
|||||||||||||||||||||||
randman  Suspended Member (Idle past 4925 days) Posts: 6367 Joined: |
Also, with humans added. People are more related to rattlesnakes than kangaroos.....hmmm.
Sequence 1: TurtleCytC>GDVEKGKKIFVQKCAQCHT 34 aa Sequence 2: KangarooCytC>GDVEKGKKIFVQKCAQC 34 aa Sequence 3: RattlesnaleCytC>GDVEKGKKIFSMKC 34 aa Sequence 4: humancytoC>MPSTLPAPRRTHAARTASL 378 aa Start of Pairwise alignments Aligning... Sequences (1:2) Aligned. Score: 88.2353Sequences (1:3) Aligned. Score: 76.4706 Sequences (1:4) Aligned. Score: 20.5882 Sequences (2:2) Aligned. Score: 100 Sequences (2:3) Aligned. Score: 73.5294 Sequences (2:4) Aligned. Score: 17.6471 Sequences (3:2) Aligned. Score: 73.5294 Sequences (3:3) Aligned. Score: 100 Sequences (3:4) Aligned. Score: 23.5294 Sequences (4:2) Aligned. Score: 17.6471 Sequences (4:3) Aligned. Score: 23.5294 Sequences (4:4) Aligned. Score: 100 This message has been edited by randman, 06-11-2005 02:41 AM
|
|||||||||||||||||||||||
randman  Suspended Member (Idle past 4925 days) Posts: 6367 Joined: |
It still doesn't match with current hypotheses.
It looks like the site will give you wrong data for common names. I looked up and ran the scientific names though, and got the following. Please note that "alligator" is alligator snapping turtle, aka Macroclemys temminckii.
Sequence 1: humanCytoB>MTPMRKINPLMKLINHSFI 308 aa Sequence 2: rattlesnakeCytoB>MMQTMTGFFLAIH 155 aa Sequence 3: redkangaroo>MTNLRKTHPLIKIVNHSF 311 aa Sequence 4: alligator 198 aa Start of Pairwise alignments Aligning... Sequences (1:2) Aligned. Score: 58.7097Sequences (1:3) Aligned. Score: 75.6494 Sequences (1:4) Aligned. Score: 77.7778 Sequences (2:2) Aligned. Score: 100 Sequences (2:3) Aligned. Score: 61.2903 Sequences (2:4) Aligned. Score: 62.5806 Sequences (3:2) Aligned. Score: 61.2903 Sequences (3:3) Aligned. Score: 100 Sequences (3:4) Aligned. Score: 77.2727 Sequences (4:2) Aligned. Score: 62.5806 Sequences (4:3) Aligned. Score: 77.2727 Sequences (4:4) Aligned. Score: 100 >humanCytoB>MTPMRKINPLMKLINHSFIDLPTPSNISAWWNF GSLLGACLILQITTGLFLAMHYSPDASTAFSSIAHIT RDVNYGWIIRYLHANGASMFFICLFLHIGRGLYYGSFL YSETWNIGIILLLATMATAFMGYVLPWGQMSF WGATVITNLLSAIPYIGTDLVQWIWGGYSVDSPTLTRFF TFHFILPFIIAALAALHLLFLHETGSNNPLG ITSHSDKITFHPYYTIKDALGLLLFLLSLMTLTLFSPDLLGDPDN YTLANPLNTPPHIKPEWYFLFAYTI LRSVPNKLGGVLALLLSILILAMIPILHMSKQQSMMF RPLSQSLYWLLAADLLILTWIGGQPVSYPFTII GQVASVLYFTTILILMPTISLIENKMLK >rattlesnakeCytoB>MMQTMTGFFLAIHYTANINLAFSSVIHITRDVPYGXIMQNLHTISASLFFICIYIHIARGLYYGLYLNKE VWLSGTALLITLMATAFFGYVLPWGQMSFWAATVITNLLTAIPYLGTTLTTWL WGGFSINDPTLTRFFAL HFILPFIIISLSSIHIILLHNEGSNNPLGTNSDIDKIPFHPYHS YKDVLMITSMITLLLLILSFSPSLLN DPENFXKAXPXXTPQ >redkangaroo>MTNLRKTHPLIKIVNHSFIDLPAPSNISAWWNFGSLLGACLIIQILTGLFLAMHYTADTLTAFSSVAHIC RDVNYGWLIRNLHANGASMFFMCLFLHVGRGIYYGSYLYKETW NIGVILLLTVMATAFVGYVLPWGQMSF WGATVITNLLSAIPYIGTTLVEWIWGGFSVDKATLTRFFAFHF ILPFIITALVLVHLLFLHETGSNNPSG INPDSDKIPFHPYYTIKDALGFMLMLLILLTLALFSPDML GDPDNFSPAKPTEHSSHIKPEWYFLFAYAI LRSIPNKLGGVLALLASILILLIIPLLHTSKQRSLMFRPISQTLF WILTANLITLTWIGGQPVEQPYIII GQVASISYFLLIIVLMPLAGLFENYMLEPKW >alligator snapping turlte>MATNLRKTHPMMKIINNSFIDLPSPSNISAWWNFGSLLGTCLIMQTITGIFLAMHYSPDISMAFSSITHI TRDVQYGWLIRNMHANGASLFFICIYLHIGRGLYYGSYL YKETWNTGVILLLLTMATAFMGYVLPWGQMS FWGATVITNLLSAIPYIGSTLVQWIWGGFSVDNATLTRFFTLHFLLPFTIMG LAMVHLLFLHETGSNNPT GLNSNSDKIPFHPYFSYKDLLGLILMLSLLLTLALFSPNLLGDPDNFTPANPLVTPPH Humans are more related to alligator snapping turtles than a red kangaroo, aka Macropus rufus? This message has been edited by randman, 06-11-2005 03:28 AM This message has been edited by AdminJar, 06-12-2005 10:56 AM
|
|||||||||||||||||||||||
randman  Suspended Member (Idle past 4925 days) Posts: 6367 Joined: |
Cyto B comparisons for a mouse, human, red wolf, and alligator snapping turtle.
Sequence 1: MusmusculusCytoB>MGDWAVNEGLSIF 500 aa
Sequence 2: humanCytoB>MTPMRKINPLMKLINHSFI 308 aa Sequence 3: MacropusrufusCytB>MTNLRKTHPLIK 311 aa Sequence 4: Macroclemys 198 aa Start of Pairwise alignments Aligning... Sequences (1:2) Aligned. Score: 11.039Sequences (1:3) Aligned. Score: 11.8971 Sequences (1:4) Aligned. Score: 12.1212 Sequences (2:2) Aligned. Score: 100 Sequences (2:3) Aligned. Score: 75.6494 Sequences (2:4) Aligned. Score: 77.7778 Sequences (3:2) Aligned. Score: 75.6494 Sequences (3:3) Aligned. Score: 100 Sequences (3:4) Aligned. Score: 77.2727 Sequences (4:2) Aligned. Score: 77.7778 Sequences (4:3) Aligned. Score: 77.2727 Sequences (4:4) Aligned. Score: 100 |
|||||||||||||||||||||||
Modulous Member Posts: 7801 From: Manchester, UK Joined: |
OK, lets have a look:
Crotalus molossus nigrescens (rattle snake) Macropus rufus (Red Kangaroo)
Homo sapien Alligator mississippiensis(Mississipi Aligator)
Macroclemys temminckii (alligator snapping turtle)
Hylomyscus alleni (Woodmouse)
Vombatus ursinus (Common Wombat)
Chelydra serpentina (Snapping turtle)
Crocodylus acutus (Crocodile) That's one heck of a list there. Now when we run that we get:
quote: Humans are more related to alligator snapping turtles than a red kangaroo, aka Macropus rufus? Human/Alligator: Sequences (3:4) Aligned. Score: 64Human/Kangaroo: Sequences (1:3) Aligned. Score: 75 Human/Snapping Turtle: Sequences (1:5) Aligned. Score: 76 Well, the numbers certainly support what your saying. But wait, there's more, look at the N J Tree It doesn't get it totally right, but we don't expect it to, it seems to have done rather well. Now, as has been said, comparing one protien can give some ideas, but not the full picture, and there is significant room for error. We should consider 100s of genes to be able to come to a more solid conclusion. This message has been edited by Modulous, Sat, 11-June-2005 10:13 AM
|
|||||||||||||||||||||||
Wounded King Member Posts: 4149 From: Cincinnati, Ohio, USA Joined: |
Please post the whole of the FASTA sequences, or at least provide sufficient information so that other people can get hold of the neccessary sequence data. In fact it might be a good idea if from now on people gave us the accession numbers of the sequences they run.
One major problem here is that Cytochrome C is not 34 amino acids in length. I tried to run the same alignment using the following sequences
>Grey Kangaroo CytC GDVEKGKKIFVQKCAQCHTVEKGGKHKTGPNLNGIFGRKTGQAPGFTYTDANKNKGIIWGEDTLMEYLEN PKKYIPGTKMIFAGIKKKGERADLIAYLKKATNE >Snapping Turtle CytC GDVEKGKKIFVQKCAQCHTVEKGGKHKTGPNLNGLIGRKTGQAEGFSYTEANKNKGITWGEETLMEYLEN PKKYIPGTKMIFAGIKKKAERADLIAYLKDATSK >Rattlesnake Cytochrome C GDVEKGKKIFSMKCGTCHTVEEGGKHKTGPNLHGLFGRKTGQAVGYSYTAANKNKGIIWGDDTLMEYLEN PKKYIPGTKMVFTGLKSKKERTDLIAYLKEATAK As you can see the CytC sequences are 104aa in length. The sequence accessions are P68517 for the rattlesnake, P00022 for the turtle and P00014 for the Kangaroo. I'm not sure how you ended up with only 34 amino acids for each species, I'm not sure that you are using the FASTA format correctly. It looks as if you are losing all of the sequence which is in the first line of amino acids in my input data and only getting the last 34 amino acids. That aside the results are in line with yours.
Sequence type explicitly set to Protein Sequence format is Pearson Sequence 1: Grey Kangaroo 104 aa Sequence 2: Snapping turtle 104 aa Sequence 3: Rattlesnake 104 aa Start of Pairwise alignments Aligning... Sequences (1:2) Aligned. Score: 89 Sequences (2:3) Aligned. Score: 79 Sequences (1:3) Aligned. Score: 79 TTFN, WK
|
|||||||||||||||||||||||
Wounded King Member Posts: 4149 From: Cincinnati, Ohio, USA Joined: |
Here is the problem!
Youre FASTA formatting is all wrong. The correct format is
>Name Sequence i.e.
>Grey Kangaroo Cytochrome C GDVEKGKKIFVQKCAQCHTVEKGGKHKTGPNLNGIFGRKTGQAPGFTYTDANKNKGIIWGEDTLMEYLEN PKKYIPGTKMIFAGIKKKGERADLIAYLKKATNE You need a line break after the name and only one '>' right at the start of the name for each sequence.At the moment you are losing big chunks of sequence data. TTFN, WK
|
|||||||||||||||||||||||
Wounded King Member Posts: 4149 From: Cincinnati, Ohio, USA Joined: |
It would be a very good idea to check out the source of the data in a case where there are severe discrepancies in the lengths of the proteins.
A lot of the sequences in genbank are only partial coding sequences. For instance your 85aa sequence for the snapping turtle is only a partial coding sequence. There is a fuller, though still not complete, CDS based vesion here. This sort of partial data is bound to affect the quality of any analysis you perform. To check the source data for your amino acid sequence click on the hyperlink at the 'DBSOURCE' entry in the GenPept view. TTFN, WK
|
|||||||||||||||||||||||
Modulous Member Posts: 7801 From: Manchester, UK Joined: |
Yeah - thanks for that. I was trying to see if it said on the main form that the sequences were only partial or not, I couldn't conceive of cytB being so short, but I went with the data I had.
I'll look into redoing the test later, thanks again - its useful to have someone around who is more than a keen amateur
|
|||||||||||||||||||||||
Modulous Member Posts: 7801 From: Manchester, UK Joined: |
I tried it again with DNA rather than amino acids:
RattlesnakeKangaroo Human Alligator Snapping Turtle Alligator Woodmouse Wombat Snapper Turtle Here is the results I got:
quote: So, harking back to your quote:
Humans are more related to alligator snapping turtles than a red kangaroo, aka Macropus rufus? Humans to aligator snapping turtles:Sequences (3:4) Aligned. Score: 74 Humans to kangaroo:Sequences (2:3) Aligned. Score: 76 Once we look at the actual DNA sequences the effect seems to go away. Still - we are suffering from a lack of complete DNA sequences here, I discarded the crocodile due to small sample size. I predict that if we have the full DNA sequence for cytochrome b for Alligator Snapping Turtles, the results will show us to be closer to the kangaroo. That said, it is possible for the opposite to be true, but I predict that when further protiens are examined, the result will be shown to be an anomolous outlier.
|
|||||||||||||||||||||||
randman  Suspended Member (Idle past 4925 days) Posts: 6367 Joined: |
Modulous, I don't doubt we are making mistakes, as WK, pointed out, but it seems the data is getting fairly consistent in placing turtles way too close to us compared to kangaroos, another mammal.
74 and 76 are nearly identical in some respects. I have to be honest and state I chose the turtle, rattlesnake, humans, and kangaroo because I read somewhere the data did not fit. I am probably not even a "keen amatuer" when it comes to DNA, but the creationist that made that comment was correct, it seems. Not sure what this means just yet though. I suspect we will have a better picture in a few years.
|
|||||||||||||||||||||||
Wounded King Member Posts: 4149 From: Cincinnati, Ohio, USA Joined: |
This is for Mark24 since the other thread in which we were doing sequence analysis has been closed.
There is a windows version of ClustalX which should be compatible with XP, here. It comes bundled with the NJ-plot program as well, which you could use instead of Treeview. TTFN, WK This message has been edited by Wounded King, 06-12-2005 09:07 AM
|
|||||||||||||||||||||||
mark24 Member (Idle past 5221 days) Posts: 3857 From: UK Joined: |
WK,
Yep, thank seems to run, I'll take a look later. Thanks again, Mark There are 10 kinds of people in this world; those that understand binary, & those that don't
|
|||||||||||||||||||||||
NosyNed Member Posts: 9003 From: Canada Joined: |
74 and 76 are nearly identical in some respects. What are those respects? What does having similar percentages mean? What is "too close"? What is not close enough? Are these not all relative numbers to position animals relative to each other? There are some proteins that are identical across a wide range of animals are there not? If we picked those would a series of 100's result? Would that matter or be meaningful? In what way? (I'm not called NOSY for nuttin' you know. )
|
|
|
Do Nothing Button
Copyright 2001-2023 by EvC Forum, All Rights Reserved
Version 4.2
Innovative software from Qwixotic © 2024