
Re: Purple Fluorescent Protein
TL;DR
Open reading frame is basically a DNA sequence that does not have a stop codon in it. Long enough open reading frames starting with a start codon (ATG, also methionine) can be used to predict proteins in DNA sequences.
I don't know how familiar you are with molecular genetics but I can add a longer explanation here too.
When you have a DNA sequence it is given in one stranded form in 5' -> 3' direction (see
DNA structure). DNA codes for amino acids in codons of three bases / amino acid, so one DNA molecule can be read in 3 different frames (+1, +2, +3). As DNA is a double helical molecule the other strand can also code for proteins in 3 frames (-1, -2, -3) in the other direction (see
genetic code). Now an open reading frame is a part of a reading frame (eg. +1) that ends in stop codon (TAA, TAG, TGA). This way for a given sequence you will have multiple open reading frames that are basically the parts between stop codons, and of course in all six frames separately. Proteins are (usually) coded in open reading frames beginning with ATG. This enables one to predict protein sequences in ORFs that beging with ATG and are long enough.
When you want to clone something and express it in a recombinant construct you need to build an ORF starting witg ATG and ending with TAA, TAG or TGA. If you add suitable SD-sequence before the start codon the ORF can be translated into protein from the mRNA. And for the mRNA to be transcribed you need promoter in the DNA sequence.
I suggest that you find a ready made DNA construct and try to find these kind of elements in the sequence. This way it is possible to see how these constructs are built and learn to design them yourself (biohacking shall I say). For starters you could look for example at pAcGFP1 plasmid that can be found
here in lablife.org's vector database. They have nice visualizations of the elements in the plasmids and you can also see the sequence from the link. In the pAcGFP1 example you can see two bigger ORFs, both in same direction (the blue arrows). One of them is coding for the protein needed for ampicillin recistance (selection marker) and one includes GFP. If you look at the GFP ORF you see that its ORF does not start from the GFP's start codon, but earlier. This is because there has been added a mutiple cloning site in the same open reading frame before the GFP sequence (That plasmid is actually for cloning C-terminally GFP tagged fusion proteins).
I hope this helps you, and also teaches something
