User Tools

Site Tools


problem_2a

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
problem_2a [2010/11/20 08:37]
ike
problem_2a [2014/01/18 07:44] (current)
Line 1: Line 1:
-====== ​Problem ​2-0 & 2-1 – CDS search ​and translation (1) ======+====== ​Practices ​2-0 & 2-1 – ORF finding ​and translation (1) ======
  
-**[Problem ​2-0]**+**[Practice ​2-0]**
  
-**Define function trans-codon in the Perl that exchanges ​three sets of nucleotides into an amino acid according to codon table.**+**Define function trans-codon in Perl that exchanges ​tri-nucleotides into an amino acid according to the codon table.**
  
    
  
-**[Problem ​2-1]**+**[Practice ​2-1]**
  
-**Write a script that loads a genome ​and switch codon into an amino acid.**+**Write a script that loads a nucleotide sequence from a file and translate it to an amino acid sequence.**
 ===== Overview for two problems ===== ===== Overview for two problems =====
  
 +In this chapter, you are expected to achieve the following two points:
  
-We expect users to address on two aspects through this comprehension.+  - Predict the coding regions in a given sequence 
 +  - Translate codons into amino acids in the predicted coding region
  
-  ​Try to estimate coding regions from given sequence +In Practice 2-1, you will first implement a program for translating a codon into an amino acid. 
-  - Translate ​codon into amino acids in the estimated coding region+
  
-In the Problem 2-1, users are to implement program on essential parts of translating a codon into an amino acid.  
 ==== Codon and amino acid translation ==== ==== Codon and amino acid translation ====
  
-Basically, one gene represents one protein in central dogma. Now, how does life emerge a protein from a gene?+Basically, one gene represents one protein in central dogma. Now, how does a gene code a protein?
  
-A protein is a long chain of amino acid and each of it is composed of 20 types of amino acid which is call polypeptide. Features of amino acid are determined by order of amino acids. Therefore ​users can attain sequence of protein ​from the orders of amino acids and amino acid is characterized ​by three sets of nucleotides ​in DNA. These three sets of nucleotide is call codon. For example length ​of 300 nucleotides express ​a protein ​of 100 amino acid lengths long+A protein is a long chain of amino acids called polypeptide, ​and each of it is composed of 20 types of amino acid. Features of proteins ​are determined by the order of amino acids. Therefore, one can identify a protein ​by its amino acid sequence, ​and moreover, ​by the triplet ​nucleotides ​called the codons coding the amino acids. For example, nucleotide sequence of length 300 is translated into a protein ​with 100 amino acids.
  
-Taken all together, one unit of codon is equivalent to one unit of amino acid. Sequences of amino acid unit, or in other words codon unit compose a gene or protein.+Taken altogether, one unit of codon is equivalent to one unit of amino acid. Sequences of amino acid units, or in other words codons, ​compose a gene or protein.
  
-===== Problem ​2-0 =====+===== Practice ​2-0 =====
  
-Now, let’s translate given DNA sequence into protein sequence.+Now, let’s translate ​given DNA sequence into protein sequence.
  
-First task is to define a function that uniquely translates codon into amino acid in the Perl. +The first task is to define a function that uniquely translates ​codon into an amino acid in Perl. 
  
-Look at the codon table in a biological textbook that best describe your target species. For example search for amino acid that correspond ​with DNA sequence “ctt”; in this case, protein called leucine (Leu). ​+Look at the codon table in a biological textbook that best describe your target species. For example search for amino acid that corresponds ​with DNA sequence “ctt”; in this case, protein called leucine (Leu). ​
  
 Above example only describe one set of codon. Now let’s define a subroutine that works for all 64 patterns of codon. Above example only describe one set of codon. Now let’s define a subroutine that works for all 64 patterns of codon.
Line 43: Line 43:
  
  
-To call the subroutine, put “&” in the head of subroutine name like in following way.+To call the subroutine, put an “&” in front of subroutine name as follows.
  
-  $amino_acid=&​trans_codon('​ctt'​)+  $amino_acid=&​trans_codon('​ctt'​);
  
  
-A codon “ctt” corresponds to leucine, so character “L” that represent leucine is expected to be assigned into variable $amino_acid in sample code. For next step, it would be nice to acquire amino acids from any DNA sequence as an argument such as an argument “atgcttctggtg” returning amino acids “MLLV”. Condon table is easily described by using hash like in following way.+A codon “ctt” corresponds to leucine, so character “L” that represent leucine is expected to be assigned into variable $amino_acid in the above sample code. For next step, it would be nice to acquire amino acids from any DNA sequence as an argument such as an argument “atgcttctggtg” returning amino acids “MLLV”. Condon table is easily described by using hash as follows:
  
  
Line 65: Line 65:
  
  
-easily assign amino acid from given codon which composed of three sets of nucleotide+easily assign amino acid from given codon composed of three sets of nucleotides
  
  
-Next is to cut sequence into pieces of three nucleotides long by using for statement.+Next is to cut sequence into pieces of three nucleotides long by using the "for" ​statement.
  
  
Line 90: Line 90:
   }   }
  
-===== Problem ​2-1 =====+===== Practice ​2-1 =====
  
  
-The translator is ready so let’s make a script that loads a genome ​and switch ​codon into an amino acid by following ​process+The translator is readyso let’s make a script that loads a nucleotide sequence ​and switch ​codons ​into amino acids by the following ​processes
-  - Load target DNA sequence by the script made in the Problem ​1 and assign ​into variable $seq+  - Load target DNA sequence by the script made in the Practice ​1 and assign ​to variable $seq
   - Translate $seq into amino acid by function trans_codon()   - Translate $seq into amino acid by function trans_codon()
   - Print the result!!   - Print the result!!
Line 100: Line 100:
  
  
-===== Refine ​Problem ​2-1 ===== +===== Refine ​Practice ​2-1 ===== 
-Here, we will provide some advanced Perl technique ​to refine ​Problems ​2-0 and 2-1.+Here, we will provide some advanced Perl techniques ​to refine ​Practices ​2-0 and 2-1.
  
-There were few lines of for statement in the process of subroutine trans_codon which is highly redundant. Users can rewrite and combine these processes into a single line like in following way by saving up variables. ​+There were few lines of "for" ​statement in the process of subroutine trans_codon which is highly redundant. Users can rewrite and combine these processes into a single line as follows ​by avoiding the use of variables. ​
  
   $amino .= $CodonTable{substr($seq,​ $i, 3)};   $amino .= $CodonTable{substr($seq,​ $i, 3)};
  
-For statement also can be rewritten as following.+"​for" ​statement also can be rewritten as the following.
  
   for(?????){ $amino .= $CodonTable{substr($seq,​ $i, 3)};}   for(?????){ $amino .= $CodonTable{substr($seq,​ $i, 3)};}
   or   or
-  $amino .= $CodonTable{substr($seq,​ $i, 3)} for ?????;+  $amino .= $CodonTable{substr($seq,​ $i, 3)} for (?????);
problem_2a.txt · Last modified: 2014/01/18 07:44 (external edit)