G::Seq Operon
SummaryIncluded librariesPackage variablesSynopsisDescriptionGeneral documentationMethods
Summary
G::Seq::Operon - Perl extension for blah blah blah
Package variables
No package variables defined.
Included modules
G::Messenger
LWP::Simple
SelfLoader
SubOpt
Inherit
Exporter
Synopsis
  use G::Seq::Operon;
  blah blah blah
Description
Stub documentation for G::Seq::Operon was created by h2xs. It looks like the
author of the extension was negligent enough to leave the stub
unedited.
Blah blah blah.
Methods
set_operonDescriptionCode
Methods description
set_operoncode    nextTop
  Name: set_operon   -   set operon information from RegulonDB

  Description:
    This program retrieves the operon information from RegulonDB, and adds this
    to the given genome data. !!!This method currently only works for E.coli!!!

    Two attributes are added to each CDS hash.
        $genome->{$cds}->{operon}
    contains the name of the operon to which the gene belongs, and 
        $genome->{$cds}->{operonN}
    contains the rank order of the gene within the operon.

  Usage:
    set_operon($gb);

 Options:
   None.

  References:
   1. Salgado H et al. (2006) "RegulonDB (version 5.0): Escherichia coli K-12 transcriptional
      regulatory network, operon organization, and growth conditions", Nucleic Acids Res. 
      1;34(Database issue):D394-7

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20070829-01 patched to match latest version of RegulonDB formamt (patch by Hiroyuki Nakamura <t04632hn@sfc.keio.ac.jp>
20061003-01 updated to use data from RegulonDB
20020207-01 initial posting
Methods code
set_operondescriptionprevnextTop
sub set_operon {
    my @args = opt_get(@_);
    my $gb = shift @args;

    if ($gb->{LOCUS}->{id} eq 'U00096' || $gb->{LOCUS}->{id} eq 'NC_000913'){

	my $url = "http://regulondb.ccg.unam.mx:80/data/OperonSet.txt";
	my $dir = $ENV{HOME} . '/.glang/data/OperonSet.txt';
	mirror($url, $dir);
	die("setOperon: cannot retrieve data from RegulonDB.") unless(-e $dir);

	my $flag = 0;
	open(FILE, $dir) || die($!);
	while (<FILE>) {
	    chomp;

	    if (/^Columns\:/) {
	      $flag++;
	      next;
	    }
	    elsif(/^\t\(\d\)\s/) {
	      $flag++;
	      next;
	    }

	    if($flag == 5){

		my %geneOrder;

		my ($operon, $num, $direction, $genes) = split(/\t/, $_, 4);
		next unless($num >= 2);

		foreach my $genepair (split(/,/, $genes)){
		    my ($gene, $locustag) = split(/\|/, $genepair, 2);
		    my $cds = $gb->gene2id($locustag);

		    $cds = $gb->gene2id($gene) unless(length $cds);

		    if($cds){
			$gb->{$cds}->{operon} = $operon;
			$geneOrder{$cds} = $gb->{$cds}->{start};
		    }
		}

		my $i = 1;
		if($direction eq 'forward'){
		    foreach my $cds (sort {$geneOrder{$a} <=> $geneOrder{$b}} keys %geneOrder){
			$gb->{$cds}->{operonN} = $i;
			$i ++;
		    }
		}else{
		    foreach my $cds (sort {$geneOrder{$b} <=> $geneOrder{$a}} keys %geneOrder){
			$gb->{$cds}->{operonN} = $i;
			$i ++;
		    }
		}

	    }else{
		msg_error($_, "\n");
	    }
	}
	close(FILE);

	foreach my $cds ($gb->cds()){
	    $gb->{$cds}->{operonN} = 0 unless(length $gb->{$cds}->{operon});
	}

   }else{
       msg_error("No Operon data for this species.\n\n");
   }

    return 1;
}
General documentation
AUTHORTop
A. U. Thor, a.u.thor@a.galaxy.far.far.away
SEE ALSOTop
perl(1).