How to fail your deadline? – Starting learning MySQL thinking you can master it in one day!

[#### 1. How to fail your deadline? – Starting learning MySQL thinking you can master it in one day!]

Recently, I touched upon MySQL because I wanted it to be the backend for my incoming Django-based data browser. After 4 days of prolonged fiddling I finally manage to make some notes of its functionality. It is very fast, but equally importantly, very different from Python! The other lesson is that one should allocate enough time for learning when starting a new language (depending on [single linkage distance](https://en.wikipedia.org/wiki/Single-linkage_clustering))

Today’s blog will go through:
1. Loading a table into MySQL, be it fixed-width or regularly delimited.
2. Create relations between entries.
3. How to deal with errors, and fix your relationship:

Continue reading “How to fail your deadline? – Starting learning MySQL thinking you can master it in one day!”

How to make RasMol work on Mac OS

Due to some unhappy events I had to switch to a MacBook recently. This lead to a very interesting adventure in the new world of Mac OS.

The main goal was to have RasMol and rasscripts set up in a way that will open the DomChop .rasscript file with one click.

Installing RasMol itself required some knowledge of command line, however since Mac OS is Unix-based it was not too hard.

First I had to download the RasWin binaries from the official website.

Continue reading “How to make RasMol work on Mac OS”

Quick guide to collaborative development of scientific code

As summer students in the Orengo group we will be working together on a set of projects related to the CATH database and collaboration is key to ensuring the success of our projects.

First step towards good collaboration is defining a set of rules that everyone should adhere to. This could facilitate the process of code review and minimize the time spent writing code by making the code reusable.

Write code for people, not just for computers.

Continue reading “Quick guide to collaborative development of scientific code”

Python modules are fun, once you understand it


[#Python modules are fun, once you understand it]

#### Brief:
Having the privilege of joining Orengo’s group as a summer student, I feel urged to improve my coding as well as documentation skills. Thus I intend to log my coding and thinking in a series of blog posts, on a weekly basis.

Today’s blog post will cover some basics of organising Python code, in the hope of easing future maintenance. We will go through:

Continue reading “Python modules are fun, once you understand it”

Varnishing all my troubles away

TL;DR

  • Varnish routes incoming web traffic to a port on backend server
  • Page shows: “Error 503 Backend fetch failed”
  • Problem was SELinux setup
  • Actually investigate/fix SELinux issue rather than turning it off

Varnish?

For our research work at UCL, we host a bunch of different web sites, web services and applications that run on a bunch of different ports on a bunch of different backend machines (and virtual machines). All requests arrive on a single IP, and we use varnish to sit on the frontline (port 80) and make sense of the incoming traffic.

Varnish is a web accelerator – it sits in front of whatever is actually generating the content for your web pages and caches whatever content it deems safe to cache. The next time someone requests that same page, the content is served from the cache (fast) rather than going off and generating content from the backend (slow). So it speeds up your web pages and generally reduces load on your backend databases and applications.

This is all great, but varnish also provides a really simple and flexible tool for routing traffic to different backends, which is the actually the point of this post.

What’s the problem?

I eventually managed to get round to moving our frontline varnish server from a decaying machine running CentOS 4(!) to a brand new VM running CentOS 7. This allowed varnish to be upgraded from v2.0 to version 4.1 which required a few minor adjustments, but nothing too crazy.

I did get stuck with one app that wasn’t working – the following varnish config was meant to direct traffic through to a backend application listening to port 5001 on a backend server.

vcl 4.0;
backend myapp_server_5001 {
  .host = "123.456.789.123";
  .port = "5001";
}

sub vcl_recv {
  if ( req.http.host == "myapp.domain.com" ) {
    set req.backend_hint = myapp_server_5001;
    return (pass);
  }
}

Checking the web page “myapp.domain.com” just gave me the standard Varnish error:

Error 503 Backend fetch failed

It looked like the varnish couldn’t contact the backend server, however I could sit on the same server that varnish was running from and access the web page just fine.

$ ssh varnishserver
$ wget http://myapp_server:5001/
Connecting to 123.456.678.123:5001... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5294 (5.2K) [text/html]
Saving to: ‘index.html’

So…

  • The application was running on the backend server
  • I could retrieve the content from the varnish server directly (wget)
  • I couldn’t retrieve this content through varnish
  • lots of other varnish rerouting was working

GIYF

Googling around suggested that the problem might be security settings in SELinux. Which took me to a nice blog post about how to get varnish to play nicely with SELinux.

In my experience, SELinux generates an incredibly strong SEP field: my general practice has been to turn SELinux into permissive mode and rely on our main firewall (SEP) to deal with security issues. This isn’t as terrible as it sounds (our IT team were okay with it), but it’s not great.

With this being a genuinely front-facing server, I figured I should actually do the right thing and learn how to get SELinux working properly. Turns out it really wasn’t that hard.

Is my problem related to SELinux?

Easiest way to find out:

$ ssh varnishserver
$ sudo grep varnish /var/log/audit/audit.log

This showed a bunch of output like:

type=AVC msg=audit(1478175339.950:37802): avc: denied { name_connect } for pid=9111 comm="varnishd" dest=5001 scontext=system_u:system_r:varnishd_t:s0 tcontext=system_u:object_r:commplex_link_port_t:s0 tclass=tcp_socket

So, yes – my problem does seem to be related to SELinux.

How do I fix my SELinux problem (without just turning the whole thing off)?

Turns out the clever people on the interwebz have written a tool audit2allow to help troubleshoot this kind of thing. This can be installed through the setroubleshoot package (which kind of makes sense).

$ sudo yum install setroubleshoot

This tool can be used to translate the output of the audit log to a more useful message:

$ sudo grep varnishd /var/log/audit/audit.log | audit2allow -w -a

Which provides messages like:

type=AVC msg=audit(1478177584.127:38275): avc: denied { name_connect } for pid=9118 comm="varnishd" dest=5001 scontext=system_u:system_r:varnishd_t:s0 tcontext=system_u:object_r:commplex_link_port_t:s0 tclass=tcp_socket
 Was caused by:
 The boolean varnishd_connect_any was set incorrectly. 
 Description:
 Allow varnishd to connect any

Allow access by executing:
 # setsebool -P varnishd_connect_any 1

Now of course I read up on exactly what this command will do before executing it (no, really).

$ setsebool -P varnishd_connect_any 1
$ systemctl restart varnish

Sorted.

Now I just need to add all this to the puppet configuration…

The Minimal Cancer Network

A metaphor is a figure of speech that describes a subject by comparing it to another otherwise unrelated object. In addition to their use as a figure of style in speech and writing, metaphors are very useful to help us understand complex subject matters.

What does a metaphor do? Basically we try to explain something by evoking familiar images of other things; sometimes we even invent these images of simple familiar things, in order to help ourselves grasp some meaning of complex, difficult material we need to deal with. I believe that the meaning of certain diverse, and complex concepts can be grasped only with the help of metaphors.

Cancer is a set of very similar yet different complex diseases and a complex biological system. War and battle metaphors stand out when we look at cancer, the diseases. These metaphors have been extensively discussed (see here or here). On the one hand, these metaphors should help patients to confront the physical and psychological ordeals they need to go through. On the other hand, it is the aim of researchers to achieve an understanding of the system deep enough to assist them in devising strategies to cure and prevent the diseases summarised as cancer.

When researchers try to understand and describe cancer, they have to deal with a challenge of communication: instead of one description, that gives a specific understanding of cancer we end up with a collection of metaphors related to cancer. In a recent BioEssays paper, Solé et al. review the many-sided views of cancer going from ecological systems to swarms. They focus on their own perspective of a cancer cell: a molecular network operating in an optimal instability level. Molecular networks are the Ace of metaphors in describing biological complex systems; their fundamental assumption is that one can describe the biological system, i.e. the cell, in terms of its molecular components and the relationships between the components of the network. Networks show emergent properties that are correlated with the behaviour of the cell.

The genetic networks in cancer cells considered by Solé et al. are reduced version of the genetic networks in healthy cells. Cancer gets rid of many network components which keep healthy cells living and working together “giving place to minimal set of intracellular components able to operate in a robust manner under noisy conditions”. In this sense, cancer cells reduce their complexity and become individuals in competition for rapid population growth rather than parts of a team working together: “cancer cells [revert] to unicellular selfishness, as a major transition from a cohesive system to individuality”. However, this involves a loss of stability that would threaten the cancer cell’s survival. Thus ,the population of cancer cells thrive by getting close to an optimal instability level in which cell proliferation is maximised but not enough to trigger the mechanisms of cell degradation and cancer death.

What is the benefit of the idea posed by Solé et al. of a minimal cancer network devoted to self-replication? It offers several testable hypotheses that would define a robust and useful metaphor of cancer, the system. One that could guide us through the path to reduce the devastation caused by cancer, the diseases. However, it is important to keep in mind that the “minimal cancer network” is just a metaphor (although a very useful one) that do not provide a complete understanding of cancer.

Solé RV et al. Can a minimal replicating construct be identified as the embodiment of cancer? BioEssays 36: 503–12 2014 DOI: 10.1002/bies.201300098

Perl for Bioinformatics: Day 2 – querying a database (DBI)

So in Day 1, we learned how to use Perl to parse a file. Today we are going to learn how to extract information from a database.

A database is an organised collection of data. Since lots of Bioinformatics resources store their data in a database, it’s pretty useful to find out early on how to go about using them.

There are lots of different types of databases (e.g. MySQL, PostgreSQL, Oracle) and each of them has slight differences in the way that you interect with them. To make life easier, the good people of Perl have written a library called DBI that provides a common way of accessing them (feel free to have a good look around the DBI documentation on CPAN and come back when you’re ready).

Accessing a database with DBI

The following script provides a very simple example of how you might go about using DBI libary to extract data from your database. We are extracting OMIM data from one of our local Oracle databases, but you should be able to see how it can be extended to your own situation.

Note: you’ll need to ask your database administrator for suitable values to replace ‘??????’

#!/usr/bin/env perl

use strict;
use warnings;

use DBI;

# information that we need to specify to connect to the database
my $dsn         = "dbi:Oracle:host=?????;sid=?????";  # what type of database (Oracle) and where to find it (sinatra)
my $db_username = "?????";                            # we connect as a particular user
my $db_password = "?????";                            # with a password

# connect to the database
my $gene3d_dbh = DBI->connect( $dsn, $db_username, $db_password )
	or die "! Error: failed to connect to database";

# this is the query that will get us the data
my $omim_sql = <<"_SQL_";
SELECT
	OMIM_ID, UNIPROT_ACC, RESIDUE_POSITION, NATIVE_AA, MUTANT_AA, VALID, DESCRIPTION, NATIVE_AA_SHORT
FROM
	gene3d_12.omim
WHERE
	valid = 't'
_SQL_

# prepare the SQL (returns a "statement handle")
my $omim_sth = $gene3d_dbh->prepare( $omim_sql )
	or die "! Error: encountered an error when preparing SQL statement:\n"
		. "ERROR: " . $gene3d_dbh->errstr . "\n"
		. "SQL:   " . $omim_sql . "\n";

# execute the SQL
$omim_sth->execute
	or die "! Error: encountered an error when executing SQL statement:\n"
		. "ERROR: " . $omim_sth->errstr . "\n"
		. "SQL:   " . $omim_sql . "\n";

# go through each row
while ( my $omim_row = $omim_sth->fetchrow_hashref ) {
	printf "%-10s %-10s %-10s %-10s %-10s %s\n",
		$omim_row->{OMIM_ID},
		$omim_row->{UNIPROT_ACC},
		$omim_row->{RESIDUE_POSITION},
		$omim_row->{MUTANT_AA},
		$omim_row->{NATIVE_AA},
		$omim_row->{DESCRIPTION}
		;
}

This prints out:

100650     P05091     504        LYS        GLU        ALCOHOL SENSITIVITY - ACUTE ALCOHOL DEPENDENCE - PROTECTION AGAINST - INCLUDED;; HANGOVER - SUSCEPTIBILITY TO - INCLUDED;; SUBLINGUAL NITROGLYCERIN - SUSCEPTIBILITY TO POOR RESPONSE TO - INCLUDED;; ESOPHAGEAL CANCER - ALCOHOL-RELATED - SUSCEPTIBILITY TO - INCLUDED ALDH2 - GLU504LYS (dbSNP rs671)
100690     P02708     262        LYS        ASN        MYASTHENIC SYNDROME - CONGENITAL - SLOW-CHANNEL CHRNA1 - ASN217LYS
100690     P02708     201        MET        VAL        MYASTHENIC SYNDROME - CONGENITAL - SLOW-CHANNEL CHRNA1 - VAL156MET
...

Improvements

The first thing to notice was that this was quite a lot of typing: writing out the SQL, setting up database handles/statement handles, checking return values, printing out decent error messages, etc. Lots of typing means lots of code to maintain and far more chance of repeating yourself (which you really shouldn’t be doing).

When faced with the prospect of lots of typing, any decent (i.e. lazy) programmer will be instantly thinking about how they can avoid it: what shortcuts they can make, what libraries they can reuse. As luck would have it the good people of Perl have already thought of this and come up with DBIx::Class which will be the basis of a future post.

Discussion

There is a lot of value in understanding how raw DBI works. However, when you start writing and maintaining your own code, there is a huge amount of value in using a library (such as DBIx::Class) that builds on DBI and helps to keep you away from intereacting with DBI directly.

Perl for Bioinformatics: Day 1 – parsing a file

You don’t have to hang around too long in a Bioinformatics lab before someone asks you to parse data from a <insert your favourite data format here> file. Since we’ve just had some people join the lab who are new to coding – parsing a file seemed a good place to start.

The following is intended as a “Day 1” introduction to a typical Bioinformatics task in Perl.

Caveats

Some things to take into account before we start:

  1. It’s very likely that somebody, somewhere has already written a parser for your favourite data format. It’s also likely that they’ve already gone through the pain of dealing with edge cases that you aren’t aware of. You should really consider using their code or at least looking at how it works. If you’re writing in Perl (and in this case, we are) then you should have a rummage around CPAN (http://www.cpan.org) and BioPerl (http://www.bioperl.org).
  2. The following script is not intended as an example of “best practice” code – the intention here is to keep things simple and readable.

Getting the data

Okay so it’s our first day and we’ve just been asked to do the following:

Parse “genemap” data from OMIM

Err.. genemap? OMIM? If in doubt, the answer is nearly always the same: Google Is Your Friend.

Googling “download OMIM” get us what we want. Now we have just have to read the instructions, follow the instructions, fill in the forms, direct your web browser at the link that gets sent in an email, download the data via your web browser.

If you get stuck, don’t be afraid to ask – either the person sitting next to you or by emailing the “contact” section of the website you’re using. However, also remember that you are here to do research – and a lot of that comes down to rummaging around, trying to figure stuff out for yourself.

It’s really useful to keep things tidy so we’re going to create a local directory for this project by typing the following into a terminal (note: lines that start with ‘#’ are comments, stuff that comes after the ‘>’ are linux commands).

# go to my home directory
> cd ~/

# create a directory that we're going to work from
> mkdir omim_project

# move into to this new directory
> cd omim_project

# create a directory for the data 
# note: the date we downloaded the data will definitely be useful to know
> mkdir omim_data.2014_09_16

# look for the files we've just downloaded
> ls -rt ~/Downloads

# copy the ones we want into our data directory
> cp ~/Downloads/genemap ./omim_data.2014_09_16

Step 1. Setting up the script

Now we can write our first Perl script which is going to parse this file – i.e. extract the data from the text file, organise the data into a meaningful structure, output the information we need.

There are loads of different text editors you can use – I’m assuming you have access to ‘kate’.

# open up 'kate' with a new file for our script called 'parse_genemap.pl'
> kate parse_genemap.pl

Here’s the first bit of code – we’ll go through it line by line.

#!/usr/bin/env perl

use strict;
use warnings;

use File::Basename qw/ basename /;

# 'basename' is imported from File::Basename
my $PROGNAME = basename( $0 );

my $USAGE =<<"_USAGE";
usage: $PROGNAME <genemap_file>

Parses OMIM "genemap" file

_USAGE

my $genemap_filename = shift @ARGV or die "$USAGE";

Line 1 (called ‘hashbang’) tells the linux terminal that we want this file to be run as a Perl script.

#!/usr/bin/env perl

The next commands make sure that we find out straight away if we’ve made any mistakes in our code. It’s generally a good thing for our programs to “die early and loudly” as soon as a problem happens. This makes debugging much easier when things get more complicated.

use strict;
use warnings;

The following command imports a function ‘basename’ that we’ll use to get the name of the current script.

use File::Basename qw/ basename /;

Note: you can find out lots more about what a module does by entering the following into a terminal:

perldoc File::Basename

Perl put lots of useful variables into special variables. To get the full path of the script we are currently running, we can use ‘$0’.

This is what Perl’s documentation pages have to say about it:

$0
Contains the name of the program being executed.

Feeding this into ‘basename’ will take the directory path off the script and just leave us with the script name (i.e. ‘parse_genemap.pl’). This is handy when we want to provide a simple note on how this script should be run.

# 'basename' is imported from File::Basename
my $PROGNAME = basename( $0 );

my $USAGE =<<"_USAGE";
usage: $PROGNAME <genemap_file>

Parses OMIM "genemap" file

_USAGE

Step 2. Gather data from the command line

We’ve set this program up to take a single argument on the command line which will be the location of the ‘genemap’ file to parse. This gives us some flexibility if we want to parse different genemap files, or if the genemap files are likely to move around in the file system.

The arguments on the command line are stored in another special variable called ‘@ARGV’. The ‘@’ symbol means this is an array (or set of values) rather than a single value. We’ll use the built-in function ‘shift’ to get the first command line argument from that list.

my $genemap_filename = shift @ARGV or die "$USAGE";

If the list is empty then it means we’ve run the script without any arguments. If this happens we want to end the progam with a useful message on what the script is doing and how it should be run.

Step 3. Reading the data

The following creates a “file handle” that can be used for reading and writing to a file. There are lots of ways of creating file handles in Perl (I suggest looking at ‘Path::Class’).

# create a file handle that we can use to input the contents of
# the genemap file
# (and complain if there's a problem)
# note: '<' means "input from this file" in linux shells

open( my $genemap_fh, '<', $genemap_filename )
or die "! Error: failed to open file $genemap_filename: $!";

Again, if there’s a problem (e.g. the file we are given doesn’t exist) then we want to know about it straight away with a sensible error message.

Now we are going to read the file line-by-line and create a data structure for each row. Most of the following code is just made up of comments.


# create an array that will contain our genemap entries
my @genemap_entries;

# go through the file line by line
while( my $line = $genemap_fh->getline ) {

  # an example line from file 'genemap' looks like:
  # 1.1|5|13|13|1pter-p36.13|CTRCT8, CCV|P|Cataract, congenital, Volkmann type||115665|Fd|linked to Rh in Scottish family||Cataract 8, multiple types (2)| | ||

  # the keys for each column are specified in 'genemap.key':
  # 1  - Numbering system, in the format  Chromosome.Map_Entry_Number
  # 2  - Month entered
  # 3  - Day     "
  # 4  - Year    "
  # 5  - Cytogenetic location
  # 6  - Gene Symbol(s)
  # 7  - Gene Status (see below for codes)
  # 8  - Title
  # 9  - Title, cont.
  # 10 - MIM Number
  # 11 - Method (see below for codes)
  # 12 - Comments
  # 13 - Comments, cont.
  # 14 - Disorders (each disorder is followed by its MIM number, if
  #      different from that of the locus, and phenotype mapping method (see
  #      below).  Allelic disorders are separated by a semi-colon.
  # 15 - Disorders, cont.
  # 16 - Disorders, cont.
  # 17 - Mouse correlate
  # 18 - Reference

  # split up the line based on the '|' character
  # note: we use '\|' since writing '|' on its own has a special meaning
  my @cols = split /\|/, $line;

  # create a HASH / associative array to provide labels for these values
  # note: arrays start from '0' so we take one away from the columns mentioned above
  my %genemap_entry = (
    id                 => $cols[0],
    month_entered      => $cols[1],
    day_entered        => $cols[2],
    year_entered       => $cols[3],
    date_entered       => "$cols[2]-$cols[1]-$cols[3]",   # "Day-Month-Year"
    cytogenic_location => $cols[5],
    gene_symbol        => $cols[6],
    # add more labels for the rest of the columns
  );

  # put a *reference* to this HASH onto our growling array of entries
  push @genemap_entries, \%genemap_entry;
}

It’s really important to add useful comments into your code. Not just what you are doing, but why you are doing it. In a few months time, you won’t remember any of this and if you don’t put these comments in, you’ll need to figure it out all over again.

Step 5. Process the data

Usually we would want to do something interesting with the data – such as filter out certain rows, sort these entries, etc. This would be a good place to do it, but we’ll save that for a different day.

Step 6. Output the data

We’re going to check that everything has done okay by simply printing out the entries that we’ve parsed from the file. Again, the code has lots of comments so I won’t go through it line by line.

# note: the following section is going to print out the following:
#
#   1.1    13-5-13          CTRCT8, CCV
#   1.2    25-9-01      ENO1, PPH, MPB1
#   1.3   22-12-87          ERPL1, HLM2
#   ...        ...                  ...
# 24.51    25-8-98        GCY, TSY, STA
# 24.52    20-3-08                DFNY1
# 24.53     8-2-01                  RPY
#
# Number of Genemap Entries: 15037
#

# go through these entries one by one...
foreach my $gm_entry ( @genemap_entries ) {
# we can use the keys that we defined when creating the HASH
# to access the values for each entry in a meaningful way
# note: $gm_entry is a HASH *reference*
#       to access the data in the HASH: $gm_entry->
printf "%5s %10s %20s\n", $gm_entry->{ id }, $gm_entry->{ date_entered }, $gm_entry->{ cytogenic_location };
}

print "\n"; # new line
print "Number of Genemap Entries: ", scalar( @genemap_entries ), "\n";
print "\n";

All done.

Here’s the listing of the program in full:

 

#!/usr/bin/env perl

use strict;
use warnings;

use File::Basename qw/ basename /;

# 'basename' is imported from File::Basename
my $PROGNAME = basename( $0 );

my $USAGE =<<"_USAGE";
usage: $PROGNAME <genemap_file>

Parses OMIM "genemap" file

_USAGE

my $genemap_filename = shift @ARGV or die "$USAGE";

# create a file handle that we can use to input the contents of
# the genemap file
# (and complain if there's a problem)
# note: '<' means "input from this file" in linux shells

open( my $genemap_fh, '<', $genemap_filename )
or die "! Error: failed to open file $genemap_filename: $!";

# create an array that will contain our genemap entries
my @genemap_entries;

# go through the file line by line
while( my $line = $genemap_fh->getline ) {

  # an example line from file 'genemap' looks like:
  # 1.1|5|13|13|1pter-p36.13|CTRCT8, CCV|P|Cataract, congenital, Volkmann type||115665|Fd|linked to Rh in Scottish family||Cataract 8, multiple types (2)| | ||

  # the keys for each column are specified in 'genemap.key':
  # 1  - Numbering system, in the format  Chromosome.Map_Entry_Number
  # 2  - Month entered
  # 3  - Day     "
  # 4  - Year    "
  # 5  - Cytogenetic location
  # 6  - Gene Symbol(s)
  # 7  - Gene Status (see below for codes)
  # 8  - Title
  # 9  - Title, cont.
  # 10 - MIM Number
  # 11 - Method (see below for codes)
  # 12 - Comments
  # 13 - Comments, cont.
  # 14 - Disorders (each disorder is followed by its MIM number, if
  #      different from that of the locus, and phenotype mapping method (see
  #      below).  Allelic disorders are separated by a semi-colon.
  # 15 - Disorders, cont.
  # 16 - Disorders, cont.
  # 17 - Mouse correlate
  # 18 - Reference

  # split up the line based on the '|' character
  # note: we use '\|' since writing '|' on its own has a special meaning
  my @cols = split /\|/, $line;

  # create a HASH / associative array to provide labels for these values
  # note: arrays start from '0' so we take one away from the columns mentioned above
  my %genemap_entry = (
    id                 => $cols[0],
    month_entered      => $cols[1],
    day_entered        => $cols[2],
    year_entered       => $cols[3],
    date_entered       => "$cols[2]-$cols[1]-$cols[3]",   # "Day-Month-Year"
    cytogenic_location => $cols[5],
    gene_symbol        => $cols[6],
    # add more labels for the rest of the columns
  );

  # put a *reference* to this HASH onto our growling array of entries
  push @genemap_entries, \%genemap_entry;
}

# note: the following section is going to print out the following:
#
#   1.1    13-5-13          CTRCT8, CCV
#   1.2    25-9-01      ENO1, PPH, MPB1
#   1.3   22-12-87          ERPL1, HLM2
#   ...        ...                  ...
# 24.51    25-8-98        GCY, TSY, STA
# 24.52    20-3-08                DFNY1
# 24.53     8-2-01                  RPY
#
# Number of Genemap Entries: 15037
#

# go through these entries one by one...
foreach my $gm_entry ( @genemap_entries ) {
# we can use the keys that we defined when creating the HASH
# to access the values for each entry in a meaningful way
# note: $gm_entry is a HASH *reference*
#       to access the data in the HASH: $gm_entry->
printf "%5s %10s %20s\n", $gm_entry->{ id }, $gm_entry->{ date_entered }, $gm_entry->{ cytogenic_location };
}

# let people know how many entries we've processed
print "\n"; # new line
print "Number of Genemap Entries: ", scalar( @genemap_entries ), "\n";
print "\n";

Prof Christine Orengo elected as member of EMBO

We are very pleased to announce that Prof Christine Orengo has been elected as a member of the European Molecular Biology Organisation (EMBO). EMBO is an organisation that promotes excellence across all aspects of the life sciences through courses, workshops, conferences and publications.

Prof. Orengo was one of 106 “outstanding researchers in the life sciences” that were elected to be EMBO members in 2014.

EMBO Director, Maria Leptin, spoke about the strategic decision to expand the scope of the membership and encourage collaborations across traditional scientific divides, “Great leaps in scientific progress often arise when fundamental approaches like molecular biology are applied to previously unconsidered or emerging disciplines. Looking forward, we want to ensure that all communities of the life sciences benefit from this type of cross-pollination.”

amazon cloud

Theres a nice new tool for using the Amazon Cloud:

http://star.mit.edu/cluster/index.html

 

The same group also have a PDB viewer:

http://star.mit.edu/biochem/index.html