BigBio Notes: data

Showing posts with label data. Show all posts

Monday, 24 August 2015

Moving Bioinformatics to the Cloud

Constantly we presence new technologies being developed and streamed to public, to researchers that work with molecular biology, the ones that get our attention normally comes from new laboratory methodologies or instruments. In this post, we are going to talk about a different situation that is calling the attention of researchers who work with molecular biology, and more specifically, bioinformatics, in a different way. I’m writing about a technological innovation that comes from the computational field and can have a great impact on how we do biological analysis with bioinformatics software.

A few years ago a cloud startup called dotCloud developed a new software called Docker, to be used only internally, the software made so much success that just after two years releasing docker to the public, the newly Docker company has an estimate worth of $1 bn.

What is Docker and Why it Matters?

Docker has several ways to be employed in different environments, what it does is to basically, provide to the user isolated and containerized software that can be executed apart from the host operating system. It is very similar to what a Virtual Machine does, the difference is that there is no guest operating system. These containers use some system libraries and apply some abstraction layers to the execution of the software inside, in the end you have an isolated environment with a custom software inside that can be shared.

What this has to do with Bioinformatics?

Imagine that you are a senior researcher, or even a recently accepted student, trying to learn how to do some analysis. You are a lab specialist but computers are not your thing. Now imagine that the software you are trying to run needs a Linux operating system with a gcc compiler version 4.9.3 and some libraries like GD. Sounds bad right? That’s where Docker comes in. Docker allows developers to ship software inside a container, that is, a custom environment with all the necessary tools and configuration to run a specific program, what you have to do if just download the container and execute the program inside. Running a Docker container is just as simple as running a program in the command line.

Benefits for Bioinformatics

For a bioinformatician this brings several other benefits. Something that is getting attention today is how to deal with reproducible research in the bioinformatics field. Different computers with different configurations, libraries and software versions can produce different results when comparing results from different software. If we had the chance to transform the environment variable into a constant, that problem would be reduced a lot.

The BioDocker Project

In 2014, a new project called BioDocker was founded. Recently, the project assumed a community-driven policy, the main idea is to get feedback from the community and to enjoy the specialty of each member. The goal here is to provide containerized bioinformatics tools to the general public. For developers bioinformaticians, the project also provides specifications, settings and guidelines on how to produce your own Biodocker containers. Defining guidelines like that we hope that the use of Docker become more common, helping people to deal more easily with different software and to reduce the problem with the reproducible research.

Wrapping up

Docker is a new technology that is gaining a lot of space nowadays, and slowly , it is getting some space in the bioinformatics field as well. It is definitively worth to get some time to learn how to work with it.

Tuesday, 12 November 2013

My List of Most Active Twitter Users in Proteomics

Recently, I published a list of my top influential authors in Computational proteomics. The list was created using a my PhD References and other resources such as linkedin, twitter, google scholar. I will try to do the same here using the most active twitter accounts that i follow. Twitter can be incredibly powerful for both consuming and contributing to the dialogue in your field. Twitter can be an excellent real-time source of new publications, fresh developments, and current opinion. If you like and use twitter these are some of the twitter account i follow (no order) in Proteomics:

Some Reasons to Rename my Blog as BioCode's Notes

Hi Dear Readers:

I’ve decided that it would be prudent, exposure-wise, to change the name of my professional blog to BioCode's Notes, for a number of reasons:

1. People into bioinformatics comprise a significant part of my –alas, still small- readership. They tend to be always hungry for code tips, language comparisons, and other things that do not fit neatly under the umbrella of “computational proteomics”.

2. My own work is straying more and more from computational proteomics per se into other problems linking biology (Proteomics, Genomics, Life Sciences) with programming (R, Java, Perl, C++). Biocoding is now my bread-and-butter…

3. I need a shorter, catchier name that is easy to use in coffee talks, presentations, or when sharing links with friends.

4. I also decided to add a Blog's mascot, our T-rex:
              Truth    => Science is about Truth.
Tea: UK Science.
              STaTisTics => OK, this one’s got as many ‘S’ as ‘T’, but the latter is more frequent in English.
              T-rex => The future belongs to Big Data, which we’ll use (and are already
   using) to trace back the march of evolution to our preferred
                                 species, including the dinosaurs. And last, but not least, this is
                                 Abel’s (my son) favorite animal.

Hope you enjoy this Idea
Yasset