We are pleased to announce a monograph  concerning recent developments of the computational resources at IPPT. For nearly 4 years IPPT PAN has taken part in the project “Biocentrum Ochota – computational infrastructure for strategic development of the biology and medicine”. The main goal of the project is the creation of the computational infrastructure of the consortium Biocentrum Ochota. The infrastructure should allow creation and integration of the databases and applications basing on existing, unique, computing services.

There are six institutions taking part in the project:

  • Institute of Biochemistry and Biophysics Polish Academy of Sciences (the project coordinator),
  • Nałęcz Institute of Biocybernetics and Biomedical Engineering Polish Academy of Sciences,
  • Nencki Institute of Experimental Biology Polish Academy of Sciences,
  • Mossakowski Medical Research Centre Polish Academy of Sciences,
  • Institute of Fundamental Technological Research Polish Academy of Sciences (IPPT PAN),
  • The International Institute of Molecular and Cell Biology.

The consortium Biocentrum Ochota possesses a unique in Poland scientific potential in the field of state-of-art biological research and applications of the research into medicine and biotechnology. As a result of their research, the consortium institutes have created numerous bioinformatics methods that are interesting for other research institutions, education, health service and high-tech companies.

The cooperation of the institutes within Biocentrum Ochota creates an extraordinary opportunity of setting up an integrated infrastructure with respect to the functionality and the hardware that allows for access to the applications and the databases. The access to the infrastructure has been limited so far, both due to shortage of the equipment and the possibility of cooperation in a proper range.

The completion of the project has stimulated the development of the research methodologies and better experimental data interpretation in the widely understood biomedicine and biotechnology. The synergy of the competencies of the project participants allowed for proposals of a range of services. They are dedicated to numerous groups of the potential users, namely, physicians, entrepreneurs, academic teachers, students and research scientists. In contrast to the benefits, the relatively small financial effort put on the project arises from the employment of the already existing potential accumulated in the institutions belonging to the consortium.

The integration of the already existent computational tools is designed in such a way that all instruments that are provided by the particular institutions are available via thematic portals (BioInfo, BioMed, BioTech). The portals are addressed to the user’s groups (science, education, health service, industry) without the restraints of computational resources. The necessary hardware and software were provided, installed and put into operation.

The system integrated distinctive services of the data processing and analytical devices belonging to the members of the consortium. The hardware base for the applications consists of six clusters of total computational efficiency 4 TFLOPS. The clusters are bonded with fast connections. The total disk space is about 4 Petabytes.

The network allows for access to virtual workstations of on-the-fly user-defined resources.

This all stands for GRID type network. The entire system is accessible via dedicated graphic workstations from the PC’s or workstations that are connected locally in the institutional networks. The services can be used for the users who are outside of the institutions. The access for them is provided via VPN connections that are set in the local networks of the institutions. A part of the services is available from tablets and mobile phones.

In the case of IPPT PAN the HPC cluster named “Grafen” has been implemented along with several applications and services. The cluster is the part of the Biocentrum Ochota grid fulfilling the assumptions of the project.

The main role of the “Grafen” cluster for the IPPT PAN researchers is a bridge for High Performance Computing. All of the HPC machines have a few common features. The features are the large number of processors, the large amount of memory, Unix/Linux type operation systems and queuing systems. The usage of Unix/Linux operation systems family makes it relatively easy to switch from working on local PC-computers, workstations and laptops to HPC machines. However, the condition to benefit from this change is the parallel structure of the program we wish to transfer to the HPC machine.

In most cases, the programmer develops his application with a low number of cores. Nowadays, a few cores are available even on good laptops. However, again, to develop and test the application for a higher number of cores the programmer needs a cluster of processors. The “Grafen” cluster serves as a platform for developing applications and executing smaller production runs. The cluster is used for training purposes during a course on postgraduate studies at IPPT PAN as well.

The report consists of three main parts. The first part describes the main features of the cluster, the implemented tools for the software developers, the user’s handbook and a variety of software applications belonging to both commercial and public domain groups. The second part treats on applications that either have been already done, are under development or present the offer to the potential users. The third part of the report deals with the description of the software that is being developed or has already been developed by the researchers from IPPT PAN. All this creates the system of services for research teams in the Biocentrum Ochota and beyond.

Tomasz A. Kowalewski, Eligiusz Postek (Editors)

Contents: