High Performance Bioinformatics

En
It

High Performance Bioinformatics

Description:

This course focuses on the development and execution of bioinformatics pipelines and on their optimization with regards to computing time and disk space. In an era where the data produced per-analysis is in the order of terabytes, simple serial bioinformatic pipelines are no longer feasible. Hence the need for scalable, high-performance parallelization and analysis tools which can easily cope with large-scale datasets. To this end we will study the common performance bottlenecks emerging from everyday bioinformatic pipelines and see how to strike down the execution times for effective data analysis on current and future supercomputers.
As a case study, two different bioinformatics pipelines (whole-exome and transcriptome analysis) will be presented and re-implemented on the supercomputers of Cineca thanks to ad-hoc hands-on sessions aimed at applying the concepts explained in the course.

Skills:

By the end of the course each student should be able to:

Manage the transfer of big data files and/or large number of files from the local computer to the Cineca platforms and vice versa
Prepare the environment to analyse big amount of biological data on a supercomputer
Run single parallel bioinformatic programs on a supercomputer
Combine bioinformatics applications into pipelines on a supercomputer

Target audience:

Biologists, bioinformaticians and computer scientists interested in approaching large-scale NGS-data analysis for the first time.

Pre-requisites:

Basic knowledge of python and shell command line. A very basic knowledge of biology is recommended but not required.

Rivolto a:

Aziende

Enti di ricerca

Università

Area:

HPC - Linguaggi di programmazione

HPC - Supercalcolo per le Scienze

Durata:

3 dd

Modalità di erogazione disponibili:

Corso in presenza

Prossime edizioni

Non sono previste edizioni di questo corso.

Domande?

Per i corsi di supercalcolo e computer graphics scrivi a corsi.hpc@cineca.it

CINECA

Cineca è un Consorzio Interuniversitario senza scopo di lucro formato da 120 enti pubblici, università italiane, Enti di Ricerca Nazionali, il Ministero dell'Università e della Ricerca.

Costituito nel 1969, oggi il Cineca è il maggiore centro di calcolo in Italia, uno dei più importanti a livello mondiale. Operando sotto il controllo del Ministero dell'Istruzione dell'Università e della Ricerca, offre supporto alle attività della comunità scientifica tramite il supercalcolo e le sue applicazioni, realizza sistemi gestionali per le amministrazioni universitarie e il MUR, progetta e sviluppa sistemi informativi per pubblica amministrazione, sanità e imprese.

Vai al sito Cineca