The Bioinformatics unit assists research groups in molecular-biology-related fields by providing scientific data mining, sequence analysis services, software infrastructure, and training in bioinformatics.

Data analysis

We investigate large functional genomics and high-throughput biological datasets. Assistance is provided in experimental design and subsequent analysis of next-generation sequencing, microarray, and mass-spectrometry-based proteomics experiments. The current focus is on the analysis of small RNA-Seq, mRNA-Seq and haploid ES cell screen data. Gene lists derived from publicly available studies or generated from in-house high-throughput experiments (NGS, microarray, proteomics) are analyzed for the overrepresentation of pathways, GO-terms, functional domains, or placed in interaction networks to visualize their relationships. Genome-wide expression patterns are contextualized with known processes and pathways using Gene Set Enrichment Analysis (GSEA). Local instances of integrated model organism databases and genome annotation portals permit visualization and analysis of in-house data with dedicated resources and additional privacy. User-driven data exploration is supported by the Ingenuity Pathway Analysis System. Project-specific hands-on trainings on applications for computational biology is provided on individual basis.

Sequence analysis

Key insights into the molecular mechanisms of a protein's function are obtained by integrated sequence/structure analysis. This includes exploration of the protein family space, multiple sequence alignments, discovery of deterministic motifs, combined with fold recognition, homology modeling, 3D structure representation and analysis. The study of evolutionary relationships (phylogenetic reconstruction, orthology assignments, remote homology detection) provides complementary information for understanding functional conservation and diversity. Common nucleotide sequence analysis tasks comprise sequence alignment, promotor and gene identification, motif discovery and enrichment analysis, prediction of transcription factor binding sites, conservation detection, phylogenetic footprinting, de-novo repeat identification and de-novo assembly. We provide genome-wide CRISPR gDNA design that can be enhanced by gene properties and functional annotations.

Computing Cluster

For heterogeneous computational tasks, we manage a computing cluster with a state-of-the-art processing system using batch and parallel computing environments. 

A short introduction to IMP/IMBA to the usage of the cluster management software can be found at the Grid Engine Usage page.

From December 2014 onwards, all the cluster related tasks moved to the IT department. Please contact Petar Forrai for inquiries.