Cell Biology
Bioinformatics
LOCATION
Rm401
CONTACT
Suh-Yuen Liang syliang@gate.sinica.edu.tw +886-2-27855696,4015
Description

The Bioinformatics core facility provides the computational solutions for high-throughput MS-base and NGS-base data analysis, advance biostatistics, data mining, data visualization, omics data integration, application development, etc.  The facility has built and installed the platforms, workflows, and tools on servers with the capacity to handle big data and parallel computing.  As the technologies for bioinformatics are constantly evolving, the facility is committed to provide the latest technologies to meet the need of investigators in IBC. In addition, the facility offers on-site consultations and trainings for common bioinformatics tools and analysis.

Service request 
User guideline 
Service Charge

1. Servers

This facility installs various servers to meet different needs of bioinformatics.  The Linux server has the specifications for high-throughput data analysis from next generation sequencing and genome assembly.  Major tools installed on the server include SPAdes/SOAPdenovo for genome assembly, Maker for genome annotation, Salmon/STAR for RNAseq analysis, Blast+ on NCBI nr database, AntiSMASH for fungi/bacteria for gene cluster analysis, Muther for 16S rRNA sequence data analysis, etc. 

The in-house galaxy is a web-based platform for small-scale data analysis with a user-friendly interface and wide-range of bioinformatics tools. The R/Shiny server supports the R-based statistics and interactive graphs for in-house web applications. 

This facility utilizes the Taiwania 3 high-performance computing server from the National Center for High-performance Computing (NCHC) and the AI development tools provided by the Taiwan Computing Cloud (TWCC) to facilitate big data analysis and AI applications.

The AWS network server is rented from Amazon Web Services (AWS) and supports the deployment of internet applications developed for users.

 
2. Advanced Data Analysis

The facility provides services for advanced data analysis including differential expression analysis for proteomics/genomics data, survival analysis, multivariate survival analysis, heatmap, clustering, interactome analysis, pathway analysis, functional enrichment analysis, genotype-phenotype association analysis, circus plot, parametric/non-parametric statistics, logistic regression, time series analysis, machine learning, etc. 

 
3. Data Mining and Integration

The facility provides the service for data mining on the databases from public domains including cancer genomic data from Genomic Data Common (GDC) data portal, the gene expression omnibus (GEO) and the sequence read archive (SRA) in NCBI, the somatic mutation database in COSMIC, proteomics data from PRIDE Archive in EBI, and post translation modification database, etc.

 
4. Streamline and automate data analysis

The facility collaborates with the investigators in IBC to streamlines data analysis workflows for complex and computation intensive project such as genomic analysis for pan-cancer study, microbiome analysis, alternative splicing identification and quantitation, and automated genome assembly/annotation. 

 
5. Application development and web deployment

This facility develops applications to assist with automated data analysis, data visualization, and information integration. For the deployment of web applications, users can choose between internal networks or the internet via an AWS server based on their needs.

 
6. Data storage

This facility provides data storage service to the Network Attached Storage (NAS) server with high-performance 3.5" SATA hard drive and 96 TB storage space.  User data will be backup in the NAS.  However, for high-throughput data, such as the raw files from mass spectrometry or next generation sequencing, an annual fee for data storage in the NAS will be applied.