AACR (American Association for Cancer Research) 2023
April 4, 2023
Authors: Maximilien Colange; Guillaume Appe; Akpeli Nordor; Abdelkader Behdenna
Abstract
The recent exponential progress of sequencing technologies has dramatically impacted cancer research and paved the way to precision medicine in cancer care. In parallel, light-speed progress in bioinformatics has been essential to allow analysts to embrace the vast amount of data yielded by high-throughput profiling machines, turn this data into cancer biology knowledge, and ultimately develop innovative approaches to cancer care. Still, computational complexity and tools’ interoperability remain major challenges for the advancement of -omic data-driven cancer research.
Despite the historical prevalence of R, Python is gaining momentum in the bioinformatics landscape. As a general purpose language, it offers numerous advantages:
These advantages motivate the further development of the Python bioinformatics ecosystem. We thus advocate the necessity to port reference tools from R to Python, with the ambition of shaping a comprehensive ecosystem. Sign of this trend, state-of-the-art tools have been directly developed in Python (e.g. lifelines, a library dedicated to survival analysis) or quickly ported in Python from R (e.g. harmony, an R library for integrating single cell data).
We introduce InMoose (Integrated Multi-Omics Open Source Environment), an open source Python unified framework for every -omic data type. It is based on recognized tools and focuses on efficiency and user-friendliness. InMoose is accessible at https://github.com/epigenelabs/inmoose and is released under GPL3 license. The first version of InMoose focuses on bringing to the Python world transcriptomics tools mostly based on the edgeR R package. It features batch-effect correction algorithms, as well as differential expression analysis, for microarray and RNA-seq data.
InMoose demonstrates the advantages of our approach:
We expect that our effort will help foster a larger collaborative effort to build and grow a consistent state-of-the-art Python platform for cancer bioinformatics.