Tissue engineering aims to recover and improve the functionality of damaged tissues and organs by constructing living components useful for regeneration. One of the most important steps in tissue engineering processes is the selection of appropriate cell sources for implantation. Although stem cells have been identified as a promising source, different issues must be addressed before their clinical use for tissue replacement. In this context, the application of bioinformatic approaches to genome-wide expression data may help to understand how tissues develop at a molecular level.
Our research activities aim at developing novel bioinformatic methods that provide insights into cellular development by simultaneously exploiting microarray-based data and knowledge repositories. Thanks to the collaborations of our laboratory with other research centres, the proposed methods have been applied to different fields, including stem cells differentiation and oocytes development.
Stem cells
Stem cells are self-renewing populations characterized by pluripotency, i.e. the ability to evolve into diverse mature cell types. In mammals, embryonic stem cells (ESCs) can be isolated, proliferated and differentiated in vitro into a potentially unlimited variety of tissues. Whilst ESCs have he greatest potential for clinical applications in terms of pluripotency, their use raises several ethical issues.
The recent discovery that adult somatic cells can be reprogrammed in vitro to obtain induced pluripotent stem cells (iPSCs) has paved the way for new opportunities to study diseases and develop patient-specific therapies. However, one of the challenges concerns ensuring that reprogrammed cells are actually pluripotent and have not moved into partially differentiated states.
In this context, we found that dimension reduction techniques can be successfully applied to the transcriptome data to obtain predictive models of the differentiation stage of stem cells and reprogrammed cells. Used in combination with gene selection strategies, these models map the temporal gene expression data of samples in standard culturing conditions to a one-dimensional space, obtaining a device named Differentiation scale. Uncharacterized samples, such as iPSCs, can be projected on this graphical tool to determine their actual pluripotency with respect to normal dynamics of differentiation.
The integration of multiple experiments with networks and knowledge on embryonic development highlights the most influent pathways during specific phases of differentiation. In particular, we are investigating the utility of methods for combining multiple gene expression data sets in order to obtain a reliable signature of the cellular identity. In addition, we have studied different prioritization strategies that exploit literature-derived gene annotations and network properties. Borrowing some ideas from text-mining and Information Retrieval, candidate marker genes emerge from the study of their annotations and from the analysis of the network connectivity patterns.
Developmental Biology
One of the not yet fully explored developmental processes is the differentiation of the mammalian oocyte during folliculogenesis. Recently, there have been increasing efforts in the characterization of oogenesis and early embryogenesis by means of knowledge repositories. Results from microarray-based studies have been analysed with bioinformatics tools for annotation and association of molecules, with the common aim of hypothesizing unknown entities that play important roles in cellular development.
A subset of “maternal effect” genes have been identified for their important role in the early stages of development: these factors can modify the oocytes developmental competence and the gene expression in the zygote. However, the complete network of essential key regulator genes in mammals still remains unclear. In the analysis of data from oocytes, an added value is represented by the evidence that a known maternal-effect gene is often related to another gene that has not been previously considered. Such evidence can be obtained from gene annotations and association networks.
The developmental competence of oocytes have been also related to the chromatin organization. Based on the presence of a ring of heterochromatin surrounding the nucleolus, two different types of oocytes have been identified in the mouse ovary:
- surrounded nucleolus oocytes (SN)
- not surrounded nucleolus (NSN)
These two types of oocytes have different developmental competence: in the mouse, NSN oocytes arrest development at the 2-cell stage, whereas SN oocytes may develop to term. This characterization provides a useful model for determining a priori the oocyte developmental competence which guides the subsequent phases of cellular growth. One of the most important questions in this context is related to the transcriptional changes influenced by a specific chromatin configuration, which may be responsible of a different behaviour in terms of cellular maturation.
In order address these issues, we have developed knowledge-based bioinformatics approaches to compare the transcriptome data of developmentally competent oocytes (SN) with those that cease development at the 2-cell stage. Using keywords extracted from the Gene Ontology and from the publications referencing each gene in PubMed, we developed knowledge-based gene association networks that allowed identifying a core set of factors guiding the transition from oocytes to embryos. Our activities in this field are currently focused on the integration of multiple knowledge sources that would help to assess the role of each gene during development and to uncover the transcriptional link existing between embryogenesis and stem cells differentiation.