CA18131 - Statistical and machine learning techniques in human microbiome studies
In recent years, the human microbiome has been characterised in great detail in several large-scale studies as a key player in intestinal and non-intestinal diseases, e.g. inflammatory bowel disease, diabetes and liver cirrhosis, along with brain development and behaviour. As more associations between microbiome and phenotypes are elucidated, research focus is now shifting towards causality and clinical use for diagnostics, prognostics and therapeutics, where some promising applications have recently been showcased. Microbiome data are inherently convoluted, noisy and highly variable, and non-standard analytical methodologies are therefore required to unlock its clinical and scientific potential. While a range of statistical modelling and Machine Learning (ML) methods are now available, sub-optimal implementation often leads to errors, over-fitting and misleading results, due to a lack of good analytical practices and ML expertise in the microbiome community. Thus, this COST Action network will create productive symbiosis between discovery-oriented microbiome researchers and data-driven ML experts, through regular meetings, workshops and training courses. Together, it will first optimise and then standardise the use of said techniques, following the creation of publicly available benchmark datasets. Correct usage of these approaches will allow for better identification of predictive and discriminatory ‘omics’ features, increase study repeatability, and provide mechanistic insights into possible causal or contributing roles of the microbiome. This Action will also investigate automation opportunities and define priority areas for novel development of ML/Statistics methods targeting microbiome data. Thus, this COST Action will open novel and exciting avenues within the fields of both ML/Statistics and microbiome research.
e-Business Optimization with Big Data
Goal: In the context of Future Investments Development of the Digital Economy in France, the e-Business optimization with Big Data project (eBob) aims to revolutionize consumer goods business management. Its goal is to create new analytical purchase data technologies by combining for the first time structured and unstructured data for dynamic functional modification of the tool according to the context resulting analyzes. Structured data (from database users), which contains primaryrmation on the internal management of company purchases, will be combined with unstructured data (from the internet) and with data from the business platform. Studying these data will allow us to analyze trends by taking into account not only the situation of the sellers but also the reality of the market. This analysis, powered by multiple sources of primaryrmation, aims to provide decision support for users by guiding them in their choices (Dynamic change).
New data intensive algorithms and structures for GPU processors
The goal of the project is to design new or adapt existing parallel structures and algorithms for multi-core general purpose graphic processing units. Among results we expect significant acceleration of existing solutions and enlargement of processed volumes of data. Scientific hypothesis proposes that this goal may be achieved with popular GPU devices and personal computers.
The dissertation also discusses an exemplary application of time series databases: the analysis of zebra mussel (Dreissena polymorpha) behaviour based on observations of the change of the gap between the valves, collected as a time series. We propose a new al- gorithm based on wavelets and kernel methods that detects relevant events in the collected data. This algorithm allows us to extract elementary behaviour events from the observa- tions. Moreover, we propose an efficient framework for automatic classification to separate the control and stressful conditions. Since zebra mussels are well-known bioindicators this is an important step towards the creation of an advanced environmental biomonitoring system.