Experience

Positions after PhD

 
 
 
 
 

Research director

LIRMM, CNRS

Jan 2016 – Present Montpellier, France

Responsibilities include:

  • Head of Institute of Computational Biology (2015-2019)
  • Head of French Molecular Bioinformatics network (2010-2016)
  • Head of computer science dpt (2007-2010)
 
 
 
 
 

Reasearcher

LIRMM, CNRS

Jan 2008 – Oct 1999 Montpellier, France
Bioinformatics and algorithmics research.
 
 
 
 
 

Postdoctoral fellow

German Cancer Research Center (Deutsches Krebsforschung Zentrum - DKFZ)

Sep 1996 – Oct 1999 Heidelberg, Germany
Algorithms for transcriptomics.

News

News about software, results, publications, or collaborations

Marion gets a permanent position at University

New results on the mechanism that creates resistance in cancer cells

1st publication in 2024

Projects

ALL THINGS ARE DIFFICULT BEFORE THEY ARE EASY

*

ALPACA

EU funded Int’al Training Network on Computational Pan-Genomics

Superstring

Algorithms for shortest superstring questions

Fast_place

Software for efficient metagenomics and applications to virus metagenomics

GEM

Information fuelled biophysical models for the control of gene expression

Recent Publications

Ancestral sequence reconstruction is an important task in bioinformatics, with applications ranging from protein engineering to the study of genome evolution. When sequences can only undergo substitutions, optimal reconstructions can be efficiently computed using well-known algorithms. However, accounting for indels in ancestral reconstructions is much harder. First, for biologically-relevant problem formulations, no polynomial-time exact algorithms are available. Second, multiple reconstructions are often equally parsimonious or likely, making it crucial to correctly display uncertainty in the results. Here, we consider a parsimony approach where any indel event has the same cost, irrespective of its size or the branch where it occurs. We thoroughly examine the case where only deletions are allowed, while addressing the aforementioned limitations. First, we describe an exact algorithm to obtain all the optimal solutions. The algorithm runs in polynomial time if only one solution is sought. Second, we show that all possible optimal reconstructions for a fixed node can be represented using a graph computable in polynomial time. While previous studies have proposed graph-based representations of ancestral reconstructions, this result is the first to offer a solid mathematical justification for this approach. Finally we discuss the relevance of the deletion-only case for the general case.Competing Interest StatementThe authors have declared no competing interest.

All routine clinical treatments for colorectal cancer include 5-fluorouracil (5-FU), which cannot counteract recurrence and metastases formation. As the pyrimidine analog 5-FU can impact multiple pathways including both DNA and RNA metabolism, studying its mode of actions could lead to improved therapies. Using a dedicated reporter system for lineage-tracing and deep translatome profiling we demonstrate that 5-FU causes some colorectal cancer cells to tolerate the drug, due to a durable translational reprogramming that sustains cell plasticity. This period of drug tolerance coincides with specific translational activation of genes coding for proteins with major pro-tumoral functions. We unravel a major unexpected translational overexpression of the pro-inflammatory and pro-tumoral IL-8 cytokine, alongside other anti-apoptotic, senescence-associated secretory phenotype and cancer-related senescence phenotype genes. Given the adverse prognostic implications of elevated IL-8 levels across various cancers, our findings suggest IL-8 targeting could counteract 5-FU resistance.

A correlation is a binary vector that encodes all possible positions of overlaps of two words, where an overlap for an ordered pair of words (u,v) occurs if a suffix of word u matches a prefix of word v. As multiple pairs can have the same correlation, it is relevant to count how many pairs of words share the same correlation depending on the alphabet size and word length n. We exhibit recurrences to compute the number of such pairs – which is termed population size – for any correlation; for this, we exploit a relationship between overlaps of two words and self-overlap of one word. This theorem allows us to compute the number of pairs with a longest overlap of a given length and to show that the expected length of the longest border of two words asymptotically diverges, which solves two open questions raised by Gabric in 2022. Finally, we also provide bounds for the asymptotic of the population ratio of any correlation. Given the importance of word overlaps in areas like word combinatorics, bioinformatics, and digital communication, our results may ease analyses of algorithms for string processing, code design, or genome assembly.

Overlaps between words are crucial in many areas of computer science, such as code design, stringology, and bioinformatics. A self overlapping word is characterized by its periods and borders. A period of a word $u$ is the starting position of a suffix of $u$ that is also a prefix $u$, and such a suffix is called a border. Each word of length, say $n>0$, has a set of periods, but not all combinations of integers are sets of periods. Computing the period set of a word $u$ takes linear time in the length of $u$. We address the question of computing, the set, denoted $Γ_n$, of all period sets of words of length $n$. Although period sets have been characterized, there is no formula to compute the cardinality of $Γ_n$ (which is exponential in $n$), and the known dynamic programming algorithm to enumerate $Γ_n$ suffers from its space complexity. We present an incremental approach to compute $Γ_n$ from $Γ_n-1$, which reduces the space complexity, and then a constructive certification algorithm useful for verification purposes. The incremental approach defines a parental relation between sets in $Γ_n-1$ and $Γ_n$, enabling one to investigate the dynamics of period sets, and their intriguing statistical properties. Moreover, the period set of a word $u$ is the key for computing the absence probability of $u$ in random texts. Thus, knowing $Γ_n$ is useful to assess the significance of word statistics, such as the number of missing words in a random text.

Cancer onset and progression are known to be regulated by genetic and epigenetic events, including RNA modifications (a.k.a. epitranscriptomics). So far, more than 150 chemical modifications have been described in all RNA subtypes, including messenger, ribosomal, and transfer RNAs. RNA modifications and their regulators are known to be implicated in all steps of post-transcriptional regulation. The dysregulation of this complex yet delicate balance can contribute to disease evolution, particularly in the context of carcinogenesis, where cells are subjected to various stresses. We sought to discover RNA modifications involved in cancer cell adaptation to inhospitable environments, a peculiar feature of cancer stem cells (CSCs). We were particularly interested in the RNA marks that help the adaptation of cancer cells to suspension culture, which is often used as a surrogate to evaluate the tumorigenic potential. For this purpose, we designed an experimental pipeline consisting of four steps: (1) cell culture in different growth conditions to favor CSC survival; (2) simultaneous RNA subtype (mRNA, rRNA, tRNA) enrichment and RNA hydrolysis; (3) the multiplex analysis of nucleosides by LC-MS/MS followed by statistical/bioinformatic analysis; and (4) the functional validation of identified RNA marks. This study demonstrates that the RNA modification landscape evolves along with the cancer cell phenotype under growth constraints. Remarkably, we discovered a short epitranscriptomic signature, conserved across colorectal cancer cell lines and associated with enrichment in CSCs. Functional tests confirmed the importance of selected marks in the process of adaptation to suspension culture, confirming the validity of our approach and opening up interesting prospects in the field.

Popular Topics

68W32 adaptation Aho-Corasik algebraic technique algorithm algorithms alignment alignment score ALPACA alphabet size Anchor-based strategy ancient DNA Approximability approximate match approximate pattern matching approximate repeats approximation Approximation algorithm approximation algorithms APX assembly autocorrelation award bacteria Bacterial genomes Basic Period binary alphabet Binary Vector binding binding site bioinformatics Biological Physics (physics.bio-ph) biophysics BLAST bounds Burrows-Wheeler cancer cancer stem cell cDNA character Characterisation chemical mark chromatin chromosome circular permutation cluster analysis clustering clustering algorithms coding coiled coil Collinear fragment chaining common word Comparative genomics complexity compressed data structures compression compression algorithms compression gain computer science Concat-Cycles concensus string conformation Connectivity cross-over cyclic cover Cyclic string cyclic strings Cytoplasmic Male Sterility Data compression data structure Data structures Data Structures and Algorithms (cs.DS) database DCJ de Bruijn graph diagnostic discrete line DNA dominance order double cut and join duplication dynamic programming edit distance encoding enumeration epitranscriptome equality EST Eulerian tour evaluation evolution Exact Match exponential F.2.2 filtration FOS: Computer and information sciences FOS: Physical sciences Gapped seed genetics genome genome rearrangement genome sequencing genomics Golomb ruler graph greedy greedy algorithm Greedy conjecture Haemophilus influenzae Hamiltonian path heuristic algorithms Hi-C homologous sequences human hybrid zone Hypergraph incomplete lineage sorting indexing information content information theory input string INS insulin integer sequence internal duplication intragenic recombination irreducible factor kinship Kolmogorov complexity lattice LCS Levenshtein distance linear superstring linear time linear time algorithm Longest common subsequence machine learning malaria mapping tool matroid maximal chain Maximum coverage Maximum independent set Maximum stable set medecine memory metagenome metagenomics microorganisms microsatellite evolution Minimum assignment minisatellite minisatellite locus minisatellites MIS modification monkey test motif motif size mouse mRNA MS-Align multiple alignment multiple read mutation MVR N-gram NGS NP-complete NP-hard On-line algorithms optimal coding oryza line overlap overlap graph Pairwise alignment parameterized complexity path pattern Pattern recognition pattern search perfect detection periods Permutation phylogenetic profile Polynomial Time Approximation Scheme price proportional length protein domain proteome publication PWM python radish genome random-access memory random text Read mapping rearrangement Recognition recombinant regular expression regularity detection regulation relative compression repeats reverse complementary sequence RLE RNA search algorithm seed segment tree seminar sequence sequence alignment sequence classification sequence comparison Sequence graph short tandem repeats Shortest Common Superstring Problem Shortest cyclic cover of strings shortest DNA cyclic cover problem Shortest Superstring Problem similarity similarity metrics single cell soft software spaced-seed Statistical Mechanics (cond-mat.stat-mech) string String matching stringology Stringology Text Algorithms Indexing Data Structures De Bruijn Graph Assembly Space Complexity Dynamic Update strings student sturgeon phylogeny subset system suffix array suffix tree superstring sweep line tandem duplication tandem repeat tandem repeat alignment tandem repeats team text text compression Text indexing Tiling time complexity tool training transcription factor transcriptome transcriptomics translation tree tree alignment validation score virus VNTR W[1]-hard web resource web server Whole genome alignment word word enumeration word RAM model workshop Yakuts zebra fish

Contact

Connect with me

  • (33) 04 67 41 86 64
  • LIRMM - UMR 5506 & University Montpellier (CC 05016) 860 rue de St Priest - 34095 Montpellier cedex 5 FRANCE