Strand 5 : Computational semantic analysis (Leader : Thierry Charnois, LIPN Paris 13; co-leader : Benoît Crabbé LLF University of Paris)

In the field of computational linguistics, Strand 5 focuses on semantic analysis and its implementation in content access tools. Up to now, methods and tools have been proposed to describe and model different semantic phenomena but these phenomena have generally been considered in isolation and with heterogeneous computation models. The ultimate challenge of Strand 5 is to evaluate these different semantic models, to operationalize them and to integrate them so as to produce rich and consistent semantic analyses having a large coverage over textual sources.

In practice, text understanding has to do with “what the texts say” and the linguistic determinants at work in the construction of meaning but also with “what we know otherwise”, namely extra-linguistic knowledge related to the context (esp. domain and speech situation). One of the important issues is to articulate these two linguistic and extra-linguistic semantics, when analysing texts, acquiring knowledge and integrating this knowledge into semantic analysis.

Strand 5 focuses on the analysis of French but different French variants (e.g. tweets or language with a special purpose) are taken into account and a particular attention is paid to the identification of language-independent methods for analysing a wide range of languages for which resources are available. It is indeed a promising approach to low-resourced languages studied in Strands 2 and 3.

For the coming period, the goal is to intensify research on the integration of semantic models and methods with particular focus on five non-independent axes implemented as as many work packages (WP):

1) Deep learning for NLP (DLNLP) for semi-supervised parsing, information extraction or retrieval, the challenge being to master those new methods of machine learning and their condition of use in Natural Language Processing (NLP);

2) Syntax-semantic interface (SSI) for developing robust and efficient parsers based on semantic meta-grammars and (semi-)supervised machine learning;

3) Integration of linguistic and extra-linguistic semantics (ILES) in context-enhanced NLP tools, knowledge extraction and mining, cross-lingual alignment of semantic resources and domain-specific text understanding;

4) Semantic-based corpus annotation (SCA) for training models of word-sense disambiguation and information extraction;

5) Computational cognitive models (CCM) to test the cognitive plausibility of semantic computation models and integrate the cognitive dimension in NLP.

Lists of current search operations :

Historical and Reflective Perspective (HP)

  • Theory and meaning and TAL (HP1, resp. J. Léon)

Specific analyzes on semantic processing (SSP)

  • Specific oral characteristics for semantic processing (SSP2, resp. M. Adda-Decker)
  • Filtering non-discursive uses of connectors (SSP3, resp. L. Danlos)
  • Deep syntactic annotation (SSP4, resp. M. Candito)

Knowledge acquisition methods (KA)

  • Automatic induction of representative lexico-grammatical patterns from texts (KA2, resp. T. Poibeau)

  • Acquiring domain knowledge from texts (KA3, resp. A. Nazarenko)

Semi-supervised learning (SSL)

  • Contribution of parsing to information extraction: domain adaptation and multi-objective learning (SSL3, resp. J. Le Roux)

Integrative approaches (IA)

  • Annotation platform (IA1)

Applications: towards enriched access to textual content (APP)

  • Design and development of new methods of accessing textual content (APP2, resp. H. Zargayouna)