Information Extraction

The process of automatically extracting relevant information from documents.

Resources

Intended audience Programmers, Researchers and Students, Text and Data miners
Level: Advanced: apply

The objective of this component is to scan a tokenized text to detect entries in BabelNet in the input document. This component is the base of entity linking and word sense disambiguation as it detects the candidates to be disambiguated. The component produces WSD item annotations as defined in the DKPro WSD typesystem. Afterwards, disambiguation can be performed by other components (like DKPro WSD). The component has been shared as a Docke...

Intended audience Industry and Business, Programmers, Project Managers, Publishers, Researchers and Students, Text and Data miners
Level: Introductory: no previous knowledge is required

This tutorial includes three parts that describe how to use the Wheat Phenotypic Information Extractor and the two end-user applications, WheatIS and AlvisIR, that integrates its results for the use case developed by Inra during the OpenMinTeD project.The application extracts information related to wheat on phenotypes, genes, markers, species, and varieties.

Intended audience Policy makers and Funders, Project Managers, Publishers, Researchers and Students, Text and Data miners
Level: Introductory: no previous knowledge is required

The objective of this tutorial is to showcase how Social Science researchers can take full advantage of the OpenMinTeD TDM platform for Detecting and Linking Variables in Scientific Publications.
 

Intended audience Industry and Business, Programmers, Researchers and Students, Text and Data miners
Level: Introductory: no previous knowledge is required

In this course we will explain how IXA pipes have been integrated as Docker images in the OpenMinTeD (OMTD) platform and how can they be used (http://ixa2.si.ehu.es/ixa-pipes/).

The aim of IXA pipes is to provide a modular set of ready to use Natural Language Processing (NLP) tools. IXA pipes uses the same approach across NLP tasks in order to create robust processors both across domains and languages.
 

&nbs...

Intended audience Industry and Business, Policy makers and Funders, Project Managers, Publishers, Researchers and Students, Text and Data miners
Level: Introductory: no previous knowledge is required

The objective of this tutorial is to showcase how the Neuroscience use case available at the OpenMinTeD platform can facilitate the curation of neuroscience entities from the literature with the aim of supporting ongoing curation efforts in the Blue Brain Project (BBP), at the École Polytechnique Fédérale de Lausanne (EPFL).
 

Intended audience Industry and Business, Policy makers and Funders, Project Managers, Publishers, Researchers and Students, Text and Data miners
Level: Introductory: no previous knowledge is required

The objective of this tutorial is to showcase the use case of “Extract Metabolites and their Properties and Modes of Actions”. The tutorial describes step-by-step how to create a workflow in the OpenMinTeD platform that can read input from a source and annotate entities useful for the curation of the ChEBI database.
 

Intended audience Industry and Business, Programmers, Project Managers, Publishers, Researchers and Students, Text and Data miners
Level: Introductory: no previous knowledge is required

This tutorial explains how to use the “Arabidopsis Gene Regulation Extractor” application available from the OpenMinTeD platform. It also explains the scientific issues it addresses, and how results of the TDM process can be exploited by researchers through the FlagDB++ application. It is related to the AS-D “Information Extraction of Mechanisms Involved in Plant” Development use case developed in the OpenMinTeD project.
 

Intended audience Programmers, Industry and Business, Project Managers, Researchers and Students, Text and Data miners
Level: Introductory: no previous knowledge is required

This tutorial explains how to use the “Habitat-Phenotype Relation Extractor for Microbes” application available from the OpenMinTeD platform. It also explains the scientific issues it addresses, and how the results of the TDM process can be queried and exploited by researchers through the Florilège application. It is related to the AS-C “Microbial biodiversity” use case developed in the OpenMinTeD project.
 

Intended audience Programmers, Researchers and Students, Text and Data miners
Level: Intermediate: able to

BO-LSTM is a model based on biomedical ontologies and Long short-term memory networks. The model was developed in python, using keras. To demonstrate its utility, we trained a classification model on the DDI corpus, using the ChEBI ontology as the reference ontology. This tutorial shows how to install BOLSTM, classify any text using thi...

Intended audience Programmers, Researchers and Students, Text and Data miners
Level: Advanced: apply

The objective of this component is to scan a tokenized text to detect entries in BabelNet in the input document. This component is the base of entity linking and word sense disambiguation as it detects the candidates to be disambiguated. The component produces WSD item annotations as defined in the DKPro WSD typesystem. Afterwards, disambiguation can be performed by other components (like DKPro WSD). The component has been shared as a Docke...