Text and Data miners

Navigation

Resources

By  Marjan Grootveld
Publication year: 2018  |  Research Data Management  |  Open Repositories
Intended audience Programmers, Researchers and Students, Text and Data miners
Level: Advanced: apply

The Freeling component provides basic language analysis functionalities (tokenization, lemmatization, Pos Tagging and dependency parsers.) for the variety of languages that Freeling includes (English, Spanish, Portuguese, Italian, French, German, Russian, Catalan, Galician, Croatian, Slovene). The specific usage scenario for this component concerns scientific publications in non-English languages. The component has been shared as a Docker a...

Intended audience Programmers, Researchers and Students, Text and Data miners
Level: Advanced: apply

The objective of this component is to scan a tokenized text to detect entries in BabelNet in the input document. This component is the base of entity linking and word sense disambiguation as it detects the candidates to be disambiguated. The component produces WSD item annotations as defined in the DKPro WSD typesystem. Afterwards, disambiguation can be performed by other components (like DKPro WSD). The component has been shared as a Docke...

Intended audience Industry and Business, Programmers, Project Managers, Publishers, Researchers and Students, Text and Data miners
Level: Introductory: no previous knowledge is required

This tutorial includes three parts that describe how to use the Wheat Phenotypic Information Extractor and the two end-user applications, WheatIS and AlvisIR, that integrates its results for the use case developed by Inra during the OpenMinTeD project.The application extracts information related to wheat on phenotypes, genes, markers, species, and varieties.

Intended audience Policy makers and Funders, Project Managers, Publishers, Researchers and Students, Text and Data miners
Level: Introductory: no previous knowledge is required

The objective of this tutorial is to showcase how Social Science researchers can take full advantage of the OpenMinTeD TDM platform for Detecting and Linking Variables in Scientific Publications.
 

Intended audience Policy makers and Funders, Programmers, Researchers and Students, Text and Data miners
Level: Introductory: aware of

This tutorial describes how to use TDM to build a Recommender system for scholarly resources and utilise OpenMinTed platform to build and annotate corpuses for this purpose.

Intended audience Researchers and Students, Text and Data miners
Level: Introductory: no previous knowledge is required

This tutorial explains how to use the Bio Term Hub, an aggregator of biomedical terminologies sourced from manually curated databases, to create a terminology suited to the users need. This terminology can be forwarded to OGER, a dictionary-based named entity recogniser. After the tutorial, the user should be able to use both these tools.
 

OpenMinTeD

Intended audience Policy makers and Funders, Project Managers, Publishers, Researchers and Students, Text and Data miners
Level: Introductory: aware of

This tutorial walks users through the simple process of creating a workflow in the OpenMinTeD platform that allows them to identify acknowledged projects (i.e. funding information) from scientific publications.
 

Intended audience Policy makers and Funders, Policy makers and Funders, Project Managers, Publishers, Researchers and Students, Text and Data miners
Level: Introductory: aware of

This tutorial walks users through the simple process of creating a workflow in the OpenMinTeD platform that allows them to extract links to DataCite (https://www.datacite.org) - mainly citations to datasets - from scientific publications.

Intended audience Policy makers and Funders, Project Managers, Publishers, Researchers and Students, Text and Data miners
Level: Introductory: aware of

This tutorial will users through the simple process of creating a workflow in the OpenMinTeD platform that allows them to perform content-based document classification on scientific publications, based on the arXiv, MeSH, ACM and DCC taxonomies.