The Dutch NewsReader pipeline

NAF layers

NAF annotations in the Dutch pipeline consist of the following layers:

  • raw: raw text
  • text: tokenized words
  • terms: word senses combined with morphosyntactic information
  • deps: dependency parses
  • constituents: phrase-structure parses
  • entities: people, locations, organizations and numeric expressions
  • srl: semantic-role labels
  • opinions: opinion triplets (holder, target, expression)
  • factualities: annotates veracity or factuality of relevant expressions
  • coreferences: marks coreferent term spans
  • timeExpressions: standardized time expressions

Components

Our version of the Dutch NewsReader pipeline uses the following components:

Component versions

The versions of the components used by the pipeline are stored in ./cfg/component_versions. This file is loaded by the installation script.

Component dependencies

Components either generate one or more layers or modify a layer. They depend on one or more input layers, and may also require specific components to be executed first, besides the components required to produce their input layers. The following table summarizes the dependencies of the Dutch NewsReader pipeline:

component input layers required components output layers
text2naf raw
ixa-pipe-tok raw text
vua-alpino text terms, deps, constituents
ixa-pipe-nerc text, terms entities
ixa-pipe-ned entities entities
vuheideltimewrapper text, terms timeExpressions
vua-wsd text, terms terms
vua-ontotagging terms +vua-wsd terms
vua-srl-nl terms, deps, constituents srl
vua-framenet-classifier terms, srl +vua-srl-nl, vua-ontotagging srl
vua-nominal-event-detection srl, terms srl
vua-srl-dutch-nominal-events terms, dependencies, srl +vua-nominal-event-detection srl
vua-eventcoreference srl, terms coreferences
opinion-miner text, terms, deps, constituents, entities opinions
multilingual-factuality terms, coreferences, opinions factualities

Pipeline graph

These dependencies result in the following execution graph:

_images/pipe-graph.png

The pipeline wrapper instantiates this graph as a directed acyclic graph, allowing for its filtering, execution and rescheduling.