Daystream annotated text corpus of traffic messages

Open data API in a single place

Provided by GKSt GovData Freie und Hansestadt Hamburg

Get early access to Daystream annotated text corpus of traffic messages API!

Let us know and we will figure it out for you.

Dataset information

Catalog
Country of origin
Updated
2023.03.06 11:04
Created
2020.06.09
Available languages
German
Keywords
mcloud_category_railway, mfund-fkz-19f2031a-e, mfund-projekt-daystream---datenanalytik-und-ki-für-sichere-und-zuverlässige-mobilität, mcloud_id723c0e26-2832-46c7-85cd-1e9424e56240, mcloud_category_roads
Quality scoring
250

Dataset description

The Daystream Corpus is a dataset of 3541 traffic messages in which proper names (e.g. roads, lines, stops), their reference IDs (e.g. DHID, DLID, OSM-IDs), as well as relations (e.g. traffic jam, accident, rail replacement traffic) are annotated manually. The data set can be used as a training or test body for information extraction tasks such as self-name recognition, entity linking and relations extraction. Dataset statistics: <TABLE> <TR> <TH></TH> <th style=“text-align:left“>Twitter</th>” <TH STYLE=“TEXT-ALIGN:LEFT“>RSS</TH>” <th style=“text-align:left“>Total</th>” </TR> <TR> <td>docs</td> <TD> 2825</TD> <TD> 716</TD> <TD> 3541</TD> </TR> <TR> <td>tokens</td> <TD> 69188</TD> <TD> 34630</TD> <TD> 103818</TD> </TR> <TR> <td>entities</td> <TD> 15280</TD> <TD> 8112</TD> <TD> 23392</TD> </TR> <TR> <td>relations</td> <TD> 365</TD> <TD> 427</TD> <TD> 792</TD> </TR> <TR> <td>docs with annotated relations</td> <TD> 305</TD> <TD> 338</TD> <TD> 643</TD> </TR> <TR> <td>linked entities (org|loc)</td> <TD> 5138</TD> <TD> 3331</TD> <TD> 8469</TD> </TR> <TR> <td>NIL entities</td> <TD> 4764</TD> <TD> 1698</TD> <TD> 6462</TD> </TR> </TABLE> The Daystream body is released under the CC-BY 4.0 license. If you use this data, you should quote the following publication: A German Corpus for Fine-Grained Named Entity Recognition and Relation Extraction of Traffic and Industry Events. Martin Schiersch, Veselina Mironova, Maximilian Schmitt, Philippe Thomas, Aleksandra Gabryszak, Leonhard Hennig. Proceedings of LREC, 2018. Further information and details: https://github.com/DFKI-NLP/daystream-corpus/
Build on reliable and scalable technology
Revolgy LogoAmazon Web Services LogoGoogle Cloud Logo
FAQ

Frequently Asked Questions

Some basic informations about API Store ®.

Operation and development of APIs are currently fully funded by company Apitalks and its usage is for free.
Yes, you can.
All important information such as time of last update, license and other information are in response of each API call.
In case of major update that would not be compatible with previous version of API, we keep for 30 days both versions so you will have enough time to transfer to new version. We will inform you about the changes in advance by e-mail.

Didn't find the API you need?

Let us know and we will figure it out for you.

API Store provides access to European Open Data via scalable and reliable REST API interface.
Copyright © 2024. Made with ♥ by Apitalks