Translation Memory from Doffin

Open data API in a single place

Provided by difi

Get early access to Translation Memory from Doffin API!

Let us know and we will figure it out for you.

Dataset information

Country of origin
Updated
2020.11.04 00:00
Created
Available languages
Norwegian
Keywords
tekst, korpus, språkbanken, språkteknologi, språkforskning
Quality scoring
245

Dataset description

This corpus contains data from Doffin, the Norwegian web-based database for notices of public procurement and procurement in the utility sector, managed by The Norwegian Agency for Public and Financial Management. The Language Bank received the data in the form of an XML database dump. The dump consisted of 41.143 document pairs (original and translation). 40.631 of these were translations from Norwegian to English. Only the latter are included in the corpus. Of the originally Norwegian documents, 39.893 were in Norwegian Bokmål and 736 in Norwegian Nynorsk. Original and translation were first aligned on document level using an internal document identifier, then the sentences were extracted using the NLTK Punkt Sentence Tokenizer and aligned using Hunalign. Duplicate translations (exact duplicates) were discarded. We recorded a total of 293.649 translation units (TUs) for Norwegian Bokmål to English, and 6.342 TUs for Norwegian Nynorsk to English. A TU is a translation pair with an original text and a parallelized translation, and corresponds to a more or less meaningful linguistic unit, typically a clause, a heading etc. A TU may also consist of a single word or several clauses. The translation units for the two languages are distributed as two separate files, both in TMX 1.4 format (a variant of XML).
Build on reliable and scalable technology
Revolgy LogoAmazon Web Services LogoGoogle Cloud Logo
FAQ

Frequently Asked Questions

Some basic informations about API Store ®.

Operation and development of APIs are currently fully funded by company Apitalks and its usage is for free.
Yes, you can.
All important information such as time of last update, license and other information are in response of each API call.
In case of major update that would not be compatible with previous version of API, we keep for 30 days both versions so you will have enough time to transfer to new version. We will inform you about the changes in advance by e-mail.

Didn't find the API you need?

Let us know and we will figure it out for you.

API Store provides access to European Open Data via scalable and reliable REST API interface.
Copyright © 2024. Made with ♥ by Apitalks