Named Entity Recognition

Norwegian
Finnish
Hindi
Arabic
Swedish
Russian
Czech
Turkish
Danish
Hebrew

About this dataset

You get access to 24 categories of annotated named entities, ranging from the typical person names, locations, and company names to markers for date, time, and duration - amongst many others. Train models to be able to identify any entity relevant to your chatbot or NLP application!

The dataset features 150,000 sentences in Norwegian (Bokmal), Finnish, Turkish, Hindi, Arabic, Danish, Swedish, Hebrew, Russian, and Czech.

License Information

This dataset is covered by Defined.ai standard Data license agreement. The license agreement is perpetual and allows for the commercialization of all models built on the data.

Sample

NER_RU_Short_Sample.PNG

Download Free Samples

Tell us about yourself, and download a sample of the NER dataset

All fields are required

By downloading, installing, accessing, and/or using this data sample, you consent to receive communications from Defined.ai and affirm your acceptance of our Privacy Policy, Terms of Use, and Data License Agreement. Consent can be revoked at your discretion.

You might also be interested in:

STEM Q&A Pairs

STEM Question-Answer Dataset of 150,000 units coming soon
English
Chemistry
Mathematics
+4

© 2025 DefinedCrowd. All rights reserved.