NLLU Datasets
GitHub | ContactParacrawl English 15M
15 million English sentences randomly sampled from Paracrawl, translated using NLLB 3.3B and filtered.
Target Language | Sentences | Link |
---|---|---|
Italian | ~14M | Download |
Dutch | ~14M | Download |
Want more languages added to this list? Get in touch
License
The source text comes from Paracrawl (https://paracrawl.eu/).
We do not own any of the source text from which this data has been translated.
We license the translated text and packaging of this parallel data under the Creative Commons Attribution 4.0 International (CC BY 4.0). Please cite "LibreTranslate" if you use the translated data.