Network-DC aims at establishing a global multilingual language resources network through a collaboration agreement between the Evaluation and Language resources Distribution Agency (ELDA) and its US counterpart, the Linguistic Data Consortium (LDC). The resources will be available over a network to the research, education, and industry sectors.
The approach adopted by Network-DC will be to initiate a large-scale collaborative data collection project between ELDA and LDC, involving the production, acquisition, normalisation, certification and distribution of spoken and written language, for research and technology development. This pragmatic approach will ensure that the project’s ideas are workable, and will provide at the same time a future model for co-operation in areas of licensing, distribution and common standards.
Network-DC will set up a network of data centres, thus facilitating the access to electronic language resources currently managed by many different regional data centres.
In so doing, the project will set up new principles and practices for co-operation between the European Language Resources Association (EU) and the Linguistic Data Consortium (US), covering several areas for the language resources management.
The European side of the project is in charge of creating up to five broadcast news in different languages, while the US side is responsible for the creation of a linguistic corpus including significant samples of the 45 languages of broadcast used by the Voice of America.
From December 2000 to May 2001
Evaluations and Language resources Distribution Agency - ELDA
- Linguistic Data Consortium - LDC (US)
- Speech Processing Expertise Centre - SPEX (NL)