Nenek: a cloud-based collaboration platform for the management of Amerindian language resources Article uri icon

abstract

  • This article presents Nenek: A cloud-based collaboration platform for language documentation of underresourced languages. Nenek is based on a crowdsourcing scheme that supports native speakers, indigenous associations, government agencies and researchers in the creation of virtual communities of minority language speakers on the Internet. Nenek includes a set of web tools that enables users to work collaboratively on language documentation tasks, build lexicographic assets and produce new language resources. This platform includes a three-stage management model to control the acquisition of existent language resources, the manufacturing of new resources, as well as their distribution within the virtual community and to the general public. In the acquisition stage, existent language resources are either automatically extracted from the web by a crawler or received through donations from users who participate in a monolingual social network. In the manufacturing stage, lexicographic and collaborative tools enable users to build new resources while the acquired and manufactured resources are published in the diffusion stage, either within the virtual community or publicly. We present a life cycle mapping scheme that registers the transformations of the language resources at each of the three stages of language resource management. This scheme also traces the utilization and diffusion of each resource produced by the virtual community. The paper includes a case study in which we present the use of the Nenek platform in a language documentation project of a Mayan language spoken in Mexico%27s Gulf coast region called Huastec. This case study reveals Nenek%27s efficiency in terms of acquisition, annotation, manufacturing and diffusion of language resources; it also discusses the participation of the members of the virtual community. © 2016, Springer Science Business Media Dordrecht.
  • This article presents Nenek: A cloud-based collaboration platform for language documentation of underresourced languages. Nenek is based on a crowdsourcing scheme that supports native speakers, indigenous associations, government agencies and researchers in the creation of virtual communities of minority language speakers on the Internet. Nenek includes a set of web tools that enables users to work collaboratively on language documentation tasks, build lexicographic assets and produce new language resources. This platform includes a three-stage management model to control the acquisition of existent language resources, the manufacturing of new resources, as well as their distribution within the virtual community and to the general public. In the acquisition stage, existent language resources are either automatically extracted from the web by a crawler or received through donations from users who participate in a monolingual social network. In the manufacturing stage, lexicographic and collaborative tools enable users to build new resources while the acquired and manufactured resources are published in the diffusion stage, either within the virtual community or publicly. We present a life cycle mapping scheme that registers the transformations of the language resources at each of the three stages of language resource management. This scheme also traces the utilization and diffusion of each resource produced by the virtual community. The paper includes a case study in which we present the use of the Nenek platform in a language documentation project of a Mayan language spoken in Mexico's Gulf coast region called Huastec. This case study reveals Nenek's efficiency in terms of acquisition, annotation, manufacturing and diffusion of language resources; it also discusses the participation of the members of the virtual community. © 2016, Springer Science%2bBusiness Media Dordrecht.

publication date

  • 2017-01-01