You are here: Home » Zero-In » Zero-In Third Issue eMagazine » e-Infrastructure for scientific data in the psycholinguistics: a survey by PARSE.Insight

e-Infrastructure for scientific data in the psycholinguistics: a survey by PARSE.Insight

By Beate Sturm, Goettingen State and University Library and Max Planck Digital Library, Germany

Scientific data are at risk: they are scattered among various institutions and administered by researchers themselves. As a result, they are rarely accessible or preserved for the future.
To counter this, the EU-funded project PARSE.Insight will develop a roadmap and recommendations for shaping an e-Infrastructure to maintain the long-term accessibility of scientific digital information in Europe.

As part of PARSE.Insight, several case studies have been conducted, including one on the use of data in psycholinguistics. Psycholinguists study the relationship between the human mind and language, and create data during experiments and observations or use data stored in language archives. The case study covered persons within and outside of the European Union.

Psycholinguistics survey coverage
A survey conducted as part of the psycholinguistics case study generated 401 responses (205 completes; 196 partials), the vast majority of which (90.1%) were researchers. Most were individual researchers (almost 50%), but also research group leaders/managers (27.7%) and members of research groups (24.1%). The researchers usually represented their department (40.5%) or working group or project (39.9%), and not their entire institute (less than 20%).
Almost 70% of respondents work in the EU, three quarters of them in Germany, the United Kingdom, the Netherlands or France. Eastern Europe was underrepresented, and more than half of responses from outside the EU came from America. The survey therefore best represents the central-European psycholinguistic community, where significant psycholinguistic institutes are located.

e-Infrastructure for psycholinguistic data
When questioned as to the relevance of specific e-Infrastructure components (tools for automated data generation, researching language resources, automated data exchange and conversion, metadata generation, persistent identifier systems, storage, recommended standards, preservation policies and training in data curation), most respondents consider all components to be fundamental or important. The most important are thought to be basic work components (“research tools for language resources” and “storage”), while components highly relevant to the long-term preservation of data (such as “tools for metadata generation” and “persistent identifier systems”), are judged less important.
“Research tools for language resources” and “storage” are the only tools that respondents believe already exist. Thus it can be assumed that these components are most frequently provided within existing e-Infrastructures for psycholinguistic data. Hence, the most important components are most frequently available, and presumably, most frequently used.
The researchers are not sure whether the other components exist. Hence, it can be assumed that they do not use these components, even if they understand their importance.
“Training in data curation” is the component researchers miss most, even though most consider it an important component.
Nearly 40% of researchers do not think that existing components fit their needs, and 37.3% are not sure. Even though researchers are not sure which e-Infrastructure components already exist, more than 23% are satisfied with them. Although almost a quarter state that the components fit their needs, more than 54% see the need for improvements; more than 42% are uncertain about it, and only 3% think nothing should be improved.
Although most researchers consider all given components as fundamental or important, most of them do not know whether existing e-Infrastructures provide these components, whether these components fit their needs, and what should be improved. Despite this uncertainty, 76% express the need for a networked and transnational e-Infrastructure for psycholinguistic digital research data. Only 7.5% do not see this need, and about 16% are not sure.

Summary
Within the surveyed psycholinguistic community there is interest in the long-term preservation of data and awareness that efforts are needed to shape e-Infrastructures for these data. However, respondents are uncertain about the existence of current e-Infrastructure components and how to use them – they state a need for more training.
Hence, there is a need for more information on how to use the e-Infrastructure components required for the long-term preservation, (re)use and exchange of scientific data.

Zero-In - Issue 3 - 3 e 4

LATEST NEWS

11-03-2010 The EUAsiaGrid project makes a virtue of diversity

EUAsiaGrid, a two-year project to promote grid awareness in South-East Asia, is entering its final phase. Time to take stock of some of the unique aspects of running such a geographically and culturally diverse grid project, and the opportunities it has created for closer scientific collaboration.

09-03-2010 Registration now open for the 1st Erina4Africa Workshop!

This one-day event will be held in Kigali, Rwanda on the 21st of April. It will focus on Investigating Best Practices for e-Infrastructure Application Development.


More news...

UPCOMING EVENTS

ISCB-LA - 1st International Society for Computational Biology Regional Latin American meeting

The first International Society for Computational Biology Regional Latin American meeting (ISCB-LA) will take place in March 2010 in Montevideo, Uruguay. Each year the International Society for Com

OGF28

OGF returns to Europe for its first event of 2010, and will assemble in Munich, Germany


Events calendar...

Enjoy The Digital Library