Geför­dert durch:

Abstract

Abs­tract: Bren­tel, I. (2018, 26.10.). A case stu­dy of pro­ces­sing lar­ge sca­le data – A method to accom­plish repro­du­ci­bi­li­ty. BigSurv18 – Big Data Meets Sur­vey Sci­ence. Rese­arch and Exper­ti­se Cen­ter in Sur­vey Metho­do­lo­gy (RECSM), Uni­ver­si­tat Pom­peu Fabra, Bar­ce­lo­na, Spa­ni­en. [Tan­dem 1]

Rele­van­ce & Rese­arch Ques­ti­on:
Our pro­ject is an attempt to fill a lacu­na in com­mu­ni­ca­ti­ons stu­dies by crea­ting a har­mo­ni­zed lon­gi­tu­di­nal data­set (sin­ce 1954) on media use in Ger­ma­ny explo­i­t­ing the Media-Ana­ly­sis-Data, which is based on repre­sen­ta­ti­ve sur­veys with 30.000 respondents each year. In making lar­ge-sca­le media use data acces­si­ble for aca­de­mic rese­arch in high qua­li­ty stan­dards of data docu­men­ta­ti­on lies the rele­van­ce of this pro­ject. The rese­arch ques­ti­on, the­re­fo­re, is: how to make the Media-Ana­ly­sis-Data – as a ‘big data’ – acces­si­ble for aca­de­mic rese­arch while being trans­pa­rent to ensu­re reproducibility.

Methods & Data:
This paper will pre­sent the various theo­re­ti­cal and prac­ti­cal use of a digi­tal har­mo­niz­a­ti­on soft­ware, Charm­Stats, uti­li­zed over the cour­se of this pro­ject. The goal of the data pro­ces­sing was to crea­te a sci­en­ti­fic use file set­ting excel­lent docu­men­ta­ti­on stan­dards with the help of Charm­Stats and to con­ti­nue har­mo­niz­a­ti­ons up to 2009. Using Charm­Stats we review the chal­len­ges and solu­ti­ons deve­lo­ped in lar­ge-sca­le data pro­ces­sing as a mass varia­ble har­mo­niz­a­ti­on case stu­dy. With more than 1.5 mil­li­on cases per data­set – in total the­re are two har­mo­ni­zed data­sets –, inclu­ding almost 30.000 varia­bles for over 60 years for press­me­dia, almost 40 years for radio and now eight years for online media, the Media-Ana­ly­sis-Data can be coun­ted as the big­gest data­set of media use in Ger­ma­ny madea­vail­ab­le for aca­de­mics. The­re­fo­re, the metho­do­lo­gi­cal approach of this pro­ject can be coun­ted as a user case for docu­men­ting and har­mo­ni­zing big data for aca­de­mic rese­arch to secu­re traceability.

Results:
Tar­get of the pro­ject is to make the com­plex and labour-inten­si­ve data pro­ces­sing pro­ce­du­re for lar­ge-sca­le data ful­ly trans­pa­rent and trace­ab­le. Charm­Stats offers the pos­si­bi­li­ty to ful­fil the project´s goals as it pro­du­ces pro­prie­ta­ry sta­tis­ti­cal soft­ware syn­ta­xes for data pro­ces­sing plus a report for docu­men­ta­ti­on. For the pre­sen­ta­ti­on we will por­trait the dif­fe­rent steps taken to ful­fil the project´s goals to ans­wer the rese­arch question:

  1. Fin­ding a struc­tu­re to work with,
  2. Set­ting stan­dards for data docu­men­ta­ti­on making data pro­ces­sing trace­ab­le with CharmStats,
  3. Pro­du­cing a har­mo­ni­zed data­set, and
  4. Making the data­set repro­du­ci­ble, moreo­ver, making it an acces­si­ble and sus­tainab­le source for aca­de­mic rese­arch throughout the Libra­ry of Online Har­mo­niz­a­ti­on (sche­du­led for release in 2019)