Ethical and practical challenges of big data in evidence-based health research: a scoping review

Article type
Authors
1Evidence-based Health Post-graduation Program, Universidade Federal de São Paulo, São Paulo
2Administration Post-graduation Program, Universidade Nove de Julho, São Paulo
Abstract
Background: 'big data' is the name given to a huge set of data that requires the assistance of computerized and/or analytical processing. It is at an initial stage and can bring an explosive availability of diverse data that are 'big' in volume, velocity, variety, helping research and decision-making in several fields within health care. However, as studies using 'big data' rely on secondary data that are collected for purposes other than research, measurement error and spurious associations may be common. Although experimental designs such as randomized controlled trials (RCTs) are considered a gold standard in determining causal relationships, when there is no clinical equipoise, it is not ethical to conduct a RCT. Also, RCTs have strict eligibility criteria and their external validity is limited. The evidence demonstrating how 'big data' can improve research in health is scant.

Objectives: to identify the ethical and practical challenges for the use of 'big data' in evidence-based health (EBH) research.

Methods: we performed a sensitive search, without language or publication date restriction, on CENTRAL, MEDLINE, Scopus, LILACS and IBECS databases to retrieve studies exploring the usage of 'big data' within health research, its ethical and practical issues and/or its relation to EBH research. Two authors screened, selected and summarized the eligible studies.

Results: we retrieved 1494 records, of which 66 were duplicated and 1232 were not related to the main subject. After reading 196 full-text articles, we included 63 studies. Regarding ethical aspects, the most frequent concern was by far the lack of privacy regulation. Other concerns include informed consent, security, data ownership, transparency and trust. Reported practical challenges include bias and confounding as major threats to the validity of data in research. These aspects require further reflection. Overall, the use of large databases and related algorithms should not surpass EBH, but rather complete it.

Conclusions: 'big data' is constantly expanding and will help to produce more accurate analyses in the coming years. To achieve these goals, it will be necessary to unite current EBH practice to 'big data' precision and share data across centers and countries in a standardized format. When combining EBH and precision approaches, evidence-based precise health can make 'big data' truly big. The challenges ahead are numerous, but so are the rewards - namely that every individual should be able to benefit from precision health.

Patient or healthcare consumer involvement: EBH formulates relevant questions and the need to search for studies for data extraction. 'Big data' has become a useful tool for capturing information from people, increasing the possibilities of variables that can be analyzed and making data quickly accessible. Thus, 'big data' can help in the creation of studies that may identify risk factors, adverse events, safety, efficiency, and effectiveness of interventions that affect patients/consumers’ lives.