The National Health and Nutrition Examination Survey (NHANES) is a population survey implemented by the Centers for Disease Control and Prevention (CDC) to monitor the health of the United States whose data is publicly available in hundreds of files. This Data Descriptor describes a single unified and universally accessible data file, merging across 255 separate files and stitching data across 4 surveys, encompassing 41,474 individuals and 1,191 variables. The variables consist of phenotype and environmental exposure information on each individual, specifically (1) demographic information, physical exam results (e.g., height, body mass index), laboratory results (e.g., cholesterol, glucose, and environmental exposures), and (4) questionnaire items. Second, the data descriptor describes a dictionary to enable analysts find variables by category and human-readable description. The datasets are available on DataDryad and a hands-on analytics tutorial is available on GitHub. Through a new big data platform, BD2K Patient Centered Information Commons (http://pic-sure.org), we provide a new way to browse the dataset via a web browser (https://nhanes.hms.harvard.edu) and provide application programming interface for programmatic access.
The recent announcement of the Precision Medicine Initiative by President Obama has brought precision medicine (PM) to the forefront for healthcare providers, researchers, regulators, innovators, and funders alike. As technologies continue to evolve and datasets grow in magnitude, a strong computational infrastructure will be essential to realize PM's vision of improved healthcare derived from personal data. In addition, informatics research and innovation affords a tremendous opportunity to drive the science underlying PM. The informatics community must lead the development of technologies and methodologies that will increase the discovery and application of biomedical knowledge through close collaboration between researchers, clinicians, and patients. This perspective highlights seven key areas that are in need of further informatics research and innovation to support the realization of PM.
INTRODUCTION: We see increased use of existing observational data in order to achieve fast and transparent production of empirical evidence in health care research. Multiple databases are often used to increase power, to assess rare exposures or outcomes, or to study diverse populations. For privacy and sociological reasons, original data on individual subjects can't be shared, requiring a distributed network approach where data processing is performed prior to data sharing.
CASE DESCRIPTIONS AND VARIATION AMONG SITES: We created a conceptual framework distinguishing three steps in local data processing: (1) data reorganization into a data structure common across the network; (2) derivation of study variables not present in original data; and (3) application of study design to transform longitudinal data into aggregated data sets for statistical analysis. We applied this framework to four case studies to identify similarities and differences in the United States and Europe: Exploring and Understanding Adverse Drug Reactions by Integrative Mining of Clinical Records and Biomedical Knowledge (EU-ADR), Observational Medical Outcomes Partnership (OMOP), the Food and Drug Administration's (FDA's) Mini-Sentinel, and the Italian network-the Integration of Content Management Information on the Territory of Patients with Complex Diseases or with Chronic Conditions (MATRICE).
FINDINGS: National networks (OMOP, Mini-Sentinel, MATRICE) all adopted shared procedures for local data reorganization. The multinational EU-ADR network needed locally defined procedures to reorganize its heterogeneous data into a common structure. Derivation of new data elements was centrally defined in all networks but the procedure was not shared in EU-ADR. Application of study design was a common and shared procedure in all the case studies. Computer procedures were embodied in different programming languages, including SAS, R, SQL, Java, and C++.
CONCLUSION: Using our conceptual framework we found several areas that would benefit from research to identify optimal standards for production of empirical knowledge from existing databases.an opportunity to advance evidence-based care management. In addition, formalized CM outcomes assessment methodologies will enable us to compare CM effectiveness across health delivery settings.
Due to the heterogeneity of existing European sources of observational healthcare data, data source-tailored choices are needed to execute multi-data source, multi-national epidemiological studies. This makes transparent documentation paramount. In this proof-of-concept study, a novel standard data derivation procedure was tested in a set of heterogeneous data sources. Identification of subjects with type 2 diabetes (T2DM) was the test case. We included three primary care data sources (PCDs), three record linkage of administrative and/or registry data sources (RLDs), one hospital and one biobank. Overall, data from 12 million subjects from six European countries were extracted. Based on a shared event definition, sixteeen standard algorithms (components) useful to identify T2DM cases were generated through a top-down/bottom-up iterative approach. Each component was based on one single data domain among diagnoses, drugs, diagnostic test utilization and laboratory results. Diagnoses-based components were subclassified considering the healthcare setting (primary, secondary, inpatient care). The Unified Medical Language System was used for semantic harmonization within data domains. Individual components were extracted and proportion of population identified was compared across data sources. Drug-based components performed similarly in RLDs and PCDs, unlike diagnoses-based components. Using components as building blocks, logical combinations with AND, OR, AND NOT were tested and local experts recommended their preferred data source-tailored combination. The population identified per data sources by resulting algorithms varied from 3.5% to 15.7%, however, age-specific results were fairly comparable. The impact of individual components was assessed: diagnoses-based components identified the majority of cases in PCDs (93-100%), while drug-based components were the main contributors in RLDs (81-100%). The proposed data derivation procedure allowed the generation of data source-tailored case-finding algorithms in a standardized fashion, facilitated transparent documentation of the process and benchmarking of data sources, and provided bases for interpretation of possible inter-data source inconsistency of findings in future studies.
OBJECTIVE: The purpose of this study is to determine whether the posterior radioscaphoid angle, a marker of posterior displacement of the scaphoid, is associated with degenerative joint disease in patients with scapholunate ligament tears.
MATERIALS AND METHODS: Images from 150 patients with wrist pain who underwent CT arthrography and radiography were retrospectively evaluated. Patients with and without scapholunate ligament ruptures were divided into two groups according to CT arthrography findings. The presence of degenerative changes (scapholunate advanced collapse [SLAC] wrist) was evaluated and graded on conventional radiographs. Images were evaluated by two readers independently, and an adjudicator analyzed the discordant cases. Posterior radioscaphoid angle values were correlated with CT arthrography and radiographic findings. The association between posterior radioscaphoid angle and degenerative joint disease was evaluated. Scapholunate and radiolunate angles were considered in the analysis.
RESULTS: The posterior radioscaphoid angle was measurable in all patients, with substantial interobserver agreement (intraclass correlation coefficient, 0.75). The posterior radioscaphoid angle performed better than did the scapholunate and radiolunate angles in the differentiation of patients with and without SLAC wrist (p < 0.02). Posterior radioscaphoid angles greater than 114° presented an 80.0% sensitivity and 89.7% specificity for the detection of SLAC wrist.
CONCLUSION: Posterior radioscaphoid angles were strongly associated with degenerative wrist disease, with potential prognostic implications in patients with wrist trauma and scapholunate ligament ruptures.
OBJECTIVE: To evaluate the impact of computerized provider order entry (CPOE) at the bedside on medical students training.
MATERIALS AND METHODS: We conducted a randomized cross-controlled educational trial on medical students during two clerkship rotations in three departments, assessing the impact of the use of CPOE on their ability to place adequate monitoring and therapeutic orders using a written test before and after each rotation. Students' satisfaction with their practice and the order placement system was surveyed. A multivariate mixed model was used to take individual students and chief resident (CR) effects into account. Factorial analysis was applied on the satisfaction questionnaire to identify dimensions, and scores were compared on these dimensions.
RESULTS: Thirty-six students show no better progress (beginning and final test means = 69.87 and 80.98 points out of 176 for the control group, 64.60 and 78.11 for the CPOE group, p = 0.556) during their rotation in either group, even after adjusting for each student and CR, but show a better satisfaction with patient care and greater involvement in the medical team in the CPOE group (p = 0.035*). Both groups have a favorable opinion regarding CPOE as an educational tool, especially because of the order reviewing by the supervisor.
CONCLUSION: This is the first randomized controlled trial assessing the performance of CPOE in both the progress in prescriptions ability and satisfaction of the students. The absence of effect on the medical skills must be weighted by the small time scale and low sample size. However, students are more satisfied when using CPOE rather than usual training.
Graft-versus-host disease (GVHD) is a known risk factor for invasive aspergillosis (IA), but remains poorly studied in relation to Clostridium difficile infection (CDI). We report a case of a 58-years-old patient who developed an IA within a protected room, CDI and GVHD after allogeneic allogeneic peripheral blood stem cell transplantation (PBSCT). Factors associated with this complex condition in patients receiving allogeneic PBSCT need to be identified.
This work proposes an integrated workflow for secondary use of medical data to serve feasibility studies, and the prescreening and monitoring of research studies. All research issues are initially addressed by the Clinical Research Office through a research portal and subsequently redirected to relevant experts in the determined field of concentration. For secondary use of data, the workflow is then based on the clinical data warehouse of the hospital. A datamart with potentially eligible research candidates is constructed. Datamarts can either produce aggregated data, de-identified data, or identified data, according to the kind of study being treated. In conclusion, integrating the secondary use of data process into a general research workflow allows visibility of information technologies and improves the accessability of clinical data.
In this paper, we present a semantic, metadata based knowledge discovery methodology for identifying teams of researchers from diverse backgrounds who can collaborate on interdisciplinary research projects: projects in areas that have been identified as high-impact areas at The Ohio State University. This methodology involves the semantic annotation of keywords and the postulation of semantic metrics to improve the efficiency of the path exploration algorithm as well as to rank the results. Results indicate that our methodology can discover groups of experts from diverse areas who can collaborate on translational research projects.
OBJECTIVE: Matching healthcare staff resources to patient needs in the ICU is a key factor for quality of care. We aimed to assess the impact of the staffing-to-patient ratio and workload on ICU mortality.
DESIGN: We performed a multicenter longitudinal study using routinely collected hospital data.
SETTING: Information pertaining to every patient in eight ICUs from four university hospitals from January to December 2013 was analyzed.
PATIENTS: A total of 5,718 inpatient stays were included.
MEASUREMENTS AND MAIN RESULTS: We used a shift-by-shift varying measure of the patient-to-caregiver ratio in combination with workload to establish their relationships with ICU mortality over time, excluding patients with decision to forego life-sustaining therapy. Using a multilevel Poisson regression, we quantified ICU mortality-relative risk, adjusted for patient turnover, severity, and staffing levels. The risk of death was increased by 3.5 (95% CI, 1.3-9.1) when the patient-to-nurse ratio was greater than 2.5, and it was increased by 2.0 (95% CI, 1.3-3.2) when the patient-to-physician ratio exceeded 14. The highest ratios occurred more frequently during the weekend for nurse staffing and during the night for physicians (p < 0.001). High patient turnover (adjusted relative risk, 5.6 [2.0-15.0]) and the volume of life-sustaining procedures performed by staff (adjusted relative risk, 5.9 [4.3-7.9]) were also associated with increased mortality.
CONCLUSIONS: This study proposes evidence-based thresholds for patient-to-caregiver ratios, above which patient safety may be endangered in the ICU. Real-time monitoring of staffing levels and workload is feasible for adjusting caregivers' resources to patients' needs.
BACKGROUND: Children with inflammatory bowel disease are at risk of vaccine-preventable diseases mostly due to immunosuppressive drugs.
AIM: To evaluate coverage after an awareness campaign informing patients, their parents and general practitioner about the vaccination schedule.
METHODS: Vaccination coverage was firstly evaluated and followed by an awareness campaign on the risk of infection via postal mail. The trial is a case-control study on the same patients before and after the awareness campaign. Overall, 92 children were included. A questionnaire was then completed during a routine appointment to collect data including age at diagnosis, age at data collection, treatment history, and vaccination status.
RESULTS: Vaccination rates significantly increased for vaccines against diphtheria-tetanus-poliomyelitis (92% vs. 100%), Haemophilus influenzae (88% vs. 98%), hepatitis B (52% vs. 71%), pneumococcus (36% vs. 57%), and meningococcus C (17% vs. 41%) (p<0.05). Children who were older at diagnosis were 1.26 times more likely to be up-to-date with a minimum vaccination schedule (diphtheria-tetanus-poliomyelitis, pertussis, H. influenzae, measles-mumps-rubella, tuberculosis) (p=0.002).
CONCLUSION: Informing inflammatory bowel disease patients, their parents and general practitioner about the vaccination schedule via postal mail is easy, inexpensive, reproducible, and increases vaccination coverage. This method reinforces information on the risk of infection during routine visits.
OBJECTIVES: In France, medical students regularly complain about the shortcomings of their theoretical training and the necessity of its adaptation to better fit the needs of students. The goal was to evaluate the theoretical teaching practices in postgraduate medical studies by: 1) collecting data from medical students in different medical faculties in France; 2) comparing this data with expected practices when it is possible; 3) and proposing several lines of improvement.
METHODS: A survey of theoretical practices in the 3rd cycle of medical studies was conducted by self-administered questionnaires which were free of charge, anonymous, and administered electronically from July 3 to October 31, 2013 to all medical students in France.
RESULTS: National, inter-regional, regional and field internship educational content was absent in respectively 50.5%, 42.8%, 26.0% and 30.2% of cases. Medical students follow complementary training due to insufficient DES and/or DESC 2 training in 43.7% of cases or as part of a professional project in 54.9% of cases. The knowledge sought by medical students concerns the following crosscutting topics: career development (58.9%), practice management (50.7%), medical English (50.4%) and their specialty organization (49.9%). Fifty-four point one percent would like to be evaluated on their theoretical training on an annual basis.
CONCLUSION: The results of this first national survey give insights into the theoretical teaching conditions in postgraduate medical education in France and the aspirations of medical students.
While risk of acute kidney injury (AKI) is a well documented adverse effect of some drugs, few studies have assessed the relationship between drug-drug interactions (DDIs) and AKI. Our objective was to develop an algorithm capable of detecting potential signals on this relationship by retrospectively mining data from electronic health records.
MATERIAL AND METHODS:
Data were extracted from the clinical data warehouse (CDW) of the Hôpital Européen Georges Pompidou (HEGP). AKI was defined as the first level of the RIFLE criteria, that is, an increase ≥50 % of creatinine basis. Algorithm accuracy was tested on 20 single drugs, 10 nephrotoxic and 10 non-nephrotoxic. We then tested 45 pairs of non-nephrotoxic drugs, among the most prescribed at our hospital and representing distinct pharmacological classes for DDIs.
Sensitivity and specificity were 50 % [95 % confidence interval (CI) 23.66-76.34] and 90 % (95 % CI 59.58-98.21), respectively, for single drugs. Our algorithm confirmed a previously identified signal concerning clarithromycin and calcium-channel blockers (unadjusted odds ratio (ORu) 2.92; 95 % CI 1.11-7.69, p = 0.04). Among the 45 drug pairs investigated, we identified a signal concerning 55 patients in association with bromazepam and hydroxyzine (ORu 1.66; 95 % CI 1.23-2.23). This signal was not confirmed after a chart review. Even so, AKI and co-prescription were confirmed for 96 % (95 % CI 88-99) and 88 % (95 % CI 76-94) of these patients, respectively.
Data mining techniques on CDW can foster the detection of adverse drug reactions when drugs are used alone or in combination.
Background: The objective of this study was to measure the prevalence of inflammatory bowel disease (IBD) among patients with autism spectrum disorders (ASD), which has not been well described previously.
Methods: The rates of IBD among patients with and without ASD were measured in 4 study populations with distinct modes of ascertainment: a health care benefits company, 2 pediatric tertiary care centers, and a national ASD repository. The rates of IBD (established through International Classification of Diseases, Ninth Revision, Clinical Modification [ICD-9-CM] codes) were compared with respective controls and combined using a Stouffer meta-analysis. Clinical charts were also reviewed for IBD among patients with ICD-9-CM codes for both IBD and ASD at one of the pediatric tertiary care centers. This expert-verified rate was compared with the rate in the repository study population (where IBD diagnoses were established by expert review) and in nationally reported rates for pediatric IBD.
Results: In all of case-control study populations, the rates of IBD-related ICD-9-CM codes for patients with ASD were significantly higher than that of their respective controls (Stouffer meta-analysis, P < 0.001). Expert-verified rates of IBD among patients with ASD were 7 of 2728 patients in one study population and 16 of 7201 in a second study population. The age-adjusted prevalence of IBD among patients with ASD was higher than their respective controls and nationally reported rates of pediatric IBD.
Conclusions: Across each population with different kinds of ascertainment, there was a consistent and statistically significant increased prevalance of IBD in patients with ASD than their respective controls and nationally reported rates for pediatric IBD.
The rise of personalized medicine and the availability of high-throughput molecular analyses in the context of clinical care have increased the need for adequate tools for translational researchers to manage and explore these data. We reviewed the biomedical literature for translational platforms allowing the management and exploration of clinical and omics data, and identified seven platforms: BRISK, caTRIP, cBio Cancer Portal, G-DOC, iCOD, iDASH and tranSMART. We analyzed these platforms along seven major axes. (1) The community axis regrouped information regarding initiators and funders of the project, as well as availability status and references. (2) We regrouped under the information content axis the nature of the clinical and omics data handled by each system. (3) The privacy management environment axis encompassed functionalities allowing control over data privacy. (4) In the analysis support axis, we detailed the analytical and statistical tools provided by the platforms. We also explored (5) interoperability support and (6) system requirements. The final axis (7) platform support listed the availability of documentation and installation procedures. A large heterogeneity was observed in regard to the capability to manage phenotype information in addition to omics data, their security and interoperability features. The analytical and visualization features strongly depend on the considered platform. Similarly, the availability of the systems is variable. This review aims at providing the reader with the background to choose the platform best suited to their needs. To conclude, we discuss the desiderata for optimal translational research platforms, in terms of privacy, interoperability and technical features.
BACKGROUND: Medline/PubMed is the most frequently used medical bibliographic research database. The aim of this study was to propose a new generic method to limit any Medline/PubMed query based on the relative impact factor and the A & B categories of the SIGAPS score. MATERIAL AND METHODS: The entire PubMed corpus was used for the feasibility study, then ten frequent diseases in terms of PubMed indexing and the citations of four Nobel prize winners. The relative impact factor (RIF) was calculated by medical specialty defined in Journal Citation Reports. The two queries, which included all the journals in category A (or A OR B), were added to any Medline/PubMed query as a central point of the feasibility study. RESULTS: Limitation using the SIGAPS category A was larger than the when using the Core Clinical Journals (CCJ): 15.65% of PubMed corpus vs 8.64% for CCJ. The response time of this limit applied to the entire PubMed corpus was less than two seconds. For five diseases out of ten, limiting the citations with the RIF was more effective than with the CCJ. For the four Nobel prize winners, limiting the citations with the RIF was more effective than the CCJ. CONCLUSION: The feasibility study to apply a new filter based on the relative impact factor on any Medline/PubMed query was positive.
OBJECTIVE: We had for objective to study HIV management (hospital, ambulatory, and mixed) and assess compliance with health insurance database.
METHOD: We conducted a retrospective study using the French Social Security (CPAM) database. The inclusion criteria were: age>18years of age, at least 2 prescriptions of antiretroviral therapy.
RESULTS: Five hundred and seventy-five patients were included: extra-hospital (12), hospital (162), mixed (401). The prescriptions were exclusively hospital issued for 76.2% of the patients. Among the mixed group patients, 91% of treatments were delivered at least once in the community, and 45.6% of biological tests were performed in private laboratories at least once. The sex ratio (2.1 vs. 1.3), the number of patients having switched antiretroviral therapy (36.7% vs. 27.8%), and the frequency of biological tests (3.1 vs. 2.6) were significantly higher in the mixed group compared to the hospital group. The mean compliance was 90% in the hospital group and 91.8% in the mixed group. The compliance was<80% for 104 patients (21.8%). Patients with≥80% compliance were older (46.1years of age vs. 42.7years of age), with more frequent biological tests (3 per year vs. 2.5 per year), and more frequent switches in treatment (35.4% vs. 26.0%).
CONCLUSION: Prescriptions of ARV were almost exclusively hospital issued. Their dispensation and biological tests were split between hospital and extra-hospital settings. Most patients demonstrated an optimal compliance. The CPAM database allows describing HIV management and assessing compliance.
PURPOSE: To test an automated method to decrease the number of false-positive (FP) signals of disproportionate reportings (SDRs) generated by co-prescription.
METHODS: Automated backward stepwise removal of reports concerning the drug associated with the highest ranked SDR for an event was tested for gastric and oesophageal haemorrhages (GOH), central nervous system haemorrhages and cerebrovascular accidents (CNSH), ischaemic coronary artery disorders and muscle pains (MP) using the reporting odds ratio in the French spontaneous reporting research database. After ranking SDRs detected in the complete dataset on the lower limit of the reporting odds ratio 95% confidence interval, reports concerning the drug with the highest ranked SDR were removed. In the dataset thus generated, SDRs were again identified, ranked and reports related to the drug involved in the newly highest ranked SDR removed. The process was repeated until no signal was detected. Initially detected SDRs eliminated using this technique were assessed regarding the summary of products characteristics and the literature to determine their FP nature.
RESULTS: Seventeen SDRs were successively eliminated for GOH, 37 for CNSH, 15 for ischaemic coronary artery disorders, and 36 for MP. Four were FP for GOH, 29 for CNSH, 7 for ACI and none were FP for MP. The positive predictive value of the backward stepwise removal procedure in identifying FP SDRs ranged from 0% (MP) to 78.4% (CNSH).
CONCLUSIONS: Although further adjustment is needed to improve the method presented herein, our results suggest that numerous FP signals because of co-prescription bias could be eliminated using an automated method.