Última actualización: 2022-07-05
16 Publicaciones
Publicaciones Publicado
GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing.

Nucleic Acids Res. 2022; 50 (5)
DOI: 10.1093/nar/gkac076
The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fill this gap by generating a dense haplotype map focused on the identification, characterization, and phasing of structural variants (SVs). By integrating multiple variant identification methods and Logistic Regression Models (LRMs), we present a catalogue of 35 431 441 variants, including 89 178 SVs (≥50 bp), 30 325 064 SNVs and 5 017 199 indels, across 785 Illumina high coverage (30x) whole-genomes from the Iberian GCAT Cohort, containing a median of 3.52M SNVs, 606 336 indels and 6393 SVs per individual. The haplotype panel is able to impute up to 14 360 728 SNVs/indels and 23 179 SVs, showing a 2.7-fold increase for SVs compared with available genetic variation panels. The value of this panel for SVs analysis is shown through an imputed rare Alu element located in a new locus associated with Mononeuritis of lower limb, a rare neuromuscular disease. This study represents the first deep characterization of genetic variation within the Iberian population and the first operational haplotype panel to systematically include the SVs into genome-wide genetic studies.
pyFoldX: enabling biomolecular analysis and engineering along structural ensembles.

Bioinformatics. 2022;
DOI: 10.1093/bioinformatics/btac072
Recent years have seen an increase in the number of structures available, not only for new proteins but also for the same protein crystallized with different molecules and proteins. While protein design software have proven to be successful in designing and modifying proteins, they can also be overly sensitive to small conformational differences between structures of the same protein. To cope with this, we introduce here pyFoldX, a python library that allows the integrative analysis of structures of the same protein using FoldX, an established forcefield and modeling software. The library offers new functionalities for handling different structures of the same protein, an improved molecular parametrization module, and an easy integration with the data analysis ecosystem of the python programming language. pyFoldX rely on the FoldX software for energy calculations and modelling, which can be downloaded upon registration in http://foldxsuite.crg.eu/ and its licence is free of charge for academics. The pyFoldX library is open-source. Full details on installation, tutorials covering the library functionality, and the scripts used to generate the data and figures presented in this paper are available at https://github.com/leandroradusky/pyFoldX. Supplementary data are available at Bioinformatics online.
Evidence for shared genetic risk factors between lymphangioleiomyomatosis and pulmonary function.

ERJ Open Res. 2022; 8 (1)
DOI: 10.1183/23120541.00375-2021
Lymphangioleiomyomatosis (LAM) is a rare low-grade metastasising disease characterised by cystic lung destruction. The genetic basis of LAM remains incompletely determined, and the disease cell-of-origin is uncertain. We analysed the possibility of a shared genetic basis between LAM and cancer, and LAM and pulmonary function. The results of genome-wide association studies of LAM, 17 cancer types and spirometry measures (forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC), FEV1/FVC ratio and peak expiratory flow (PEF)) were analysed for genetic correlations, shared genetic variants and causality. Genomic and transcriptomic data were examined, and immunodetection assays were performed to evaluate pleiotropic genes. There were no significant overall genetic correlations between LAM and cancer, but LAM correlated negatively with FVC and PEF, and a trend in the same direction was observed for FEV1. 22 shared genetic variants were uncovered between LAM and pulmonary function, while seven shared variants were identified between LAM and cancer. The LAM-pulmonary function shared genetics identified four pleiotropic genes previously recognised in LAM single-cell transcriptomes: ADAM12, BNC2, NR2F2 and SP5. We had previously associated NR2F2 variants with LAM, and we identified its functional partner NR3C1 as another pleotropic factor. NR3C1 expression was confirmed in LAM lung lesions. Another candidate pleiotropic factor, CNTN2, was found more abundant in plasma of LAM patients than that of healthy women. This study suggests the existence of a common genetic aetiology between LAM and pulmonary function.
A community challenge for a pancancer drug mechanism of action inference from perturbational profile data.

Cell Rep Med. 2022; 3 (1)
DOI: 10.1016/j.xcrm.2021.100492
The Columbia Cancer Target Discovery and Development (CTD2) Center is developing PANACEA, a resource comprising dose-responses and RNA sequencing (RNA-seq) profiles of 25 cell lines perturbed with ∼400 clinical oncology drugs, to study a tumor-specific drug mechanism of action. Here, this resource serves as the basis for a DREAM Challenge assessing the accuracy and sensitivity of computational algorithms for de novo drug polypharmacology predictions. Dose-response and perturbational profiles for 32 kinase inhibitors are provided to 21 teams who are blind to the identity of the compounds. The teams are asked to predict high-affinity binding targets of each compound among ∼1,300 targets cataloged in DrugBank. The best performing methods leverage gene expression profile similarity analysis as well as deep-learning methodologies trained on individual datasets. This study lays the foundation for future integrative analyses of pharmacogenomic data, reconciliation of polypharmacology effects in different tumor contexts, and insights into network-based assessments of drug mechanisms of action.
Identification and drug-induced reversion of molecular signatures of Alzheimer's disease onset and progression in AppNL-G-F, AppNL-F, and 3xTg-AD mouse models.

Genome Med. 2021; 13 (1)
DOI: 10.1186/s13073-021-00983-y


In spite of many years of research, our understanding of the molecular bases of Alzheimer's disease (AD) is still incomplete, and the medical treatments available mainly target the disease symptoms and are hardly effective. Indeed, the modulation of a single target (e.g., β-secretase) has proven to be insufficient to significantly alter the physiopathology of the disease, and we should therefore move from gene-centric to systemic therapeutic strategies, where AD-related changes are modulated globally.


Here we present the complete characterization of three murine models of AD at different stages of the disease (i.e., onset, progression and advanced). We combined the cognitive assessment of these mice with histological analyses and full transcriptional and protein quantification profiling of the hippocampus. Additionally, we derived specific Aβ-related molecular AD signatures and looked for drugs able to globally revert them.


We found that AD models show accelerated aging and that factors specifically associated with Aβ pathology are involved. We discovered a few proteins whose abundance increases with AD progression, while the corresponding transcript levels remain stable, and showed that at least two of them (i.e., lfit3 and Syt11) co-localize with Aβ plaques in the brain. Finally, we found two NSAIDs (dexketoprofen and etodolac) and two anti-hypertensives (penbutolol and bendroflumethiazide) that overturn the cognitive impairment in AD mice while reducing Aβ plaques in the hippocampus and partially restoring the physiological levels of AD signature genes to wild-type levels.


The characterization of three AD mouse models at different disease stages provides an unprecedented view of AD pathology and how this differs from physiological aging. Moreover, our computational strategy to chemically revert AD signatures has shown that NSAID and anti-hypertensive drugs may still have an opportunity as anti-AD agents, challenging previous reports.
Software Application Profile: exposomeShiny—a toolbox for exposome data analysis

Int J Epidemiol. 2021; 51 (1)


Studying the role of the exposome in human health and its impact on different omic layers requires advanced statistical methods. Many of these methods are implemented in different R and Bioconductor packages, but their use may require strong expertise in R, in writing pipelines and in using new R classes which may not be familiar to non-advanced users. ExposomeShiny provides a bridge between researchers and most of the state-of-the-art exposome analysis methodologies, without the need of advanced programming skills.


ExposomeShiny is a standalone web application implemented in R. It is available as source files and can be installed in any server or computer avoiding problems with data confidentiality. It is executed in RStudio which opens a browser window with the web application.

General features

The presented implementation allows the conduct of: (i) data pre-processing: normalization and missing imputation (including limit of detection); (ii) descriptive analysis; (iii) exposome principal component analysis (PCA) and hierarchical clustering; (iv) exposome-wide association studies (ExWAS) and variable selection ExWAS; (v) omic data integration by single association and multi-omic analyses; and (vi) post-exposome data analyses to gain biological insight for the exposures, genes or using the Comparative Toxicogenomics Database (CTD) and pathway analysis.


The exposomeShiny source code is freely available on Github at [https://github.com/isglobal-brge/exposomeShiny], Git tag v1.4. The software is also available as a Docker image [https://hub.docker.com/r/brgelab/exposome-shiny], tag v1.4. A user guide with information about the analysis methodologies as well as information on how to use exposomeShiny is freely hosted at [https://isglobal-brge.github.io/exposome_bookdown/].
Detailed stratified GWAS analysis for severe COVID-19 in four European populations

medRxiv; 2021.
DOI: 10.1101/2021.07.21.21260624


Given the highly variable clinical phenotype of Coronavirus disease 2019 (COVID-19), a deeper analysis of the host genetic contribution to severe COVID-19 is important to improve our understanding of underlying disease mechanisms. Here, we describe an extended GWAS meta-analysis of a well-characterized cohort of 3,260 COVID-19 patients with respiratory failure and 12,483 population controls from Italy, Spain, Norway and Germany/Austria, including stratified analyses based on age, sex and disease severity, as well as targeted analyses of chromosome Y haplotypes, the human leukocyte antigen (HLA) region and the SARS-CoV-2 peptidome. By inversion imputation, we traced a reported association at 17q21.31 to a highly pleiotropic ∼0.9-Mb inversion polymorphism and characterized the potential effects of the inversion in detail. Our data, together with the 5 th release of summary statistics from the COVID-19 Host Genetics Initiative, also identified a new locus at 19q13.33, including NAPSA , a gene which is expressed primarily in alveolar cells responsible for gas exchange in the lung.
Use of Electronic Health Record Patient Portal Accounts Among Patients With Smartphone-Only Internet Access.

JAMA Netw Open. 2021; 4 (7)
DOI: 10.1001/jamanetworkopen.2021.18229
Bioactivity descriptors for uncharacterized chemical compounds.

Nat Commun. 2021; 12 (1)
DOI: 10.1038/s41467-021-24150-4
Chemical descriptors encode the physicochemical and structural properties of small molecules, and they are at the core of chemoinformatics. The broad release of bioactivity data has prompted enriched representations of compounds, reaching beyond chemical structures and capturing their known biological properties. Unfortunately, bioactivity descriptors are not available for most small molecules, which limits their applicability to a few thousand well characterized compounds. Here we present a collection of deep neural networks able to infer bioactivity signatures for any compound of interest, even when little or no experimental information is available for them. Our signaturizers relate to bioactivities of 25 different types (including target profiles, cellular response and clinical outcomes) and can be used as drop-in replacements for chemical descriptors in day-to-day chemoinformatics tasks. Indeed, we illustrate how inferred bioactivity signatures are useful to navigate the chemical space in a biologically relevant manner, unveiling higher-order organization in natural product collections, and to enrich mostly uncharacterized chemical libraries for activity against the drug-orphan target Snail1. Moreover, we implement a battery of signature-activity relationship (SigAR) models and show a substantial improvement in performance, with respect to chemistry-based classifiers, across a series of biophysics and physiology activity prediction benchmarks.
The DisGeNET cytoscape app: Exploring and visualizing disease genomics data.

Comput Struct Biotechnol J. 2021; 19
DOI: 10.1016/j.csbj.2021.05.015
Thanks to the unbiased exploration of genomic variants at large scale, hundreds of thousands of disease-associated loci have been uncovered. In parallel, network-based approaches have proven to be essential to understand the molecular mechanisms underlying human diseases. The use of these approaches has been boosted by the abundance of information about disease associated genes and variants, high quality human interactomics data, and the emergence of new types of omics data. The DisGeNET Cytoscape App combines the capabilities of Cytoscape with those of DisGeNET, a knowledge platform based on a comprehensive catalogue of disease-associated genes and variants. The DisGeNET Cytoscape App contains functions to query, analyze, and visualize different network representations of the gene-disease and variant-disease associations available in DisGeNET. It supports a wide variety of applications through its query and filter functionalities, including the annotation of foreign networks generated by other apps or uploaded by the user. The new release of the DisGeNET Cytoscape App has been designed to support Cytoscape 3.x and incorporates novel distinctive features such as visualization and analysis of variant-disease networks, disease enrichment analysis for genes and variants, and analytic support through Cytoscape Automation. Moreover, the DisGeNET Cytoscape App features an API to access its core functionalities via the REST protocol fostering the development of reproducible and scalable analysis workflows based on DisGeNET data.
Orchestrating privacy-protected big data analyses of data from different resources with R and DataSHIELD.

PLoS Comput Biol. 2021; 17 (3)
DOI: 10.1371/journal.pcbi.1008880
Combined analysis of multiple, large datasets is a common objective in the health- and biosciences. Existing methods tend to require researchers to physically bring data together in one place or follow an analysis plan and share results. Developed over the last 10 years, the DataSHIELD platform is a collection of R packages that reduce the challenges of these methods. These include ethico-legal constraints which limit researchers' ability to physically bring data together and the analytical inflexibility associated with conventional approaches to sharing results. The key feature of DataSHIELD is that data from research studies stay on a server at each of the institutions that are responsible for the data. Each institution has control over who can access their data. The platform allows an analyst to pass commands to each server and the analyst receives results that do not disclose the individual-level data of any study participants. DataSHIELD uses Opal which is a data integration system used by epidemiological studies and developed by the OBiBa open source project in the domain of bioinformatics. However, until now the analysis of big data with DataSHIELD has been limited by the storage formats available in Opal and the analysis capabilities available in the DataSHIELD R packages. We present a new architecture ("resources") for DataSHIELD and Opal to allow large, complex datasets to be used at their original location, in their original format and with external computing facilities. We provide some real big data analysis examples in genomics and geospatial projects. For genomic data analyses, we also illustrate how to extend the resources concept to address specific big data infrastructures such as GA4GH or EGA, and make use of shell commands. Our new infrastructure will help researchers to perform data analyses in a privacy-protected way from existing data sharing initiatives or projects. To help researchers use this framework, we describe selected packages and present an online book (https://isglobal-brge.github.io/resource_bookdown).
Association Between Hormone-Modulating Breast Cancer Therapies and Incidence of Neurodegenerative Outcomes for Women With Breast Cancer.

JAMA Netw Open. 2020; 3 (3)
DOI: 10.1001/jamanetworkopen.2020.1541
Importance:The association between exposure to hormone-modulating therapy (HMT) as breast cancer treatment and neurodegenerative disease (NDD) is unclear. Objective:To determine whether HMT exposure is associated with the risk of NDD in women with breast cancer. Design, Setting, and Participants:This retrospective cohort study used the Humana claims data set from January 1, 2007, to March 31, 2017. The Humana data set contains claims from private-payer and Medicare insurance data sets from across the United States with a population primarily residing in the Southeast. Patient claims records were surveyed for a diagnosis of NDD starting 1 year after breast cancer diagnosis for the duration of enrollment in the claims database. Participants were 57 843 women aged 45 years or older with a diagnosis of breast cancer. Patients were required to be actively enrolled in Humana claims records for 6 months prior to and at least 3 years after the diagnosis of breast cancer. The analyses were conducted between January 1 and 15, 2020. Exposure:Hormone-modulating therapy (selective estrogen receptor modulators, estrogen receptor antagonists, and aromatase inhibitors). Main Outcomes and Measures:Patients receiving HMT for breast cancer treatment were identified. Survival analysis was used to determine the association between HMT exposure and diagnosis of NDD. A propensity score approach was used to minimize measured and unmeasured selection bias. Results:Of the 326 485 women with breast cancer in the Humana data set between 2007 and 2017, 57 843 met the study criteria. Of these, 18 126 (31.3%; mean [SD] age, 76.2 [7.0] years) received HMT, whereas 39 717 (68.7%; mean [SD] age, 76.8 [7.0] years) did not receive HMT. Mean (SD) follow-up was 5.5 (1.8) years. In the propensity score-matched population, exposure to HMT was associated with a decrease in the number of women who received a diagnosis of NDD (2229 of 17 878 [12.5%] vs 2559 of 17 878 [14.3%]; relative risk, 0.89; 95% CI, 0.84-0.93; P < .001), Alzheimer disease (877 of 17 878 [4.9%] vs 1068 of 17 878 [6.0%]; relative risk, 0.82; 95% CI, 0.75-0.90; P < .001), and dementia (1862 of 17 878 [10.4%] vs 2116 of 17 878 [11.8%]; relative risk, 0.88; 95% CI, 0.83-0.93; P < .001). The number needed to treat was 62.51 for all NDDs, 93.61 for Alzheimer disease, and 69.56 for dementia. Conclusions and Relevance:Among patients with breast cancer, tamoxifen and steroidal aromatase inhibitors were associated with a decrease in the number who received a diagnosis of NDD, specifically Alzheimer disease and dementia.
ESICM LIVES 2019 : Berlin, Germany. 28 September - 2 October 2019.

Intensive Care Med Exp. 2019; 7 (Suppl 3)
DOI: 10.1186/s40635-019-0265-y
Treatment strategies and clinical outcomes of locally advanced pancreatic cancer patients treated at high-volume facilities and academic centers.

Adv Radiat Oncol. 2019; 4 (2)
DOI: 10.1016/j.adro.2018.10.006


Locally advanced pancreatic cancer (LAPC) treatment has varying practice patterns with poor outcomes. We investigated treatment using single-agent chemotherapy and multiagent chemotherapy (MAC) with or without radiation therapy (RT) at high-volume facilities (HVFs) and academic centers (ACs).

Methods and materials

The National Cancer Database was used to obtain data on 10,139 patients with LAPC. HVF was defined as the top 5% of facilities per number of patients treated at each facility. Univariate and multivariable (MVA) analysis Cox regressions were performed to identify the impact of HVF, AC, MAC, and RT on overall survival (OS).


The median age of patients was 66 years (range, 22-90); 50.1% were male and 49.9% female. Of the patients, 46.1% received MAC, 53.8% received single-agent chemotherapy, 45.7% received RT, 54.3% did not receive RT, and 5% underwent surgical resection. The median follow-up was 48.8 months. On MVA, treatment at HVFs and ACs remained significantly associated with improved OS, with a hazard ratio (HR) of 0.84 (P < .001) and 0.94 (P = .004), respectively. The median OS for HVF treatment compared with low-volume facilities was 14.3 versus 11.2 months, respectively (P < .001). The median OS for AC treatment versus non-AC was 12.1 versus 10.8 months, respectively (P < .001). Additionally, on MVA, receipt of RT and MAC remained significantly associated with improved OS (HR: 0.76; P < .001; and HR: 0.73; P < .001, respectively). MVA for receipt of surgery showed that MAC is a significant predictor for receiving surgery (odds ratio: 1.29; P = .009).


Our results build on a growing literature supporting RT and MAC in treating LAPC. Additionally, we believe that-in the absence of prospective data-this makes a strong case for considering MAC with RT at ACs and HVFs for treating LAPC.
Improvement in Mortality and End-Stage Renal Disease in Patients With Type 2 Diabetes After Acute Kidney Injury Who Are Prescribed Dipeptidyl Peptidase-4 Inhibitors.

Mayo Clin Proc. 2018; 93 (12)
DOI: 10.1016/j.mayocp.2018.06.023


To focus on the potential beneficial effects of the pleiotropic effects of dipeptidyl peptidase-4 inhibitors (DPP4is) on attenuating progression of diabetic kidney disease in reducing the long-term effect of the acute kidney injury (AKI) to chronic kidney disease (CKD) transition.

Patients and methods

Data from the National Health Insurance Research Database from January 1, 1999, to July 31, 2011, were analyzed, and patients with diabetes weaning from dialysis-requiring AKI were identified. Cox proportional hazards models and inverse-weighted estimates of the probability of treatment were used to adjust for treatment selection bias. The outcomes were incident end-stage renal disease (ESRD) and mortality, major adverse cardiovascular events, and hospitalized heart failure.


Of a total of 6165 patients with diabetes weaning from dialysis-requiring AKI identified, 5635 (91.4%) patients were DPP4i nonusers and 530 (8.6%) patients were DPP4i users. Compared with DPP4i nonusers, DPP4i users had a lower risk of ESRD (hazard ratio, 0.81; 95% CI, 0.70-0.94; P=.04) and all-cause mortality (hazard ratio, 0.28; 95% CI, 0.23-0.34; P<.001) after adjustments for CKD, advanced CKD, and angiotensin-converting enzyme inhibitor or angiotensin II receptor blocker use. In contrast, the risk of major adverse cardiovascular events and hospitalized heart failure did not differ significantly between groups.


Dipeptidyl peptidase-4 inhibitor users had a lower risk of ESRD and mortality than did nonusers among patients with diabetes after weaning from dialysis-requiring AKI. Therefore, a prospective study of AKI to CKD transitions after episodes of AKI is needed to optimally target DPP4i interventions.
Chest Radiograph Findings in Childhood Pneumonia Cases From the Multisite PERCH Study.

Clin Infect Dis. 2017; 64 (suppl_3)
DOI: 10.1093/cid/cix089


Chest radiographs (CXRs) are frequently used to assess pneumonia cases. Variations in CXR appearances between epidemiological settings and their correlation with clinical signs are not well documented.


The Pneumonia Etiology Research for Child Health project enrolled 4232 cases of hospitalized World Health Organization (WHO)-defined severe and very severe pneumonia from 9 sites in 7 countries (Bangladesh, the Gambia, Kenya, Mali, South Africa, Thailand, and Zambia). At admission, each case underwent a standardized assessment of clinical signs and pneumonia risk factors by trained health personnel, and a CXR was taken that was interpreted using the standardized WHO methodology. CXRs were categorized as abnormal (consolidation and/or other infiltrate), normal, or uninterpretable.


CXRs were interpretable in 3587 (85%) cases, of which 1935 (54%) were abnormal (site range, 35%-64%). Cases with abnormal CXRs were more likely than those with normal CXRs to have hypoxemia (45% vs 26%), crackles (69% vs 62%), tachypnea (85% vs 80%), or fever (20% vs 16%) and less likely to have wheeze (30% vs 38%; all P < .05). CXR consolidation was associated with a higher case fatality ratio at 30-day follow-up (13.5%) compared to other infiltrate (4.7%) or normal (4.9%) CXRs.


Clinically diagnosed pneumonia cases with abnormal CXRs were more likely to have signs typically associated with pneumonia. However, CXR-normal cases were common, and clinical signs considered indicative of pneumonia were present in substantial proportions of these cases. CXR-consolidation cases represent a group with an increased likelihood of death at 30 days post-discharge.