Journal: Scientific Data

Loading...

Abbreviation

Sci Data

Publisher

Nature

Journal Volumes

ISSN

2052-4463

Description

Search Results

Publications 1 - 10 of 111
  • Rosenstock, Todd S.; Steward, Peter; Joshi, Namita; et al. (2024)
    Scientific Data
    Information on the effects of changing agricultural management on crop and livestock performance is critical for developing evidence-based policies, investments, and programs. Evidence for Resilient Agriculture (ERA) v1.0.1 presents a dataset that harmonizes and aggregates 112,859 observations from 2,011 agricultural studies taken place in Africa between 1934 and 2018. The dataset includes information on the effect of 364 combinations of management practices and technologies on 87 environmental, social, and economic indicators of outcomes. Observations are geolocated and temporally tagged and thus can be linked to other datasets such as historical weather, soil properties, and road networks. ERA offers a new resource for understanding the impacts of changing farming practices under diverse environmental contexts, providing data to support strategic interventions aimed to enhance productivity, resilience, and sustainability of African agriculture.
  • González, Asier; Pandey, Muskan; Schlusser, Niels; et al. (2025)
    Scientific Data
    The limited correlation between mRNA and protein levels within cells highlighted the need to study mechanisms of translational control. To decipher the factors that determine the rates of individual steps in mRNA translation, machine learning approaches are currently applied to large libraries of synthetic constructs, whose properties are generally different from those of endogenous mRNAs. To fill this gap and thus enable the discovery of elements driving the translation of individual endogenous mRNAs, we here report steady-state and dynamic multi-omics data from human liver cancer cell lines, specifically (i) ribosome profiling data from unperturbed cells as well as following the block of translation initiation (ribosome run-off, to trace translation elongation), (ii) protein synthesis rates estimated by pulsed stable isotope labeled amino acids in cell culture (pSILAC), and (iii) mean ribosome load on individual mRNAs determined by mRNA sequencing of polysome fractions (polysome profiling). These data will enable improved predictions of mRNA sequence-dependent protein output, which is crucial for engineering protein expression and for the design of mRNA vaccines.
  • Gilbert, Marius; Nicolas, Gaëlle; Cinardi, Giusepina; et al. (2018)
    Scientific Data
    Global data sets on the geographic distribution of livestock are essential for diverse applications in agricultural socio-economics, food security, environmental impact assessment and epidemiology. We present a new version of the Gridded Livestock of the World (GLW 3) database, reflecting the most recently compiled and harmonized subnational livestock distribution data for 2010. GLW 3 provides global population densities of cattle, buffaloes, horses, sheep, goats, pigs, chickens and ducks in each land pixel at a spatial resolution of 0.083333 decimal degrees (approximately 10 km at the equator). They are accompanied by detailed metadata on the year, spatial resolution and source of the input census data. Two versions of each species distribution are produced. In the first version, livestock numbers are disaggregated within census polygons according to weights established by statistical models using high resolution spatial covariates (dasymetric weighting). In the second version, animal numbers are distributed homogeneously with equal densities within their census polygons (areal weighting) to provide spatial data layers free of any assumptions linking them to other spatial variables.
  • Oostwegel , Laurens J.N.; Schorlemmer , Danijel; Guéguen , Philippe (2025)
    Scientific Data
    Buildings play a critical role in understanding settlement patterns and are essential for crisis management, urban planning, energy efficiency, and multi-hazard risk assessment. To address the need for accessible global building data, we introduce a dataset containing 2.7 billion building footprints classified using the building taxonomy of the Global Earthquake Model. By conflating the AI-derived Google Open Buildings and the Microsoft Global ML Building Footprints datasets, and the crowd-sourced OpenStreetMap, we created the most detailed and extensive building dataset to date. This conflation helps balancing out the completeness bias in OpenStreetMap in which mapping is most complete in countries with high human developing index. We validated occupancy types and building height estimation using Kullback-Leibler divergence across specific cities, and through cadaster data from Slovenia and Greece, revealing that, while some misclassifications occur due to definitional differences or data limitations, the dataset overall provides reliable and valuable building information at global scale. Examples to use the data are to identify vulnerabilities of buildings for natural hazards or to model population distributions.
  • Cramer, Estee Y.; Huang, Yuxin; Wang, Yijin; et al. (2022)
    Scientific Data
    Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages.
  • Halbritter, Aud H.; Vandvik, Vigdis; Bison, Nicole N.; et al. (2025)
    Scientific Data
    The Afromontane region harbors ancient grasslands with high levels of endemism, now under threat from land-use change, biological invasions and encroachment, and climate warming. As part of an international Plant Functional Traits Course we collected comprehensive trait data in five sites along an elevation gradient from 2,000–2,800 m a.s.l. and in a climate warming experiment at 3,064 m a.s.l. in the Maloti-Drakensberg, South Africa. We sampled 24,405 aboveground and 94 root trait measurements from 171 vascular plant taxa paired with 11 other datasets reflecting vegetation and structure, leaf and ecosystem carbon and water fluxes, leaf hyperspectral reflectance, and microclimatic and environmental data. Our data provide the first recorded trait data for 47 vascular plant species and more than double the trait data coverage from the Maloti-Drakensberg (106% increase). This study offers insights into plant and ecosystem functioning, provides a baseline for assessing impacts of environmental change, builds local competence, and aligns with similar data from China, Svalbard, Peru, and Norway.
  • Külling, Nathan; Adde, Antoine; Fopp, Fabian; et al. (2024)
    Scientific Data
    Standard and easily accessible cross-thematic spatial databases are key resources in ecological research. In Switzerland, as in many other countries, available data are scattered across computer servers of research institutions and are rarely provided in standard formats (e.g., different extents or projections systems, inconsistent naming conventions). Consequently, their joint use can require heavy data management and geomatic operations. Here, we introduce SWECO25, a Swiss-wide raster database at 25-meter resolution gathering 5,265 layers. The 10 environmental categories included in SWECO25 are: geologic, topographic, bioclimatic, hydrologic, edaphic, land use and cover, population, transportation, vegetation, and remote sensing. SWECO25 layers were standardized to a common grid sharing the same resolution, extent, and geographic coordinate system. SWECO25 includes the standardized source data and newly calculated layers, such as those obtained by computing focal or distance statistics. SWECO25 layers were validated by a data integrity check, and we verified that the standardization procedure had a negligible effect on the output values. SWECO25 is available on Zenodo and is intended to be updated and extended regularly.
  • Chaabane, Sonia; de Garidel-Thoron, Thibault; Giraud, Xavier; et al. (2023)
    Scientific Data
    Planktonic Foraminifera are unique paleo-environmental indicators through their excellent fossil record in ocean sediments. Their distribution and diversity are affected by different environmental factors including anthropogenically forced ocean and climate change. Until now, historical changes in their distribution have not been fully assessed at the global scale. Here we present the FORCIS (Foraminifera Response to Climatic Stress) database on foraminiferal species diversity and distribution in the global ocean from 1910 until 2018 including published and unpublished data. The FORCIS database includes data collected using plankton tows, continuous plankton recorder, sediment traps and plankton pump, and contains similar to 22,000, similar to 157,000, similar to 9,000, similar to 400 subsamples, respectively (one single plankton aliquot collected within a depth range, time interval, size fraction range, at a single location) from each category. Our database provides a perspective of the distribution patterns of planktonic Foraminifera in the global ocean on large spatial (regional to basin scale, and at the vertical scale), and temporal (seasonal to interdecadal) scales over the past century.
  • Saetta, Gianluca; Cognolato, Matteo; Atzori, Manfredo; et al. (2020)
    Scientific Data
    Despite recent advances in prosthetics, many upper limb amputees still use prostheses with some reluctance. They often do not feel able to incorporate the artificial hand into their bodily self. Furthermore, prosthesis fitting is not usually tailored to accommodate the characteristics of an individual’s phantom limb sensations. These are experienced by almost all persons with an acquired amputation and comprise the motor and postural properties of the lost limb. This article presents and validates a multimodal dataset including an extensive qualitative and quantitative assessment of phantom limb sensations in 15 transradial amputees, surface electromyography and accelerometry data of the forearm, and measurements of gaze behavior during exercises requiring pointing or repositioning of the forearm and the phantom hand. The data also include acquisitions from 29 able-bodied participants, matched for gender and age. Special emphasis was given to tracking the visuo-motor coupling between eye-hand/eye-phantom during these exercises.
  • Gu, Tianyu; Bijelić, Nenad; Katircioglu, Isinsu; et al. (2025)
    Scientific Data
    This paper presents STEEL-3dPointClouds, a dataset of deformed steel beam-columns obtained using high-fidelity physics-based numerical simulations. These simulations trace the inelastic deformations of hot-rolled wide-flange steel beam-columns under different loading protocols covering a range of responses, starting with no strength loss and up to at least 60% loss of load-bearing capacity for each considered steel member. Each of the ~ 323k samples is a unique point extracted from the hysteretic response of the loaded member and consists of the deformed shape (represented as a 3D point cloud) along with the corresponding reserve capacity and stress/strain fields. To exemplify the use of this dataset, machine learning models are implemented to quantify the reserve capacity of deformed steel members solely using point clouds as inputs and to estimate their key deformation characteristics based on geometric properties. Furthermore, the dataset is used to extract deformations at critical response stages to characterize the geometric tolerances for potential member reuse. Dataset is shared to facilitate development of automated inspection methodologies and benchmarking of computer vision tools.
Publications 1 - 10 of 111