Select data resources

Once the assessment team has selected indicators and metrics, the next step is to evaluate available data resources and address any data gaps. Collecting data to measure metrics is often one of the most time-intensive components of a landscape assessment. To streamline this process, the LandScale platform offers a structured workflow and various resources to support users. By documenting all data resources and assessing their limitations on the platform, users can facilitate the assessment and validation processes while preserving valuable information for reassessments.

Throughout this section, the following terms are frequently used:

Data resource: This refers to a wide range of materials that support the measurement of a metric result, including general data, specific datasets, data tools, frameworks, guidelines, and other relevant resources.
Data: This encompasses both general data and specific datasets, as well as the individual data values recorded for observations within these datasets.
Dataset: This specifically refers to a single data file or a linked set of files (e.g., GIS shapefiles). For example, a 'land cover' dataset might consist of one or more files that collectively provide detailed information on land cover types within a landscape.

Track and manage data

The Data Manager on LandScale ensures consistent tracking and handling of data resources across assessments. It serves as a centralized workspace for the assessment team to manage all datasets used and highlights areas that require attention—helping to streamline the assessment process. The Data Manager can be accessed from the assessment dashboard at any time.

Identify and procure data resources

The process of identifying and procuring data should consider and draw from multiple types of data and data sources. For each metric, at least one dataset must be selected. Once a dataset is chosen, a limitations analysis must be completed for that specific metric. This step is essential even when the same dataset is used across multiple metrics, as the limitations may vary depending on what each metric measures and how well the data aligns with the intended purpose of that metric.

To support this process, LandScale provides a curated Data Resources Library as part of its platform. This library provides a directory of diverse data resources applicable to assessments worldwide, making them accessible to all LandScale users. When available, data resources tagged as potentially relevant to a specific metric and geography will automatically appear on the platform as suggested options for assessment teams to select.

Due to the dynamic nature of new data, the Data Resources Library should not be considered exhaustive. Assessment teams are encouraged to prioritize localized data before relying on global data, as localized data are often superior in terms of thematic detail (e.g., level of ecosystem classification), spatial resolution (e.g., smaller raster cell size), currentness, and time-series frequency. The Data Resources Library also includes tools and methods to generate LandScale-relevant data from other sources, offering comprehensive support for assessments.

Assessment teams may source their own data and link them directly to the platform, associating them with the corresponding metric(s). Once linked, these datasets will be stored in the team's personal data manager within the platform for easy access and management. Assessment teams can nominate data resources they discover to be included in the Data Resources Library for broader use. The LandScale team will review and approve these nominations, ensuring they meet the platform's standards and can benefit other assessment teams.

The assessment team is strongly encouraged to engage stakeholders and local experts in identifying suitable data for the selected metrics. Stakeholder input can save time, provide insights into existing datasets relevant to the landscape, and help identify limitations, gaps, or quality issues in data coverage. Stakeholder engagement regarding data can take place at any time up to and including Step C.

Recommended stakeholder input

Engaging with landscape stakeholders can significantly enhance the data collection process for a LandScale assessment. Stakeholders often possess valuable insights or direct access to relevant datasets. To maximize efficiency and data quality, assessment teams are advised to:

Identify and consult stakeholders early: Begin consultations as early as possible to accommodate the lead time often needed to procure existing datasets. Data-related consultations may be integrated into earlier stakeholder engagement activities or conducted as standalone bilateral or group discussions.
Document findings on the platform: Data identified through stakeholder engagement can be logged on the LandScale platform at any stage. This ensures early integration and efficient tracking of potential data resources.

Stakeholder groups who may be especially relevant to consult about data include:

Data developers and data users: Entities such as government agencies, research institutions, and NGOs frequently utilize datasets similar to those required for LandScale assessments. These organizations may have already conducted thorough data searches and evaluations to support planning and assessment needs. For example, a municipal land-use planning agency may have curated data for land use/land cover, ecological and hydrological features, and social factors.
Companies operating in the landscape: Companies involved in producing or sourcing commodities often collect extensive data on production systems and producers. While such data is typically proprietary, these companies can provide valuable insights into specific metrics or indicators.

By engaging stakeholders early, assessment teams can broaden access to critical datasets, foster collaboration, strengthen assessment credibility, and better align data collection processes with the landscape's unique context and needs.

Types of data

LandScale's holistic approach to assessments necessitates the use of a variety of data, including maps (spatial data), tabular data, and qualitative data. These datasets may vary by type and characteristics, often combining multiple features. For instance, a dataset on tree cover loss might be secondary, quantitative, modeled, and geospatial, while data on household income could be primary, quantitative, surveyed, and non-geospatial (if geographic locations were not recorded alongside each observation).

Below are key data types and their defining characteristics:

Secondary vs primary data

Secondary data are any data that already exist at the time of the assessment and are collected by others outside the LandScale assessment process.

Primary data are those that are newly collected within a given landscape for the purpose (in part or in whole) of conducting a LandScale assessment.

Quantitative vs qualitative data

Quantitative data are numerical in form and can be used for statistical analyses to measure observable states, changes, trends, and comparisons.

Qualitative data, such as notes from an interview or a case study, are non-numerical and particularly useful for exploring multi-faceted phenomena that cannot be reduced to specific quantities. These include human perceptions, motivations, interpersonal interactions, or characteristics and functioning of human institutions and groups.

Measured vs modeled data

Measured data are derived from direct measurement, such as water quality readings from a river sample.

Modeled data are created through inference by using input data, theoretical frameworks, modeling assumptions, and statistical methods to estimate findings about a variable of interest. For example, weather forecasts are produced by integrating measured data, such as historical weather records, with statistical and computational models to predict future conditions.

Survey data

Survey data are collected through standardized surveys or polls, often focused on socio-economic or demographic factors. Surveys can be implemented through enumerators (i.e., interviewers), paper questionnaires, or online forms. Surveys may target entire populations of interest (e.g., census data) or sampled subsets, using statistics to infer findings about the whole population.

Self-reported data

Self-reported data rely on participants to report data about themselves, without independent measurement or validation. The lack of independent validation can introduce errors or bias, especially if respondents have a strategic reason to provide false information (e.g., under-reporting rights violations because of fear of reprisal) or if questions are posed in a subjective way that leads to different understandings of the phenomenon in question (e.g. asking respondents to characterize their own health status without more specific questions or guidance).

Geospatial data

Geospatial data consist of attributes that describe data values and their corresponding locations. Geospatial data are generally categorized into two forms:

Raster data: Comprised of pixels, each associated with a specific geographic location and corresponding data values for a given theme, such as land cover type or percentage of tree cover.
Vector data: Represented as points, lines, or polygons with specific geographic locations. A common example is political boundaries.

A geospatial dataset is often referred to as a 'data layer,' which is equivalent to a dataset. Any of the previously described data types may be geospatial.

Geospatial data can be visualized as map layers and analyzed individually or in combination with other geospatial layers—such as those representing land use, population characteristics, infrastructure, or others—using geographic information systems (GIS). For example, an assessment team can determine the proportion of ecosystem types within established protected areas by overlaying an ecosystem type data layer with a protected areas boundary layer in GIS.

Time-series data

Time-series data are collected consistently over two or more successive points or periods in time for the same phenomena. These data are essential for metrics that involve calculating changes from a prior baseline year to the most recent value or for averaging values across multiple time periods.

For metrics requiring a prior baseline value, specific guidance on selecting the baseline year is provided in the Performance Metrics Descriptions Table. When prior measures are not mandatory for a given metric, assessment teams may still incorporate data from earlier periods to enable trend analysis, as long as the conditions for those prior periods can be reliably determined or estimated.

Sources of data

The assessment team is encouraged to explore a wide range of data sources, including those identified by LandScale in the Data Resources Library, those known to the assessment team, and those recommended through stakeholder consultations and expert advice.

Below is an overview of typical sources for secondary datasets that can support a LandScale assessment:

Government agencies

National statistics offices, ministries, and departments of agriculture, forestry, environment, planning, etc.
National or subnational data atlases often contain key datasets that may cover many of the data needs for a LandScale assessment.

Geospatial data platforms

NASA, Google Earth Engine, European Space Agency, Esri Living Atlas, Trends.Earth, Global Forest Watch, and MapBiomas all host geospatial datasets covering land and water features, land use, and land cover.
Emerging sources of geospatial data are continuously identified and added to the LandScale Data Resources Library.

Intergovernmental and international organizations

Organizations such as the World Bank (Open Data, MicroData Catalogue), OECD, and UN provide extensive data resources.

Research institutes, universities, and NGOs/CSOs

Organizations such as the World Resources Institute (WRI) and the Center for International Forestry Research (CIFOR) offer valuable data resources.

Commercial and private data providers

Third-party sources such as IBAT or other paid subscription portals may offer relevant and specialized data resources.

Media and qualitative sources

Local, national, or international newspapers and other media can provide qualitative data, such as United Nations News on Human Rights, Human Rights Watch country reports, Global forest Watch blog, and Resources Watch stories.

Data deficiency

Upon reviewing the availability of data resources for each metric, if no suitable data resources are found to measure the metric, it may be marked as 'data deficient'. Metrics marked as data deficient will not require a result input for the current assessment. Indicators with data deficient essential metrics will not be marked as complete at the end of the assessment, but won't require further action. To support future assessments, the assessment team should consider how this data gap might be addressed—such as through targeted studies, additional research, or alternative data collection methods.

Data ownership, use, and privacy

Assessment teams are encouraged to prioritize publicly available data whenever it meets the needs of their LandScale assessment. Public data are typically easiest to access, well-documented and vetted, and allow users of the assessment results to inspect the source data directly.

If publicly available secondary data are unavailable or do not fully meet data requirements for each metric, assessment teams may consider obtaining data from private sources or conducting primary data collection. For private data owned by another party, it is important to establish a data-sharing agreement with the data provider that outlines how the data can be used and represented in the assessment results.

Where there are no suitable secondary datasets for a given metric, assessment teams should explore options for generating primary data, such as:

Short-term primary data collection: Data generated within the assessment's timeframe and scope that can be used in the current assessment.
Long-term primary data collection: When data collection extends beyond the assessment period, assessment teams or other landscape stakeholders may initiate data-generation processes or monitoring systems. Although these data may not be ready for the current assessment, they can inform future assessments in the same landscape.

For additional details on primary data, refer to the seek supplemental data section.

When collecting primary data, the assessment team must:

Comply with laws: Ensure all applicable laws and regulations are adhered to, especially those concerning data collection and the maintenance of privacy, with particular attention to data collected from human subjects.
Protect security and privacy: Safeguard sensitive and confidential information in their data management systems. Note that while the LandScale platform allows limited uploads of supporting information, its security cannot be guaranteed. Therefore, documents that cannot be publicly shared may not be uploaded to the platform.

Choosing among competing datasets

The LandScale platform requires the selection of one dataset (or a group of non-overlapping datasets) to fulfill the measurement of each metric. However, following their search, the assessment team may find that multiple datasets can be used to measure a given metric.

In such cases, the usual course of action is to select the best dataset for the metric based on the data limitations analysis criteria and other relevant factors. Less commonly, the assessment team may decide to combine multiple datasets to create a composite dataset that provides a more comprehensive metric result. This approach should be undertaken with caution, as methodological differences between datasets can introduce inconsistencies.

Selecting the best dataset(s) often requires balancing multiple factors. For example, the assessment team may need to choose between an older ecosystem map with a finer classification scheme (e.g., more ecosystem types) and higher spatial resolution, or a more current global map with coarser resolution and fewer ecosystem types. These decisions may involve judgment calls based on the specific needs of the assessment, the characteristics of the landscape (e.g., how quickly conditions are changing), and the complementarity or gaps among the datasets being used.

Example: Combining multiple datasets to measure a metric

Several early users of LandScale have successfully combined information from multiple datasets to create composite datasets that best meet the requirements of a given metric.

For example, in a LandScale assessment in Peru led by Rainforest Alliance, the assessment team sought to measure the flow rate of key water sources (volume per unit time) as a metric for assessing the water quantity indicator. However, the relevant data was spread across institutions. The assessment team reviewed numerous technical reports on rivers in the landscape, published by different scientists and organizations. To generate a representative value for the water flow metric, the assessment team compiled data from six different rivers in the landscape and calculated the average flow rate for the baseline year of 2018.

Similarly, to assess water flow, an assessment team in Costa Rica led by IUCN gathered data from all the relevant water service providers, power companies, and regulatory bodies in the landscape. Using these diverse datasets, the assessment team calculated the average flow rate from 27 rivers or springs in the landscape. Whenever possible, the assessment team utilized records from the previous five years and compared the average flow rate to historical data to assess trends in water flow, helping to identify whether flows were increasing or decreasing over time.

In both cases, combining datasets allowed the assessment teams to capture a more comprehensive picture of water flow across the landscape, ensuring the accuracy and representativeness of their assessment. However, such an approach requires careful handling of different data sources to minimize inconsistencies and ensure the final dataset accurately reflects the metrics being measured.

PreviousSelect metrics NextAnalyze data limitations

Last updated 3 months ago

Was this helpful?