Introduction

The terrestrial carbon sink removes about a quarter of anthropogenic CO2 emissions1 but is highly variable in time and space depending on climate. The magnitude of gross primary productivity (GPP) is therefore one of the largest sources of uncertainty in predicting future trajectories of global temperature2. For example, GPP is a first-order control on plant turnover times, a dominant uncertainty term in the terrestrial carbon sink3. There has been substantial progress in quantifying and constraining GPP and other major global carbon fluxes, typically using models driven by satellite remote sensing4,5,6,7 and upscaled in situ ecosystem-scale flux measurements8,9. Recent syntheses7,10 suggest that global GPP is 120–125 Pg C yr−1, and such estimates from the literature (GPPlit) have been incorporated into synthesis efforts such as the Global Carbon Project1 as well as model benchmarking frameworks11. The magnitude of terrestrial GPP thus has implications for the dynamics and resilience of the terrestrial C sink in the face of global environmental change12,13.

Global GPP is roughly balanced by ecosystem-to-atmosphere respiratory fluxes. The difference between these two major fluxes, minus smaller fluxes such as fire and lateral (e.g., dissolved, particulate) organic carbon losses, comprises the terrestrial C balance1. Terrestrial ecosystem respiration is dominated by the soil-to-atmosphere CO2 flux (soil respiration or RS), the combined flux generated by microbial and plant root respiration. Respiration is rarely estimated, even indirectly, from satellite observations, and thus global RS is generally derived by upscaling in situ measurements14,15,16. Published RS estimates from the literature (Rslit) range from 68 to 109 Pg C yr−1 (Supplementary Table 1), with a central range of 85–90 Pg C yr−1 17. Because GPP and RS are physiologically linked, the biophysical balance between GPP and RS could be used as a constraint on the global carbon budget. To date, however, no attempt has been made to quantify how consistent these independent GPP and RS estimates are at the global scale. This study compares these two large carbon fluxes and the results emphasize the importance of cross-comparing datasets and models to understand terrestrial carbon cycling as well as future climate change.

Results and discussion

Inconsistency between photosynthesis and soil respiration

We partitioned global Rslit estimates into microbial and root respiration based on all available (published) partitioning values, and calculated distributions of the resulting implied GPP (GPPRs) using literature estimates of net primary production (NPP) and root-to-shoot respiration ratios (Supplementary Figs. 17). Using a nonparametric bootstrap, we generated 10,000 such GPPRs estimates based on random draws from Rslit, NPP, the partitioning parameters (see Methods and Supplementary Figs. 5, 810 and Supplementary Tables 1, 2), and the corresponding uncertainties. The resulting GPPRs distribution was \({149}_{-23}^{+29}\) Pg C yr−1 (mean ± 95% confidence interval; Fig. 1), which contrasts with the GPPlit average of \({113}_{-18}^{+18}\) Pg C yr−1. The intersection of these two distributions is 127.6 Pg C yr−1 (Fig. 1), a point at the 95.2% quantile of GPPlit and the 9.8% quantile of GPPRs. The null hypothesis (that these distributions are from the same underlying population) is highly unlikely: t49 = −12.68; P < 0.001. What characterizes the small number of estimates consistent with both GPPlit and GPPRs? Bootstrap draws in the overlap region were characterized by low root contribution to RS (averaging 34% below the intersection point, versus 42% above it) and high root contribution to autotrophic respiration (45 vs. 38%, respectively; Supplementary Fig. 11), resulting in low GPPRs values.

Fig. 1: Distribution and comparison of annual global soil respiration (RS) and gross primary productivity (GPP).
figure 1

a Distributions of global gross primary productivity (GPPlit and GPPRs); b Joint distribution of annual global soil respiration (RS) and gross primary productivity (GPP); c Distribution of global soil respiration (Rslit and RsGPP) estimates. Two distributions are shown: literature-reported GPP (GPPlit) versus GPP implied by those RS estimates (GPPRs); or literature-reported RS (Rslit) versus RS implied by those GPP estimates (RsGPP); Distributions are based on 10,000 random draws of the underlying estimates from published literature (summarized in supplementary Fig. 8). The red arrow represents from GPPlit to calculate RsGPP, the light-blue arrow represents from Rslit to calculate GPPRs, and the blue dots and line represent RS from the random forest model developed in this study and based on that to calculate the GPPRs. The arrows and direction corresponding to the arrows in supplementary Fig. 1.

We performed a comparative analysis of published data to derive RS from GPP, partitioning GPPlit into NPP and belowground autotrophic respiration components, while accounting for other carbon loss pathways (see Methods). The resulting implied RsGPP (i.e., the global RS as implied by GPPlit, \({68}_{-8}^{+10}\) Pg C yr−1; Fig. 1) is highly unlikely to be consistent with Rslit values (\({87}_{-8}^{+9}\) Pg C yr−1; see Methods). Only 1.8% of the Rslit distribution in Fig. 1 is below the intersection point of 78.2 Pg C yr−1, and only 2.5% of the RsGPP distribution is above it. This is strong evidence against the null hypothesis that these curves are mutually consistent (i.e., that they represent the same underlying population, t23 = −11.59; P < 0.001). The overlap between these distributions is characterized by high GPPlit (averaging 125.6 Pg C yr−1, versus 112.5 Pg C yr−1 below the intersection point), high NPP, and a high contribution of roots to overall autotrophic respiration (46 and 39% for above and below the intersection point, respectively; supplementary Fig. 12). The cumulative result of these values produced the small percentage of RsGPP draws consistent with Rslit.

We identified sources of variability in Fig. 1 using a variance decomposition procedure to explore which parameters were both uncertain and influential in the distribution of GPPRs and RsGPP (Table 1). Variability in GPPRs was dominated (63% of total variance) by uncertainties in the ratio of root respiration to total autotrophic respiration, for which field measurements are limited. Other influential variables were variance in global Rslit (12%) and the root contribution to total RS of a desert, wetland, and savanna (other, 7%). For bootstrapped RsGPP, uncertainty in GPPlit was the largest (35%) contributor to variability, with root contribution to total RA of cropland, savanna, grassland, and wetland (other, 32%) and global NPP (28%) also large. No other factor contributed more than 2% for variability in GPPRs.

Table 1 Variance decomposition for the calculation of gross primary productivity (GPP) from soil respiration (RS) reported in the literature (Rslit), and calculation of RS from literature GPP (GPPlit).

We also employed a second, complementary approach, one independent of any assumptions about carbon partitioning. In this step, we compared site-level measurements of RS and GPP from a global soil respiration database (SRDB18) and FLUXNET19. These were compared against the same global GPPlit and Rslit estimates shown in Fig. 1. The site-level RS:GPP ratios (i.e., the values directly reported by investigators and compiled in SRDB) averaged 0.56 ± 0.26 (Fig. 2), very similar to the RS:GPP ratios from combining SRDB and FLUXNET data (0.54 ± 0.85). These were both significantly (P < 0.001 based on a nonparametric Wilcoxon test) lower than the Rslit:GPPlit ratios of 0.72 ± 0.11.

Fig. 2: Observations, estimates, and model results of the ratio of soil respiration (RS) or heterotrophic respiration (RH) to gross primary productivity (GPP), at different spatial scales and from different sources.
figure 2

a Observations, estimates, and model results of the ratio of RS to GPP at grid cell site-level; b Observations, estimates, and model results of the ratio of RS to GPP at a global scale; c Observations, estimates, and model results of the ratio of RH to GPP at grid cell site-level; d Observations, estimates, and model results of the ratio of RH to GPP at a global scale. Observational site-level data are from the global Soil Respiration Database (SRDB) and FLUXNET data (see Methods). The ratio of global RS and RH to global GPP is shown in red (and emphasized by the horizontal dashed lines), while results from the Coupled Model Intercomparison Project Phase 6 (CMIP6) at both the local grid cell site-level (values were extracted at coordinates corresponding to specific SRBD and FLUXNET sites) and global scale are shown in blue. Note that the odd distribution of the former results from the diversity of model ensemble realization used. Each point grouping is arranged distributionally, with overlaid box-and-whisker plots summarizing the mean, 25 and 75% quantiles, and extreme values. There are 16 models from CMIP6 with RH data; RS from CMIP6 models was calculated based on RH and Rroot:RS ratio using a bootstrap approach.

We found no evidence that this difference was driven by a lack of spatial representativeness in the global distribution of SRDB data. For example, the arithmetic mean of the RS:GPP ratio in the SRDB is 0.56, and 0.57 when weighted by vegetation areas globally. We highlight that this does not mean that the difference cannot be influenced by sampling errors related to the sparsity of the underlying measurements. Figure 2 also shows RS:GPP and RH:GPP values from models in the Coupled Model Intercomparison Project phase 6 (CMIP6)20 at both local (grid cell site-level) and global scales. These models are global in extent, similar to satellite data products, but their explicit physiological processes mean that their RS outputs are constrained by GPP. In the CMIP6 models examined, RS:GPP values were 0.609 ± 0.11 at both the global scale (i.e., the ratio of the models’ global fluxes) and the scale of individual grid cell site-level, which were significantly lower (W = 375,206, P < 0.001) than global Rslit:GPPlit values shown in Fig. 2.

The RH:GPP ratios from CMIP6 models do not significantly differ from the global RH:GPP ratio from the literature (P = 0.93, Fig. 2d), indicating that the low RS:GPP ratio of the CMIP6 models (Fig. 2b) is likely due to too-low Rroot values, eitther because the fluxes are incorrectly parameterized, or because the allocation of carbon across different pools is incorrectly represented. Carbon allocation is a notable weak link in current ESMs due to both a lack of empirical observations and uncertainty over the underlying physiological mechanisms, and the RS:GPP ratio could be a valuable model benchmark to constrain root allocation. An even stronger approach, in our view, is to use data assimilation in model benchmarking efforts11 to estimate multiple C and biogeochemical fluxes simultaneously, so that they are constrained by each other.

These independent lines of the analysis demonstrate that GPPlit and Rslit, the historical global flux estimates reported in the published scientific literature, are almost certainly inconsistent with each other. One possible interpretation of this problem is that many published global GPP estimates are biased low. If the mean of the GPPRs distribution (149 Pg C yr−1) in Fig. 1 is the actual global flux, for example, that would be close to that implied by atmospheric 18O:16O ratios of CO2, which suggest that a global GPP of 150–175 Pg C yr−1 is needed to explain rapid CO2 cycling times21. A similar conclusion was reached in recent studies using novel methods such as O2:CO2 ratios associated with the land carbon exchange22 as well as GPP derived using solar-induced fluorescence (SIF) data assimilation5.

In an effort to derive new and independent estimates of RS and GPP, we used RS data from a recently updated global daily RS database (DGRsD) to parameterize Random Forest (RF) models for each month, and estimated global monthly RS at a spatial resolution of 0.1° (Supplementary Figs. 13, 14). Such daily data can provide more robust estimates than do annual numbers used until now to estimate global-scale RS23. The resulting global annual RS was 93 Pg C yr−1, with a corresponding GPPRs of 157 Pg C yr−1 (Fig. 1), close to the mean Rslit (\({87}_{-8}^{+9}\) Pg C yr−1) and GPPRs (\({149}_{-23}^{+29}\) Pg C yr−1). This also suggests that higher GPP is a possible explanation for any discrepancy between GPPlit and Rslit, but it should be noted that DGRsD is not independent of SRDB, and therefore more evidence is needed to ensure there are no systematic biases in Rslit.

Possibilities to close the gap

A number of factors might produce too-low global GPPlit estimates (Table 2). We found that purely remote-sensing derived GPP values, in particular from MODIS, tended to be smaller than estimates from site-level upscaling or a mixture of remote sensing and site-based measurements (Supplementary Fig. 5), consistent with recent work on the uncertainties in GPP estimation7,12. Note however that if GPPlit groups are weighted equally (i.e., aggregated into six different groups before bootstrap resampling), the bootstrapped results (GPPlit-group) are higher and closer to the GPPRs (Supplementary Fig. 5). This suggests that older remote sensing approaches may underestimate sub-pixel spatial heterogeneity, and do not fully account for understory production24 or belowground C allocation25. Second, products such as FLUXCOM are produced from eddy covariance measurements that are themselves spatially biased26. Furthermore, these measurements do not account for all carbon loss pathways or long-term CO2 fertilization effects9, and probably underestimate GPP in the highly-uncertain tropics9,27, as well as in managed and fertilized croplands28 where there are limited measurements to parameterize FLUXCOM. Finally, there are substantial uncertainties and mismatches in the algorithms that partition towers’ net ecosystem exchange into GPP and respiration (Supplementary Table 3)29, and also mismatches between these respiration estimates with direct measurements of RS (Table 2 and Supplementary Table 3).

Table 2 Summary of uncertainties and possible biases: factors that might explain why gross primary production (GPP) would be biased low, and/or soil respiration (RS) too high.

Conversely, it is possible that Rslit estimates are biased consistently high (Table 2). One important factor may be that RS data are less diverse than those of GPP, with almost all Rslit ultimately deriving from a large but single global database of thousands of small-scale studies using generally similar methods18. This database is based on published data of annual fluxes, most of which are extrapolated (to an annual flux) from sporadic daytime measurements made at widely varying intervals, which might introduce bias30. Nevertheless, when additional newly published daily time scale in situ measurements were included to parameterize the RF models, global RS was predicted to be 93 Pg C yr−1, very close to Rslit (Fig. 1). Finally, the local- and/or large-scale models used to upscale measured RS temporally and spatially may not accurately represent soil moisture responses (e.g., due to hysteresis effects) because of its confounding effect with temperature31.

A common potential problem affecting large-scale estimates of both GPP and RS concerns spatial coverage and representativeness of the terrestrial land surface and climate space26. GPP and RS measurements have differing tradeoffs in this regard. The former is characterized by a spatially complete and large measurement domain (hundreds of m2 to km2, depending on the eddy covariance tower or pixel), but also nontrivial measurement uncertainties (e.g., the algorithms used to calculate GPP from the measured net flux). By contrast, RS is upscaled from spatially small (~1 m2) but locally accurate chamber measurements dispersed in time that are, however, with better global coverage. Sites in both FLUXNET and SRDB are biased (Supplementary Fig. 6) towards the mid-latitudes of the northern hemisphere32,33. Both global GPP and RS are thought to be dominated by fluxes from highly-productive tropical forests, where eddy covariance towers are scarce and measurements, particularly uncertain26,34. Many of these factors could in theory produce systematic biases in the measurement and scaling of both GPP and RS9,35.

In addition, estimates of GPPlit and Rslit have varied among studies (see Supplementary Fig. 8 and refs. 15,36), reflecting methodological and technological differences, but uncertainty in these estimates have remained high (Supplementary Tables 1, 3); see also ref. 37. We highlight that more recent GPP estimates have tended towards higher estimates but still with high uncertainty. There is also a temporal disparity when comparing literature estimates: while GPPlit and Rslit cover a similar period overall (1980–2020), most GPPlit values are centered between 2000 and 2010, but a majority of Rslit occurs between 1985 and 1995. If GPPlit and Rslit are weighted equally by time (i.e., aggregated by the same breakpoints before bootstrap resampling, Supplementary Fig. 8), bootstrapped GPPlit-agg and Rslit-agg are closer to GPPRs and RsGPP (by ~10 Pg C yr−1, Supplementary Fig. 8), although significant disparities remain. Furthermore, when considering the temporal coverage and changing methods for GPP, we found that the gaps between carbon-cycle flux collected from the literature (GPPlit and Rslit) and the results implied by the other fluxes (GPPRs and RsGPP) decreases, but still significantly differed from each other (P < 0.01, Supplementary Fig. 9).

Perspective view

How could we address these discrepancies and close the terrestrial C budget once and for all? The distribution of our GPPRs and RsGPP results is driven by a few key variables (Tables 12), some of which are relatively rarely measured. These include the ratio of root respiration to total autotrophic respiration38; the ratio of root respiration to total soil respiration, and the ratio of autotrophic respiration to GPP; those data came from sites covering a similar range compared with global GPP, but lack measurements for regions with low photosynthesis (Supplementary Fig. 7). Acquiring (via field measurements or other approaches) additional constraints on these ratios may be a particularly fruitful way to resolve the inconsistencies identified in this study. For example, increasing numbers of studies have separated RS into its autotrophic and heterotrophic components in the last decade, enabling large-scale heterotrophic respiration synthesis efforts upscaling global estimates16. Recent studies have shown that RS are relatively less measured in low-productivity regions, arctic regions, and Tibetan Plateau, and that this uneven spatial distribution of data may create large uncertainties when scaling up and estimating global RS33,39, inferring GPP from Rslit and inferring RS from GPPlit (Table 1) also show that Rroot:RS and Rroot:RA measurements from the desert, wetland, cropland, and savanna are key variables to close the gap between productivity and respiration fluxes in the global terrestrial carbon cycle. In addition, arctic regions and the Tibetan Plateau store a large amount of organic matter and are experiencing fast climate change. In the future, increasing field measurements of Rroot:RS, Rroot:RA, and RA:GPP, especially in low-productivity regions, arctic regions, and Tibetan Plateau is important to close the terrestrial carbon budget.

Here, we show large discrepancies between published estimates of global GPP and RS, producing uncertainties that hamper our capacity to close the global C budget. Despite substantial efforts to understand carbon-climate feedbacks2,35 in the last decades, changes to carbon uptake rates in response to climate change remain uncertain. Importantly, more recent GPP estimation methods—in particular, moving from MODIS-derived information to alternative measurements of plant photosynthetic activity (i.e., SIF)—seem to be closing the gap between our estimates of these two dominant terrestrial carbon fluxes. This is crucial, as, without accurate estimates of the largest terrestrial C fluxes, it will be impossible to correctly determine the land carbon sink and its variability. Resolving the inconsistency between global GPP and RS is a necessary precondition for understanding the future of the global carbon cycle, and thus the possible future global climate change.

Methods

Carbon cycle terms and consistency

This study explored the consistency of global gross primary productivity (GPP) and soil respiration (RS) estimates in the global carbon (C) cycle. Terrestrial GPP is the photosynthetic gain of C by plants; soil respiration, the soil-to-atmosphere CO2 flux, the sum of root respiration and heterotrophic respiration as measured at the soil surface, and represents carbon fixed by plants at some point in the past. While GPP and RS may diverge significantly at local scales and for short time periods, they should however be coupled to a degree consistent with our understanding of the C cycle40. Plant autotrophic respiration (including leaf and stem respiration, Rshoot, and root respiration, Rroot) consumes part of GPP, and the remainder is termed net primary productivity (NPP). Parts of NPP are consumed by heterotrophs (RH) and herbivores (Cherb), burned in fires (Cfire), exported as dissolved organic carbon (DOC), or returned to the atmosphere by plants’ biogenic volatile organic compound emissions (BVOC). The remainder comprises long-term carbon storage–the terrestrial carbon sink (Csink). Theoretically, if we know how GPP is partitioned at each of these steps, we can produce an estimate of the RS implied by a GPP value (here termed RsGPP) at site or global scales; a similar process can be used to derive GPP from RS.

Data sources

Global RS and GPP were collected from published literature. We collected 23 global RS estimates (Supplementary Table 1) from published articles, the majority of which upscaled site RS measurements based on a global database41. Approximately 100 scientific manuscripts estimated global GPP, and we used the following criteria to determine whether the GPP estimate should be included: (1) the GPP year (or middle year if GPP was averaged across a period, Supplementary Table 2) was after 1980; (2) GPP was estimated from satellite remote sensing products or upscaled from global flux data (as opposed to process-based modeling). With those criteria, 49 GPP estimates from published articles were used in this study (Supplementary Table 2).

Our primary source of global NPP estimates was a literature survey42 that compiled 251 global NPP estimates. We noticed that there are several extreme NPP values within the dataset, we thus detected outliers using R, whatever an NPP estimate above 75% quantile + 1.5 interquartile range or below 25% quantile—1.5 interquartile range were considered as outliers. After outliers were removed, total 237 global NPP estimates were used in this study (Supplementary Fig. 1), similar to GPP. Cherb, Cfire, Csink, DOC, and BVOC emissions were also collected from published literature (Supplementary Table 4). Ratios of root respiration to autotrophic respiration (Rroot:RA), autotrophic respiration to GPP (RA:GPP), and root respiration to total soil surface respiration (Rroot:RS) were gathered from values in the global soil respiration database (SRDB18). Additional Rroot:RA ratio data were collected from a literature search (Supplementary Table 5). We used the ISI Web of Science for all literature searches.

Site-level data

A number of site-specific data were used (the results of which appear in Fig. 1). The RS:GPP ratio was computed based on observational data reported in the SRDB. To broaden the sources of available data for this analysis, we also used the FLUXNET-SRDB data combination from ref. 43. Briefly, Tier 1 FLUXNET2015 data were downloaded 30 January 2017 from http://fluxnet.fluxdata.org/data/fluxnet2015-dataset/ and filtered for quality (NEE_VUT_REF_QC ≥ 0.5). FLUXNET GPP was linked to an SRDB RS measurement if both measurements occurred within 5 km, in the same vegetation type, and in the same year. We realized that if a land conversion occurred in the last decades, RS will not be in equilibrium with GPP making the Rs:GPP ratio incorrect, however, we believe this do not introduce an important bias because (1) usually RS and GPP are reported from the same study in SRDB, and thus land use and measurement year are the same; and (2) RS:GPP ratio from SRDB are similar as that from FLUXNET (Fig. 2). This part of the analysis used eddy covariance data acquired and shared by the FLUXNET community, including these networks: AmeriFlux, AfriFlux, AsiaFlux, CarboAfrica, CarboEuropeIP, CarboItaly, CarboMont, ChinaFlux, Fluxnet-Canada, GreenGrass, ICOS, KoFlux, LBA, NECC, OzFlux-TERN, TCOS-Siberia, and USCCC.

CMIP6 data processing

Monthly historical GPP, heterotrophic respiration (RH), and autotrophic respiration (RA) outputs were obtained for the 16 models (104 model × ensemble combinations) currently available under the Coupled Model Intercomparison Project, version 6 (CMIP6)20, from the Earth System Grid Federation archive (https://esgf.llnl.gov/, accessed February 23, 2020). But there are only two models have root respiration, therefore, we estimated root respiration of all CMIP6 models based on RA and Rroot:RA ratio (Supplementary Fig. 3). To calculate the annual RS and RH to GPP ratio, monthly outputs were processed using CDO 1.9.844 and R to obtain a global annual time series of C flux, weighted by land area and the number of days in each month. This mean flux rate was converted to a total global flux by multiplying by the total land area and the number of seconds in a year, calculating RS as the sum of heterotrophic respiration and root respiration. To be consistent with the SRBD and FLUXNET observations, only data from those 1043 FLUXNET sites (Fig. 2) were extracted, the mean CMIP6 RH and RS to GPP ratio was calculated using flux data from 2005 to 2014.

For the ecosystem-scale CMIP6 analysis, we used monthly GPP, heterotrophic respiration, and root respiration outputs from 16 models. These were extracted at latitude and longitude coordinates corresponding to specific SRBD and FLUXNET sites. The total annual fluxes (weighted by days in a month) were used to calculate the average RS to GPP ratio from 2005 to 2014 at each coordinate. The final results consist of ratios at 362 latitude and longitude coordinates for 104 model × ensemble combinations. All CMIP6 processing code is available in the main repository at https://github.com/PNNL-TES/GlobalC.

GPP implied by RS (GPPRs)

In the past decades, global RS rates have generally been estimated by upscaling site RS measurements (producing values here termed Rslit, meaning “RS estimates from literature”). We collected and summarized these estimates from published articles (Supplementary Table 1, n = 23); approximately half also reported RS 95% confidence interval or standard deviation (N = 10) and a rate of change during the study period (N = 8). The reported RS values ranged from 68 to 109 Pg C yr−1, with an average of 85.4 Pg C yr−1.

Some studies also separated RS into its heterotrophic (RH) and root respiration (Rroot) source fluxes; the resulting Rroot:RS ratios have been compiled into the SRDB-V518 (Supplementary Fig. 2c). We used all of these Rroot:RS ratios from SRDB-V5, in total 911 separate records between 0 and 1.0. These covered nine vegetation types, but the majority were from forest, grassland, cropland, and shrubland; all other vegetation types (desert, wetland, and savanna) had only 49 samples combined (Supplementary Fig. 2c).

Autotrophic respiration is made up of aboveground (Rshoot) and belowground (Rroot) components. Many studies have separated RA into Rroot and Rshoot (Supplementary Fig. 3 and Supplementary Table 5), and thus Rroot:RA ratio and Rroot:Rshoot ratio can be calculated. GPP can be calculated (GPPRs, Supplementary Fig. 1 and Eqs. 13) from the Rslit estimates according to Rroot:RS ratio (RC), Rroot:Rshoot ratio (data from both the SRDB and an additional literature search, Supplementary Table 5) and NPP.

We then compared the GPPRs with GPP from publications in past decades (i.e., GPPlit) to determine the consistency between the GPPlit and GPPRs. The following equations were used to calculate GPPRs, i.e., the GPP implied by Rslit:

$${{{{{{\rm{R}}}}}}}_{{{{{{\rm{root}}}}}}}={{{{{{\rm{Rs}}}}}}}_{{{{{{\rm{lit}}}}}}}\times {{{{{{\rm{R}}}}}}}_{{{{{{\rm{root}}}}}}}{:}{{{{{{\rm{R}}}}}}}_{{{{{{\rm{S}}}}}}}{{{{{\rm{ratio}}}}}}$$
(1)
$${{{{{{\rm{R}}}}}}}_{{{{{{\rm{shoot}}}}}}}={{{{{{\rm{R}}}}}}}_{{{{{{\rm{root}}}}}}}\times {{{{{{\rm{R}}}}}}}_{{{{{{\rm{shoot}}}}}}}{:}{{{{{{\rm{R}}}}}}}_{{{{{{\rm{root}}}}}}}{{{{{\rm{ratio}}}}}}$$
(2)
$${{{{{{\rm{GPP}}}}}}}_{{{{{{\rm{Rs}}}}}}}={{{{{\rm{NPP}}}}}}+{{{{{{\rm{R}}}}}}}_{{{{{{\rm{root}}}}}}}+{{{{{{\rm{R}}}}}}}_{{{{{{\rm{shoot}}}}}}}$$
(3)

RS implied by GPP (RsGPP)

GPP has been estimated based on both remote sensing, FLUXNET data, and atmospheric inversions (Supplementary Table 2). We collected 49 such estimates from published articles; only 11 of these estimates reported the corresponding SD, and 14 reported corresponding temporal trends (Supplementary Table 2). The reported GPP estimates were from 1980 to 2015 and ranged from 100.2 to 167.0, with an average of 120.7 Pg C yr−1.

GPP can be separated into NPP, Cherb, Cfire, RA, DOC, BVOC, and Csink. Our global NPP source was a previous meta-analysis42, with outlier (outside 1.5 times the interquartile range above the upper quartile and below the lower quartile) removed, resulted in 237 estimates averaged 56.2 ± 9.6 Pg C yr−1. After subtracting carbon consumed by herbivores, fire, and the land sink from NPP, global RH can be estimated (RH = NPP − Cherb − Csink − Cfire − DOC − BVOC, Supplementary Fig. 1 and Supplementary Table 4).

The precise chain of reasoning and computation was as follows. The difference between GPP and NPP is RA, meaning that an RA:GPP ratio was required to estimate RA based on GPP (Eq. 4). The RA:GPP ratios used in this study were from two sources: (1) a literature search that produced 123 RA:GPP ratio estimates45,46,47,48; and (2) an additional 123 RA:GPP ratio estimates from SRDB-V5. These RA:GPP ratios covered nine vegetation types, mainly from forest and grassland; all the other vegetation types (cropland, wetland, and tundra) only had 14 samples combined (Supplementary Fig. 4). RA can also be calculated by subtracting NPP from GPP (Eq. 5), and calculated RA was very similar when computed by the above two methods. We used the average RA from these two methods.

In turn, RA consists of root respiration (Rroot) and shoot respiration (Rshoot), and thus Rroot:RA and Rshoot:RA ratios are required to calculate Rroot and Rshoot from RA. The Rroot:RA ratios used in this study were from two sources: (1) 35 Rroot:RA estimates from 28 literature studies (Supplementary Table 5); and (2) an additional 94 estimates from SRDB-V5. The Rroot:RA values covered seven vegetation types (Supplementary Fig. 3), mainly from forests; all other vegetation types (cropland, savanna, grassland, and wetland) had only 18 samples.

Finally, starting with the GPPlit values, and using NPP, RA:GPP, Rroot:RA, and Rshoot:RA, GPP can be separated into RH, Rshoot, and Rroot and thus the implied global RS calculated (RsGPP; lower panel in Supplementary Fig. 1 and Eqs. 49 below). We then compared this RsGPP with Rslit to determine their consistency.

$${{{{{{\rm{R}}}}}}}_{{{{{{\rm{A}}}}}}}={{{{{{\rm{GPP}}}}}}}_{{{{{{\rm{lit}}}}}}}\times {{{{{{\rm{R}}}}}}}_{{{{{{\rm{A}}}}}}}{:}{{{{{\rm{GPP}}}}}}$$
(4)
$${{{{{{\rm{R}}}}}}}_{{{{{{\rm{A}}}}}}}={{{{{\rm{GPP}}}}}}-{{{{{\rm{NPP}}}}}}$$
(5)
$${{{{{{\rm{R}}}}}}}_{{{{{{\rm{H}}}}}}}={{{{{\rm{NPP}}}}}}-{{{{{{\rm{C}}}}}}}_{{{{{{\rm{sink}}}}}}}-{{{{{{\rm{C}}}}}}}_{{{{{{\rm{fire}}}}}}}-{{{{{{\rm{C}}}}}}}_{{{{{{\rm{herb}}}}}}}-{{{{{\rm{DOC}}}}}}-{{{{{\rm{BVOC}}}}}}$$
(6)
$${{{{{{\rm{R}}}}}}}_{{{{{{\rm{root}}}}}}}={{{{{{\rm{R}}}}}}}_{{{{{{\rm{A}}}}}}}\times {{{{{{\rm{R}}}}}}}_{{{{{{\rm{root}}}}}}}{{:}{{{{{\rm{R}}}}}}}_{{{{{{\rm{A}}}}}}}$$
(7)
$${{{{{{\rm{R}}}}}}}_{{{{{{\rm{shoot}}}}}}}={{{{{{\rm{R}}}}}}}_{{{{{{\rm{A}}}}}}}\times {{{{{{\rm{R}}}}}}}_{{{{{{\rm{shoot}}}}}}}{{:}{{{{{\rm{R}}}}}}}_{{{{{{\rm{A}}}}}}}$$
(8)
$${{{{{{\rm{Rs}}}}}}}_{{{{{{\rm{GPP}}}}}}}={{{{{{\rm{R}}}}}}}_{{{{{{\rm{root}}}}}}}+{{{{{{\rm{R}}}}}}}_{{{{{{\rm{H}}}}}}}$$
(9)

Bootstrap resampling

A critical factor is uncertainty that compounds at each step in this process. We used a bootstrap resampling approach to estimate GPPRs and RsGPP, as the sample size of each step is different, and many of the input data do not follow a normal distribution (Supplementary Figs. 15). For each bootstrap sample, we first generated a new estimate of GPP or RS by sampling from the published data (Supplementary Tables 1, 2, and 4, 5). We evaluated four different resampling methods, differing in how they treated the presence and absence of errors associated with each flux estimate. Method 1 did not use error information (i.e., any error estimate associated with each published RS or GPP value) when resampling. Methods 2–4 used errors but handled missing values differently. Method 2 replaced missing errors with values calculated from the median coefficient of variability (CV) of non-missing values; method 3 replaced missing errors with values calculated from the maximum CV across the dataset; and method 4 set missing errors to zero. We used method 3 in the main analysis, which is the most conservative (produces the widest distribution for both RS and GPP; cf. Supplementary Fig. 10).

In addition, a random value for each partitioning coefficient (e.g., above- to belowground autotrophic respiration ratio or herbivory fraction) was used in each bootstrap sample; note that errors are seldom reported for these data, and so were not considered here. We separated the Rroot:RS, Rroot:RA, and RA:GPP ratios by vegetation type, weighted by global vegetation area (from the IGBP vegetation land classification, https://climatedataguide.ucar.edu/climate-data/ceres-igbp-land-classification). Starting from the randomly-drawn RS or GPP value, and randomly-drawn partitioning coefficients, the resulting RS or GPP was then calculated following Eqs. 19 described above.

Variable importance analysis

As noted above, many variables related to C partitioning were used to derive GPP from RS (Eqs. 13) or to derive RS from GPP (Eqs. 49). To determine the relative contribution of each variable to the overall distributional uncertainty, as well as the sensitivity of the estimate to that variable, we fixed each variable (e.g., NPP) in turn to the median of all its observations. All other variables were randomly drawn, as normal, in the bootstrap process, and the output variable (GPPRs or RsGPP) mean and distribution were calculated. We then compared the output variance with the result when no variables were fixed, i.e., that shown in Fig. 1, to determine the importance of each variable: larger decreases in output variance when a particular parameter was fixed to be constant, imply greater importance for this parameter.

Representativeness analysis

We connect the Rroot:RS, Rroot:RA, and RA:GPP sites with external global GPP data from FLUXCOM (https://www.fluxcom.org/, last accessed on 2021/06/22) through latitude and longitude to obtain mean GPP between 2001 and 2015. We then compared the GPP of sites used in this study with the global GPP (spatial resolution of 0.5°) to test the representation of the sites (Supplementary Fig. 7).

Overlap calculation

We calculated the overlap between the GPPlit distribution and the distribution of GPPRs to quantify the agreement between GPPlit and GPPRs. If a sample was not significantly different from a normal distribution (based on a Shapiro–Wilk test in R), we used a normal distribution with sample mean and variance to approximate the distribution; if a sample was significantly different from a normal distribution, we used a numerical approximation based on linear interpolation (approxfun in R) to approximate the distribution’s probability density function. We then calculated the intersection point of these probability density functions, as well as the proportion of each curve that overlapped with the other using a trapezoidal rule numerical integration. Finally, we sampled each approximated distribution for the original number of GPP or RS values. With these samples, a two-sample Welch’s t-test (t.test with var.equal = FALSE in R) was performed to determine if the means of the two distributions differed significantly.

Global soil respiration modeling

Following a similar approach as Jian et al. (2018)23, measurements from a global daily soil respiration database (DGRsD) and nine environmental factors (i.e., nitrogen deposition, monthly precipitation, monthly air temperature, soil bulk density, soil organic carbon, soil clay percentage, aboveground biomass, belowground biomass, and Enhanced Vegetation Index, details please see supplementary Table 6) were used to build Random Forest (RF) models for each month. Only RS measurements with no field manipulation were used, totally 27,214 samples were separated into two datasets, 80% of samples were used to train the models, and the rest 20% were used to test the model performance. The results showed that the RF models can explain ~66% RS variability, and the performance is consistent with both training and validation datasets. RS for each month with a spatial resolution of 0.1° were predicted by the RF models, estimated monthly RS were then summarized to estimate global annual RS. Permanent ice sheets and bare soils were removed according to the MODIS landcover map49.

Other statistical analyses

All analyses were conducted using R 3.6.150. Bootstrap means were compared using a two-sided Student’s t-test. A one-sided, nonparametric Wilcoxon rank-sum test with continuity correction was used to compare RS to GPP ratios calculated from global estimates, the SRDB, and CMIP6 outputs.