Chapter 4: Cancer survival in Africa, Asia, Caribbean and Central America: database and attributes

Swaminathan R, Lucas E and Sankaranarayanan R

open this chapter (vol 2, 2011) in PDF format
open this chapter (vol 1, 1998) in PDF format

Abstract

Thirty-one registries in 17 countries submitted data for systematic and centralized scrutiny. Data on 564 606 cases of different cancers ranging 1–56 sites/types from 27 registries in 14 low-/medium-resource countries in Eastern and Western Africa, the Caribbean, Central America and four regions of Asia, registered during 1990–2001 (period varying for individual registries) were reported. The database for this survival study comprised data that were classified as mandatory and optional. Mandatory variables provided by all registries included case-ID, age at diagnosis, sex, incidence date, most valid basis of diagnosis, cancer site/type (ICD-10 codes C00-96), vital status at follow-up and corresponding date. Clinical extent of disease was prominent among the optional variables provided by 17 registries and analysed. The grouping of cancer sites for analysis was based on standard norms, and only categories with at least 25 cases were reported. Cases registered based on a death certificate only, cases lacking any follow-up after initial registration, or cases rejected based on validation checks were excluded from the survival analysis. An easy guide to contents in subsequent chapters, especially tables and graphs describing data quality indices, survival statistics and online dynamic functions, is provided.

Introduction

The database for the survival study was conceived on the basis of data routinely collected in population-based cancer registries in most countries. Accordingly, the variables needed were classified under three major headings: the person, the disease and the follow-up. Under each heading the variables were classified as either mandatory or optional. The variables are summarized in Table 1.

Table 1: Summary of data variables requested from the registries for the survival study

open the table in PDF format

The choice of registries for participation in this study was mainly those whose data on cancer incidence and mortality were published in any volume of Cancer Incidence in Five Continents[1]. The response was overwhelming: thirty-one registries in 17 countries submitted data for centralized scrutiny. One or some of the mandatory data variables required were not provided by two of the registries and in four, there was a significant incompleteness in follow-up. Hence the data from these four registries were rejected after cursory introspection. Thus data submitted by 27 registries from 14 countries were included for further systematic scrutiny.

Inclusion and exclusion criteria

The broad inclusion and exclusion criteria fixed for this study are given in Table 2. The processing of data for individual registries with a pre-specified set of minimal checks for validity and consistency of data revealed the different procedures followed by some of the registries[2]. The first step undertaken was to standardize the norms to facilitate an unambiguous exclusion of cases from the study.

Table 2: Inclusion and exclusion criteria for the survival study

open the table in PDF format

The distinction between a Death Certificate initiated cases (DCN) for further trace-back of information and cases finally registered based on a death certificates only (DCOs) without any additional information was established after several exchanges of correspondence with the concerned registries. Such death certificates only (DCOs) cases with the incidence date the same as the date of death were excluded from survival analysis. In rare instances, the date of follow-up was mistaken to be the date of follow-up attempt rather than the date corresponding to the vital status. Such cases were mostly notified by a code for loss to follow-up or not coded for vital status in the data. These discrepancies were addressed diligently and resolved in consultation with the respective registries to classify them as having incomplete or no follow-up. Cases with lack of any follow-up were then excluded from the analysis. Cases with multiple primaries were identified both by the registry and by routine checks, and were excluded from survival analysis. Thus, a compact set of validation checks for mandatory and optional data variables, as described in Table 3, was evolved and carried out systematically for all the registries. Cases rejected on the basis of these checks were then excluded from the survival analysis.

Table 3: Details of validation checks carried out and the decision made

open the table in PDF format

Data processing

The data sent by the registries were not in a uniform format. These were all converted as database files (dbf) for uniformity.

Registry code & name

A two-digit code based on the ascending alphabetical order of the participating countries was assigned.

Cancer site or type

The data submitted by the registries did not have a uniform coding format, even for mandatory variables. The most prominent among these was the coding for the primary site and/or histology type of cancer. The calendar period of case registration for this study coincided with the smooth transition in coding practices that most registries were undergoing: from one version of ICD-O to another and thereby to ICD-10. Some of them remained with ICD-9 coding even after having changed to higher versions of ICD-O [3] [4] [5] [6] [7]. These prompted to have the cancer diagnosis converted into a uniform format and codes of four digits following ICD-10 [8]. The case listings of warnings and invalid conversions were sent to the respective registries, and the queries were resolved by mutual consent.

Table 4: Description of cancer sites or types included for analysis of survival

open the table in PDF format

The classification of cancer sites or types was based on the same lines as in Cancer Incidence in Five Continents, Volume VIII[9] and is described in Table 4. Only categories with at least 25 cases were considered for analysis and reporting.

Age at diagnosis

This refers to the age in completed years on the incidence date. This was verified with the date of birth when provided. Age unknown cases were excluded, and age above 97 years was coded as 98.

Clinical extent of disease

Though this is an optional variable in this study, it has the greatest significance in correlating local factors with the estimated survival. This data is routinely available or collected by most registries, and in this study seventeen registries submitted this information. The broad norms adopted in classifying this variable into four categories are as follows:
  • Localized: Tumour confined to the organ of origin without invasion into the surrounding tissue/organ and without involvement of any regional or distant lymph nodes or organs;

  • Regional: Tumour not confined to the organ of origin with invasion into the surrounding tissue/organ, with or without the involvement of the regional lymph nodes and not involving or spread to the non-regional lymph nodes or organs;

  • Distant metastasis: Tumour involving or spread to the non-regional lymph nodes or distant organs;

  • Unknown: The above information is unknown.

Index date

The starting date for calculating survival in this study is the incidence date. The definition of incidence date did not reveal any substantial variation between registries. Most of the registries resorted to the first date of unequivocal diagnosis of cancer, by any means, as the incidence date. Other alternatives encountered were hospital admission date or the date of 'Histological verification'. Such a variation might result in minimal differences in short-term survival (say <2 years) and negligent differences for long-term survival[10]. Data on the incidence date was submitted as exact dates or to the level of the month and year of diagnosis. In this study, The index date varied between 1st January 1990 and 31st December 2001, with the period varying for individual registries.

Closing date or date of last follow-up

This date varied between registries and ranged between 31st December 1999 and 31st December 2003. The vital status of each patient was classified as dead, alive or lost to follow-up corresponding to this date. This information was submitted as exact dates or to the level of month and year of follow-up by the registries. To measure the extent of incompleteness in follow-up, especially for registries that employed active methods of follow-up, a variable called 'Follow-up' was created, and the extent of loss to follow-up in years from index date was classified as <1 year, 1-3 years, 3-5 years and >5 years on survival time.

Survival time

This was calculated as the time (in months) between the index date and the date of death from any cause, date of loss to follow-up or the closing date, whichever was earliest.

Most valid basis of diagnosis

The codes for the most valid basis of diagnosis were also different among registries. Based on the key to these codes, a new variable, 'Histological verification', was created for unambiguity and uniformity.

Inclusion status

Systematic validation of the data was undertaken for all registries by performing the checks listed in Table 3. More customized checks were performed depending on the data type on optional variables provided by each registry. A list of potential errors was returned to the registries for clarification. Registries undertaking follow-up predominantly by passive methods were urged to improve follow-up by resorting to feasible active methods like repeated scrutiny of medical records and linkage of data at sources of registration of cases. After the rectification of errors, if any, the checks were repeated on the revised data. The inclusion status was then classified as follows: Included (I) or excluded for reasons of being a DCO case (D), with lack of any follow-up (F) or due to any other reasons (O) on validation checks.

Data quality indicators

The indices that would determine the data quality can be summarized as follows:
  • The frequency of cases, expressed as number and percentage, that were registered as a DCO;

  • The frequency of cases, expressed as percentage, that had a histologically confirmed cancer diagnosis;

  • The frequency of cases, expressed as number and percentage, that were excluded from survival analysis including those with lack of any follow-up or other errors;

  • The frequency of cases, expressed as number and percentage, with incomplete follow-up.

All of the above, by classified cancer site or type, are included as standard tables in the chapters dealing with individual registry data.

Study database

Two databases were created for analysis and reporting of results:
  • The file SURVD.DBF deals with all cases submitted for scrutiny and includes 16 variables. This file is essentially for eliciting the data quality.

  • The file SURVDB2.DBF deals with cases included for analysis and comprises 10 variables. This file is essentially for eliciting data on survival.

The description of the variables is given in Table 5.

Table 5: Cancer Survival in Africa, Asia, Caribbean and Central America, database

open the table in PDF format
Map 1: World map showing study locations

open the table in PDF format
Map 2: Asian map showing study locations

open the table in PDF format
Map 3: African map showing study locations

open the table in PDF format
Map 4: Caribbean and Central America map showing study locations

open the table in PDF format

A lead to the chapters on individual registries

An overview of the issues in the background of the survival data and the analysis carried out for each registry is given in Table 6, as a lead to the forthcoming chapters on individual registries. The cancer registration and follow-up are seen to be completely carried out by active methods in eleven registries (all five from India (Barshi, Bhopal, Chennai, Karunagappally and Mumbai), two from Thailand, one each from Pakistan, Turkey, Uganda and Zimbabwe). Among the remaining four registries wherein cancer registration is done entirely by active methods, the follow-up methods are carried out in a mixed manner: predominantly by active methods with a minimal passive component in Manila, Rizal, Philippines and predominantly by passive methods in Tianjin, China and Incheon, South Korea. Cancer registration and follow-up are entirely done by passive methods in only Singapore. In Hong Kong, the cancer registration is mixed while the follow-up is entirely by passive methods. The cancer registration and follow-up are both carried out by a mixture of active and passive methods in all the other registries.

Table 6: An overview of basic characteristics of survival data and analysis done by registry

open the table in PDF format

The various approaches to estimating survival probability are illustrated in Chapter 2. The analysis by the semi-complete approach involving two calendar periods has been done for all the registries excepting Qidong, for which the analysis was done by the complete approach. The analysis of survival trend by the semi-complete approach was done for eleven registries, while a comparative analysis of survival trend by cohort and period approaches for several calendar periods was possible in Qidong and Tianjin, China and Singapore. Analysis of survival by clinical extent of disease for selected cancer sites was possible in 17 registries.

A guide to the tables and graphs in the individual registry chapters

A chapter is dedicated to each participating registry. It comprises a concise summary describing the background and salient features of the results combined with standard/optional tables and figures.

Table 1 deals with the main data quality indices prior and after to the commencement of follow-up. It gives the total number of cases registered, proportion (%) of histologically verified diagnosis, frequency of type of exclusions from study like DCOs, lack of follow-up and others, and the total number of excluded and included cases for the study for each classified cancer site/type category.

Table 1: Sample from the Songkhla (Thailand) registry: Data quality indices - Proportion (%) of histologically verified and death certificate only cases, number and proportion of included and excluded cases by site, Songkhla, Thailand, 1990–1999 cases followed-up until 2003

open the table in PDF format

Table 2 refers to the data quality index on completeness of follow-up. For registries that resorted to passive means of follow-up entirely, this table gives the distribution of vital status (alive/dead) by classified cancer site/type. For others, this table gives the frequency of cases with complete follow-up at the closing date, as well as at 5 years from the index date, and the extent of incompleteness in follow-up by duration (classified years from diagnosis) of loss to follow-up for every classified cancer site/type. The non-randomness of loss to follow-up or informative censoring is indicated wherever encountered. The median follow-up (in months) is given for every registry and classified cancer site/type.

Table 2: Sample from the Busan (South Korea) registry: Number and proportion of cases by vital status and median follow-up (in months) by site, Busan, South Korea, 1996–2001 cases followed-up until 2003

open the table in PDF format

Table 3 gives the crude and age-adjusted survival statistics. Absolute and relative survival (%) at one, three and five years from index date and 5-year age-standardized relative survival for all ages together and for the age interval 0–74 years are given for every classified cancer site/type.

Table 3: Sample from the Mumbai (India) registry: Comparison of 1-, 3- and 5-year absolute and relative survival and 5-year age-standardized relative survival (ASRS) by site, Mumbai, India, 1992–1994 cases followed through 1999 and 1995–1999 cases followed-up until 2003

open the table in PDF format

Figure 1a portrays the top five or ten cancers ranked by 5-year relative survival for those registries that contributed data on sufficient number of cancer sites/types.
Figure 1a: Sample from the Shanghai (China) registry: Top ten cancers (ranked on survival), Shanghai, China, 1992–1995

open the table in PDF format

Figure 1b displays the top five cancers ranked by 5-year relative survival among males.
Figure 1b: Sample from the Shanghai (China) registry: Top five cancers (ranked on survival), Male, Shanghai, China, 1992–1995

open the table in PDF format

Figure 1c represents the top five cancers ranked by 5-year relative survival among females.
Figure 1c: Sample from the Shanghai (China) registry: Top five cancers (ranked on survival), Female, Shanghai, China, 1992–1995

open the table in PDF format

Table 4 deals with survival statistics by sex and classified age groups. The frequency of cases and five-year absolute and relative survival (%) by sex and the frequency of cases and five-year relative survival (%) by the age groups 0–44, 45–54, 55–64, 65–74 and 75+ years are given for every classified cancer site/type.

Table 4: Sample from the Seoul (South Korea) registry: Site-wise number of cases, 5-year absolute and relative survival by sex and relative survival by age group, Seoul, South Korea, 1993–1997 cases followed-up until 2001

open the table in PDF format

Table 5: Analysis of survival by clinical extent of disease

This is carried out for selected cancer sites only: cancers of the head and neck, female breast, cervix and ovary. The frequency (%) of cases by classified clinical extent of disease categories and the corresponding 5-year absolute survival are presented as a table.

Figure 2 either depicts the absolute survival by clinical extent of disease for available cancer sites and registries or trend of survival by cohort and period approaches as appropriate.

Figures 2: Sample from the Costa Rica registry: Up-to-date 5-year relative survival estimates over the calendar periods by period and cohort approaches for selected cancers, Costa Rica, 1995–2000

open the table in PDF format

Table 6: Analysis of trend of survival

This is done in two ways:

  • For registries that provided data for any one preceding calendar period of time, the 5-year absolute and relative survival were estimated by semi-complete approach for the two periods for the available cancer sites/types and presented as a table.

  • For registries that provided data for more than two 5-year calendar periods preceding the latest one, the 5-year absolute and relative survival were estimated for the latest two calendar periods for the available cancer sites/types and presented as a table. Additionally, the frequency of cases by 5-year calendar periods by cancer site/type is given in a table. Correspondingly, the five-, ten- and fifteen-year relative survival were estimated by cohort and period approaches for available cancer site/type depending on the availability of data and presented as one or two tables, as necessary. These are depicted as figures also.

Table 6: Sample from the Cuba registry: Comparison of 5-year absolute and relative survival of cases diagnosed between 1988–1989 and 1994–1995, Cuba

open the table in PDF format

Figures 3: Sample from the Singapore registry: Up-to-date 5-year relative survival of selected cancers by period and cohort appoaches, Singapore

open the table in PDF format

Online features of the publication

A dedicated website has been designed to host the new version of Cancer Survival in Africa, Asia, the Caribbean and Central America (survCan), available at http://survcan.iarc.fr. Users will be able to access all the chapters of the publication, including abstracts, tables and figures, and will be able to export each chapter in full or part in PDF format. Users will also be able to access the previous volume of the publication (1998) if available. References cited in the chapters are directly linked to PubMed or to the specific website of the publication as appropriate. Online dynamic functions are also supplied to let users generate comparative statistics: users will be able to list all the available tables/figures for a registry, compare values between registries for a specific cancer site, based on ICD-10 codes, and generate specific dynamic figures on survival statistics (5-year absolute survival or 5-year relative survival, etc.) according to the sex, age group and extent of disease. An online help tool is available to facilitate the use of the online statistical functions.


References

  1. Parkin DM, Whelan SL, Ferlay J and Storm H. Cancer Incidence in Five Continents, Vol I to VIII: IARC Cancerbase No. 7. IARCPress, Lyon, 2005.
    (link to CI5)

  2. Swaminathan R, Black RJ and Sankaranarayanan R. Database on Cancer Survival from Developing Countries. In: Cancer Survival in Developing Countries (eds) R.Sankaranarayanan, RJ Black and DM Parkin. IARC Scientific Publications No. 145. IARCPress, Lyon, 1998.
    (link to Cancer Survival, volume 1)

  3. WHO. International Classification of Diseases for Oncology (ICD-O), First edition. World Health Organization, Geneva, 1976.
    (link to WHO ICD-O website)

  4. WHO. International Classification of Diseases for Oncology (ICD-O), Second edition. World Health Organization, Geneva, 1990.
    (link to WHO ICD-O website)

  5. WHO. International Classification of Diseases for Oncology (ICD-O), Third edition. World Health Organization, Geneva, 2000.
    (link to WHO ICD-O website)

  6. WHO. International Classification of Diseases, Ninth Revision (ICD-9). World Health Organization, Geneva, 1976.
    (link to WHO ICD-10 website)

  7. WHO. International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-10), Volume 1. World Health Organization, Geneva, 1992.
    (link to WHO ICD-10 website)

  8. Ferlay J. IARCcrgTools, Version 1.01. IARCPress, Lyon, 2003.
    (link to IARCcrgTools website)

  9. Parkin DM, Whelan SL, Ferlay J, Teppo L and Thomas DB. Cancer Incidence in Five Continents, vol VIII. IARC Scientific publications No 155. IARCPress, Lyon, 2002.
    (link to CI5)

  10. Berrino F, Sant M, Verdecchia A, Capocaccia R, Hakulinen T and Esteve J. (eds) Survival of Cancer Patients in Europe: the EUROCARE Study. IARC Scientific Publications No. 132. IARCPress, Lyon, 1995.
    (link to EUROCARE-4)