Reports and guidance documents; availability, etc.: National Center for Health Statistics; Research Data Center use; operational procedures and costs,

[Federal Register: November 18, 2004 (Volume 69, Number 222)]

[Notices]

[Page 67584-67592]

From the Federal Register Online via GPO Access [wais.access.gpo.gov]

[DOCID:fr18no04-91]

DEPARTMENT OF HEALTH AND HUMAN SERVICES

Centers for Disease Control and Prevention

Procedures and Costs for Use of the Research Data Center

AGENCY: National Center for Health Statistics, Centers for Disease Control and Prevention (CDC), Department of Health and Human Services (HHS).

ACTION: Notice and request for comments.

SUMMARY: This notice provides information about the Research Data Center (RDC) operated by the National Center for Health Statistics (NCHS) within the Centers for Disease Control and Prevention (CDC). The Research Data Center was established in 1998 to provide a mechanism whereby researchers can access detailed data files in a secure environment, without jeopardizing the confidentiality of respondents. Historically, the data files accessed in the RDC have consisted of NCHS survey data. RDC has recently begun accepting data files that were not produced from NCHS survey data. In order to assure that all data files are processed in a consistent manner, the original guidelines for accessing files in the RDC are being reviewed and revised as necessary. As part of the revision process, potential users are being given the opportunity to provide input on how the procedures of the RDC can best serve their research needs. This notice describes how to submit proposals requesting use of the data, mechanisms to access the RDC, requirements, use of outside data sets, costs for using the RDC, and other pertinent topics. We are seeking comments on these procedures and will post the final procedures on the NCHS Web site.

DATES: Submit comments on or before December 9, 2004.

ADDRESSES: Send comments concerning this notice to Ken Harris, National Center for Health Statistics, 3311 Toledo Road, Room 3210, Hyattsville, MD 20782, or e-mail to kwharris@cdc.gov.

FOR FURTHER INFORMATION CONTACT: Ken Harris at (301) 458-4262.

SUPPLEMENTARY INFORMATION:

Operational Procedures for Use of the Research Data Center; National Center for Health Statistics; Centers for Disease Control and Prevention

Table of Contents

Purpose Background Research Data Center--Operations Submission of Research Proposals Using NCHS Data Researcher--Supplied Data General Requirements for Guest Researchers General Requirements for Remote Access Use of RDC/NCHS Costs for Using the RDC Disclosure Review Process Appendix I--Examples of Data Available through the NCHS RDC Appendix II--Requirements for the Release of NCHS Micro Data Appendix III--Disallowed SAS Functions, Statements, and Procedures Appendix IV--Project-Specific Requirements

Vaccine Safety Datalink Files Appendix V--Agreement Regarding Conditions of Access to Confidential Data in the Research Data Center of the National Center for Health Statistics Appendix VI--Researcher Affidavit of Confidentiality

Operational Procedures for the Use of the Research Data Center, National Center for Health Statistics (NCHS); Centers for Disease Control and Prevention (CDC)

Purpose

This document provides information about the National Center for Health Statistics' (NCHS) Research Data Center (RDC), including how to submit proposals requesting use of data, mechanisms to access the RDC, requirements, use of outside data sets, costs for using the RDC, and other pertinent topics. The Guidelines pertain to use of data produced by NCHS and non-NCHS entities. If, after reading these guidelines, you have further questions, you may seek clarification through e-mail RDCA@cdc.gov) or by contacting Ken Harris at (301) 458-4262 or by e- mail at kwharris@cdc.gov. The procedures described for use of the RDC are under constant review to improve RDC operations and to be responsive to changes in the environment that affect confidentiality protections. Please check the NCHS Web site or contact the RDC to determine if modifications have been made.

Background

In order to advance knowledge on the health and well-being of the nation and its health care system, NCHS and other organizational entities in the Department of Health and Human Services release statistical micro data containing health and related variables. These files allow outside researchers and analysts to develop statistics and conduct independent research. However, any release of data, whether micro data files or the results of statistical analyses, must be consistent with the confidentiality provisions under which the data were collected. For the case of data collected or

[[Page 67585]]

obtained by NCHS, Section 308(d) of the Public Health Service Act (42 U.S.C. 242m(d)) and the NCHS Staff Manual on Confidentiality do not permit the release of data that are either identified or identifiable to persons outside of NCHS. In order to preserve privacy and confidentiality, details that might identify or facilitate the identification of persons and organizations participating in surveys and data systems are suppressed in published data products. Examples of data elements that might be abridged are geographic identifiers, details of sample design, and variables such as age or income that might exist in other databases.

Despite the wide dissemination of data through publications, CD- ROMs, etc., the inability to release files with, for instance, lower levels of geography, severely limits the utility of some data for research, policy, and programmatic purposes and sets a boundary on one of the goals of the U.S. Department of Health and Human Services, i.e., to increase our capacity to provide state and local area estimates. In pursuit of this goal and in response to the research community's interest in restricted data, NCHS established the Research Data Center (RDC), a mechanism whereby researchers can access detailed data files in a secure environment, without jeopardizing the confidentiality of respondents. The RDC provides restricted access to NCHS data. The RDC also accepts outside data sets. Appendix I contains information about some of the data sets currently available in the RDC.

Special requirements for use of non-NCHS data can be found in Appendix IV, Project-Specific Requirements.

Authority: Sections 306 and 308 of the Public Health Service Act (42 U.S.C. 242k and 242m).

Research Data Center (RDC)--Operations

The NCHS RDC is a research facility located at the NCHS headquarters in Hyattsville, MD, where researchers meeting certain qualifications are allowed access, under strict supervision, to restricted statistical micro data files. To gain access to the RDC researchers must submit a proposal for review and approval. Researchers can use one of three access methods (see below): (1) Direct access through local computing resources in the RDC that accommodate visiting researchers; (2) a remote program submission system through which researchers can submit work to be done in the RDC with the output returned to them by e-mail; or (3) programming services for outside researchers provided by RDC staff (see below). In all three methods, confidential data files remain in the RDC where access to unit records is restricted, and output is inspected before it leaves the RDC.

As currently designed, the NCHS RDC facility in Hyattsville has four user workstations and a secure room for the RDC printer. In addition, there is office space for the RDC staff and long-term outside researchers.

The RDC computers have no electronic link either to the NCHS network, the CDC-NCHS mainframe, or the Internet. The RDC workstations consist of Pentium III 933 MHz computers running Windows 2000. There is sufficient storage on the workstations and the server for any confidential data. PC-SAS[supreg], SUDAAN[supreg], Watcom Fortran 77[supreg], and Stata[supreg] are installed on the workstations, and additional programming/analytic languages can be added as needed.

The computers have been configured so that removable media such as floppy disks are inaccessible to users. All print output is routed to a central printer which is monitored by RDC staff while the RDC is open to external researchers. Further, the system's workstations are configured such that researchers are given read-only access to requested data files and can write only onto the local workstation's hard disk. These restrictions ensure that users cannot remove information that has not been subjected to a review for confidentiality.

The three methods of access to restricted data through the Data Center include:

(1) Guest Researcher (on site)--The researcher submits a research proposal to the RDC and, upon approval, conducts his/her research on site at NCHS in the RDC. RDC staff constructs the necessary data files before the guest researcher arrives and ensure that no restricted data leave the facility. Data from virtually all of the NCHS data collection systems may be made available through the RDC. Also available are data from other data collection systems.

PC-SAS[supreg], SUDAAN[supreg], Watcom Fortran 77[supreg], and Stata[supreg] are installed on the RDC workstations. Other programming languages or data analysis packages can be made available with sufficient lead time.

Researchers may take the results of their analyses off-site only after disclosure review by NCHS RDC staff. Disclosure review consists of looking for tabular cells less than 5, tables with geographic variables in any dimension, models with geographic variables (or variables tantamount to geographic variables) as outcome variables, or case listings. In general, disclosure review is consistent with the guidelines published in the NCHS Staff Manual on Confidentiality (see Appendix II, Requirements for the Release of NCHS Micro Data Files).

(2) Remote Access--Users are able to electronically submit analytical computer programs using SAS as the programming language. After their proposals are approved, researchers are registered with the RDC remote access system and introduced to the procedures and programming limitations to be followed in accessing data. Researchers send programs to the RDC and receive output by e-mail. RDC staff prepares the requested data files which may consist of confidential data merged with user data. Both submitted programs and output undergo a programmed disclosure limitation review and are also subject to a manual review. Certain procedures and SAS[supreg] functions are not allowed (see Appendix II, Disallowed SAS[supreg] Functions, Statements, and Procedures for a complete list). For example, users cannot use PROC TABULATE or PROC IML, nor are functions allowed that are capable of producing listings of individual cases such as LIST and PRINT. Additionally, functions that may select individual cases are not allowed (R--, FIRST., LAST., and others). The output is scanned for cells containing less than five observations. If any are found, not only is that cell suppressed, but several additional cells will also be suppressed (complementary suppression). Alternatively, the researcher may be asked to revise and resubmit his/her analyses. The job log is also scanned with particular attention to certain types of error conditions that may spawn case listings. Some projects are not suitable for the remote access method. Stewards of the file/s in consultation with RDC staff make this determination.

(3) RDC Staff-Assisted Research: This is mainly useful for those planning to use statistical software programming languages other than SAS[supreg] or who are not able to travel to the RDC facility. Under this method, an approved researcher e-mails a statistical software program to the assigned RDC staff person who runs the program and, after disclosure review, provides the output to the researcher by e- mail. More extensive programming services are also available.

Each of the access methods outlined above has an associated cost which includes equipment and space rental, staff overhead, and setup. The staff overhead and setup include the time and resources necessary for monitoring progress, setting up equipment and data

[[Page 67586]]

files, disclosure limitation review, and file management. Since these reflect varying demands on resources, accurate cost estimates cannot be given without complete knowledge of the proposed research. In general, though, the setup fee is $500 per day of effort (see Costs of Using the RDC, below).

Submission of Research Proposals Using NCHS Data

Researchers must submit proposals that are detailed enough in their data specifications to permit RDC staff to easily determine what data elements are required. Prospective researchers are encouraged to check with RDC staff prior to writing their proposals to ensure that the data of interest can be made available to them. Researchers should develop their proposals in a way that facilitates the ability of the RDC staff to create the analytic files required by the project. Proposals should be explicit regarding the variables needed as well as any case selection required. Only those data items required to conduct the proposed analyses will be included in the analytic data file and the proposals should address why the requested data are needed for the proposed study. Overly large and complex projects or poorly defined projects will require extensive communication between RDC staff and the researchers proposing the project, and this can cause the process to move slowly. Work to prepare data files can be accomplished most expeditiously if large, complex projects are subdivided into manageable parts and requested data are clearly defined.

Researchers wishing to link data in the RDC with external data should provide the external data to RDC staff in advance of their entry to and use of the RDC (a minimum of 7 days prior to the approved date for access to the RDC).

The RDC expects that all researchers will adhere to established standards and principles for carrying out statistical research and analyses. Researchers must conduct only those analyses which received approval. Failure to comply will result in cancellation of the research activity and potential disbarment from future research activities in the RDC. In the case where Institutional Review Board (IRB) approval is required to conduct research, RDC staff will notify relevant IRBs of infringements of protocol approvals.

Appendix IV (Project-Specific Requirements) contains information on submitting a research proposal requesting use of data other than those produced by NCHS. The format detailed below pertains specifically to use of NCHS data. If no project specific requirements are provided for non-NCHS data, the format below is to be used.

(1) The research proposal must contain the following information:

  1. Cover letter.

  2. Project Title.

  3. Abstract: approximately 100-300 words summarizing the project.

  4. Full personal identification, institutional affiliation, mailing addresses (including overnight express mail address), phone, and e-mail address. Applicants who are students must append a letter from the department chair or advisor stating that the applicant is a student working under the direction of the department.

  5. Dates of proposed tenure at the RDC (or use of the remote access system). Proposals requesting remote access should include an appendix describing the computer and e-mail account that will receive output as well as the security provisions established for them.

  6. Source of funding for the proposed project.

  7. Background of study:

    1. Key study questions or hypotheses.

    2. Public health benefits.

  8. A summary of the data requirements for the proposed research along with an explanation of why the data are needed for the proposed study.

    1. Identification of cases to be included in the analytic file.

    2. Identification of variables to be included in the analytic file.

    3. Data to be supplied by the researcher and merged with NCHS or other data.

    4. A description of why publicly available data are insufficient.

      1. Methods for the study:

    5. Analytic strategy and statistical methods to be used.

    6. Software requirements (currently, PC-SAS[supreg] for Windows[supreg], Stata[supreg], SUDAAN[supreg], LIMDEP[supreg], HLM[supreg], SPSS[supreg], and Watcom Fortran 77[supreg] are available in the RDC; other languages can be made available with sufficient lead time).

  9. A description of the output that the researcher intends to have reviewed for non-disclosure. This should include table shells, model equations, or test statistics of any output that the researcher plans to remove from the RDC. This will help the reviewers to determine the risk of disclosure.

  10. Appendices.

    1. A current resume or Curriculum Vitae for each person who will participate in the research activity. Resumes or CVs must specify nationality.

    2. A letter from student applicant's department chair or academic advisor stating that student is working under the direction of the department.

    3. A data dictionary: a complete listing of the specific data requested--data system, files, years, cases, variables, matching or linking variables, etc.

    4. A data dictionary for researcher-supplied data, if any, to be merged with the confidential data. This includes identifying the source of the data, variable names, variable codes or ranges, file layout, number of records, and restrictions on NCHS use of the data (currently the RDC policy prohibits release of merged data to anyone other than the prospective researcher).

    5. A description of the computer and e-mail system to be used to receive output from the remote access system as well as the security provisions established for them.

    Portions of doctoral proposals or grant applications with appropriate modifications may suffice for the research proposal.

    Proposals to use the Research Data Center should be sent to:

    Research Data Center, National Center for Health Statistics, 3311 Toledo Road, Suite 4113, Hyattsville, MD 20782, RDCA@cdc.gov.

    Upon receipt, the Research Proposal will be evaluated by a review committee convened for that purpose. The Proposal Review Committee consists of (at minimum) the director of the NCHS RDC, the RDC staff liaison, the NCHS Confidentiality Officer, and the director (or designee) of the NCHS data division whose data are requested in the proposal. Proposals for use of non-NCHS data undergo review as determined by the steward/s of those data.

    (2) The following criteria apply to proposal review for projects requesting use of NCHS data:

  11. Scientific and technical feasibility of the project;

  12. Availability of resources at the RDC;

  13. Risk of disclosure of restricted information; and

  14. For projects using NCHS data, whether the proposed project is in accordance with the mission of the NCHS to provide statistical information that will guide actions and policies to improve the health of the American people.

    Researchers should note that approval of their application does not constitute endorsement by NCHS of the substantive, methodological, theoretical, or policy relevance or merit of the proposed research. NCHS approval only

    [[Page 67587]]

    constitutes a judgment that this research, as described in the application, is not an illegal use of the requested data file and that there is high probability that the project can be successfully done in the RDC.

    Researcher-Supplied Data

    The RDC allows researchers to supply their own data to be linked with RDC data sets to create merged data sets that will be stored in the RDC. The researcher-supplied data may consist of proprietary data collected and ``owned'' by the researcher or other publicly available data obtained by the researcher such as census data. Researchers MUST provide RDC staff with complete documentation of any data proposed to be merged with RDC data. Researchers expecting to use merged files are responsible for interacting with RDC staff to ensure that their data can be merged with the data resident at the RDC and the format of the data is consistent with the RDC data. The RDC will accept user data files in SAS[supreg], Stata[supreg], or ASCII[supreg] format (flat files) with variables either column-delimited or column-specific. Other formats may also be proposed. RDC staff prior to the arrival of the researcher will do the merging of researcher-supplied data with RDC data sets. Identifying information in linking fields will be removed after the merge and will not be made available to the researchers.

    Owners or stewards of RDC data sets make the determination of whether and how the resultant merged files would be made available to other researchers. For RDC files that are owned by NCHS, this determination is made by the owners of the researcher-supplied data that will be merged with the NCHS owned RDC files. For files that are NOT owned by NCHS, the determination is made by the stewards or owners of the RDC files. The owners of these files can require that any merged files be made available to all interested researchers or allow this determination to be made by the owners of the researcher supplied data.

    The RDC periodically creates and maintains backup copies of all computer files. Backup files are stored in a secure storage area accessible by RDC staff only, although they may be made available to researchers who need to return for additional analyses. These backup files will contain user-supplied data as well as the merged files. These backup files will be destroyed only upon the written request of the user.

    General Requirements for Guest Researchers

    1. Researchers must work under the supervision of RDC staff and only during normal working hours (Monday-Friday, 8:30 a.m.-5 p.m.). Admittance to the RDC will be limited to the researchers whose names are included in the Research Proposal (Section D). Researchers will be required to show photo identification before admittance. A maximum of 3 collaborating researchers can sit at a computer station in the RDC.

    2. Computers will be pre-loaded with the approved datasets by NCHS staff approximately one day prior to the external researcher's use of the RDC. Once the analysis is completed, NCHS staff will remove the datasets from the RDC computer.

    3. Guest researchers must be able to conduct their analyses with the software specified in their research proposal.

    4. External researchers are not allowed to bring documents, manuals, books, etc., that may enable them to identify and disclose confidential information they access in the RDC. Neither are they allowed to bring into the RDC cell phones, pagers, or other devices which would enable them to communicate with persons outside of the RDC.

    5. All logs will be printed or electronically archived and will be kept by NCHS. NCHS will retain only the programs and procedures run by external researchers. The logs will not include results from their research.

    6. All computer output generated by statistical programs and all hand-written notes based on such computer output are subject to disclosure review by NCHS staff before removal from the RDC. Output is restricted to summary tables of geographic or patient-level data (e.g., line listings of diagnoses by study identifier will be prohibited).

    7. Guest researchers may not save output, files, or programs to transportable electronic media. RDC staff can copy output or programs to transportable media, if requested.

    8. Researchers proposing multiple analyses that employ multiple data sets will have access to only one dataset at a time. Under no circumstances will researchers be permitted any opportunity to merge datasets on their own.

      General Requirements for Remote Access

    9. Researchers must register an e-mail address that is credibly secure. Although programs can be sent to the RDC from any address, results will always be returned to the registered e-mail address.

    10. Data requests must be in the form of SAS[supreg] programs (Version 8.2). However, certain SAS[supreg] commands/statements are not allowed through remote access. A list of such commands/statements is included in Appendix III. This list is periodically reviewed and may be modified as necessary. The SAS[reg] program must be in plain ASCII[supreg] format.

    11. During the first week of registration, researchers' data requests are executed in a manual mode, requiring RDC staff to review the program and resulting output before its release. During this period, remote access is available only during normal working hours. After the first week, researchers may submit data requests any time (day or night) and receive prompt response, except when the CDC e-mail system is down or when the remote access system is taken off-line for maintenance.

    12. The remote access system does not allow users to write permanent datasets in its disk space. Jobs that attempt to create permanent datasets or files are flagged, terminated, and an error message is sent to the researcher.

    13. The remote access system limits researchers' time and storage. No single program is allowed more than one hour to complete execution or to generate output in access of 1.5 MB.

    14. With one exception, macros are not allowed through the remote access system. The exception, GLIMIX[supreg], requires special permission.

      Use of the RDC

      In order to get access to restricted data files in the RDC, researchers must include in their proposals a signed ``Agreement Regarding Conditions of Access to Confidential Data in the Research Data Center for the National Center for Health Statistics.'' (Appendix V) All researchers participating on an approved project must sign the agreement--which clearly states the penalties for violating the conditions of agreement. In addition, each researcher must sign an ``Affidavit of Confidentiality.'' (Appendix VI) The RDC reserves the right to terminate any project at any time that it deems that an investigator's actions will compromise confidentiality or ethical standards of behavior in a research environment.

      Statistical micro data files are collections of data from individual units such as persons or providers. Statistical agencies world wide are bound by ethical and legal requirements to preserve the privacy of individual respondents and the confidentiality of data provided to the agency by them or otherwise pertaining to them. As mentioned earlier, confidentiality protection at NCHS is governed by Section 308(d) of the Public Health

      [[Page 67588]]

      Service Act (42 U.S.C. 242m). This section states that:

      No information, if an establishment or person supplying the information or described in it is identifiable, obtained in the course of activities undertaken or supported under section 304, 306, or 307 may be used for any purpose other than the purpose for which it was supplied unless such establishment or person has consented (as determined under regulations of the Secretary) to its use for such other purpose and in the case of information obtained in the course of health statistical or epidemiological activities under section 304 or 306, such information may not be published or released in other form if the particular establishment or person supplying the information or described in it is identifiable unless such establishment or person has consented (as determined under regulations of the Secretary) to its publication or release in other form.

      Having read and familiarized themselves with the Researcher Affidavit of Confidentiality, including Section 308(d) of the Public Health Service Act (42 U.S.C. 242m) (see below), researchers agree:

    15. To make no copies of any files or portions of files to which they are granted access except those authorized by NCHS Research Data Center staff.

    16. To return to RDC staff all NCHS restricted materials with which they may be provided during the conduct of their research at NCHS and other materials as requested.

    17. Not to use ANY technique in an attempt to learn the identity of any person, establishment, or sampling unit not identified on public use data files.

    18. To hold in strictest confidence the identification of any establishment or individual that may be inadvertently revealed in any documents or discussion, or analysis. Such inadvertent identification revealed in their analyses will be immediately brought to the attention of RDC staff.

    19. Not to remove any printouts, electronic files, documents, or media until they have been scanned for disclosure risk by RDC staff.

    20. Not to remove from NCHS any written notes pertaining to the identification of any establishment, individual, or geographic area that may be revealed in the conduct of their research at NCHS.

    21. To the inspection of any material they may bring to or remove from the NCHS Research Data Center.

    22. To comport themselves in a manner consistent with principles and standards appropriate to a scientific research establishment.

      Appendix V Agreement Regarding Conditions of Access to Confidential Data in the Research Data Center of the National Center for Health Statistics, signed by all investigators on the project, must be submitted with the initial proposal.

      Deliberate violation of any of these conditions may result in cancellation of the data access, and the researcher may be escorted from the premises by the duly authorized Federal protection service on duty at NCHS. The researcher may also be barred from any future use of the RDC upon review and determination by the Director of NCHS that this is necessary to protect the integrity and confidentiality of the RDC.

      The RDC technical monitor will perform a disclosure review and must provide approval to the researcher before removal of any data from the RDC, whether it is in electronic or paper form. Any violation by the researcher may be punishable by fine or imprisonment for up to 5 years or both under Title 18 U.S.C. 1001.

      As noted above, the RDC contains work stations with computers pre- loaded by NCHS staff with the requested dataset(s) to be analyzed with statistical software. External researchers must schedule time for use of the RDC, pay the appropriate user fees, and abide by the standard practices of the RDC. Among the requirements is a restriction on equipment that can be brought into the RDC, signing agreements to maintain confidentiality, and submitting to review of all results for any potential breaches in confidentiality.

      Costs for Using the RDC

      Time in the RDC can be scheduled in increments ranging from a consecutive 2-day minimum to a consecutive 10-day maximum. Extensions can be negotiated with RDC staff subject to scheduling requirements. Scheduling time at the RDC is on a first-come, first-served basis.

      Researchers using the NCHS RDC will be charged for space and equipment rental and staff time necessary for supervision, disclosure limitation review, maintenance of computer facilities (including both hardware and software), and the creation and maintenance of data files required by the researcher. The cost per project (or creation of an analytic file) is given in the table below:

      Guest Researcher (on site)............. $200 per day (2-day minimum). Remote Access.......................... $500 per month for files with less than 130,000 records. $1,000 per month for files with 130,000 records or more. $500 per year for selected standard files.*

      * There are selected files that have been developed for repeat and multiple users which require minimal set up procedures and involve minimal content changes to the file when preparing for different users. For that reason, charges for accessing these files are considerably less expensive than the regular fees. Two files fall under this category: the contextual data file for the National Survey of Family Growth (NSFG-CDF) and the Polio file for the National Health Interview Survey (NHIS-Polio). The cost for accessing standard files of this type will be published as the files are developed.

      There is a minimum setup charge of $500 per day for new file creation. An additional $500 per day is charged as needed for file creations and for special handling, such as the merging of additional data or creating custom file formats.

      More complex projects may require discussion between the researcher and RDC staff to determine the cost of file creation. Researchers are encouraged to develop their proposals in a way that facilitates the ability of the RDC staff to create the analytic files required by the project. Proposals should be explicit regarding the variables needed as well as any case selection required. Overly large and complex projects will require extensive communication between RDC staff and the researchers proposing the project, and this can cause the process to move slowly. Work to prepare data files can be accomplished most expeditiously if large, complex projects are subdivided into manageable parts.

      Payment is expected in advance of the use of the RDC. A cashier's check or money order made payable to NCHS RDC must be received seven business days prior to the start date scheduled for use of the RDC. Payments should be mailed to: NCHS RDC, Attn: RDC Director, 3311 Toledo Road, Suite 4113, Hyattsville, MD 20782.

      Disclosure Review Process

      The disclosure review process in the RDC is centered on a rigorously

      [[Page 67589]]

      conducted research base. Briefly, RDC staff, either independently or in collaboration with staff from other areas of the NCHS, other government agencies, and non-governmental researchers, conduct research into the use of technological and statistical advances to develop and refine additional methods to access restricted data such as the use of the internet or encrypted data, assessment of disclosure risk through statistical and automated procedures, and the use of disclosure limitation methodologies (e.g., statistical noise) to enable the release of otherwise restricted data files. The results of these research activities are applied to disclosure review activities in the RDC.

      Researchers may take the results of their analyses off-site after disclosure review by RDC staff. Disclosure review consists of looking for tabular cells less than 5, tables with geographic variables in any dimension, models with geographic variables (or variables tantamount to geographic variables) as outcome variables, or line listings. In general, disclosure review is consistent with the guidelines published in the NCHS Staff Manual on Confidentiality (see Appendix II, Requirements for the Release of Micro Data).

      RDC staff review data summaries to assure maintenance of respondent confidentiality. In no case may any table contain cells with fewer than 5 observations. If found, these small cells are suppressed, generally by obliterating the cell. To assure that small cells cannot be calculated from the other cells in the same row or column, staff makes illegible the totals for the rows and columns corresponding to the small cell. Once disclosure review is completed, researchers receive a photocopy of the final tabulations.

      RDC staff when reviewing cross-tabulations for small cell use the following procedures:

    23. Shred all tables having fewer than five total observations (table total);

    24. Shred all tables having fewer than five observations in each cell ;

    25. If the table passes the first two criteria, RDC staff will review the table one row at a time;

    26. Make illegible all counts and percents for cells with four or fewer observations;

    27. If one row cell is Some IRBs mandate that datasets be destroyed after research is completed.

      Principal investigator may no longer be affiliated with VSD or the collaborating MCOs; therefore, the location of the dataset is unknown.

      Rapidly changing technology can mean that data are on obsolete media.

      Following receipt of a proposal for a reanalysis, the RDC will verify that the data variables requested from the published study are available. If these data are not available (for one or more of the reasons stated above), the RDC will notify the external researcher. Documentation for variables and datasets used in VSD studies completed after August 2002 are maintained according to the CDC data sharing policy regarding archival of data that are available on the Web at http://www.cdc.gov/od/ads/pol-385.htm.

      All proposals requesting use of VSD data should contain the following information:

  15. Project Title.

  16. Name of proposed investigator and collaborators (RDC rules limit number of persons at a work station to 3 at a time).

  17. Name of point of contact, address, telephone number, and e- mail address.

  18. Summary of proposed study (i.e., background, reasons for conducting the study, public health benefits).

  19. Specific hypothesis for new vaccine safety studies to be investigated or title of published VSD study to be reanalyzed.

  20. Proposed methodology for new vaccine safety studies or the specification of the methods used in published VSD studies:

    1. Definition of the study population of interest and type of study to be conducted:

      a. Descriptive studies: specify the variables and values for those variables to be used to select the study population.

      b. Case-control studies: specify criteria for cases and controls.

      c. Cohort studies: specify criteria for the exposed and unexposed population.

      d. For all new vaccine safety studies, please include the following information as

      [[Page 67591]]

      part of the definition of the study population of interest:

      i. Adult or Pediatric data (0-17 or 18+).

      ii. Study years of interest (i.e. 199X-2000). Please note the study years available vary by HMO site.

      iii. How the study population will be selected from the VSD data files based on available fields in the VSD data dictionary.

    2. Specification of the variables that will be required including:

      a. Exposures: Specific criteria defining exposures based on the VSD data dictionary should be included. For instance, specific vaccines given within 14 days of the outcome of interest.

      b. Outcomes: Specific criteria defining those outcomes based on the VSD data dictionary should be included. For instance, specific ICD-9 codes for outcomes of interest and type of health care encounter (hospitalization, outpatient encounter, emergency room visit).

      c. Person Time or Enrollment: Specify criteria to determine calculation of person time, follow-up time, or MCO enrollment restrictions.

      d. Confounding or control variables, including:

    3. Demographic information.

    4. Pre-existing or co-morbid conditions.

    5. Concurrent vaccinations.

    6. MCO Site.

      e. Other required variables to perform the proposed analysis.

      G.Proposed analytic strategies.

      The RDC staff will notify the external researcher whether his/ her proposal is complete and whether the requested variables are available. If all the requested data variables can be located for the proposed new vaccine safety studies or proposed reanalysis, review of the proposal by the appropriate MCO IRBs takes place. In compliance with federal law and regulations, access by external researchers to a portion of the VSD data files or to datasets from VSD published studies requires review and approval by the appropriate IRBs of the relevant MCOs. The MCO IRBs have the responsibility to protect the confidentiality and privacy of their members' medical records and to adhere to the rules and regulations applicable to their respective institution(s). Consequently, each of the MCO IRBs must review any request for access to the VSD data files that contain information on its MCO members. Any appeal by the requestor of an IRB decision must follow the national, federal procedures for IRBs. CDC is not involved in the MCO IRB process at any time. General information pertaining to the rules and regulations of IRB submission can be found at http://www.cdc.gov/od/ads/hsr2.htm/ .

      Submission of Proposals to MCO IRBs

      Review of a proposal submitted by an external researcher by a MCO IRB does not imply that CDC approves or endorses the external researcher's proposed research. IRB applications may require a more detailed description of the proposed vaccine safety study and may vary according to individual IRB requirements. Furthermore, various IRBs may have different time lines for submission of proposals for review. Each IRB may have specific policies or requirements for data sharing that have not been adopted by the other MCO IRBs. These policies may include required collaboration with an MCO investigator, fees associated with the IRB review process, or differing criteria for the IRB review process.

      MCO IRBs will use their established procedures and time lines to review the proposed research and to consider any appeals. As a rule, IRBs attempt to inform researchers as to the status of their proposals. Approval for access to MCO data contained within the VSD data files does not indicate approval for obtaining additional data contained within the MCO's member medical records or elsewhere, if such data are not contained within the VSD data files that reside in the RDC.

      For new vaccine safety studies, it is possible that an external researcher will receive approval for access to VSD data from some, but not all, relevant IRBs. If this occurs, then the dataset(s) needed to conduct the new vaccine safety study will still be created, but only with data from the MCOs whose IRBs approved access. VSD data sets for new vaccine safety studies must contain data from two or more MCOs' data. Access will not be provided to data from only one MCO. For reanalysis of a published VSD study, all relevant IRBs from the MCOs that participated in the published study must approve the proposal for reanalysis; therefore if one or more IRBs do not approve access to VSD data used in the published study, the final dataset cannot be provided.

      Once the external researcher has received a response from all of the appropriate IRBs, the RDC will begin the process of creating or formatting the approved dataset(s). The RDC will not create or prepare the dataset(s) until it receives copies of all final IRB dispositions.

      Publication of Research Using VSD Data

      When an external researcher has completed his/her work at the RDC and wishes to publish research results and findings using VSD data, there are specific requirements that must be followed:

      External researchers are required to submit a copy of these data sharing guidelines with any manuscript submitted to a journal.

      External researchers are required to submit (to the journal) a copy of the Confidentiality Agreement he/she signed prior to conducting research at the RDC.

      Disclaimers must be included in the manuscript which state:

      The research was conducted using data from the Vaccine Safety Datalink Project, through the data sharing program at the Centers for Disease Control and Prevention.

      Any published material using VSD data must acknowledge CDC as the original data source.

      Additionally, disclaimers must be included that state:

      The analysis, interpretations, and conclusions are the responsibility of the authors and do not represent the views and opinions of the CDC, the Federal Government, or the managed care organization providing the data.

      Appendix V--Agreement Regarding Conditions of Access to Confidential Data in the Research Data Center of the National Center for Health Statistics

      I ---------------------- (please print name) am aware that the information contained in the (name of data file) has been provided to NCHS in accordance with the provisions of Section 308(d) of the Public Health Service Act (42 U.S.C. 242m), with the assurance that it will be used only for health statistical reporting and analysis and will not be published or released in identifiable form. I am also aware that I can be held legally liable for any harm incurred by individuals or establishments who have provided or are described in the information contained in the above work files to which I will have access.

      Having read and familiarized myself with the Researcher Affidavit of Confidentiality, including Section 308(d) of the Public Health Service Act (42 U.S.C. 242m) (attached), I agree:

    7. To make no copies of any files or portions of files to which I am granted access except those authorized by NCHS Research Data Center staff.

    8. To return to RDC staff all NCHS restricted materials with which I may be provided during the conduct of my research at NCHS and other materials as requested.

    9. Not to use ANY technique in an attempt to learn the identity of any person, establishment, or sampling unit not identified on public use data files.

    10. To hold in strictest confidence the identification of any establishment or individual that may be inadvertently revealed in any documents or discussion, or analysis. Such inadvertent identification revealed in my analysis will be immediately brought to the attention of RDC staff.

    11. Not to remove any printouts, electronic files, documents, or media until they have been scanned for disclosure risk by RDC staff.

    12. Not to remove from NCHS any written notes pertaining to the identification of any establishment, individual, or geographic area that may be revealed in the conduct of my research at NCHS.

    13. To the inspection of any material I may bring to or remove from the NCHS Research Data Center.

    14. To comport myself in a manner consistent with the principles and standards appropriate to a scientific research establishment.

      Deliberate violation of any of these conditions may result in cancellation of the data access agreement, and the researcher may be escorted from the premises by the duly authorized Federal protection service on duty at NCHS. The researcher may also be barred from any future use of the RDC upon review and determination by the Director of NCHS that this is necessary to protect the integrity and confidentiality of the RDC.

      Researcher's Signature

      Date

      NCHS Witness

      Date

      [[Page 67592]]

      Appendix VI--Researcher Affidavit of Confidentiality

      I certify that no confidential data or information viewed or otherwise obtained while I am a researcher in the National Center for Health Statistics (NCHS) Research Data Center (RDC) will be removed from NCHS. Further, I understand that NCHS will perform a disclosure review and must provide approval to me before I remove any data from the RDC, whether they are in electronic or paper form. I acknowledge NCHS Confidentiality Statute, Sec. 308(d) of the Public Health Service Act (42 U.S.C. 242m) stated below and fully understand my legal obligations to NCHS to protect all confidential data. Further, I understand that any violation may be punishable by fine or imprisonment for up to 5 years or both under Title 18 U.S.C. 1001.

      NCHS Confidentiality Statute--No information, if an establishment or person supplying the information or described in it is identifiable, obtained in the course of activities undertaken or supported under section 304, 306, or 307 may be used for any purpose other than the purpose for which it was supplied unless such establishment or person has consented (as determined under regulations of the Secretary) to its use for such other purpose and in the case of information obtained in the course of health statistical or epidemiological activities under section 304 or 306, such information may not be published or released in other form if the particular establishment or person supplying the information or described in it is identifiable unless such establishment or person has consented (as determined under regulations of the Secretary) to its publication or release in other form.

      Title 18 U.S.C. 1001--Deliberately making a false statement in any matter within the jurisdiction of any Department or Agency of the Federal Government violates Title 18 U.S.C. 1001 and is punishable by a fine or up to 5 years in prison or both.

      Researcher's Signature

      Date

      NCHS Witness

      Date

      Dated: November 9, 2004. James D. Seligman, Associate Director for Program Services, Centers for Disease Control and Prevention.

      [FR Doc. 04-25537 Filed 11-17-04; 8:45 am]

      BILLING CODE 4163-18-P

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT