Finnish health and social welfare registers in epidemiological research

Finland introduced a personal identification number system in 1964, and since then practically all administrative registers have included this unique identification code. Currently Finland has strict data protection laws, which prohibit the collection of sensitive health and social information. Health data that includes identifiers can be gathered by obtaining informed consent from the patients or clients, or under special legislation. An important exemption of this general principle, however, is the collection and use of such data for statistical and scientific purposes, for example in epidemiological studies. Thus, these registers can be used as a basis when trying to improve health, welfare, health services, and social welfare services. This article describes the health and social welfare register system and data protection legislation in Finland, and gives some examples on epidemiological register studies. This article also presents methods for promoting register research, including the newly launched Finnish Information Centre for Register Research. LONG TRADITION OF MAINTAINING


REGISTERS
The management, organisation, planning, evaluation, control and protection of individuals as well as the identification, selection and enumeration of cases have been listed as good reasons to collect administrative health and welfare data (Gissler, 1998).The use of existing administrative data in research is attractive, since the total study costs and the time spent on data collection can be reduced significantly.
Record keeping in general has a long tradition in Finland.The registration of vital statistics, including for example births, deaths and marriages, was initiated as early as 1749.Most national data collection was based on aggregated data collection, but the development of computer technology made the collection of personal-level register information feasible.The first nation-wide, computerised disease register, the Cancer Register was started in 1952 (Table 1).Other administrative registers were computerised early, including the Central Register on Health Care Personnel, the Register on New Cases of Tuberculosis, and the Register of New Cases of Sexually Transmitted Diseases, as well as the first hospital discharge reporting systems covering tuberculosis sanatoriums (since 1956), psychiatric hospitals (since 1957) and general hospitals (since 1960).
In the 1960s, the National Board of Health introduced the Register on Congenital Malformations (1963), the Register of Adverse Drug Reactions (1966), and the Mass Screening Register for cervical and breast cancers (1968), and merged the different hospital discharge registrations to a new Hospital Discharge Register covering all public hospitals (1967).The Finnish Institute of Occupational Health started the registration of occupational diseases in 1964.
A system of unique identification numbers was launched in 1964 along with general sickness insurance, and by 1968 all Finnish citizens and permanent residents had received their own number.The Finnish Central Population Register was also created at the same time.Currently, the register covers information on all inhabitants and their family relations (since 1973) who are Finnish citizens or permanent residents of Finland.This development provided good opportunities for increasing the compilation of health and social welfare registers.In general, the collection of personal identification numbers improves the data quality of any statistics and augments the available information, for example on the aggregation of service utilisation.It also enables the more efficient secondary use of data for example in research.
A study from the 1990s showed that it is possible to get identification numbers retrospectively in cases where these numbers are missing from the original data that predates the personal identification numbers.In that study, consisting of a cohort of 4431 women who were pregnant in 1954-1963, only 0.6% of the cohort remained unidentified, but the process was very complicated, time-consuming and expensive.The total costs to allocate retrospectively the identification numbers were estimated to be 23 000 €, while had the identification numbers already been available, then the validation of their correctness would have only cost 925 € (Hemminki et al. 1998, Gissler 1999a).Finland -along with Denmark -is one of the rare countries, which base their Census on already compiled register information instead of collecting similar information from all citizens by postal questionnaires and/or interviews.

LEGISLATION
The first health registers were compiled under legislation covering the data collecting institution, but there was no separate legislation on health registers.Such legislation was passed in the Finnish Parliament in 1987.It ensured citizens' rights to privacy despite the increased use of computerised registries containing sensitive data, but also recognised the need to collect health and medical information.These statutes, which are still in force, gave health authorities the right to gather and register relevant information on individual level including personal identification numbers, and obliged both public and private health care personnel to provide this data for them.The legislation listed all the health registers that national authorities may maintain (Table 1) as well as describing their content on a general level.
After this data protection legislation was accepted, no new health registers have been introduced.However, in 1989 the different data collections regarding infectious diseases were merged into a single register, whilst in 1994 the National Implant Register was widened to include dental implants, and also in 1994 the Hospital Discharge Register (re-named as the Care Register) was widened to include day surgery cases.
This legislation does not cover the existing three social welfare registers.In 2001, the Parliament passed legislation on statistical activities at the National Research and Development Centre for Welfare and Health (STAKES).The law also stated that STAKES has the right to maintain these registers whilst their contents were listed in great detail, variable by variable; unlike in the legislation regarding health registers.
The Finnish health and social information system, based on registers that include personal identification numbers, is in accordance with the EC directive on the protection of personal data.Finland revised its legislation on the protection of personal life to meet the EU requirements in 1999.According to the Personal Data Act, health and social information can only be gathered by informed consent from the client or patient with the exception of data collected for statistics and historical or scientific research.The legislation also states that the Finnish nation-wide health and social welfare registers cannot be used in decisionmaking about registered individuals.Neither can registered persons themselves enter these register to check and correct their information.However, they have the right to do so at the primary source of information.In most cases these are health care facilities, social care facilities, and municipal social welfare offices, which are keeping registers on their patients and clients.On the other hand, the data protection legislation states that the register keeping institutions (controllers) have to maintain general information -a so called description of the file -on the existing registers, their contents and their purpose of use, which is easily available to the citizens.

DATA QUALITY IS SHOWN TO BE GOOD
One of the main prerequisites for the utilisation of register data is good data quality.Data in such a register has good coverage and validity.In other words, all events are included in the database, and the registered data is in accordance with the reality.This has been shown to be true for several Finnish administrative registers in studies that compare the internal validity of a register (e.g.Gissler and Shelley 2002) as well as in studies comparing register information with patient records or other information from the primary source (e.g.Keskimäki and Aro 1991, Teperi 1993, Teppo et al. 1994, Gissler et al. 1995, 1996and 1997, Isohanni et al. 1997).
In spite of having good data quality, there might be a variance in the quality of some variables.For example, information on events occurring before or after the actual registered event, information on rare diagnoses, interventions, or events, and information on events occurring in another unit or institution than the reporting one may cause quality problems, which limit their utilisation both in statistics and in research.Active collaboration between the reporting institutions and the register controller, and active use of register data in decision-making and research seems to improve the data quality of health registers.In cases where the data quality is problematic, the compilation of such a register has to be critically evaluated.A Finnish example is the Register on the Mentally Retarded, which was run by the National Board of Social Welfare between 1981 and 1986.Its re-establishment in 1991 was unsuccessful due to poor data quality, which resulted in its discontinuation.

RESEARCHERS CAN APPLY FOR PERMISSION TO USE REGISTER DATA
According to the basic legislation on social and health care services, health care and social welfare organisations and institutions can register patient and client data which is necessary for their care.Otherwise, informed consent from the person is needed for the use of health or social welfare data in research.This principle also applies to the compilation of local, regional and national registers not mentioned in the previously described legislation.Previously collected health information may be used though in research without informed consent if the data is large or the collection of such informed consents is not feasiblefor example, due to a retrospective collection of information, or due to the patients' age or health status.This principle is applied only in the case of research that is utilising register data only.Usually the institution which is maintaining the register has the right to grant this kind of authorisation.If hospital records are linked to register data, then permission always has to be applied for from the Ministry for Social Affairs and Health.Biological samples can be combined with register data, but a statement from an ethical board is mandatory.In register research, a similar statement is recommended, but it is not obligatory.
The data protection legislation includes the following requirements, which all have to be fulfilled before authorisation can be received: 1.An application form has to be filled, and it has to include a study plan, which is sufficiently detailed and scientifically sound.2. The study follows generally accepted guidelines and methods for scientific research.3. The study has a leader or a group of persons who take responsibility for the project.4. The researcher(s) have to follow a duty of care that ensures the study must not compromise the privacy of registered persons, and that personal-level information is managed legally and carefully.After the research data has been collected, the research group has to describe the data content, and to provide this information to anyone who requests it.5.The data protection authority (data protection ombudsman) has the right to comment on the study plan.6.A plan for archiving or destroying the data after the study has to be completed.
In cases, where researchers wish to contact the registered persons, e.g. for clinical or anthropometric measurements, interviews or postal questionnaires, the first contact can only be done through the physician in the health care institution where the patient or client was treated.At the same time, the diagnoses available in the health registers can be double-checked and thus validated.
After authorisation has been granted, the researcher(s) have to apply for the data.The data protection issues are thought through once more, when the register controllers prepare the research data for the researchers.As a general rule, the information which is needed to identify an individual person (such as the personal identification number and name) is removed from the data set.The data can also be made coarser either by removing the exact date of birth or by aggregating the residential information to ensure that no violations of privacy and confidentiality occur by accident.
The personal identification numbers are given only, if the data has to be further linked to other registers or other information sources.These data linkages can also be done by the register keeper -or by a trusted third partner, if the data includes extremely sensitive information -and the researcher(s) will receive unidentified data only.By law, some institutions (e.g.Statistics Finland and STAKES in the case of social welfare registers) cannot deliver personal identification numbers to the researchers.Thus, all data linkages have to be made by the register-keeping organisation.
According to current legislation, a register maintained for research is never permanent, and it has to be destroyed after the research has been completed.However, important research data owned by authorities or by academic or research institutions can though be archived, and the filing can be done in those academic or research institutions, which have the right to archive such data.
Research data can be distributed in certain instances to other researcher(s) after its primary research use has been finalised.This can be done, if the registered persons have given their informed consent, but it requires the parties to have made a contract covering the exchange of information.This kind of data transference is also possible to other countries, but the data protection authority has to be notified, if the data protection legislation in that country is considered to be inferior to Finnish legislation.The latter does not, however, apply to the countries of the European Economic Space, including all the five Nordic countries.
In the vast majority of cases researchers do get authorisation for the use of confidential register data, and the process usually takes from three to six months.It can also take much longer.The most common problems are that the study questions and/or research framework are unfocused; the researcher(s) wish to have information on every available data source irrespective of their relevancy; or the requirements stated in the data protection legislation are not fulfilled.Then, instead of directly rejecting the application, the authorities usually consult and negotiate with the researchers, which may be time consuming.
Some epidemiological studies involve extremely sensitive information, for example when studying themes such as induced abortion, sterilisation, infertility, congenital anomalies, psychiatric disorders, genetics, and studies involving family members or consecutive generations.In these cases, a focused and scientifically justified study plan that includes consideration of all ethical questions is essential, but unofficial negotiations are recommended between the study group and the data protection authorities as well as experts who are familiar with register research.
Even though all the requirements for register research may be fulfilled, the process for obtaining permission to use sensitive health information in research may be complicated and cumbersome.An example was the School Age Study, which followed all the 60 254 children born in Finland in 1987 up until the age of nine years by linking the Medical Birth Register to five national registers and 18 regional registers on intellectually disabilities, as well as to 38 school registers in one county (Gissler et al. 1999b).The study idea and the utilisation of existing register data was generally accepted, but it remained unclear under which legislation the data gathering and the necessary data linkages could be performed.The process was started in April 1994, when a positive statement was received from the ethical board of the institution which was in charge of the study (STAKES).Due to the size of study, the Data Protection Authority required that the researchers apply for special permission from the Board of Data Protection.This Board interpretation of the data protection legislation was such that they thought no special permission was needed.The latter interpretation was accepted only after the issue was finalised by a decision from the Supreme Court of Administration Issues in August 1996, more than two years after the start of the three-year project (Gissler et al. 1998).Since then, the data protection legislation has been updated, and the practices and processes for obtaining permission for the use of health data in research have been streamlined to be more research-friendly.

ADMINISTRATIVE REGISTER DATA IS WIDELY USED
The Finnish register system provides good opportunities for utilising data in research.For the registers kept by STAKES, more than 400 authorisations for data use were given in 1999-2003 (Table 2).These studies represent a diversity of scientific disciplines, such as epidemiology, clinical medicine, demography, health care research, jurisprudence, occupational health, pharmacology, social policy, sociology, and statistics.The most often used registers are the Hospital Discharge Register (47% of the authorisations given in 1999-2003), the Cancer and Mass Screening Registers (24%), and the reproductive registers, including the Medical Birth Register, the Register of Congenital Malformations, and the Register on Induced Abortions and Sterilisations (23%).The methods used in these studies were multifarious varying from crosssectional studies utilising one single health register to retrospective and prospective follow-up studies combi-ning several administrative registers and medical journals gathered from hospitals and other data sources.
These authorisations also included studies in which information other than register or patient information was combined with health registers.Questionnaire data could be linked to register information without informed consent prior to 1999.In one study, data from the Adolescent Health and Lifestyle Survey 1987-1998, which is a self-administered questionnaire mailed every second year to independent samples of 12, 14, 16, and 18 year old adolescents was combined with their pregnancies (births, miscarriages and induced abortions) to identify groups with a higher probability of pregnancy (Vikat et al. 2002).
Another example is the study on relationships between natural well fluoride and hip fracture in Finland (Kurttio et al. 1999).They combined data from the Population Census of Statistics Finland, the Central Population Register, and the Hospital Discharge Register to form a data set for a population cohort correlating with hip fracture cases.To calculate individual exposures they used ground water fluoride measurements of 8 972 wells by the Geological Survey of Finland and another study of well-water carried out by the Ministry of Social Affairs and Health and the National Board of Waters and the Environment.The study showed that fluoride increases the risk of hip fractures only among women.
The use of register based data in the early stages of development of new drugs or medical practices is attractive, because the time used in early development could in some cases be substantially shorter when some early hypothesis can be tested without clinical trials.The utilisation of register data can also be useful in designing clinical trials.Kilkinen et al. (2002) combined the data from a cross-sectional health survey carried out in Finland in 1997 with antimicrobial prescriptions reimbursed by the National Social Insurance Institution, and showed that the use of oral antimicrobials decreases serum enterolactone concentration.The new data protection legislation has made the data compilation on longitudinal prospective cohorts more complicated, but not impossible.In Northern Finland, two large birth cohorts have been compiled in 1966and in 1985-1986(Gissler et al. 2000, Pouta et al. 2004).These cohorts consisting of 12 058 and 9 479 births, respectively, have been tracked through follow-up questionnaires and clinical measurements, by compiling information from hospital records, but also by linking several administrative registers, such as the Central Population Register, the Cause-of-Death Register, the Hospital Discharge Register, and the registers on social benefits gathered by the Social Insurance Institution.Several important findings regarding the connection of prenatal and infant health to subsequent health and welfare in adolescence and in adulthood have been found by using these unique data.
No informed consent from the cohort members was needed prior to the late 1990s, but currently the following two statements have been included into the survey questionnaires and asked before the clinical examination: 1. Any information that has been gathered on me previously, or will be gathered in the course of this study can be used now, or later, in scientific and public health research carried out at the Faculty of Medicine at Oulu University for the purpose of promoting well-being and health.2. Any information that has been gathered on me previously, or will be gathered in the course of this study can be handed over without name and personal identification number, that is, without identifying data, for the use of researchers working at cooperating research institutes for the purposes of scientific research.
The majority of participants permitted the use of their data, and only a few participants refused to give their informed consent.With regard to the 1985-1986 birth cohort, 2.2% of adolescents and 1.7% of their parents did not give permission to use their data for scientific research (Question 1), and rather more refused the delivery of their data for collaborating units, in total 5.1% of adolescents and 4.4% of their parents (Question 2).Almost everyone who participated in the clinical examination gave their permission to use the compiled data in research, and only 0.5% refused the delivery of their unidentified data to other research institutions.

FINNISH INFORMATION CENTRE FOR REGISTER RESEARCH
As in the other Nordic countries, the significant possibilities for register-based research have been noted.Some obstacles which are hindering the more effective use of existing register data have been identified though.First, many researchers find the current data protection legislation cumbersome and difficult to follow.Second, the knowledge of existing health and social welfare registers and their contents is still limited.Third, there is a need to further improve the methodological skills of researchers and students in the field of register research.Fourth, the high costs of obtaining information from some data sources -e.g. the Central Population Register and the databases kept by Statistics Finland -have been said to preclude the utilisation of existing register information.
To further promote the use of administrative registers in scientific research, the Finnish Society for Social Medicine and the Finnish Society of Epidemiology have, in 2001, proposed to the Academy of Finland that a centre for register research should be established.A committee consisting of a chairperson, two secretaries, and 16 members from different research and information organisations were nominated to explore the possibilities of setting up such a centre in Finland, to clarify the interests among the relevant actors who would participate in the activities of the centre, and to find sustainable financing for the centre.
After the initial working period from 1 March 2002 to 31 December 2002, the committee proposed that the Finnish Information Centre for Register Research be launched by means of a two-year grant from the Academy of Finland.The Academy of Finland approved this grant in April 2003, and the Centre was initiated at STAKES in August 2003.After the two-year support from the Academy of Finland ends, the research institutions that come under the Ministry of Social Affairs and Health have agreed to finance these functions.Moreover, universities and other academic research units, as well as Statistics Finland can and are participating in the development of the centre.
The aim of the Finnish Information Centre for Register Research is to promote the use of national administrative registers in research, especially in health and social sciences by • supporting planning and implementation of registerbased research, • improving the capabilities for using register data among researchers, • increasing co-operation between different registers, and • improving practices on the utilisation of register data.
The first task of this newly launched centre was to create a body (board) to promote register research.The main goal of this board is to promote register research, to promote financing of register-based research both nationally and internationally and to reduce the costs of utilisation of register data.Besides maintaining this board, the Centre has created a network of contact persons in the register-keeping organisations and research institutions.The centre will introduce an internet portal (http://www.rekisteritutkimus.fi)presenting the existing administrative registers, data protection legislation and practices, and methods in register-based research.In the future, the centre will support training for students and researchers in register-based research in the Finnish universities and research institutions.The centre may also assist in the process of retrieving authorisation for data access and financing, as well as to give assistance in the writing of study plans and in data linkages.The Finnish Information Centre for Register Research serves the whole country, and these basic tasks are free for all users.The activities of the Centre will start as a limited network, but the aim from the beginning has to be to expand the range of activities.Possible future tasks for the Centre are to finance register-based research with special funds, to perform data linkages for studies having the required authorisation, to analyse register data, and to archive data sets without identification numbers, possibly in cooperation with the Finnish Social Science Data Archive (FSD 2004).Some of these services may be liable to a charge.

FUTURE CHALLENGES
There have been some proposals to introduce new health care and social welfare registers, but these preparatory plans have not been implemented.One important reason has been the wish to avoid any changes in the current data protection legislation.This recommendation was also followed in the governmental working party, which reviewed the current and planned future health and social welfare information system in Finland (Gissler et al. 2004).
The working party concluded the importance of continuing the compilation of individual-based data, and supported more active utilisation of the nationwide registers.The current health and social welfare information system with registers given in Table 1 will be kept unchanged.The only exception is the Register on Sterilisations, which can be terminated if all the cases are found to be in the Hospital Discharge Register.The working party proposed only one new register: the National Public Health Institute may initiate a nation-wide Vaccination Register to monitor immunisation coverage and the possible harmful effects of vaccinations if the pilot data collection demonstrates a clear feasibility and if the necessary change to the data protection legislation is accepted by Parliament.Some further augmentations of the existing information system were made, but these can be performed without legislative modifications.With regard to health care services, the most important reform is the improvement of primary health care statistics: sample-based register data on the patient background, on the reasons for primary health care visits, and on interventions will be gathered together to complete the existing aggregated data collection of the number of primary health care visits, also covering rehabilitation and occupational health visits.
The compilation and maintenance of health registers and their use in research have been widely accepted in Finland, but also critical voices have been raised (e.g.Lehtonen 2002).A threat to the current register practice and for epidemiologic research is the tightening of data protection legislation.This may happen for example, if a single leak occurs from one of the protected data sources like the national health registers or from a research register.The decisionmaking in such a scenario is political and its endpoint is thus hard to predict, even though the Finnish Parliament has repeatedly accepted the justifications of data compilation with personal identification numbers.Finally, the use of sensitive information in research is justifiable only when the studies serve widely acceptable aims and are designed and carried out to the highest possible standards of quality.

Table 1 .
The nation-wide health and social welfare registers in Finland.Technical management has been given to the Finnish Federation of the Visually Impaired.4) Register on New Cases of Tuberculosis and Sexually Transmitted Diseases started in 1958.
1) Technical management has been given to the Finnish Cancer Societies.2)The data collection also included previous hospital discharge registers, which included care in tuberculosis sanatoriums (since 1956), psychiatric hospitals (since 1957) and general hospitals (since 1960).Complete identification numbers are available since 1969.3) 5) The data collection includes information on special reimbursements of medicine since 1964, on national pensions (guaranteed minimum pension), family pensions and child disability allowances since 1970, on sickness allowances and reimbursed visits in private health care services since 1971, on rehabilitation since 1978, on conscripts' allowances and basic unemployment allowances since 1985, on family allowances and child care subsidies since 1993, on maternal grants, labour market subsidies and housing allowances since 1994, on prescribed medicine and reimbursed interventions in private health care services since 1996, and on students' allowances since 1997.

Table 2 .
The number of authorisations given for the utilisation of personal-level data from health and social welfare registers for research purposes atSTAKES in 1999STAKES in  -2003.   .