Research does not always involve collection of data from the participants. There is huge amount of data that is being collected through the routine management information system and other surveys or research activities. The existing data can be analyzed to generate new hypothesis or answer critical research questions. This saves lots of time, money and other resources. Also data from large sample surveys may be of higher quality and representative of the population. It avoids repetition of research & wastage of resources by detailed exploration of existing research data and also ensures that sensitive topics or hard to reach populations are not over researched (1). However, there are certain ethical issues pertaining to secondary data analysis which should be taken care of before handling such data.
Secondary data analysis
Secondary analysis refers to the use of existing research data to find answer to a question that was different from the original work (2). Secondary data can be large scale surveys or data collected as part of personal research. Although there is general agreement about sharing the results of large scale surveys, but little agreement exists about the second. While the fundamental ethical issues related to secondary use of research data remain the same, they have become more pressing with the advent of new technologies. Data sharing, compiling and storage have become much faster and easier. At the same time, there are fresh concerns about data confidentiality and security.
Issues in Secondary data analysis
Concerns about secondary use of data mostly revolve around potential harm to individual subjects and issue of return for consent. Secondary data vary in terms of the amount of identifying information in it. If the data has no identifying information or is completely devoid of such information or is appropriately coded so that the researcher does not have access to the codes, then it does not require a full review by the ethical board. The board just needs to confirm that the data is actually anonymous. However, if the data contains identifying information on participants or information that could be linked to identify participants, a complete review of the proposal will then be made by the board. The researcher will then have to explain why is it unavoidable to have identifying information to answer the research question and must also indicate how participants’ privacy and the confidentiality of the data will be protected. If the above said concerns are satisfactorily addressed, the researcher can then request for a waiver of consent.
If the data is freely available on the Internet, books or other public forum, permission for further use and analysis is implied. However, the ownership of the original data must be acknowledged. If the research is part of another research project and the data is not freely available, except to the original research team, explicit, written permission for the use of the data must be obtained from the research team and included in the application for ethical clearance.
However, there are certain other issues pertaining to the data that is procured for secondary analysis. The data obtained should be adequate, relevant but not excessive. In secondary data analysis, the original data was not collected to answer the present research question. Thus the data should be evaluated for certain criteria such as the methodology of data collection, accuracy, period of data collection, purpose for which it was collected and the content of the data. It shall be kept for no longer than is necessary for that purpose. It must be kept safe from unauthorized access, accidental loss or destruction. Data in the form of hardcopies should be kept in safe locked cabinets whereas softcopies should be kept as encrypted files in computers. It is the responsibility of the researcher conducting the secondary analysis to ensure that further analysis of the data conducted is appropriate. In some cases there is provision for analysis of secondary data in the original consent form with the condition that the secondary study is approved by the ethics review committee. According to the British Sociological Association’s Statement of Ethical Practice (2004) the researchers must inform participants regarding the use of data and obtain consent for the future use of the material as well. However it also says that consent is not a once-and-for-all event, but is subject to renegotiation over time (3). It appears that there are no guidelines about the specific conditions that require further consent.
Issues in Secondary analysis of Qualitative data
In qualitative research, the culture of data archiving is absent (4). Also, there is a concern that data archiving exposes subject’s personal views. However, the best practice is to plan anonymisation at the time of initial transcription. Use of pseudonyms or replacements can protect subject’s identity. A log of all replacements, aggregations or removals should be made and stored separately from the anonymised data files. But because of the circumstances, under which qualitative data is produced, their reinterpretation at some later date can be challenging and raises further ethical concerns.
Internal sources of data are those that are internal to the organisation in question. For instance, if you are doing a research project for an organisation (or research institution) where you are an intern, and you want to reuse some of their past data, you would be using internal data sources.
The benefit of using these sources is that they are easily accessible and there is no associated financial cost of obtaining them.
External sources of data, on the other hand, are those that are external to an organisation or a research institution. This type of data has been collected by “somebody else”, in the literal sense of the term. The benefit of external sources of data is that they provide comprehensive data – however, you may sometimes need more effort (or money) to obtain it.
Let’s now focus on different types of internal and external secondary data sources.
There are several types of internal sources. For instance, if your research focuses on an organisation’s profitability, you might use their sales data. Each organisation keeps a track of its sales records, and thus your data may provide information on sales by geographical area, types of customer, product prices, types of product packaging, time of the year, and the like.
Alternatively, you may use an organisation’s financial data. The purpose of using this data could be to conduct a cost-benefit analysis and understand the economic opportunities or outcomes of hiring more people, buying more vehicles, investing in new products, and so on.
Another type of internal data is transport data. Here, you may focus on outlining the safest and most effective transportation routes or vehicles used by an organisation.
Alternatively, you may rely on marketing data, where your goal would be to assess the benefits and outcomes of different marketing operations and strategies.
Some other ideas would be to use customer data to ascertain the ideal type of customer, or to use safety data to explore the degree to which employees comply with an organisation’s safety regulations.
The list of the types of internal sources of secondary data can be extensive; the most important thing to remember is that this data comes from a particular organisation itself, in which you do your research in an internal manner.
The list of external secondary data sources can be just as extensive. One example is the data obtained through government sources. These can include social surveys, health data, agricultural statistics, energy expenditure statistics, population censuses, import/export data, production statistics, and the like. Government agencies tend to conduct a lot of research, therefore covering almost any kind of topic you can think of.
Another external source of secondary data are national and international institutions, including banks, trade unions, universities, health organisations, etc. As with government, such institutions dedicate a lot of effort to conducting up-to-date research, so you simply need to find an organisation that has collected the data on your own topic of interest.
Alternatively, you may obtain your secondary data from trade, business, and professional associations. These usually have data sets on business-related topics and are likely to be willing to provide you with secondary data if they understand the importance of your research. If your research is built on past academic studies, you may also rely on scientific journals as an external data source.
Once you have specified what kind of secondary data you need, you can contact the authors of the original study.
As a final example of a secondary data source, you can rely on data from commercial research organisations. These usually focus their research on media statistics and consumer information, which may be relevant if, for example, your research is within media studies or you are investigating consumer behaviour.
TABLE 5 summarises the two sources of secondary data and associated examples: