Ethics
The study was conducted according to the principles expressed in the Declaration of Helsinki. Written informed consent was obtained from each participant, or from his or her guardian, if the participant was less than 18 years of age at the time of the study. The Zanzibar Research Council Ethics Committee and the Institutional Review Board of the International Vaccine Institute in Seoul, Republic of Korea approved this project.
Study site
The study was conducted on Pemba, one of the main islands of the Zanzibar archipelago (Tanzania) located approximately 60 kilometres off the eastern coast of mainland Tanzania (Figure 1). The district hospitals in Chake-Chake, Mkoani, and Wete were centres for participant enrolment into the study. Pemba is a mainly rural area with an approximate population of 500,000 and a total land area of 984 square kilometres [11]. Much of its terrain is hilly, heavily vegetated, and poorly accessible mainly due to unpaved roads. The northern region of Pemba is divided into two districts, Micheweni and Wete, the southern region is divided into Mkoani and Chake-Chake districts. The administrative centre is Chake-Chake, which is located in the district of the same name.
Study procedures
The comparison of the two data collection methods was done as part of a study that assessed the burden of community-acquired bloodstream infections in febrile patients [12]. All patients seeking treatment at one of the three district hospitals were registered and screened for eligibility by study staff upon arrival at the outpatient department (OPD) or the inpatient ward. Inclusion criteria for outpatients were age over 2 months and a recorded temperature of ≥ 37.5°C (axillary); while inclusion criteria for inpatients were any history of fever and age above 2 months. Since registration and screening were performed prior to the decision to admit patients to the ward or to treat as outpatients, current body temperature and history of fever were recorded for all patients. The majority of participants were enrolled in the outpatient department. In the event that a patient did not initially fulfil the inclusion criteria at the time of presentation but was later admitted to the ward, he or she was enrolled into the study as an inpatient. After OPD office hours, patients that were directly admitted were recruited from the inpatient ward. Each enrolled patient was assigned a unique identification (ID) number. The directing, registration, and screening of participants according to the flow of questions appearing in the PDA is shown in Figure 2.
The surveillance was implemented using a stepwise approach, starting in Chake-Chake Hospital with paper-based CRFs in September 2008. Direct data entry using PDAs was later implemented at Chake-Chake Hospital (in March 2009), as well as at Mkoani (in May 2009) and Wete Hospitals (in August 2009), and data collection with PDAs continued at all three locations until December 2010.
Paper-based case report form and data entry
Data was collected using a 14-page CRF that consisted of four sections: registration, case record (clinical history, physical examination and bedside test results for malaria, glucose and haemoglobin), laboratory results, and outcome. There were a total of 74 paper-based fields to be completed, consisting of 44 multiple choice and 30 open-ended questions. Each CRF was labelled with a consecutive serial number. A unique study ID number was manually assigned to the participant at the time of enrolment. After completion of all four sections, each form was sent to the data management team who manually checked for errors or omissions. Any detected error was referred back to the fieldworker who had completed the respective section of the CRF. This was followed by double data entry of completed forms using Microsoft Access (Microsoft, Seattle, WA, USA), which involved data entry by two different individuals. The two data sets were compared to detect keypunch errors, and any discrepancies were addressed by referring to the source document (CRF). The computerized data were validated by reviewing range and logic errors. Finally, the four sections of the CRF were linked together using the unique ID number.
PDA-based direct data entry
A total of nine Hewlett Packard (Palo Alto, CA, USA) iPAQ 214 Enterprise Handheld personal digital assistants (PDA) with 4-inch TFT touch screen display and Microsoft Windows Mobile® 5.2 operating system were used at the three hospitals. Each PDA has a 2200 mAh lithium ion rechargeable main battery that provides at least six hours of usage. In addition, a backup battery was provided for each operator. Each PDA unit cost approximately USD 340. The PDAs were employed at each of the hospitals five days a week from about 7 am to 4 pm, and the batteries were charged overnight.
The software was developed for the CRF direct data entry using a combination of Visual Studio.Net and Visual Basic.Net (Microsoft, Seattle, WA, USA). To upload and manage the data on a desktop computer, another data management software was developed using Microsoft FoxPro 7.0 (Microsoft, Seattle, WA, USA).
The first two sections of the paper-based CRF (registration and case record) were converted into digital versions for direct data entry. Sample screens from the PDA program are shown in Figure 3. The system offered a structured questionnaire to record each patient's information. The data entry fields were restricted to option buttons, check boxes, or fields where appropriate data could be entered. Dropdown menus, skip patterns, and fields requiring data were programmed into the system to prevent errors while navigating through the questionnaire. Once the individual's information was saved, the name and the census ID of the person was shown in the window.
The two remaining sections of the CRF (laboratory results and outcomes) were not replaced with direct data entry methods. These laboratory data were obtained at various times and in locations where PDAs could not always be made available. The paper-based laboratory results and outcome forms were later double-entered into the database as described above.
The data entered in the PDA were collected by a roving field worker on a secure digital memory card, and uploaded into a central data management desktop computer at the end of each day and integrated into the database. The two completed PDA modules and the two paper forms from each patient were linked through the individual ID number in the data management unit. The data were processed in a relational database environment. Further checks such as data integrity and inter-record consistency, which could not be implemented in the PDA system, were completed on a central desktop computer immediately after uploading the data. Queries were sent back to the hospital or laboratory on the following day for resolution. The data entry staff used the edit module in the PDA system to correct erroneous data. The hospital staff could not change data once it had been entered and saved in the PDA.
Data security and storage
All PDAs and computers were password-protected, and as with completed CRFs, were kept in a safe locker. All data were transferred on a regular basis from the three study hospitals to the central data management unit and uploaded into the central database. The central database was saved with scheduled back-ups.
Comparison
Both data collection methods were compared regarding training, acceptability, data entry time in minutes per patient file using an average value of 1.4 min/page, data turnaround time in days, omission, accuracy, cost in US Dollar and knowledge transfer. User friendliness/acceptability and ease of implementation was assessed in informal interviews with staff members. Omissions were defined as missing entries. The percentage of omissions were calculated for a subset of 32 variables including age, address, history of fever, weight, temperature, heart rate, blood pressure and clinical signs and symptoms. The accuracy of data was determined by assessing the percentage of typographical errors, decimal point faults and illogical values for the variables mostly affected from this type of errors (glucose, hemoglobin, blood pressure, heart rate and weight of blood culture bottle before and after addition of blood). Accuracy was thereby defined as the absence of typographical errors, decimal point faults, and illogical values.
Cost was calculated using cost for personnel, hardware, printing and database development for paper based data collection and our electronic data collection (including the cost for the 2 parts of the CRF that remained on paper). Frequencies were compared by chi-square test.