Kenneth Mentor

Kenneth W. Mentor, J.D., Ph.D.

Survey Research and the Internet

Kenneth W. Mentor, J.D., Ph.D.
Department of Sociology and Criminology
University of North Carolina Wilmington

Presentation made at the Annual Meetings of the
American Society of Criminology,
November 2002, Chicago, IL

ABSTRACT

The internet provides a cost effective means of administering surveys to a large number of respondents. Early surveys relied on e-mail for survey delivery and response. This initial method of internet-survey offered advantages over traditional methods but had many limitations. Web-based surveys offer an attractive alternative to e-mail surveys while providing advantages over traditional survey methodology. Using an online survey as an example, this paper discusses the process of developing and administering a web-based survey. Problems and potentials of this data collection method are discussed as internet-based surveys are compared to more traditional methods.

The Internet provides many new opportunities for survey research. In particular, the internet offers an economical alternative to paper or telephone surveys. However, there are disadvantages and advantages that must be considered when evaluating the viability of Internet research.

Three types of surveys are compared in this presentation. The first type, and one we are most familiar with, is a mailed or telephone survey. The second type of survey is delivered by e-mail. As discussed below, e-mail surveys can take several forms. E-mail surveys typically include the survey in the text of the e-mail or as an attachment. The third type of survey is web-based. Web-based surveys may be announced by e-mail but the respondent goes to a web-site to complete the survey.

This presentation compares these methods and outlines the advantages and limitations of each method. This presentation also includes discussion of specific steps taken in conducting a web-based survey. Finally, problems and successes of this web based survey are discussed. In some cases the limitations are based on the newness of the internet. Other problems are similar to those faced by researchers using more traditional tools.

Stanton (1998) suggests that sampling problems, response consistency problems, and respondent motivation are the primary issues with Internet data collection. Each of these is discussed below.

Sampling Problems

In most cases the entire population cannot be surveyed. Sampling techniques are used to assure that those who are surveyed are representative of the larger population. A variety of sampling techniques are accepted in survey research, although each introduce error. Couper defines sampling error as a "mismatch between the target population and the frame population" (2000:467). Couper points out that sampling error inevitably occurs when not all members of the frame population are, or can be, measured. Phone, email, and web-based surveys each present different sampling problem.

Phone: To generate a sample the researcher must be able to access phone numbers, or have potential to generate numbers, for the entire population - then take a sample. Approximately 90-95% of all households have telephones (Miller, 2001). This increases the likelihood of obtaining an adequate telephone sample that may be generalized to the general population. Phone surveys use a sampling methods that is similar to that used for mailed surveys. In each case it is relatively easy to obtain a sample that is representative of the larger population. Internet survey options present different challenges.

Internet-based Surveys

E-mail: Many people do not have e-mail addresses. Others have multiple addresses. As a result, it is difficult to obtain a random sample that is comparable to the larger population. The researcher can only choose from those with e-mails, and this sub-group may differ from group as a whole. In comparison to traditional methods:

It is possible to establish a list of telephone numbers for telephone surveys, or create a list of addresses for a mail survey. It is difficult to locate or create a list of email addresses.
Random generation of telephone numbers is common in sampling. There is no method at present that allows for random construction of valid email addresses.
A comprehensive listing of all email addresses for the entire Internet population is non-existent.
Researchers can track the status of phone contacts. Mail surveys can provide return verification when the address is not valid. In contrast, e-mail is sent in the hope that the intended recipient is contacted. There is no reliable way to determine whether the e-mail was delivered and/or read.
Assuming the number or address is valid, phone calls or mail typically reach the intended recipient. The recipient decides how to respond to the call or mail upon receipt. E-mail may not reach the recipient due to several factors, not all of which are under the recipient's control. E-mail programs may be configured to automatically delete certain messages before they are delivered. Another possibility is that Internet Service Providers may block an emailed survey perceived as "spam."

Web-page survey as alternative to e-mail: This method relies on a level of computer knowledge that is a step above receiving and replying to an e-mail. If the initial contact and invitation to participate is provided in an e-mail, the issues outlined above remain active.

As with e-mail surveys, the sample group is not randomly drawn. The demographic patterns of Internet users result in a sample that varies from the larger population. For example, 1998 Population Survey data provided by the U.S. Census Bureau reports that while 42.3 percent of U.S. households have at least one computer, only 25.6 percent of all households reported having access to the Internet from home. This survey also found that Internet users generally come from households earning $75,000 and higher, and tend to be Anglo and highly educated. Other research has documented a range of demographic characteristics among internet users.

Internet users are generally younger than the population average (Couper, 2000).
Stanton (1998) reports that 66% of Internet users are male, and that half of all users are professionals or managers.
Internet users are more politically interested and active, voting at higher levels than non-users (Robinson and Kaye, 2000).
However, Internet respondents are geographically dispersed, enabling their data to be linked to the U.S. Census (Bainbridge, 1999).
Further, several studies have concluded that there are no significant response biases between email and mail respondents (Dommeyer and Moriarty, 2000).

Given the problems inherent in any effort to randomly sample Internet users, it appears that online surveys are more appropriately used when non-probability sampling will meet the requirements of the research. Kaye and Johnson (1999) suggest that since there is no mechanism for randomly sampling Internet users, online surveys are more appropriately used when non-probability sampling will meet the requirements of the research. Similarly, Weible and Wallace (1998) argue that email surveys are only practical for specific or target groups.

E-mail would be useful to pre-test survey instruments, where the sample validity is not as critical (Weible and Wallace, 1998). It may also be argued that internet-based surveys are appropriate when the population of interest has demographic characteristics similar to those of internet users. For example, student populations with easy access to the internet.

2. Response Consistency/Cost Comparison

Several studies analyzed response rates of e-mail and mail surveys. E-mail surveys typically fail to reach the response rates of mail surveys. Research suggests two reasons for this:

Confidentiality and anonymity: E-mail addresses are readily available and lack confidentiality (Bainbridge, 1999; Moriarty, 2000). Arguably, another issue is the general distrust of internet privacy. Many people are suspicious of internet technology and may overestimate the potential for identification of individual internet users.
Delivery of the survey: Weible and Wallace (1998) report that the undeliverable rate for a mail survey was 2 percent, as compared to 19.5 percent for an email survey and 24.5 percent for a web-based survey.

Other issues include response time, response rate, accuracy, and costs of a traditional postal mail survey and email or web-based surveys.

The average response time for the email group was only 18 days, compared to 33 days for the postal mail group. However, savings were realized since the cost of a postal mail survey was found to be 27% higher than that of an email survey (Raziano, Jayadevappa, Valenzula, Weiner and Lavizzo-Mourey, 2001).
Other research calculated the cost of a postal survey at three times that of an e-mail or web-based survey. These authors also pointed to cost savings by suggesting that responses can be increased with very little cost by increasing the sample size of an e-mail survey (Weible and Wallace, 1998).
Current research indicates a response rate ranging from 70-75% for postal surveys. In contrast, response rates for email surveys range from 34-76%, a much larger range (Raziano et al., 2001).
Targeted email surveys produce data from a known sample of respondents. In one case a targeted email survey found that response rates were comparable to traditional postal survey response rates when offering incentives and follow-up contacts (Stanton, 1998). In effect, response rates can be increased by using strategies developed for more traditional methods of survey research.
Stanton (1998) surveyed 231 respondents. Fifty completed a web-based survey and 181 completed a paper version. His research concluded that the web-based data contained fewer missing values than that from the conventional survey. This is another area for significant cost savings. Web-based surveys can be designed to create a database that can be quickly imported in SPSS or other programs. This reduces error while speeding up the data entry process.

3. Respondent Motivation

There is no mechanism to prevent a respondent from answering when tired, bored, or intoxicated, which could affect the accuracy of the responses. Variations in psychological state could also result in missing data or bias (Stanton, 1998).

Internet survey respondents were less likely to use scale endpoints when answering, choosing answers on the "definite" ends less often than "probably" or "maybe" (Miller, 2001). Similarly, Taylor (2000) found that compared to a audible survey, fewer people choose the extreme ends of a scale.

However, open-ended questions in quantitative studies produce more detailed replies in Internet surveys, and also may be more revealing (Curasi, 2000). Replies to open-ended questions tend to be richer and longer (Taylor, 2000). Taylor suggests that respondents may be more willing to address sensitive issues in an internet-based survey.

Comparing Internet Surveys

Three general types:

Embedded surveys are those that are included in the body of an email sent to the survey respondent. This type of survey allows the respondent to answer with relative ease � they must simply reply to the email, including the original survey in their reply, and answer the questions in the space provided.

Embedded surveys yield a significantly higher response rate than attached surveys, 37 percent as compared to 8 percent. However, there were no significant differences in response speed, items omitted, or bias. A disadvantage is that there is limited ability to affect the visual appearance of the survey and graphics are not an option (Dommeyer and Moriary, 2000).

Attached document surveys are those that are attached as a document to an email. This type of survey is more complicated, requiring that the respondent have the knowledge to complete multiple steps to retrieve, complete, and return the survey.

Attached survey programs allow the survey to be completed without leaving the e-mail program. This type of survey requires programming knowledge that may be beyond the researcher's capability, necessitating the expense of hiring programmers to design the survey.

Each of the attachment options leads to potential non-response problems since many Internet users know viruses are often delivered in attachments. This may cause respondents to delete these emails without opening the attachment. In other cases organizations have policies that require that all attachments be removed before delivery. In spite of these problems this type of survey has the advantage of allowing for format and appearance changes, making the survey more visually pleasing.

Web-based surveys are often introduced through emails sent to a potential sample set. The e-mail invite participation and provides instructions. These surveys may include color graphics, audio and skip patterns.

A distinct advantage is the ability for the data to be directly input into the statistical program, which may reduce data entry error.

Web-based surveys are available to anyone who locates the page through a browser. Access control techniques, including passwords or other access keys, may be used to limit access. This requires the time and expense of customized programming. Further, when a respondent uses an individualized password, provided via the invitational email, anonymity may be lost.

Summary

Internet research is promising and has several advantages, namely cost savings, response time, and the ability to sample large geographic areas for linking to the U.S. Census. However, the problems of sampling error, demographic skew, and overall response rates reduce the ability to generalize the results. As a result, the research is less reliable. Key findings and questions include:

Web-based surveys offer distinct advantages over e-mail surveys.
These surveys may not be appropriate for all populations.
Students and others with easy internet access, combined with relatively strong computer skills, may be good targets for this type of survey.
Will web-based surveys always be dismissed as "convenience samples?"
What methods can be used to minimize sampling issues?

Web-based Survey of Criminal Justice Faculty

In an examination of the effectiveness of web-based survey research, Criminal Justice faculty throughout the United States were invited to participate in a web-based survey. This survey assessed opinions regarding distance education. The results of the survey are presented elsewhere. For now, we focus on the process of developing and administering a web-based survey.

The Sample

A list of criminal justice programs, along with web sites and contact information, was assembled in January 2002 (click here to review the list). Contacts included faculty, department heads, and in a few cases, admissions office personnel. The contacts were determined by reviewing program information provided by the various institutions. Programs were identified by examining various internet listings and searching for programs by state. The final list includes 172 programs that offered criminal justice, justice studies, or related degrees. Each of the 172 contact people were invited to participate in the survey.

The Survey

The survey was developed using Microsoft FrontPage. Although the learning curve can be a bit steep, the program enables a "non-programmer" to create online surveys. The process is somewhat repetitive and time consuming but can be completed with minimal expenses. The survey tools provided in FrontPage require that the survey be published on a server with "FrontPage extensions" installed. In addition, database support will be required so that the results can be collected in a form that can be imported into Access, Excel, SPSS, or other data management program.

Click here to see the survey

The survey contains 45 items plus an open ended question at the end. Most questions follow a similar format and are answered by clicking on circles l0cated above each possible response. You are welcome to complete the survey although your responses will not be used in the data analysis.

Invitation to Participate

The initial contact of each institution, through e-mail, occurred in January 2002. Each institution was asked to verify the information we had collected. In addition, we asked about current and planned distance education courses or degrees. Information about these offerings is included with the program listings.

Based on replies from the programs, contact information was revised after the initial correspondence. This list of e-mail addresses was used for the web-based survey discussed in this presentation. On May 3, 2002, an invitation e-mail was sent to the contact person at each of the programs. A copy of the e-mail is included below:

May 3, 2002

Criminal Justice Educator,

Over the past few months we have been working to identify all Criminal Justice related programming in the nation. You may remember my last e-mail requesting information regarding your institution's program offerings and whether your institution offers, or plans to offer, distance courses or degrees. Thank you for your participation. The results of this research can be found at http://cjstudents.com/cj_programs.htm

I have been working on my research with Dr. Kenneth Mentor. We are especially interested in issues related to distance education in the field of criminal justice. We would like to continue our research by asking you to fill out a short survey.

Prior to May 10, please take a few minutes to go to http://www.cjstudents.com/cjdist02/index.htm and complete our online survey. The single page survey should take about 5 minutes to complete. The online survey allows us to collect data without identifying the respondents. The survey is not password protected so anyone can participate. In order to determine whether the participant is part of the invited sample, we are asking for a verification word at the top of the survey. This word will be used by all invited participants and can not be used to identify individuals. The verification word is "snow." Please enter this word at the beginning of the survey.

Thank you for being of assistance in our research.

Sincerely,

Jennifer Lovett
Criminal Justice Graduate Student
New Mexico State University

Kenneth Mentor
Assistant Professor
New Mexico State University

Reminder e-mails were sent on May 9 and May 21. Knowing that some of these contacts were with admissions or other administrative entities, we asked that the invitation be forwarded to the appropriate person when necessary. In some cases recipients replied that they had already completed the survey. We replied with a thank you and an apology for sending a reminder.

The May 21 reminder informed respondents that the response rate, at that time, was "around 25 percent." Recipients were told that we "would like to increase that rate and are sending this final reminder in the hope that those who have not completed the survey can spare a few minutes to help with our research. We are sorry to bother you with repeated requests and will not be contacting you again."

Note that the invitation includes a "verification word" that is to be entered at the beginning of the survey. The original plan was to password protect the directory so that only those with an invitation would be allowed access. Programming skill, and/or server configuration, became a problem at this point. In spite of numerous efforts and a string of communications with the web provider, the password access would not work. The password idea was abandoned, making the web site accessible to all. In order to recognize valid entries the "verification word" was provided to those who were invited to participate in the survey.

Response Rates and Other Issues

Between May 3 and May 29, 2002, the survey was completed by 63 respondents. All but 6 entered the correct verification word. One respondent entered the correct word but did not complete any other items. Two other respondents also entered the verification word but did not complete the survey on the first try. Each immediately tried again and successfully completed the survey.

Out of the initial group of 172, the survey was successfully completed by 54 respondents. The response rate of 31 percent was lower than we would have liked, but we acknowledge that there were problems with the initial sample. Several of the addresses were out of date, even though they had been verified a few months earlier. These e-mails came back undeliverable. The undeliverable rate was less than 10 percent, which is lower than found in previous internet surveys (Weible and Wallace, 1998).

Another problem was that the survey did not always reach a criminal justice faculty member. Several people replied that they were employed in admissions or in another capacity and they would not complete a survey that was intended for criminal justice educators. As with other forms of survey research, we can only speculate as to why others did not complete the survey.

Raziano et al. (2001) reported a large range of response rates for e-mail surveys. The response rate for the present research was at the low end of this range. This rate would most likely be improved with a larger, and more focused, sample. For example, a sampling of ASC or ACJS members would probably result in a higher response rate.

Arguably, response rate becomes less of an issue with e-mail and internet surveys since the cost of increasing the sample size is minimal. In contrast, the cost of increasing the sample size in a mail survey rises quickly and the researcher is forced to make trade-offs. Using the ASC/ACJS example, e-mail contact of the entire membership would be the same as sending an e-mail to a percentage of the members.

Klez?

The Klez worm began making the rounds in early May. This worm was highly publicized, struck many Universities, and was likely to make people especially suspicious of unsolicited e-mail. This may have had a negative impact on return rates.

The Klez worm began making the rounds in the days following our initial e-mail. The initial invitation was sent on May 3. The first of many failed attempts to deliver the Klez worm to my computer occurred on May 5. I know that the Klez worm was not present on my, or my research assistant's, computer.

However, I became suspicious of the timing and wondered if our e-mail invitations had anything to do with the sudden Klez attacks. I recognized many of the names that were sending me the infected attachment. Using e-mail addresses and other information that was available, I searched for information about those that I did not know. In nearly every case, the sender had some connection to the criminal justice system. Most were criminal justice educators. I worried that our mail, although I know it did not contain the worm, may have been used to send the worm to the list of participants who were contacted about the survey.

As we know, this worm was sent as an e-mail attachment and spread very rapidly. Like many viruses, this worm takes addresses from an e-mail program (especially Outlook) and sends mail to all addresses. The worm also picks one of the addresses as the sender. The mail doesn't actually come from the person listed in the "from" box. I received this virus many times over the next few weeks. I also received several messages indicating that mail I had send had not reached the intended sender. Since I had not tried to contact these people, I believe my e-mail address may have been listed as the sender of a Klez initiated e-mail.

These Klez attempts were very frequent over the next few weeks. Since my research assistant's computer was not similarly attacked, it seems logical to assume that our e-mail had nothing to do with the efforts of this aggressive worm. However, I remain curious about the fact that so many of these e-mails came from criminologists and others associated with the justice system. I cross checked the names with those who were send the invitation and could find no pattern of overlap. While at this time it appears that this was merely a coincidence I would be curious to hear from other criminologists who received multiple Klez attacks during this time period.

Returning to the issue of return rates, if the criminal justice educators who were invited to participate in this survey were also struck with the Klez worm it is possible that they assumed that this unsolicited e-mail may be the culprit. This is not a good way to motivate strangers to take the time to fill out an internet survey. Researchers often include monetary reward or a token gift in the hope that the recipient will complete the survey. Delivery of the Klez worm, or perhaps even a suspicion about the timing of two unrelated events, would have had the opposite effect.

Response Dates and Reminders

The initial invitation to participate was sent to potential respondents late in the afternoon of May 2. The first reminder was sent late in the day on May 9. A final reminder was send the morning of May 21. Visits to the survey web site peaked just after these mailings with the highest response rates occurring on May 6 (11 respondents), May 10 (12) and May 21 (17). No other day had more than 5 respondents (May 7 and 22).

The reminders clearly increased the response rate. The final reminder, in which the estimated response rate was mentioned, resulted in an increase of25 respondents. The initial invitation and the first reminder were descriptive while the final reminder included a strong plea for assistance. This plea, and the timing of the plea, nearly doubled the response rate.

Raziano et al. (2001) reported an average response time of 18 days for an e-mail survey. In the present research responses peaked on certain days while the web site received few visits on other days. No responses were entered on days 13 through 18. The final reminder motivated respondents to return to the site on the 19th and 20th day but only two more respondents completed the survey after the 20th day.

Demographics

The survey was completed by 36 males and 17 females. Responded age was assessed on an eight point scale. No respondents were under 26 years old. Other results for age:

		Frequency	Percent
AGE	26-30	2	3.6
	31-35	9	16.4
	36-40	4	7.3
	41-45	7	12.7
	46-50	10	18.2
	51-60	19	34.5
	over 60	2	3.6
	Total	53	96.4

Respondents were asked to report their academic rank:

		Frequency	Percent
RANK	Full Professor	17	30.9
	Associate Professor	16	29.1
	Assistant Professor	12	21.8
	Full time - Non-tenure	6	10.9
	Part time - Non-tenure	2	3.6
	Total	53	96.4

Respondents were also asked about the highest degrees offered in their department:

		Frequency	Percent
DEGREE	Ph.D	10	18.2
	Master's	25	45.5
	Bachelor's	17	30.9
	Associate's	1	1.8
	Total	53	96.4

These demographics indicate that a range of programs and educators were reached through this method of data collection. The programs grant degrees ranging from Associate's to Doctoral. It would appear that there is some bias toward Doctoral programs as community colleges and other smaller institutions are under-represented in the sample.

The sample included more men than women, as might be expected given statistics regarding internet usage (Stanton, 1998). However, these statistics would also predict that a sample of internet users would be younger than the general population (Couper, 2000). In the current survey 60% of the respondents were older than 45. The methods used in identifying the contact people for various programs most likely resulted in the names of department heads and senior faculty. This could be expected to be an older population. This appears to be verified by the fact that the majority of respondents hold Full or Associate faculty ranks.

IP Addresses and Anonymity

The entry page for the online survey includes the following statement:

As you know, we used e-mail to contact you and other educators. We have no way of knowing which of these educators eventually visit this page. We cannot identify you, your institution, or any other factors that could be used to identify individual respondents. Responses to this survey are anonymous and will be analyzed in aggregate form.

Upon collecting the data, we must admit that this statement may not be entirely true. We did not mean to be dishonest - this statement was made without full knowledge of the potential for identifying some information about the respondents. We apologize for making an inaccurate statement.

The database allowed for the collection of IP addresses. This information allowed the researcher to examine individual responses and determine whether individuals had completed the survey more than once - assuming the respondent used the same computer for each visit to the site. This was helpful information but came at a cost to anonymity. All IP addresses follow the same format. They include 4 sets of numbers separated by periods. An IP address looks like this:

123.123.123.123

The first two sets of numbers may, in some cases, be used to identify the ISP (internet service provider) or university that provides access to the internet. Since many institutions use "dynamic" IP addressing, in which the sets of numbers are randomly generated, most individual computers do not have a unique IP address that could be used to identify individuals. In some cases computers are assigned a "static" IP address. It may be possible to identify an individual computer user by having access to this address.

The majority of the statement regarding anonymity is true. Responses to the survey were anonymous and were analyzed in aggregate form. Regrettably, it is not accurate to say that there is no way to identify the individual institution. The issue of IP address collection presents problems for researchers who want to promise complete anonymity.

The distance education survey developed for this research does not provide an opportunity to make embarrassing or otherwise harmful statements. This is not always the case, especially in criminal justice related research. Researchers may need to identify online survey methods that prevent the collection of IP addresses. This may involve a third party that assures anonymity through filtering or encryption programs. For now, this issue is unresolved.

Conclusion

Remember that sampling problems, response consistency problems, and respondent motivation have been identified as primary issues with Internet data collection. The experience described above describes an effort to minimize these problems through the use of a simple online survey. Online surveys are relatively inexpensive tools for data collection and have the potential to be powerful tools when used in the appropriate situations. No survey methodology is perfect, an online surveys are no exception. However, the ease of construction and use, when combined with low cost, make this method of data collection a very attractive option. As such, we can expect greater reliance on web-based data collection in the future. Along with this reliance we can expect the rapid evolution of the internet to continue, bringing new tools that are likely to bring web-based surveys to the point where the data collected is regarded as equal in validity to that generated in more traditional methods.

Bibliography

Bainbridge, W.S. (1999). International network for integrated social science. Social Science Computer Review, 17(4), 405-420.

Couper, M.P. (2000). Web surveys: A review of issues and approaches. Public Opinion Quarterly, 64(4), 464-494.

Curasi, C.F. (2001). A critical exploration of face-to-face interviewing vs. computer-mediated interviewing. International Journal of Market Research, 43(4), 361+.

Dommeyer, C.J. & E. Moriarty. (2000). Comparing two forms of an e-mail survey: Embedded vs. attached. Journal of the Market Research Society, 42(1), 39-50.

Kaye, B.K. & T.J. Johnson. (1999). Research methodology: Taming the cyber frontier. Social Science Computer Review, 17(3), 323-337.

Miller, T.W. (2001). Can we trust the data of online research? Marketing Research, 13(2), 26-32.

Raziano, D.B., R. Jayadevappa, D. Valenzula, M. Weiner, & R.Lavizzo-Mourey. (2001). E-mail versus conventional postal mail survey of geriatric chiefs. The Gerontologist, 41(6), 799-804.

Robinson, T.J. & B.K. Kaye. (2000). Using is believing: The influence of reliance on the credibility of online political information among politically interested Internet users. Journalism and Mass Communication Quarterly, 77(4), 865-879.

Stanton, J.M. (1998). An empirical assessment of data collection using the Internet. Personnel Psychology, 51(3), 709-725.

Taylor, H. (2000). Does Internet research work? Journal of the Market Research Society, 42(1), 51-63.

Weible, R. & J. Wallace. (1998). Cyber research: The impact of the Internet on data collection. Marketing Research, 10(3), 19-24+.

Creative Commons License