Completeness and Reliability of Location Data Collected on the Web: Assessing the Quality of Self-Reported Locations in an Internet Sample of Men Who Have Sex With Men
Background: Place is critical to our understanding of human immunodeficiency virus (HIV) infections among men who have sex with men (MSM) in the United States. However, within the scientific literature, place is almost always represented by residential location, suggesting a fundamental assumption of equivalency between neighborhood of residence, place of risk, and place of prevention. However, the locations of behaviors among MSM show significant spatial variation, and theory has posited the importance of nonresidential contextual exposures. This focus on residential locations has been at least partially necessitated by the difficulties in collecting detailed geolocated data required to explore nonresidential locations.
Objective: Using a Web-based map tool to collect locations, which may be relevant to the daily lives and health behaviors of MSM, this study examines the completeness and reliability of the collected data.
Methods: MSM were recruited on the Web and completed a Web-based survey. Within this survey, men used a map tool embedded within a question to indicate their homes and multiple nonresidential locations, including those representing work, sex, socialization, physician, and others. We assessed data quality by examining data completeness and reliability. We used logistic regression to identify demographic, contextual, and location-specific predictors of answering all eligible map questions and answering specific map questions. We assessed data reliability by comparing selected locations with other participant-reported data.
Results: Of 247 men completing the survey, 167 (67.6%) answered the entire set of eligible map questions. Most participants (>80%) answered specific map questions, with sex locations being the least reported (80.6%). Participants with no college education were less likely than those with a college education to answer all map questions (prevalence ratio, 0.4; 95% CI, 0.2-0.8). Participants who reported sex at their partner’s home were less likely to indicate the location of that sex (prevalence ratio, 0.8; 95% CI, 0.7-1.0). Overall, 83% of participants placed their home’s location within the boundaries of their reported residential ZIP code. Of locations having a specific text description, the median distance between the participant-selected location and the location determined using the specific text description was 0.29 miles (25th and 75th percentiles, 0.06-0.88).
Conclusions: Using this Web-based map tool, this Web-based sample of MSM was generally willing and able to provide accurate data regarding both home and nonresidential locations. This tool provides a mechanism to collect data that can be used in more nuanced studies of place and sexual risk and preventive behaviors of MSM.