Methodology
HOUSES Index
The HOUSES composite index is derived from individual housing features by linking address information to enumerated real property data that is available from local government assessors' offices.
The program applies principal component factor analysis based on real property data features of housing and neighborhood socioeconomic status (SES) items. Factor analysis results are then pared down to four real property feature variables including:
- Housing value.
- Square footage of housing unit.
- Number of bedrooms.
- Number of bathrooms.
In formulating the HOUSES Index, individuals' addresses at the time of interest are geocoded. The geocoding allows for users to match study addresses to geographic reference data and real property data of a housing unit. Each property item corresponding to an individual's address is standardized into a z-score and aggregated into an overall z-score for the four variables mentioned above, such that a higher HOUSES score indicates higher SES. HOUSES is standardized within each county based on available real property data for a given year, as real property data is ascertained and updated from the county assessor's office for tax purposes. The HOUSES z-score can then be converted to percentiles, quartiles and deciles, if needed. For example, some studies use HOUSES in quartiles with Q1 representing the underserved population with the lowest HOUSES and Q4 representing the population with the highest SES.
HOUSES can overcome the paucity of conventional SES measures in commonly used data sets, such as administrative data sets derived from medical records. The unavailability of SES measures in commonly used data sets has been an important impediment to health disparities research.
HOUSES Cloud for the United States
To make HOUSES scalable, an automated cloud-based system, which makes the HOUSES formulation process automatic using the established algorithms, was developed through a National Institute on Aging grant: NIH R21AG065639. The preliminary data showed nearly a perfect match with manual formulation of the HOUSES Index. For example, after a training and reiterative process with data from Olmsted County, Minnesota, a HOUSES Index was formulated for Ramsey County, Minnesota, where the capital city of St. Paul is located. This data was compared with the HOUSES Index z-score formulated by the HOUSES Cloud against gold standard human-formulated data. The average difference between the HOUSES Index from the HOUSES Cloud and gold standard data is negligible. As property data is generally updated once a year, HOUSES can be automatically updated annually so that HOUSES may more accurately reflect the current socioeconomic status of people living at a given address.
With the availability of the HOUSES Cloud, the HOUSES Index can be calculated for all counties in all 50 states of the U.S. Users authenticate and access the web application to perform a HOUSES lookup via search and small batch uploads. Application programming interface (API) clients, such as those with electronic health records, authenticate directly to the API server to perform HOUSES lookup requests. This is a secure, nonpublic HOUSES intraservice and database communication.