The sqpr
package gives easy access to the API of the Survey Quality
Prediction website, a data base that contains
over 40,000 predictions on the quality of questions.
After the feedback received at the ESRA 2019
conference,
the sqpr
package has been decoupled into two packages. The first will
be responsible for only obtaining the data from the SQP API (sqpr
) and
the second will have all the measurement error corrections implemented
separately. This way, measurement error corrections can happen
independently of the SQP software for anyone who has other survey
quality information. This new package is still under construction and
will be uploaded as soon as it has working value.
Currently, the sqpr
package is not usable. This package is being
tested on the new SQP 3.0 API and not on the published website of the
SQP here.
If you have any questions, please contact us at cimentadaj@gmail.com.
sqpr
is not currently on CRAN but you can install the developing
version of from Github with:
# install.packages("devtools")
devtools::install_github("sociometricresearch/sqpr")
Register in the SQP website and confirm your registration through your email.
First, load the package in R and provide your registered credentials.
library(sqpr)
sqp_login('your username', 'your password')
For details on the login process see the Accessing the SQP 3.0 API vignette
from the package.
Once you’ve ran sqp_login()
, you’re all set to work with the SQP 3.0
API! No need to run it again unless you close the R session.
To explore the SQP 3.0 API quickly, get_sqp
will be your main
function. Assuming you know the study, question, country and language
that you’re looking for, you can make one call to the SQP 3.0 API. Let’s
try to get the question tvtot
in Round 1 for Spain in Spanish.
sp <-
get_sqp(
study = "ESS Round 1",
question_name = "tvtot",
country = "es",
lang = "spa"
)
sp
#> # A tibble: 1 x 4
#> question reliability validity quality
#> <chr> <dbl> <dbl> <dbl>
#> 1 tvtot 0.731 0.939 0.686
The country and language specification need to be in two and three letter codes respectively. A simple search on Google will yield two and three letter codes for country and language, feel free to explore them.
get_sqp
also allows to specify several variables:
sp <-
get_sqp(
study = "ESS Round 1",
question_name = c("tvtot", "trstprl", "ppltrst"),
country = "es",
lang = "spa"
)
sp
#> # A tibble: 3 x 4
#> question reliability validity quality
#> <chr> <dbl> <dbl> <dbl>
#> 1 tvtot 0.731 0.939 0.686
#> 2 ppltrst 0.724 0.956 0.692
#> 3 trstprl 0.828 0.902 0.747
Additionally, you can also use regular expressions:
sp <-
get_sqp(
study = "ESS Round 1",
question_name = "^tv",
country = "es",
lang = "spa"
)
sp
#> # A tibble: 2 x 4
#> question reliability validity quality
#> <chr> <dbl> <dbl> <dbl>
#> 1 tvtot 0.731 0.939 0.686
#> 2 tvpol NA NA NA
The previos step assumes you’re well aware of the studies available and
some of the country/languages available. Alternatively, you can query
all the questions in a specific study to check whether a specific
question has quality predictions. Use find_studies
to locate whether
your study is in the SQP 3.0 database.
find_studies("ESS Round 4")
#> # A tibble: 1 x 2
#> id name
#> <int> <chr>
#> 1 4 ESS Round 4
Great! Once we have that, we can use it to find all of it’s questions
with find_questions
. find_questions
accepts the study that you’re
looking for and a string that specifies the questions that you’re
looking for. Let’s search for all questions that have tv
in the name:
q_ess <- find_questions("ESS Round 4", "tv")
That might take a while because it’s downloading all of the data to your
computer. If you want to know all the questions in that study
beforehand, use get_questions("ESS Round 4")
.
Let’s query further down to get the language for a specific question:
sp_tv <- q_ess[q_ess$language_iso == "spa", ]
sp_tv
#> # A tibble: 3 x 5
#> id study_id short_name country_iso language_iso
#> <int> <int> <chr> <chr> <chr>
#> 1 7999 4 TvTot ES spa
#> 2 27699 4 TvPol ES spa
#> 3 27638 4 PrtVtxx ES spa
The hard part is done now. Once we have the id
of your questions of
interest, we supply it to get_estimates
and it will bring the quality
predictions for those questions.
predictions <- get_estimates(sp_tv$id)
predictions
#> # A tibble: 3 x 4
#> question reliability validity quality
#> <chr> <dbl> <dbl> <dbl>
#> 1 tvtot 0.713 0.926 0.66
#> 2 tvpol NA NA NA
#> 3 prtvtxx NA NA NA
get_estimates
will return all question names as lower case for
increasing the chances of compatibility with the name in the
questionnaire of the study.