Package 'careless' reference manual

Title:	Procedures for Computing Indices of Careless Responding
Description:	When taking online surveys, participants sometimes respond to items without regard to their content. These types of responses, referred to as careless or insufficient effort responding, constitute significant problems for data quality, leading to distortions in data analysis and hypothesis testing, such as spurious correlations. The 'R' package 'careless' provides solutions designed to detect such careless / insufficient effort responses by allowing easy calculation of indices proposed in the literature. It currently supports the calculation of longstring, even-odd consistency, psychometric synonyms/antonyms, Mahalanobis distance, and intra-individual response variability (also termed inter-item standard deviation). For a review of these methods, see Curran (2016) <doi:10.1016/j.jesp.2015.07.006>.
Authors:	Richard Yentes [cre, aut] , Francisco Wilhelm [aut]
Maintainer:	Richard Yentes <[email protected]>
License:	MIT + file LICENSE
Version:	1.2.2
Built:	2025-02-23 03:31:18 UTC
Source:	https://github.com/ryentes/careless

careless: A package providing procedures for computing indices of careless responding

Description

Careless or insufficient effort responding in surveys, i.e. responding to items without regard to their content, is a common occurence in surveys. These types of responses constitute significant problems for data quality leading to distortions in data analysis and hypothesis testing, such as spurious correlations. The R package careless provides solutions designed to detect such careless / insufficient effort responses by allowing easy calculation of indices proposed in the literature. It currently supports the calculation of Longstring, Even-Odd Consistency, Psychometric Synonyms/Antonyms, Mahalanobis Distance, and Intra-individual Response Variability (also termed Inter-item Standard Deviation).

Statistical outlier function

mahad computes Mahalanobis Distance, which gives the distance of a data point relative to the center of a multivariate distribution.

Consistency indices

evenodd computes the Even-Odd Consistency Index. It divides unidimensional scales using an even-odd split; two scores, one for the even and one for the odd subscale, are then computed as the average response across subscale items. Finally, a within-person correlation is computed based on the two sets of subscale scores for each scale.
psychsyn computes the Psychometric Synonyms Index, or, alternatively, the Psychometric Antonyms Index. Psychometrical synonyms are item pairs which are correlated highly positively, whereas psychometric antonyms are item pairs which are correlated highly negatively. A within-person correlation is then computed based on these item pairs.
psychant is a convenience wrapper for psychsyn that computes psychological antonyms.
psychsyn_critval is a helper designed to set an adequate critical value (i.e. magnitude of correlation) for the psychometric synonyms/antonyms index.

Response pattern functions

longstring computes the longest (and optionally, average) length of consecutive identical responses given.
irv computes the Intra-individual Response Variability (IRV), the "standard deviation of responses across a set of consecutive item responses for an individual" (Dunn et al. 2018)

Datasets

careless_dataset, a simulated dataset with 200 observations and 10 subscales of 5 items each.
careless_dataset2, a simulated dataset with 1000 observations and 10 subscales of 10 items each.

The sample datasets differ in the types of careless responding simulated.

Author(s)

Richard Yentes [email protected], Francisco Wilhelm [email protected]

Simulated dataset with insufficient effort responses.

Description

A simulated dataset mimicking insufficient effort responding. Contains three types of responses: (a) Normal responses with answers centering around a trait/attitude value (80 percent probability per simulated observation), (b) Straightlining responses (10 percent probability per simulated observation), (c) Random responses (10 percent probability per simulated observation). Simulated are 10 subscales of 5 items each (= 50 variables).

Usage

careless_dataset
careless_dataset

Format

A data frame with 200 observations (rows) and 50 variables (columns).

Simulated dataset with careless responses.

Description

A simulated dataset mimicking insufficient effort responding. Contains three types of responses: (a) Normal responses with answers mimicking a diligent respondent (b) Some number of longstring careless responders, (c) some number of generally careless responders. Simulated are 10 subscales of 10 items each (= 100 variables).

Usage

careless_dataset2
careless_dataset2

Format

A data frame with 1000 observations (rows) and 100 variables (columns).

Calculates the even-odd consistency score

Description

Takes a matrix of item responses and a vector of integers representing the length each factor. The even-odd consistency score is then computed as the within-person correlation between the even and odd subscales over all the factors.

Usage

evenodd(x, factors, diag = FALSE)
evenodd(x, factors, diag = FALSE)

Arguments

`x`	a matrix of data (e.g. survey responses)
`factors`	a vector of integers specifying the length of each factor in the dataset
`diag`	optionally returns a column with the number of available (i.e., non-missing) even/odd pairs per observation. Useful for datasets with many missing values.

Author(s)

Richard Yentes [email protected], Francisco Wilhelm [email protected]

References

Johnson, J. A. (2005). Ascertaining the validity of individual protocols from web-based personality inventories. Journal of Research in Personality, 39, 103-129. doi:10.1016/j.jrp.2004.09.009

Examples

careless_eo <- evenodd(careless_dataset, rep(5,10))
careless_eodiag <- evenodd(careless_dataset, rep(5,10), diag = TRUE)
careless_eo <- evenodd(careless_dataset, rep(5,10))
careless_eodiag <- evenodd(careless_dataset, rep(5,10), diag = TRUE)

Calculates the intra-individual response variability (IRV)

Description

The IRV is the "standard deviation of responses across a set of consecutive item responses for an individual" (Dunn, Heggestad, Shanock, & Theilgard, 2018, p. 108). By default, the IRV is calculated across all columns of the input data. Additionally it can be applied to different subsets of the data. This can detect degraded response quality which occurs only in a certain section of the questionnaire (usually the end). Whereas Dunn et al. (2018) propose to mark persons with low IRV scores as outliers - reflecting straightlining responses, Marjanovic et al. (2015) propose to mark persons with high IRV scores - reflecting highly random responses (see References).

Usage

irv(x, na.rm = TRUE, split = FALSE, num.split = 3)
irv(x, na.rm = TRUE, split = FALSE, num.split = 3)

Arguments

`x`	a matrix of data (e.g. survey responses)
`na.rm`	logical indicating whether to calculate the IRV for a person with missing values.
`split`	logical indicating whether to additionally calculate the IRV on subsets of columns (of equal length).
`num.split`	the number of subsets the data is to be split in.

Author(s)

Francisco Wilhelm [email protected]

References

Dunn, A. M., Heggestad, E. D., Shanock, L. R., & Theilgard, N. (2018). Intra-individual Response Variability as an Indicator of Insufficient Effort Responding: Comparison to Other Indicators and Relationships with Individual Differences. Journal of Business and Psychology, 33(1), 105-121. doi:10.1007/s10869-016-9479-0

Marjanovic, Z., Holden, R., Struthers, W., Cribbie, R., & Greenglass, E. (2015). The inter-item standard deviation (ISD): An index that discriminates between conscientious and random responders. Personality and Individual Differences, 84, 79-83. doi:10.1016/j.paid.2014.08.021

Examples

# calculate the irv over all items
irv_total <- irv(careless_dataset)

#calculate the irv over all items + calculate the irv for each quarter of the questionnaire
irv_split <- irv(careless_dataset, split = TRUE, num.split = 4)
boxplot(irv_split$irv4) #produce a boxplot of the IRV for the fourth quarter
# calculate the irv over all items
irv_total <- irv(careless_dataset)

#calculate the irv over all items + calculate the irv for each quarter of the questionnaire
irv_split <- irv(careless_dataset, split = TRUE, num.split = 4)
boxplot(irv_split$irv4) #produce a boxplot of the IRV for the fourth quarter

Identifies the longest string of identical consecutive responses for each observation

Description

Takes a matrix of item responses and, beginning with the second column (i.e., second item) compares each column with the previous one to check for matching responses. For each observation, the length of the maximum uninterrupted string of identical responses is returned. Additionally, can return the average length of uninterrupted string of identical responses.

Usage

longstring(x, avg = FALSE)
longstring(x, avg = FALSE)

Arguments

`x`	a matrix of data (e.g. item responses)
`avg`	logical indicating whether to additionally return the average length of identical consecutive responses

Author(s)

Richard Yentes [email protected], Francisco Wilhelm [email protected]

References

Johnson, J. A. (2005). Ascertaining the validity of individual protocols from web-based personality inventories. Journal of Research in Personality, 39, 103-129. doi:10.1016/j.jrp.2004.09.009

Examples

careless_long <- longstring(careless_dataset, avg = FALSE)
careless_avg <- longstring(careless_dataset, avg = TRUE)
boxplot(careless_avg$longstr) #produce a boxplot of the longstring index
boxplot(careless_avg$avgstr)
careless_long <- longstring(careless_dataset, avg = FALSE)
careless_avg <- longstring(careless_dataset, avg = TRUE)
boxplot(careless_avg$longstr) #produce a boxplot of the longstring index
boxplot(careless_avg$avgstr)

Find and graph Mahalanobis Distance (D) and flag potential outliers.

Description

Takes a matrix of item responses and computes Mahalanobis D. Can additionally return a vector of binary outlier flags. Mahalanobis distance is calculated using the function psych::outlier of the psych package, an implementation which supports missing values.

Usage

mahad(x, plot = TRUE, flag = FALSE, confidence = 0.99, na.rm = TRUE)
mahad(x, plot = TRUE, flag = FALSE, confidence = 0.99, na.rm = TRUE)

Arguments

`x`	a matrix of data
`plot`	Plot the resulting QQ graph
`flag`	Flag potential outliers using the confidence level specified in parameter `confidence`
`confidence`	The desired confidence level of the result
`na.rm`	Should missing data be deleted

Author(s)

Richard Yentes [email protected], Francisco Wilhelm [email protected]

References

Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological Methods, 17(3), 437-455. doi:10.1037/a0028085

Examples

mahad_raw <- mahad(careless_dataset) #only the distances themselves
mahad_flags <- mahad(careless_dataset, flag = TRUE) #additionally flag outliers
mahad_flags <- mahad(careless_dataset, flag = TRUE, confidence = 0.999) #Apply a strict criterion
mahad_raw <- mahad(careless_dataset) #only the distances themselves
mahad_flags <- mahad(careless_dataset, flag = TRUE) #additionally flag outliers
mahad_flags <- mahad(careless_dataset, flag = TRUE, confidence = 0.999) #Apply a strict criterion

Computes the psychometric antonym score

Description

A convenient wrapper that calls psychsyn with argument anto = TRUE to compute the psychometric antonym score.

Usage

psychant(x, critval = -0.6, diag = FALSE)
psychant(x, critval = -0.6, diag = FALSE)

Arguments

`x`	is a matrix of item responses
`critval`	is the minimum magnitude of the correlation between two items in order for them to be considered psychometric synonyms. Defaults to -.60
`diag`	additionally return the number of item pairs available for each subject. Useful if dataset contains many missing values.

Author(s)

Richard Yentes [email protected], Francisco Wilhelm [email protected]

Examples

antonyms <- psychant(careless_dataset2, .50)
antonyms <- psychant(careless_dataset2, .50, diag = TRUE)
antonyms <- psychant(careless_dataset2, .50)
antonyms <- psychant(careless_dataset2, .50, diag = TRUE)

Computes the psychometric synonym/antonym score

Description

Takes a matrix of item responses and identifies item pairs that are highly correlated within the overall dataset. What defines "highly correlated" is set by the critical value (e.g., r > .60). Each respondents' psychometric synonym score is then computed as the within-person correlation between the identified item-pairs. Alternatively computes the psychometric antonym score which is a variant that uses item pairs that are highly negatively correlated.

Usage

psychsyn(x, critval = 0.6, anto = FALSE, diag = FALSE, resample_na = TRUE)
psychsyn(x, critval = 0.6, anto = FALSE, diag = FALSE, resample_na = TRUE)

Arguments

`x`	is a matrix of item responses
`critval`	is the minimum magnitude of the correlation between two items in order for them to be considered psychometric synonyms. Defaults to .60
`anto`	determines whether psychometric antonyms are returned instead of psychometric synonyms. Defaults to `FALSE`
`diag`	additionally return the number of item pairs available for each observation. Useful if dataset contains many missing values.
`resample_na`	if psychsyn returns NA for a respondent resample to attempt getting a non-NA result.

Author(s)

Richard Yentes [email protected], Francisco Wilhelm [email protected]

References

Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological Methods, 17(3), 437-455. doi:10.1037/a0028085

Examples

synonyms <- psychsyn(careless_dataset, .60)
antonyms <- psychsyn(careless_dataset2, .50, anto = TRUE)
antonyms <- psychant(careless_dataset2, .50)

#with diagnostics
synonyms <- psychsyn(careless_dataset, .60, diag = TRUE)
antonyms <- psychant(careless_dataset2, .50, diag = TRUE)
synonyms <- psychsyn(careless_dataset, .60)
antonyms <- psychsyn(careless_dataset2, .50, anto = TRUE)
antonyms <- psychant(careless_dataset2, .50)

#with diagnostics
synonyms <- psychsyn(careless_dataset, .60, diag = TRUE)
antonyms <- psychant(careless_dataset2, .50, diag = TRUE)

Compute the correlations between all possible item pairs and order them by the magnitude of the correlation

Description

A function intended to help finding adequate critical values for psychsyn and psychant. Takes a matrix of item responses and returns a data frame giving the correlations of all item pairs ordered by the magnitude of the correlation.

Usage

psychsyn_critval(x, anto = FALSE)
psychsyn_critval(x, anto = FALSE)

Arguments

`x`	a matrix of item responses.
`anto`	ordered by the largest positive correlation, or, if `anto = TRUE`, the largest negative correlation.

Author(s)

Francisco Wilhelm [email protected]

Examples

psychsyn_cor <- psychsyn_critval(careless_dataset)
psychsyn_cor <- psychsyn_critval(careless_dataset, anto = TRUE)
psychsyn_cor <- psychsyn_critval(careless_dataset)
psychsyn_cor <- psychsyn_critval(careless_dataset, anto = TRUE)

Package 'careless'

Help Index

careless: A package providing procedures for computing indices of careless responding

Description

Statistical outlier function

Consistency indices

Response pattern functions

Datasets

Author(s)

Simulated dataset with insufficient effort responses.

Description

Usage

Format

Simulated dataset with careless responses.

Description

Usage

Format

Calculates the even-odd consistency score

Description

Usage

Arguments

Author(s)

References

Examples

Calculates the intra-individual response variability (IRV)

Description

Usage

Arguments

Author(s)

References

Examples

Identifies the longest string of identical consecutive responses for each observation

Description

Usage

Arguments

Author(s)

References

Examples

Find and graph Mahalanobis Distance (D) and flag potential outliers.

Description

Usage

Arguments

Author(s)

References

See Also

Examples

Computes the psychometric antonym score

Description

Usage

Arguments

Author(s)

See Also

Examples

Computes the psychometric synonym/antonym score

Description

Usage

Arguments

Author(s)

References

See Also

Examples

Compute the correlations between all possible item pairs and order them by the magnitude of the correlation

Description

Usage

Arguments

Author(s)

See Also

Examples