Investigating the validity of the Human Resource Practices Scale in South Africa : Measurement invariance across gender

Gender (used in this text to refer to men and women) is a prominent variable within the workplace and life in general. Several journals are dedicated to the topic (see Gender, Work, and Organisations [Wiley], Gender in Management: An International Journal [Emerald Publishing], as well as the International Journal of Gender and Entrepreneurship [Emerald Insight]). In some articles published in these journals the perceptions of men and women are compared, or measures of perceptions are used in models to test hypotheses related to gender differences (Eagly, 1997; Eagly & Wood, 1999), often reporting differential outcomes based on gender.


Introduction
Gender (used in this text to refer to men and women) is a prominent variable within the workplace and life in general.Several journals are dedicated to the topic (see Gender, Work, and Organisations [Wiley], Gender in Management: An International Journal [Emerald Publishing], as well as the International Journal of Gender and Entrepreneurship [Emerald Insight]).In some articles published in these journals the perceptions of men and women are compared, or measures of perceptions are used in models to test hypotheses related to gender differences (Eagly, 1997;Eagly & Wood, 1999), often reporting differential outcomes based on gender.human resource management (HRM) practices.HRM practices are those practices traditionally associated with HRM functions, ranging from job design to service termination (Albrecht, Bakker, Gruman, Macey, & Saks, 2015).Within this context, some authors report that gender discrimination exists widely, regardless of gender equality policies (Patterson, Bae, & Lim, 2013).The persistence of gender inequality makes it therefore important to see gender inequality in organisations as a complex phenomenon (Stamarski & Son Hing, 2015) -one that requires sophisticated models if it is to be explained (Lips, 2013).Lee Cooke and Xiao (2014) also express their concern and state that observed gender differences have serious repercussions for HRM practices, affecting job design, work organisation, career support, as well as work-life balance enterprises.
Despite the aforementioned concerns, Dickens (1998) states that most writing and research on HRM does not make gender noticeable (except when the primary submission concerns women at work or equal opportunities) and that writing and research on the nature and perceptions regarding HRM practices tend to be gender-blind.In such writings employees are usually presented as disembodied.As Acker (1992, p. 259) notes, the 'fiction of the universal worker obscures the gendered effects of ostensibly gender-neutral processes and helps banish gender from theorising about the fundamental character of complex organisations'.Dickens (1998) concludes that assuming equality across genders in the HRM domain forms part of grandiloquence rather than the reality and states that apparently gender-neutral HRM concepts and policies are in reality gendered and perpetuate, rather than contest, gender inequality.
Focusing on human capital models, Lips (2013) states that there is a continuing debate in which various explanatory variables are used to explain the gender differences in workplace outcomes, arguing that many of the differences are the result not of discrimination but of other factors such as the different contributions men and women make in the workplace.Most significant for this research is Lips's (2013) questioning of the utility or validity of many of the human capital 'explanatory' variables, stating that they (the explanatory variables) beg explanation themselves.

Purpose
This research aims to analyse the validity of a measurement of HRM practices across men and women, testing if respondents interpret the measure in a conceptually similar manner.Stated more operationally, the research aims to test whether the relationships between manifest indicator variables (scale items, subscales) and the underlying construct are the same across groups (Bialosiewicz, Murphy, & Berry, 2013).The focus on measurement is important, as the Employment Equity Act (Act 55 of 1998) prohibits the use of instruments that have not been scientifically tested to demonstrate that they can be applied fairly to all employees and are not biased to any group.The focus on HRM practices is also important, as it is a major antecedent of organisational culture and knowledge management practices, leading to organisational innovation that is positively related to organisational performance (Al-bahussin & El-Garaihy, 2013).Furthermore, this research has been prompted by the work of Ismail and Nakkache (2015), who explored gender differences in the experiences of HRM policies and whose results disaffirm the stereotypical pro-men conceptualisations.

Literature review
Two matters are reviewed.Firstly, the contention that HRM practices constitute an antecedent to organisational outcomes is considered, and secondly the focus will be on ways in which HRM practices are measured.This review grounds the present research within the context of the present body of knowledge.
HRM practices can positively influence employees' attitudes and lift workplace performance, which will most likely affect organisational outcomes (Kehoe & Wright, 2013;Messersmith, Patel, Lepak, & Williams, 2011).Research has highlighted the role of effective HRM practices in organisational effectiveness (Combs, Liu, Hall, & Ketchen, 2006;Melton & Meier, 2017;van Esch, Wei, & Chiang, 2016).Brewster, Gooderham and Mayrhofer (2016) state that the bulk of HRM research focuses on strategic HRM, implying an emphasis on the impact of HRM on organisational performance.It is therefore not surprising that the Chartered Institute of Personnel and Development (2016) encourages debate on how HRM can amplify its contribution toward organisational performance or that Ulrich (2013, p. 16) urges executives to 'see their human resource practices as a source of competitive advantage' and a deliverer of results.
When considering quantitative methodologies, the measurement of constructs is important.Focusing specifically on the measurement of high-performance or effective HRM practices, some authors develop their own measures (e.g.Madmoli, 2016;Zhang & Jia, 2010;Ziyae, 2016) while others prefer to use standardised measures, such as the one developed by Sun, Aryee and Law (2007; e.g.Ahmed, 2016;Mustafa, Lundmark, & Ramos, 2016;Zhu, Warner, & Rowley, 2007) or Gould-Williams and Davies (2005; e.g.Alfes, Shantz, & Truss, 2012;Boekhorst, Singh, & Frawley, 2015;Jensen, Patel, & Messersmith, 2011).In this research, the focus will be on the Human Resource Practices Scale (HRPS) (Nyawose, 2009;Steyn, 2012), a measure of effective HRM practices previously successfully used in the South African workplace, displaying acceptable reliability and validity properties (Steyn, Bezuidenhout, & Grobler, 2017;Steyn & Grobler, 2014).Some researchers prefer to present measurement of highperformance or effective HRM practices as a single construct (e.g.Makongoso, Gichira, & Orwa, 2015;Tang, Wei, Snape, & Ng, 2015;Zhang & Jia, 2010), and this is how Becker, Huselid and Becker (1998) present it in their seminal paper.Others, however, perceive it as a multidimensional construct.In this regard, Sun et al. (2007) list broad job design, selective staffing, internal mobility, employment security, extensive training, results-oriented appraisal and rewards, as well as employee participation, as elements of the construct.Boadau and Gil-Ripoll's (2009) instrument assesses elements named values and culture, job, internal communication, training, appraisal of diligence and performance, recruitment and selection, pay, induction and exit processes, workforce planning, climate and motivation, teamwork, change, leadership, industrial relations and career plan.As a last example, Madmoli (2016) lists the following as elements to be assessed when one is interested in effective HRM: selection, training, job evaluation, rewarding, employees' participation in current affairs, hiring competent experts, as well as the tendency of managers to share implicit and explicit knowledge among themselves.The HRPS (Nyawose, 2009;Steyn, 2012) (the instrument used in this research) assesses seven HRM practices, namely training and development, compensation and rewards, performance management, supervisor support, staffing, diversity management, as well as internal communication.
It may be important to note that the evaluation of HRM practices depends on the degree to which employees experience HRM practices as effective (Kehoe & Wright, 2013).Building on this, and seeing the matter in the context created in the second and third paragraphs of this literature review, this research aims to analyse the extent to which men and women perceive concepts, as presented in the HRPS instrument (Nyawose, 2009;Steyn, 2012), equivalently.The focus on measurement invariance stems from the comparisons often drawn between men and women, something that also happens in entrepreneurship research (Haus, Steinmetz, Isidor, & Kabst, 2013;Henry, Foss, & Ahl, 2016;Lim & Envick, 2013) and when researching the HRM practices that act as antecedents to entrepreneurship (Amberg & McGaughey, 2016;Dabic et al., 2011;Mustafa et al., 2013).To the present no research on the invariance across gender of the HRPS has been published, and this matter is thus unresolved.This research did not attempt to explain differences between men and women through identifying the most potent explanatory variables.Rather, it focused on the validity of the explanatory variables themselves, as Lips (2013) urges researchers to do.When asking questions regarding invariance, it takes into account whether differences in scores are real and whether the functioning of the measuring instruments is indeed equivalent for men and women.In some cases, instruments have indeed been found to function differently for males and females (Pässler, Beinicke, & Hell, 2014;Wetzel, Böhnke, Carstensen, Ziegler, & Ostendorf, 2013), while in other cases no such differentiation was noted (Baker, Caison, & Meade, 2007;Wei, Chesnut, Barnard-Brak, Stevens, & Olivárez Jr, 2014).Within the context of HRM practices, some research has been conducted regarding the differential functioning of measures of individual HR practices across men and women (Matthews & Ritter, 2016;Ployhart & Holtz, 2008;Xu, Wubbena, & Stewart, 2016), but no research could be located on measurement invariance in HRM practices scales that focus on multiple practices, nor on the HRPS.Ignoring the possibility of deferential functioning has the potential to compromise any substantive genderbased comparisons resulting from the measurement (Salzberger, Newton, & Ewing, 2014).More so, the National Institute of Education and American Psychological Association Standards lists differential validity and differential prediction as a major concern of test fairness (Pässler et al., 2014).Only once construct comparability (measurement invariance) is demonstrated does it become possible to interpret differences in test or scale scores as true representations of differences explained by group membership (Wu, Li, & Zumbo, 2007).The aforementioned is in line with the requirements of the South African Employment Equity Act (Act 55 of 1998), which takes a strong stance against the adverse impact of psychometric testing.

Research design
This study examines the HRPS structure across 1652 men and 1284 women employees of 52 companies in South Africa.Full data were available across all of the companies concerned.All applicants completed the HRPS in English (which is the lingua franca of high school and post-school education, as well as of business, in South Africa).The objectives of the study were (1) to examine if the HRPS structure could be replicated across gender groups, (2) to examine the level of measurement invariance attained across the groups and (3) to report on the psychometric properties of the HRPS when used in South African organisations.
The matter of measurement invariance is central to this research and to this article.Measurement invariance relates to an observed score being reflective of an individual's standing on a construct, independent of his or her group membership (Mellenbergh, 1989;Meredith, 1993;Meredith & Millsap, 1992;Wu et al., 2007).Within the context of factor analysis, measurement invariance means that the same latent variables are measured across groups, allowing for crossgroup factor scores to be comparable (Meredith, 1993;Wu et al., 2007).Typically four levels of measurement invariance are tested: (1) configural invariance, which tests if groups (men and women) have similar factor loading patterns; (2) weak invariance, testing for equality in unstandardised factor loadings; (3) strong invariance, testing for equal unstandardised factor loadings and intercepts (of the item regressions); and (4) strict invariance, testing for equal unstandardised factor loadings, intercepts and error variances (Vandenberg & Lance, 2000).As a final step, equivalence of the latent means of men and women on the seven factors was tested.Multigroup confirmatory factor analysis is the de facto standard (Chen, 2008) for use in investigating measurement invariance.

Method Population and sampling
The target population consisted of employees, at different levels of responsibility, who are exposed to various HRM practices.Organisations with more than 50 employees were targeted as it was presumed that the HRM services would be formalised in these organisations and that a broad range of services would be available.

Measurement instrument
The HRPS (Nyawose, 2009;Steyn, 2012) was used to measure employees' satisfaction with the HRM services delivered to them.The items were developed on a rational basis by examining the literature on HRM (Nyawose, 2009).Seven HRM practices were measured in this study, and the questionnaire consisted of 21 items.The HRPS has a hierarchical structure, with each of the seven factors consisting of three items (see Appendix 1).
Participants responded to the items on a five-point Likert scale, ranging from 'disagree strongly' (1) to 'agree strongly' (5).For each of the seven HRM practices, the scores ranged from 3 to 15.A high score would be reflective of an individual who perceived the HRM practice as effective, whereas a low score would reflect that the participant was dissatisfied with the particular HRM practice.Nyawose (2009) reported internal consistency reliabilities varying from 0.74 to 0.93.Nyawose also reported statistically significant correlations with outcomes such as turnover intentions and occupational commitment.Steyn (2012), only using five of the HRPS scales, reported Cronbach's alphas of 0.88 for training and development, 0.87 for compensation and rewards, 0.81 for performance management, 0.74 for staffing and 0.75 for diversity management.Steyn (2012) also reported significant correlations with turnover intentions and occupational commitment, and additionally with job satisfaction and employee engagement.Overall, these results support the reliability and validity of the HRPS for research use.

Participants
The participants were 2936 employees (44.7% women), representing several public and private organisations based in South Africa.The distribution of participants with respect to race and ethnicity was approximately as follows: 8% Asian, 58% black people, 8% mixed ethnicity and 24% white people.The participants' ages ranged between 20 and 72 years, with a mean of 37.8 years and with a standard deviation of 9.1.Participants' tenure at their present companies ranged from 1 month to 42 years, with an average of just more than 9 years and a standard deviation of 7.5 years.

Analysis
The data were initially scanned for normality, after which measurement invariance was tested for.Following the recommendations of Vandenberg and Lance (2000), pairwise multigroup confirmatory factor analyses (Wu et al., 2007) with robust maximum likelihood estimation were used to examine configural, weak, strong and strict invariance across men and women, and as a final step equivalence of the latent means of men and women on the seven factors was tested.
The analysis only focused on measurement differences between self-identified men and women.This divide (mainly) represents the biological sex and more traditional gender role identification prevalent in the South African society.It is acknowledged that in the present era gender identification is more fluid and that identification as a lesbian, gay, bisexual or transgender (LGBT) individual may have more negative consequences (Badgett, Lau, Sears, & Ho, 2007;Grant, Mottet, Tanis, Harrison, Herman, & Keisling, 2011) than being labelled as a man or a woman.Granting this, the present custom in South Africa is to identify as a man or a woman in most formal organisational settings, and this custom was therefore followed in this study.
The analyses were performed with the lavaan package (Rosseel, 2012) in R (R Core Team, 2013).Maximum likelihood chi-square (MLχ 2 ), comparative fit index (CFI), root-meansquare error of approximation (RMSEA) and Bayesian information criterion (BIC) were used to evaluate model fit across successively stringent levels of measurement invariance.Findings are as follow: • Although highly desirable, it was expected that the hypotheses of perfect fit for the measurement models would be rejected, given that the χ 2 statistic is very sensitive to sample size (in this case more than 3000) and is no longer relied upon as a basis for acceptance or rejection of a model fit (Schermelleh-Engel, Moosbrugger, & Müller, 2003;Vandenberg 2006).However, a statistically significant difference in χ 2 between a less constrained and a more constrained model was deemed as evident of a deteriorating model fit.• A CFI > 0.95 is used as indicative of a good model fit (Vandenberg & Lance, 2000).When comparing models, Vandenberg and Lance (2000, p. 46) note that 'changes in CFI of -0.01 or less indicate that the invariance hypothesis should not be rejected, but when the differences lie between -0.01 and -0.02, the researcher should be suspicious that differences exist.Definite differences between models exist when the change in CFI is greater than -0.02'.
• Vandenberg and Lance (2000) suggest that a RMSEA < 0.08 is acceptable.RMSEA < 0.08 was used as indicative of overall fit.As no critical values for the change of RMSEA could be located, the same principles as for ∆CFI were followed, where consecutive model fits were compared.• The BIC was used as a measure of comparative fit.Models that generate lower BIC values are generally preferred, and the absolute value was not interpreted.BIC was therefore used to assess model deterioration, which was visible when BIC values increase.
These parameters were used when interpreting the measurement invariance results.Once measurement invariance is established, more descriptive statistics on the HRPS will be provided.These will include the factor loadings, descriptive statistics, including reliability information, as well as the correlations between the observed scores as well as the latent factors.Last-mentioned will provide insight into the uni-or multidimensionality of the measurement of HRM practices.

Ethical consideration
Permission (2014_SBL_018_CA dated 27 February 2014) to conduct the research was obtained from the Research Ethics Review Committee of the Graduate School of Business Leadership at the University of South Africa before commencing with sampling.Once approval had been obtained, a list of staff members was requested from the organisation's HRM department.Respondents were selected randomly from this list.The selected respondents were invited to a meeting at which the purpose of the research was explained.They were informed as to the nature of their participation, including that participation was completely voluntary.Those who agreed to participate then completed a consent form specifying ethical issues, including confirmation regarding the anonymity of participation, confidentiality, the right to withdraw from participation at any time without any explanation or any adverse effects, and the fact that the data would be used for research purposes only.Then only did they complete a hard copy of the questionnaire.

Results
Preliminary analysis showed that the skewness and kurtosis of the HRPS items ranged from -0.08 to -0.97 and -0.99 to 0.79, respectively.None of the items demonstrated excessive deviation from normality and they appeared appropriate for factor analysis with robust maximum likelihood estimation (cf.Loehlin & Beaujean, 2017;McDonald & Ho, 2002).
In each group, a baseline independent cluster confirmatory factor analysis model was specified in accordance with the structure given in Appendix 1.The baseline models were identified by fixing the unstandardised factor loading of one item per targeted factor to unity.Factor loadings of items on non-target factors were fixed at zero.Factor loadings of the remaining items, factor covariances and error variances were freely estimated using robust maximum likelihood.MLχ 2 , CFI, RMSEA and BIC were used to evaluate model fit.The results pertaining to BIC and χ 2 changes are presented in Table 1.
As expected, the hypothesis of perfect fit for the configural invariance model was rejected (χ 2 (326) = 1341, p < 0.001).However, as evident from Table 2, fit to the configural model as measured with CFI (= 0.97) and the RMSEA (= 0.045) suggested a good fit.
Tables 1 and 2 encapsulate the changes in fit across successively more stringent measurement invariance models with respect to the BIC, CFI and RMSEA.For each comparison, very small ΔCFI and ΔRMSEA values were found (≤ 0.001 for all comparisons -see Table 2).The lowest RMSEA and BIC values were observed for the strict invariance model (i.e.equal loadings, intercepts and error terms), suggesting that this model has the best chance of being successfully replicated in future studies.As a final step, the constraint of equal latent means across men and women was added, producing a statistically non-significant Δχ 2 (p = 0.998).In addition, the ΔCFI and ΔRMSEA of ≥ 0.001 and ≥ 0.001, respectively (see Table 2), indicated that the latent means of the males and females could be treated as equal.
Against the background of the support yielded by the ΔCFI and ΔRMSEA for strict measurement invariance, Table 3 shows the standardised factor loadings obtained for the total group (n = 2936).Each factor was well defined and each item was a statistically significant (p < 0.001) indicator of its target factor.Standardised loadings varied from 0.89 to 0.54.
Noting that latent means were assessed to be invariant, descriptive statistics on the observed HRPS construct scores for men and women and reliability coefficients are presented in Table 4.
The range of the Cronbach's alpha reliability coefficients of the HRPS scales varied from 0.735 and 0.845 for men and 0.710 and 0.853 for women.The reliabilities of the seven scales were uniformly satisfactory and similar across men and women.Given the evidence in support of strict measurement invariance these reliabilities can be assumed to be invariant across the groups.As a last step the correlations between the latent constructs as well as the scale scores were calculated and are presented in Table 5.
Across the groups, medium-sized correlations between factors were observed, which points to some, but not excessive, overlap of the seven factors.This affirms the interrelatedness of the HRM functions (see Becker et al., 1998) but shows that each scale measures a distinct aspect of HRM practices.

Discussion
The objectives of the study were (1) to examine if the HRPS structure could be replicated across gender groups, (2) to examine the level of measurement invariance attained across men and women and (3) to report on the psychometric properties of the HRPS when used in South African organisations.
The results of the maximum likelihood χ² suggest that the hypothesis of perfect fit for all the measurement models had to be rejected (see Table 1).The CFI and RMSEA evidenced that the degree of misfit across the models was relatively small (see Table 2).This suggests that the HRPS structure could be replicated across gender groups, at a configural or baseline level (Objective 1).
The ΔCFI values in Table 2 revealed no detectable deteriorations in fit across successively stringent levels of measurement invariance (note that the CFI does not take model complexity into account).The ΔRMSEA values showed improved fit with successively stringent models.Indeed, the RMSEA and BIC, which both take model complexity into account, showed that the strict measurement invariance model yielded the best fit (see Table 2).Taken together, these results suggest that a measurement model with invariant factor loadings, intercepts and error variances for men and women is the   most likely to be replicated across different studies.This also suggests that the highest level of invariance was achieved (Objective 2).Furthermore, the additional test of latent mean equality was met, which supplements the notion of invariance across men and women.
In conducting this research the seldom-answered call for questioning the assumption of measurement invariance (Tsaousis & Kazi, 2013) was answered.These results are similar to the studies that found invariance when applying the same instrument to men and women (Baker at al., 2007;Wei et al., 2014;Xu et al., 2016), suggesting that males and females are no different when they interpret the items of these instruments.As in the case of many other instruments, the HRPS showed high levels of invariance, implying that gender differences in this regard are not significant.The statistics (Objective 3) presented in Table 4 reflect this equivalence.
The research also affirms the multidimensional conceptualisation of HRM practices, as presented by Nyawose (2009) and Steyn (2012).Contrary to the seminal work of Becker et al. (1998), and many others (Makongoso et al., 2015;Tang et al., 2015;Zhang & Jia, 2010) who perceive HRM functioning as unidimensional, this research demonstrated that the HRM practices are distinct.This is in line with the conceptualisations of Boadau and Gil-Ripoll (2009), Madmoli (2016) and Sun et al. (2007).As far as measurement is concerned, the multidimensionality of HRM practices affirmed here implies that items need to be assigned to each HRM practice, which requires longer questionnaires than when HRM practices are presented as unidimensional.

Practical implications
This study contributes to addressing limitations in the existing literature and practice through validating the factorial structure of the HRPS and its invariance across the gender spectrum.The results empower industrial psychologists in South Africa to use the HRPS to assess the level at which employees are satisfied with the delivery of HRM services across gender.The HRPS is now in compliance with the specifications of the Employment Equity Act (Act 55 of 1998), specifying that gender comparisons be scientifically shown to be fair and not biased to either group.Doing cross-gender comparisons is to be a matter of interest for practitioners involved in HRM efficiency, as some may be interested in reporting on discrimination related to gendered structures and practices.
The distribution of men and women in the sample presents an over-representation of women when considering the demographics of the South African workforce (Statistics South Africa, 2016).A further limitation is that the elements included in the HRPS may not comprehensively describe the entire HRM function.Both these matters should be taken into consideration when using the instrument.While the focus of this research was on traditional gender-centred differences, and the possible differential treatment of men and women, it should be noted that discrimination against LGBT individuals is rife and considerable (Badgett et al., 2007;Grant et al., 2011).The magnitude of the reported discrimination against LGBT individuals as compared to those in more traditional gender roles should promote debate and research on differences in workplace experiences based on gender-related matters.

Conclusion
The results provide ample evidence of measurement invariance of the HRPS across gender in the workplace context in South Africa and also support the veracity and stability of the elements among job incumbents in South Africa.After establishing measurement invariance, it will be appropriate for researchers to proceed with testing substantial hypotheses about the means and interrelations between these latent constructs across groups (Hirschfeld & von Brachel, 2014).

TABLE 1 :
Chi-square test and change in chi-square statistics.

TABLE 2 :
Fit measures and changes in fit measures.

TABLE 3 :
Standardised factor loadings of the Human Resource Practices Scale items for men and women jointly.

TABLE 5 :
Factor and scale correlations of the Human Resource Practices Scale.
Note: Factor correlations are below the diagonal.Scale correlations are above the diagonal.Coefficient alphas are on the diagonal, in parentheses.All correlations are statistically significant ( p < 0.05).T&D, training and development; Rem, remuneration; PM, performance management; SS, supervisor support; Sta, staffing; Div, diversity management; Com, communication.

TABLE 4 :
Scale means, standard deviations and reliability coefficients on the Human Resource Practices Scale per gender.