APPLICABILITY OF THE ONLINE SHORT SPATIAL ABILITY BATTERY TO UNIVERSITY STUDENTS TESTING

. Introduction. Multiple studies advocate an importance of spatial abilities (SA) for educational and occupational success, especially in STEM. Recently an Online Short Spatial Ability Battery (OSSAB) was developed and normed for SA testing in adolescents. The battery includes mechanical reasoning, paper folding, pattern assembly, and shape rotation tests. The battery has shown good psychometric characteristics (high reliability and validity, low redundancy, discriminative power), and is available in open access and free to use. Aim. The present research aims : 1) to examine the applicability of the OSSAB for university student testing; 2) to describe its psychometric properties and structure; and 3) to investigate links between SA and educational performance. Methods. A total of 772 university students (aged from 18 to 26, mean age ( SD ) = 19.55 (1.51), 63.1% females) participated in the study. Participants provided information about their age, gender, university major, and academic achievement, and completed a battery of tests that included the OSSAB tests. Results. The study reports psychometric norms for using the OSSAB in university students. Students’ performance in the OSSAB was similar to that shown in previous research in adolescents in terms of means and variance. The OSSAB showed adequate psychometric properties in this sample: no floor or ceiling effects; low redundancy; moderate to high internal consistency; high discriminative power across university majors; and high external validity. The results indicated that around 6% of the students showed very high levels of SA (higher than 1.5 SD above the mean), and around 8% of students showed very low levels of SA (lower than 1.5 SD below mean). In addition, the OSSAB scores were linked to educational profile choice and exam scores, with small-to-medium effect sizes. Scientific novelty. The study provides psychometric norms for a short online open measure of spatial ability in university students. Practical significance. The OSSAB can be used to provide individual recommendations to students (e.g. SA training), to identify spatially gifted students, and for research purposes in university contexts.


Благодарности
Objetivo.El propósito de este estudio ha sido evaluar la estructura y las propiedades psicométricas de la batería al evaluar a estudiantes universitarios y compararlos con los datos obtenidos de la muestra de adolescentes en la que se desarrolló la batería de habilidades espaciales (OSSAB), y examinar las relaciones entre las habilidades espaciales y los resultados educativos.

Introduction
Object-oriented spatial ability (SA) is defined by K. Rimfeld as an ability to produce, recall, store, and modify spatial relations among objects [1].SA is linked to performance in STEM-related fields, such as mathematics [2], chemistry [3], biology and medicine [4], architecture [5], and IT [6,7].Longitudinal studies suggest that SA predicts (beyond maths and verbal ability) a domain of future education and career; and associated levels of achievement [8,9].Meta-analysis by D. H. Uttal and colleagues also shows that SA can be successfully developed through training, and this may lead to improvement in other domains [10].Another meta-analysis by Z. C. K. Hawes and colleagues, based on 29 experimental studies, showed that spatial training had a positive effect on mathematics, with small-to-medium effect size [11].This makes SA an important target for educational testing and development.
However, in educational practice, the testing is impeded by the lack of accessible instruments, especially in higher education.This paper aims to validate an existing Online Short Spatial Ability Battery (OSSAB) for administration in university student samples.Adaptation of OSSAB battery for university students corresponds to the three challenges of university education.First, there is a shortage of qualified engineer personnel, which is emphasised by current educational policies [12].As spatial ability is linked with success in technical and engineering careers, its development can contribute to addressing this issue.Second, in line with a growing trend for personalised education [13], accounting for individual cognitive skills may help to build individual educational trajectories through elective courses and extra-curricular activities, based on one's strengths and weaknesses.Third, there is an identified lack of valid, reliable, psychometrically tested instruments for practitioners in Russia [14] that can be used in university students to support individualised learning.Availability of accessible normed instruments can help students to be active co-developers of their own learning experience.The following review analyses the exiting literature in relation to spatial ability testing and how it can support individualised learning.

Literature Review
A growing body of research advocates an importance of SA testing in educational settings (e.g.results from the Project TALENT [15]).J. M. Lakin and J. Wai showed that approximately 7% of US schoolchildren are spatially gifted and that many of these students, as well as their parents and teachers, do not know of this spatial strength [16].For these students, and especially if they do not show high verbal or maths ability, this lack of support might lead to lower academic motivation and behavioural problems.As suggested by H. Kell and D. Lubinski, once spatially gifted students being identified, they can benefit from receiving more challenging activities (e.g.participating in science competitions); and courses rich in hands-on content (e.g.robotics or modelling) and experimental laboratory work [17].
SA assessment can also benefit those with low levels of SA: according to D. H. Uttal and colleagues, SA can be improved through training using video games, special courses and spatial tasks training, with average effect size of 0.47 (Hedge's g) [10].This is consistent with literature that suggests that SA is a more malleable component of general intelligence (see e.g.J. B. Carroll [18] and E. Krapohl et al. [19]) compared to verbal ability, memory and speed (see the second-order metaanalysis by G. Sala et al. [20]).This finding is important for educators as SA training may bring greater and faster gains, especially since SA improvement may transfer to other domains, in particular to academic performance in STEM [21].
For example, longitudinal research by S. Sorby and colleagues showed that students with low SA, who attended an SA training course, demonstrated higher STEM final grades (GPA), especially grades for engineering problem solving, analysis and calculus; and higher graduation rates compared with their peers who also had low spatial skills but did not take the course [22,23].
In a university context, measuring students' SA may aid in efforts to reduce the dropout rates in higher education [24], especially in STEM areas where drop out can be as high as 30% 1 .Retention rates of STEM students improved when SA was assessed in the first year and spatial training was provided to students who showed low results, as shown by S. Sorby and colleagues [25].Such training could be implemented in educational programmes in many ways (see review by C. Zhu and colleagues [26]), including specifically developed visualisation courses [27] Multiple instruments have been suggested to assess various aspects of SA.Already in 1980s, J. Eliot reviewed approximately 500 existing spatial tests and concluded that classification of the tests was challenging, as the SA factorial structure was not clear [33,34].Today, 40 years later, there is still no consensus on the number and nature of SA factors.K. Rimfeld and colleagues have shown a significant overlap (i.e.single factor of SA) in small-scale spatial tasks (when a viewer has to mentally represent and transform two-and three-dimensional images, seen from a single vantage point -e.g.mental rotation, perspective changing, mechanical reasoning) [1].In addition, there was also significant overlap between this factor and large-scale SA tasks (when the viewer's perspective can change with respect to the larger environment, but the spatial relationships among individual objects are fixed -e.g.wayfinding, navigation, orienting in space), as shown by M. Malanchini and colleagues, M. V. Likhanov and colleagues [35,36].This data suggests that a small number of object-oriented SA tests can reasonably capture small-scale SA.
Although many batteries of spatial ability have been created (e.g.Pathfinder [37], Sea Hero Quest [38], Cantab [39], or Cognifit [40], etc.), very few have been validated and normed for different samples, and made freely available.Recently, A. V. Budakova and colleagues developed a battery consisting of 4 measures -Online Short Spatial Ability Battery (OSSAB) -that could be used to test smallscale SA reliably [41] and is freely available 1 .The battery includes four tests tapping into different aspects of SA: shape rotation (mental rotation of two-dimensional pictures); paper folding (mentally recreating a series of manipulations with a piece of paper), mechanical reasoning (questions about understanding and applying basic physical laws related to movement), and pattern assembly (combining several figures to build a whole).Two recent studies also used these 4 tests in university student samples as part of larger SA battery, showing good psychometric properties and robust correlations with other SA tests (on a Russian sample by E. Esipenko et al. [42], on Russian and Chinese samples by M. V. Likhanov et al. [43]).Later, M. V. Likhanov and colleagues developed psychometric norms for the use of the OSSAB with Russian schoolchildren and provided recommendations for adolescents with different levels of SA [44].
The aims of the current study are: 1) to extend this work and examine the applicability of the Online Short Spatial Ability Battery (OSSAB) for university student testing; 2) to describe its psychometric properties and structure; and 3) to examine links between SA and educational performance.
No reward was given for participation; data collection was anonymous.The students were fully informed about the testing procedure.All participants gave their written consent, and were informed that they could refuse to participate at any moment without explaining the reasons.The study was approved by the Ethics committee of the Interdisciplinary Research at Tomsk State University.

Measures
All participants completed a demographics questionnaire that included information on their age, gender, and educational profile with three options: technical studies; natural sciences; arts & humanities.Also, the participants reported their results of the Unified State Exam (USE) in Russian Language and Mathematics (two obligatory subjects) that is taken at the end of secondary education (11 th grade) and graded anonymously.The USE final score varies from 0 to 100 for each school subject.This measure was chosen as a measure of academic achievement as the grading criteria of USE is nationally standardised.In contrast, university grades cannot be directly compared because they vary depending on universities and majors.
SA of the participants were assessed with the 4 tests of the OSSAB battery (see Table 1 and description in [41]).We also created an OSSAB total score as per procedure suggested by A. V. Budakova and colleagues, by averaging proportions of correct responses for the 4 tests.

Statistical Analysis
All statistical analyses were performed using IBM SPSS Statistics (v.26.0.0.0) and Jamovi (v 2.4.6).Before the analyses, all data were standardised and screened for missing values, univariate, and multivariate outliers.The threshold of Z = 3.29 was used as recommended in A. P. Field to exclude outliers, and Mahalanobis distance was used to find multivariate outliers [45].In total, less than 5% of outliers were excluded.

Descriptive Analysis
The distribution of the OSSAB total score was close to normal (see Figure 1); skewness and kurtosis varied within the acceptable range (from -2 to +2).The mean for the OSSAB total score was 52.99 (SD = 17.29) -very similar to that previously reported for adolescents (Mean = 51.02;SD = 20.02)[44].The descriptive statistics for the OSSAB subtests (Shape Rotation, Mechanical Reasoning, Paper Folding, and Pattern Assembly) are presented in Table 2.

Psychometric Properties of the OSSAB
Following A. V. Budakova and colleagues [41], we utilised six criteria to evaluate applicability of the OSSAB battery to university students: differentiating power, high reliability, high external validity, specificity, low redundancy, absence of floor and ceiling effects.
Cronbach's alpha and Split-Half Reliability Methods, presented in Table 2, showed adequate level of reliability for all tests: Cronbach's alpha above 0.7; as suggested by M. Tavakol & R. Dennick [46], with slightly lower reliability for Mechanical Reasoning (Cronbach's alpha 0.62).The 4 SA tests positively correlated with each other, with moderate effect size (r varying from 0.36 to 0.47), suggesting that scales do not duplicate each other and fit low redundancy criteria [41].
We further explored whether the 4 OSSAB tests can distinguish among students with different majors: Technical Studies; Natural; Arts and Humanities majors.The results of the MANOVA showed a weak but significant main effect of major: F (8, 1534) = 10.26,p < 0.001, η 2 p = 0.05 (see Table 3).Follow-up ANOVAs showed similar effects for individual tests, with η 2 p ranging from 0.02 to 0.09 (Table 4).Discriminant analysis was used to follow-up MANOVA, as recommended in A. P. Field [45].The results of this analysis showed two discriminant functions: the first function explained 97.7% of the variance, canonical R 2 = 0.10; the second explained only 2.3%, canonical R 2 = 0.003.The functions at group centroid demonstrated that function 1 (positively loaded by all four tests, with largest correlation with mental rotation -r 1 = 0.94) discriminated Technical studies + Natural Sciences from Humanities and Arts.Second function (positively loaded by shape rotation and paper folding, negatively loaded by the other two tests) showed slight differences between Natural Sciences from STEM and Arts.In combination, these functions significantly differentiated among the three educational profiles: Wilks' Lambda = 0.90, χ 2 (8) = 82.03,p > 0.001.
To check the external validity of the 4 tests, we examined correlations of the OSSAB with participants' grades for the Unified State Exam in Russian Language and Mathematics.The results showed adequate external validity for the OSSAB, with moderate positive correlations for the 4 OSSAB tests with Mathematics USE; and weak correlations with Russian USE (see Table 5).The means (SDs) for the OSSAB, USE in Russian language and Mathematics are presented in Figure 2. A pattern of group differences mirrors the correlational patterns: for SA and mathematics, Technical studies showed best performance, followed by Natural Sciences and Humanities; for Language exam, no group differences were observed.

Psychometric Norms
In order to establish norms for the OSSAB in a student sample, we calculated the quartiles and boundaries of 8 levels of SA (following previous research in intelligence [47] and spatial ability [44]).The results are presented in Tables 6 and 7.  Finally, we examined how students from Natural Sciences, Technical Studies and Humanities & Arts are distributed across the established levels.There were differences across the three groups in proportion of students falling into different levels categories.
For example, only about 3% of Technical students demonstrated very low SA scores, compared to about 9% in Natural and about 12% in Humanities.For more details, see Figure 3.

Discussion
The current study explored the applicability of the OSSAB to university students.The results of the student testing are in line with previously reported results from adolescents described by A. V. Budakova and colleagues [41].Psychometric analysis demonstrated low redundancy, moderate split-half reliability, and absence of floor and ceiling effects of the four SA subtests.As for external validity, the correlations between OSSAB subtests and mathematics grades were positive and moderate, in line with the meta-analysis by K. Atit and colleagues that reported a correlation of 0.36 between spatial and mathematical skills [2].In contrast, the correlations with exam scores in Russian language were weaker.As expected the correlations between two exams was moderate (r = 0.40, p < 0.001), reflecting multiple overlapping factors reported in previous studies by Y. Kovas et al. [48] and I. A. Voronin et al. [49], including overlapping genetic and environmental contributions, motivation and general cognitive ability.The observed stronger correlations between SA and math compared to correlations between SA and language can be considered as evidence of the OSSAB external validity.
The current study showed that OSSAB tests have sufficient internal validity that is compatible to other instruments, tapping into spatial ability (e.g.Bricks or other tests from King's challenge battery [49,50]).The results of the current study also demonstrated that the OSSAB can differentiate students with majors in technical studies, natural sciences, arts and humanities, with small-to-medium effect size.Comparable effects of expertise were reported in previous studies: for example, η 2 p = 0.16 by E. S. Tsigeman and colleagues [52] or η 2 p = 0.07 by S. Y. Yoon & E. L. Mann [53].
The current study reports psychometric norms for the OSSAB in university students.Around 6% of the students demonstrated very high SA, scoring more than 1.5 SD above the mean.These results are in line with the previous studies reporting that 4 to 7 percent of adolescent schoolchildren could be considered spatially gifted (see publications by J. M. Lakin & J. Wai for the US sample [16] and M. V. Likhanov and colleagues for the Russian sample [44]).Around 8 percent of the students scored lower than 1.5 SD below the mean, smaller proportion compared to the adolescent sample (15.32%).This lower proportion can be explained by implicit and explicit selection associated with educational choices and university entry criteria.Further research is needed to assess generalisability of the results beyond the current sample, including in different countries as proportion of school graduates varies across countries.In Russia, according to A. Bessudnov and colleagues, approximately 50% of all schoolchildren in compulsory education (9 th grade) go to universities [54,55].In addition, higher scores in students can be attributed to slight age differences between the current sample and previously reported adolescent samples, as SA was shown to grow with age in longitudinal studies (e.g.M. V. Likhanov et al. [56] and M. Rodic et al. [50]).Relatedly, university students on average have had longer engagement with SA-developing activities, such as playing video games (B.Bediou et al. [57]), practicing sports (D.Voyer & P. Jansen [58]), and reading maps (C.Davies & D. Uttal [59]).
The results of the study could be used to provide individual recommendations to students based on their OSSAB scores.For example, H. J. Kell et al. [9], R. M. Webb and colleagues [60] suggested that students who score in the high -extraordinary giftedness range may consider pursuing STEM courses and degrees.[63]; large-scale cross-cultural studies by H. J. Spiers et al. [38], I. Silverman et al. [64]; and studies from selected populations by E. S. Tsigeman et al. [52].For example, in the study by M. Stieff and colleagues spatial training, targeting strategies of spatial problem solving [65], reduced gender differences and overall gender gap in STEM [66].

Conclusion
The current study addresses the need for validated normed reliable tools for measuring cognitive abilities and spatial abilities, in particular.The literature reviewed in the current study suggests that spatial ability is a promising target for educational interventions, as it is malleable and linked to success in different academic domains.The study selected OSSAB -a validated short online SA battery, which is open, free and can be used for research and practical purposes in educational contexts.Our research showed that OSSAB, initially developed for adolescents, can be used for SA assessment in university students.We reported psychometric norms that can be used to provide recommendations to students with different levels of SA.Future research is needed to investigate SA in samples with different expertise and field of study.This knowledge will help to evaluate whether OSSAB can be used predictively for career guidance.

Fig. 3 .
Fig. 3. OSSAB levels for Technical Studies, Natural Sciences and Humanities students Students, who demonstrate low SA scores, can benefit from SA training.Several recent studies have shown positive effects of SA training that can improve overall academic performance, as shown by N. Judd & T. Klingberg on a sample of children [61], and by N. Veurnik & S. Sorby on a sample of university students [23].Moreover, SA training may specifically target factors implicated in gender differences in SA (see meta-analyses by Y. Maeda & S. Y. Yoon [62], A. Nazareth et al.
, computerised gaming applications [28], or new classes with spatialised context: e.g.3D modelling and design as suggested by F. Dilling and A. Vogler [29], anatomy course as in M. A. T. M. Vorstenbosch et al. [30]; GIS course as in B. Kolvoord et al. [31], or robotics as in C. Julià and J. Ò.Antolì [29].To provide such support to students, good and reliable measures of SA are needed.

Table 5
Correlations between the Unified State Exam Scores and the OSSAB Note: * p < .05,** p < .01,*** p < .001Fig. 2. Means of the OSSAB total score, USE Russian Language and USE Mathematics Results by profile Note: Error Bars 95% CI

Table 6
Percentiles of the OSSAB total score distribution

Table 7
Norms for the OSSAB total score