Identifying Criteria for a Physical Literacy Screening Task: An Expert Delphi Process

I sedentary behavior and decreasing physical activity are growing concerns in children’s health promotion. The Canadian 24-hour Movement Guidelines recommend children aged 5-17 years perform at least 60 minutes of moderate-to-vigorous physical activity daily[1]. However, Canadian Health Measures Survey data indicate only 36% of Canadian children and youth meet these guidelines[2]. Physical literacy has recently come to the forefront as a concept describing factors that contribute to a healthy, active lifestyle. The Canadian consensus definition of physical literacy is “the motivation, confidence, physical competence, knowledge and understanding to value and take responsibility for engagement in physical activities for life”[3]. Physical literacy is an independent indicator of cardiovascular fitness[4] and children with higher physical literacy scores are more likely to achieve the physical activity guidelines[4]. In turn, guideline compliance is associated with enhanced physical, psychological, social, and cognitive health outcomes[5,6]. Being able to identify children with lower levels of physical literacy would enable targeted interventions to decrease their sedentary lifestyle morbidity risk. To date, multiple protocols have been developed to assess physical literacy[7], but none are suitable for use as a quick screening tool by “REACH” sector leaders. REACH refers to physical literacy supportive sectors where leaders often do not have specialist physical activity training. They include: Recreation (municipal, private, camps), Education (schoolbased), Allied health (fitness, health promotion), Coaching Research Article Exercise Medicine Open Access eISSN: 2508-9056


INTRODUCTION
(lessons, sport), or Healthcare (doctors, nurses, therapists). REACH sector leaders are well-suited to assess and support the physical literacy of all children.
The Canadian Assessment of Physical Literacy (CAPL) assesses four domains of physical literacy, including motivation and confidence, physical competence, daily activity behavior, and knowledge and understanding. The CAPL has high validity to measure the physical literacy of children aged 8 to 12 years [8] and each protocol has published reliability data for this age [9]. However, the CAPL requires significant time and space, and some protocols are designed to be performed by individuals with specialized training. The second edition of the CAPL (www.capl-eclp.ca) reduced examiner and participant burden, but still requires more time and space than would be appropriate for a screening test [10]. The Physical Literacy Assessment for Youth (PLAY), developed by Sport for Life, is designed to assess physical literacy for children aged 7 years and older. A sub-tool, called PLAYfun, and a shorter version, called PLAYbasic, have validity evidence [11], but only good inter-rater reliability for average measures and moderate-to-good reliability for single measures [12]. However, the PLAYfun and PLAYbasic tools only assess the motor competence component of physical competence, and are thus not sufficient to be used as stand-alone screening tools for the broader concept of physical literacy. Other assessment tools, such as Passport for Life, developed by Physical Health Education (PHE) Canada, currently have no peer-reviewed evidence for reliability or validity [7].
Based on the limitations of these existing physical literacy assessments, a new approach to screening childhood physical literacy is required. The goal of this project was to establish the criteria for a physical literacy screening tool that could quickly and effectively identify children at risk of inactive lifestyles due to significant physical literacy deficits. The ideal tool would be suitable for use by leaders across all REACH sectors, within the time and space constraints of their work environments.

METHODS
A Delphi process is a communication framework based on multiple rounds of feedback from a panel of experts, in order to systematically elicit a reliable consensus of opinion on emerging or unexplored issues. The Delphi method relies on anonymity of the participants, structured information flow, and feedback between the facilitators and participants [13,14]. A 3-round Delphi method was used to establish the criteria for a physical literacy screening task suitable for all REACH sectors. Round 1 group discussion notes were recorded without identifying information. Responses for Rounds 2 and 3 were de-identified with coded ID numbers so that results could be tabulated without identifying specific participants. The study protocol was approved by the Research Ethics Board of the Children's Hospital of Eastern Ontario (#13/124X), and all participants provided written informed consent.

Round 1
Participants (n=32) with expertise in research, education, allied health, coaching, or healthcare, and with an interest in childhood physical activity were identified. A 1-day, in-person meeting was facilitated by PL, CB, AA with assistance from 2 graduate students. The research team shared results of an environmental scan outlining current physical literacy assessment tools, including the Canadian Assessment of Physical Literacy (CAPL). A group discussion identified goals and criteria for potential screening tasks, and which domains of physical literacy (motivation, knowledge, behavior, and physical competence) should be reflected. The panel was then divided into five sub-groups. Each sub-group had 30 minutes to identify specific tasks that should be investigated as a potential screening task for children's physical literacy. These discussions were unstructured, and each subgroup was comprised of 5 or 6 experts plus one designated note-taker. Experts were assigned to a sub-group to ensure that each group had representation from each REACH sector. The benefit of having multiple sub-groups rather than a single group discussion. It is easier for each person to contribute their ideas in a small group of people. Combining the information from each small group makes it more likely that diverse points of view are recognized and considered. Two researchers (AA, CB) utilized qualitative research methods to independently analyze the meeting notes from the presentation sessions, large group discussion, and 5 small group discussions. Comments regarding the criteria for a screening task were identified. Based on their expertise in kinesiology (i.e., graduate degree) and experience assessment physical literacy, researchers identified each comment and categorized it by the physical literacy domain represented (overall, physical competence, motivation and confidence, knowledge and understanding, daily behavior) and whether it related to the task performed (e.g., balance on one leg) or the testing environment (e.g., small clinic room). Similar comments were grouped together to represent important concepts. Three researchers (CB, AA, PEL) then met to review the identified concepts, ensure consistency between the two analyses, and identify important themes. A list of statements outlining goals and criteria for potential physical literacy screening assessment tasks was compiled based on the identified themes.

Round 2
The list of statements developed from the analyses of Round 1 responses was circulated electronically for Round 2. Delphi participants were asked to rate each statement using a 5-point Likert scale (strongly disagree, disagree, neutral, agree, and strongly agree). A priori, the researchers defined positive consensus as ≥75% of participants responding "agree" or "strongly agree" to a statement. It was felt that this would identify tasks and criteria that a majority of experts agreed were important. Similarly, negative consensus was defined as ≥75% of participants responding "disagree" or "strongly disagree" to a statement. Non-consensus statements were defined as those that did not meet either positive or negative consensus criteria. Frequency tabulations determined the degree of consensus for each statement and mean statement ratings were calculated. Additional questions regarding length of time for screening, space requirements for screening, and time for interpretation of results were also included.

Round 3
The group mean was calculated for each statement based on 1 (strongly disagree) to 5 (strongly agree) points being assigned for each response. Statements from Round 2 were then recirculated to participants, accompanied by the group mean ± standard deviation and their previous individual response. Participants could then retain or change their original response based on further reflection and their knowledge of the group mean response. Final response distributions, to be used to guide future research, were calculated using the Round 3 responses. The same criteria used to define consensus in Round 2 were applied to the Round 3 responses.

Data Analyses
Qualitative inductive thematic analysis was used to identify key themes and statements from the written text summarizing the meeting discussions (presentations, large group, 5 small groups). All of the written text was reviewed multiple times by two researchers (CB, AA), and comments related to screening tasks or criteria were identified. Concepts were identified to represent similar comments. Three researchers (CB, AA, PEL) then met to review the identified concepts, ensure consistency in the two analyses, and combine concepts into important themes.
For Rounds 2 and 3, points were awarded to each response: 1=Strongly disagree, 2=Disagree, 3=Neutral, 4=Agree, 5=Strongly agree. Frequency distributions summarized the responses to each item and determined consensus. One-way ANOVA statistics were calculated for final Round 3 data to determine whether response distributions were significantly influenced by the experts' gender or REACH sector. Analyses were completed using IBM SPSS Version 26 (IBM Corp., released 2019, Armonk, NY), with statistical significance set at alpha=0.01 to account for the large number of comparisons.

Participants
Invitations were sent to 53 experts requesting their participation in the 3-round Delphi process, of whom 37 provided consent to participate. Twenty-eight participants (75.7%) attended Round 1, 31 (83.8%) completed round 2, and 28 (75.7%) completed round 3. Those who did not complete all three rounds either did not respond to follow-up, or withdrew from specific rounds due to time constraints. The final panel for each round was composed of at least 7 experts with primary or secondary expertise for each sector (recreation, education, allied health, coaching, and healthcare). Descriptive statistics of participants for each round are provided in Table 1.

Round 1
The facilitated large group discussion among the 28 participants identified 45 screening task criteria. In addition, the experts agreed that the screening task should be flexible and easy to administer, require limited equipment and space, be accessible to all children and REACH sectors, and be complimentary to existing resources and research. Five small groups, each with representation from multiple REACH sectors, separately developed a "top 10" list of specific screening tasks. Given the number of expert participants (n=32), the variation in their expertise and experience, and the deliberate construction of small groups with representatives from each REACH sector, it is not surprising that there was significant heterogeneity of tasks between groups (see Supplemental File 1). However, all groups agreed that a screening assessment should consist of both physical tasks that can be objectively observed and questionnaires for self-reporting of different aspects of physical literacy. All groups also supported the inclusion of at least one task from each of the four domains within the consensus definition of physical literacy: motivation and confidence, physical competence, knowledge and understanding, and

Round 3
When the questionnaire was circulated a second time with the mean scores for each question included, positive consensus (≥75% of respondents) was achieved for 47 of 90 (52.2%) statements. Round 3 consisted of statements regarding the screening task criteria (Table 2), what aspects of physical literacy should be measured (Table 3), the results the screening task should provide (Table 4), and details of screening task test administration (Table 4). Negative  consensus was achieved for 4 statements (4.3%). The four statements that achieved negative consensus (i.e., consensus of disagreement) indicated that the screening task would require promotion/publication and should include more than one task, more than one component of physical literacy, and not be specific to one domain of physical literacy. 39 statements (43.5%) had diverse responses and did not demonstrate consensus. These diverse opinions were also reflected in the feedback regarding the time and space required to complete the screening task(s). The most frequent response (13/28, 46%) regarding the time required to screen one child was less than 10 minutes, but 11 (39%) experts thought that screening should be completed in less than 2 minutes (Figure 1). Most respondents felt that it would be appropriate to screen a group of 10 children in 25-30 minutes (n=20, 71%). Interpretation of results Be based on skill progressions (e.g. standing long jump to triple jump). 13 12 3 Include skills that can be assessed in all REACH sectors. 17 8 3 Include only one component of physical literacy (e.g. handgrip or plank).

5 22
Be specific to one domain of physical literacy (e.g., knowledge, motivation).

24
Be a circuit, reflecting all physical literacy domains. 16 8 4 Maintain the domains of the Canadian Assessment for Physical Literacy. 13 13 2 Be the same for each REACH sector. 4 14 10 Differ by REACH sector (i.e., different leaders choose depending on need). 18 8 2 Be something a child can practice on his or her own time. 14 10 4 Be able to be used by parents at home. 15 10 3 Be an active assessment that can be observed and scored. 22 6 0 Include a questionnaire to assess motivation. 23 5 0 Include a self-report measure of physical activity. 20 6 2 The physical literacy screening task(s) should include only 1 item. 0 3 25 The physical literacy screening task(s) should include no more than 3 items. 2 10 16 The physical literacy screening task(s) should include no more than 5 items. 13 11 4 Equipment needed should fit in a backpack or tool kit. 21  Results should be accompanied by information for leaders and parents. 26 0 2 Screening results should be compatible with the electronic medical record. 12 15 1 Screening results should be in a database available to REACH leaders. 20 6 2 Resources need to be available for follow-up after screening results. 26 1 1 Allied-health professional referrals should be available after screening. 24 2 2 A publication/promotion plan should be developed with the screening. 26 2 0 Screening should be publicized so REACH sectors know the benefits. 27 0 1 Screening task(s) do not require promotion, just word-of-mouth awareness. 3 1 24 was recommended to be less than 2 minutes (n=14, 50%) or 5 to 10 minutes (n=14, 50%). The size of space required ( Figure 2) was recommended as a small clinic room (n=9, 32%), a classroom (n=14, 50%) or a gymnasium (n=5, 18%). Participant gender did not significantly influence the results. Participant REACH sector also did not influence most results, with two exceptions. Leaders from the education and allied-health were less likely to agree (p=0.003) the screening task should be appropriate for the child's chronological age. Whether the screening task should measure body composition also differed by sector, with leaders from alliedhealth and coaching being less likely (p=0.01) to agree that the measurement of body composition was important ( Table  5).

DISCUSSION
Leaders from the Recreation, Education, Allied health, Coaching and Healthcare (REACH) sectors are strategically situated to proactively identify and support children at risk of low engagement in physical activity. To target supportive resources to children with the greatest need, REACH leaders require a physical literacy screening tool that can be used effectively in these settings. Through a Delphi process, this work identified the criteria for a physical literacy screening tool based on expert consensus. The ideal screening tool should assess multiple domains of physical literacy, including physical competence, knowledge and understanding, and motivation. It should be accompanied by educational information, suitable for use in all REACH sectors, based on objective measurements, appropriate for the chronological and developmental age of the participants and for use with children with and without disabilities. There was consensus that the screening tool should provide an indication of the child's motor skill, cardiorespiratory fitness, core strength, activity motivation and their active and sedentary daily behavior. To achieve these goals, the screening tool should be comprised of both active movement assessments and questionnaires assessing motivation and/or knowledge. The screening results should also provide information useful to REACH leaders and parents, and provide a decision tree and referral resources for follow-up action when needed. While a majority of the respondents (20/28, 71%) felt that 25 to 30 minutes would be a suitable time for screening a group of 10 children, the Delphi panel was unable to achieve consensus regarding the amount of time to screen one child or the space that should be required. The respondents were fairly evenly split between recommending that screening one child be completed in less than 2 minutes or that requiring up to 10 minutes would be appropriate. Similarly, many participants felt that the assessment tool should be suitable for administration in a small clinic space, while others indicated that a classroom-sized space would be appropriate. Surprisingly, the differences among participants were largely unrelated to their gender, REACH sector or years of service. Males tended to allow a longer time to screen one child (r=0.35, p=0.07) and the time required to assess a group of 10 children was higher among leaders with fewer years of experience (r=0.37, p=0.05). REACH sector was not associated with recommendations regarding the time or space required for screening (r<0.15, p>0.50) suggesting that the same screening tool could be suitable across sectors and settings.
That many experts recommended the screening be completed in less than 2 minutes would seem to somewhat contradict their recommendations for screening task content. Content recommendations focused on objective measurements and emphasized that the screening should incorporate more than one task, and assess more than one physical literacy domain. There was consensus that the screening task should provide an indication of motor skill, cardiorespiratory fitness, physical activity motivation, core strength and daily active and sedentary behaviors. For each of these 6 items to be objectively measured in 2 minutes would require each task to be a maximum of 20 seconds or the use of more complex tasks that combine multiple measures (e.g., a circuit of stations). Daily active and sedentary behaviors require multiple days of monitoring to obtain accurate and objective measures [15]. Similarly, measures of cardiorespiratory fitness require at least 2 to 3 minutes of steady state exercise [16]. Daily activity, screen time and cardiorespiratory fitness could be assessed quickly by asking children to self-report, however it is recognized that self-report data represent perceived activities rather than actual participation. Investigations of potential screening tasks will need to give careful consideration to balancing the desire for a more comprehensive, objective screening tool with the desired timeline.
The screening task content aligns closely with previously recommended components of physical literacy. The Canadian consensus definition of physical literacy [17], which aligns with the physical literacy philosophical writings of Whitehead [18], emphasizes the importance of physical competence (motor skill, strength, endurance, etc.), knowledge and motivation as it relates to the achievement of an active lifestyle (daily behavior). A previous Delphi process to establish the components of a detailed assessment of physical literacy [8] also emphasized the importance of movement skill, strength and cardiorespiratory fitness. The second edition of the Canadian Assessment of Physical Literacy also increased the importance of physical activity motivation within the overall physical literacy score [10]. The Canadian Agility and Movement Skill Assessment is an agility course that assesses fundamental, complex and combined movement skills with the median completion time for each trial of 17 seconds. Two practice trials and two timed/scored trials are required, for a total time requirement of 1.5 minutes per child [19]. The PLAYbasic assessment focuses on five motor tasks, running, jumping, throwing, kicking, and balance [20], which also align with the motor assessment consensus criteria developed by this Delphi panel. Both of these items could potentially be used as part of an effective screening task, although the moderate-to-good test-retest reliability reported for the PLAY tools [12] would be of concern. The motivation assessment of the Canadian Assessment of Physical Literacy-2 [20] or the Children's Self-perceived Adequacy and Predilection for Physical Activity [21] are existing measures of children's motivation for physical activity.

Strengths and Limitations
While, an effort was made to have equal representation from each REACH sector on the panel, agreement for participation and completion of all rounds of the Delphi protocol could not be controlled. This resulted in unequal distribution of experts from the different sectors (Tables 1 & 2) which may have introduced bias into the recommendations and feedback. Variability in the requirements for administration of a screening test may be due to different resources available among REACH sectors, although our analyses did not indicate a systematic difference in recommendations by sector. At the start of the Round 1 meeting, the researchers presented the Canadian consensus definition of physical literacy, which matches that of the International Physical Literacy Association, as the framework for the discussions. It was recognized that there are multiple definitions for the concept of physical literacy, and therefore the Canadian consensus definition was presented and discussed in order to provide a common knowledge base among the participating experts. However, use of this definition may have influenced the direction of the recommendations provided. Finally, the Round 3 panel included more females (n=17, 61%), with higher representation from allied health (n=10, 36%) and healthcare (n=6, 21%) than other sectors (n<5). In addition, over half of the experts were from the allied health or healthcare sector, which may explain why there was a focus on cardiorespiratory fitness and body strength, since these are well-known indicators of physical health.

CONCLUSIONS
A physical literacy screening tool would enable leaders in recreation, education, allied health, coaching, and healthcare to identify children with limited capacity for a healthy, active lifestyle (i.e., low physical literacy). Expert consensus suggests that such an assessment should encompass multiple facets of physical literacy, such as motor competence, motivation, strength, endurance, and daily behavior. Objectively observed physical tasks could be combined with subjective questionnaire responses, as appropriate for each factor being assessed. Additional research is required to evaluate the efficacy of potential tasks meeting these criteria, which can be completed quickly and are suitable for use in each of the REACH sectors. The reliability of screening tasks results, their validity relative to an in-depth physical literacy assessment [10], and the generalizability for screening different ages and children with/without medical conditions or disabilities should also be evaluated.