SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Back to Entry
  • Entry Contents
  • Entry Bibliography
  • Academic Tools
  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Supplement to Critical Thinking

How can one assess, for purposes of instruction or research, the degree to which a person possesses the dispositions, skills and knowledge of a critical thinker?

In psychometrics, assessment instruments are judged according to their validity and reliability.

Roughly speaking, an instrument is valid if it measures accurately what it purports to measure, given standard conditions. More precisely, the degree of validity is “the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests” (American Educational Research Association 2014: 11). In other words, a test is not valid or invalid in itself. Rather, validity is a property of an interpretation of a given score on a given test for a specified use. Determining the degree of validity of such an interpretation requires collection and integration of the relevant evidence, which may be based on test content, test takers’ response processes, a test’s internal structure, relationship of test scores to other variables, and consequences of the interpretation (American Educational Research Association 2014: 13–21). Criterion-related evidence consists of correlations between scores on the test and performance on another test of the same construct; its weight depends on how well supported is the assumption that the other test can be used as a criterion. Content-related evidence is evidence that the test covers the full range of abilities that it claims to test. Construct-related evidence is evidence that a correct answer reflects good performance of the kind being measured and an incorrect answer reflects poor performance.

An instrument is reliable if it consistently produces the same result, whether across different forms of the same test (parallel-forms reliability), across different items (internal consistency), across different administrations to the same person (test-retest reliability), or across ratings of the same answer by different people (inter-rater reliability). Internal consistency should be expected only if the instrument purports to measure a single undifferentiated construct, and thus should not be expected of a test that measures a suite of critical thinking dispositions or critical thinking abilities, assuming that some people are better in some of the respects measured than in others (for example, very willing to inquire but rather closed-minded). Otherwise, reliability is a necessary but not a sufficient condition of validity; a standard example of a reliable instrument that is not valid is a bathroom scale that consistently under-reports a person’s weight.

Assessing dispositions is difficult if one uses a multiple-choice format with known adverse consequences of a low score. It is pretty easy to tell what answer to the question “How open-minded are you?” will get the highest score and to give that answer, even if one knows that the answer is incorrect. If an item probes less directly for a critical thinking disposition, for example by asking how often the test taker pays close attention to views with which the test taker disagrees, the answer may differ from reality because of self-deception or simple lack of awareness of one’s personal thinking style, and its interpretation is problematic, even if factor analysis enables one to identify a distinct factor measured by a group of questions that includes this one (Ennis 1996). Nevertheless, Facione, Sánchez, and Facione (1994) used this approach to develop the California Critical Thinking Dispositions Inventory (CCTDI). They began with 225 statements expressive of a disposition towards or away from critical thinking (using the long list of dispositions in Facione 1990a), validated the statements with talk-aloud and conversational strategies in focus groups to determine whether people in the target population understood the items in the way intended, administered a pilot version of the test with 150 items, and eliminated items that failed to discriminate among test takers or were inversely correlated with overall results or added little refinement to overall scores (Facione 2000). They used item analysis and factor analysis to group the measured dispositions into seven broad constructs: open-mindedness, analyticity, cognitive maturity, truth-seeking, systematicity, inquisitiveness, and self-confidence (Facione, Sánchez, and Facione 1994). The resulting test consists of 75 agree-disagree statements and takes 20 minutes to administer. A repeated disturbing finding is that North American students taking the test tend to score low on the truth-seeking sub-scale (on which a low score results from agreeing to such statements as the following: “To get people to agree with me I would give any reason that worked”. “Everyone always argues from their own self-interest, including me”. “If there are four reasons in favor and one against, I’ll go with the four”.) Development of the CCTDI made it possible to test whether good critical thinking abilities and good critical thinking dispositions go together, in which case it might be enough to teach one without the other. Facione (2000) reports that administration of the CCTDI and the California Critical Thinking Skills Test (CCTST) to almost 8,000 post-secondary students in the United States revealed a statistically significant but weak correlation between total scores on the two tests, and also between paired sub-scores from the two tests. The implication is that both abilities and dispositions need to be taught, that one cannot expect improvement in one to bring with it improvement in the other.

A more direct way of assessing critical thinking dispositions would be to see what people do when put in a situation where the dispositions would reveal themselves. Ennis (1996) reports promising initial work with guided open-ended opportunities to give evidence of dispositions, but no standardized test seems to have emerged from this work. There are however standardized aspect-specific tests of critical thinking dispositions. The Critical Problem Solving Scale (Berman et al. 2001: 518) takes as a measure of the disposition to suspend judgment the number of distinct good aspects attributed to an option judged to be the worst among those generated by the test taker. Stanovich, West and Toplak (2011: 800–810) list tests developed by cognitive psychologists of the following dispositions: resistance to miserly information processing, resistance to myside thinking, absence of irrelevant context effects in decision-making, actively open-minded thinking, valuing reason and truth, tendency to seek information, objective reasoning style, tendency to seek consistency, sense of self-efficacy, prudent discounting of the future, self-control skills, and emotional regulation.

It is easier to measure critical thinking skills or abilities than to measure dispositions. The following eight currently available standardized tests purport to measure them: the Watson-Glaser Critical Thinking Appraisal (Watson & Glaser 1980a, 1980b, 1994), the Cornell Critical Thinking Tests Level X and Level Z (Ennis & Millman 1971; Ennis, Millman, & Tomko 1985, 2005), the Ennis-Weir Critical Thinking Essay Test (Ennis & Weir 1985), the California Critical Thinking Skills Test (Facione 1990b, 1992), the Halpern Critical Thinking Assessment (Halpern 2016), the Critical Thinking Assessment Test (Center for Assessment & Improvement of Learning 2017), the Collegiate Learning Assessment (Council for Aid to Education 2017), the HEIghten Critical Thinking Assessment (https://territorium.com/heighten/), and a suite of critical thinking assessments for different groups and purposes offered by Insight Assessment (https://www.insightassessment.com/products). The Critical Thinking Assessment Test (CAT) is unique among them in being designed for use by college faculty to help them improve their development of students’ critical thinking skills (Haynes et al. 2015; Haynes & Stein 2021). Also, for some years the United Kingdom body OCR (Oxford Cambridge and RSA Examinations) awarded AS and A Level certificates in critical thinking on the basis of an examination (OCR 2011). Many of these standardized tests have received scholarly evaluations at the hands of, among others, Ennis (1958), McPeck (1981), Norris and Ennis (1989), Fisher and Scriven (1997), Possin (2008, 2013a, 2013b, 2013c, 2014, 2020) and Hatcher and Possin (2021). Their evaluations provide a useful set of criteria that such tests ideally should meet, as does the description by Ennis (1984) of problems in testing for competence in critical thinking: the soundness of multiple-choice items, the clarity and soundness of instructions to test takers, the information and mental processing used in selecting an answer to a multiple-choice item, the role of background beliefs and ideological commitments in selecting an answer to a multiple-choice item, the tenability of a test’s underlying conception of critical thinking and its component abilities, the set of abilities that the test manual claims are covered by the test, the extent to which the test actually covers these abilities, the appropriateness of the weighting given to various abilities in the scoring system, the accuracy and intellectual honesty of the test manual, the interest of the test to the target population of test takers, the scope for guessing, the scope for choosing a keyed answer by being test-wise, precautions against cheating in the administration of the test, clarity and soundness of materials for training essay graders, inter-rater reliability in grading essays, and clarity and soundness of advance guidance to test takers on what is required in an essay. Rear (2019) has challenged the use of standardized tests of critical thinking as a way to measure educational outcomes, on the grounds that  they (1) fail to take into account disputes about conceptions of critical thinking, (2) are not completely valid or reliable, and (3) fail to evaluate skills used in real academic tasks. He proposes instead assessments based on discipline-specific content.

There are also aspect-specific standardized tests of critical thinking abilities. Stanovich, West and Toplak (2011: 800–810) list tests of probabilistic reasoning, insights into qualitative decision theory, knowledge of scientific reasoning, knowledge of rules of logical consistency and validity, and economic thinking. They also list instruments that probe for irrational thinking, such as superstitious thinking, belief in the superiority of intuition, over-reliance on folk wisdom and folk psychology, belief in “special” expertise, financial misconceptions, overestimation of one’s introspective powers, dysfunctional beliefs, and a notion of self that encourages egocentric processing. They regard these tests along with the previously mentioned tests of critical thinking dispositions as the building blocks for a comprehensive test of rationality, whose development (they write) may be logistically difficult and would require millions of dollars.

A superb example of assessment of an aspect of critical thinking ability is the Test on Appraising Observations (Norris & King 1983, 1985, 1990a, 1990b), which was designed for classroom administration to senior high school students. The test focuses entirely on the ability to appraise observation statements and in particular on the ability to determine in a specified context which of two statements there is more reason to believe. According to the test manual (Norris & King 1985, 1990b), a person’s score on the multiple-choice version of the test, which is the number of items that are answered correctly, can justifiably be given either a criterion-referenced or a norm-referenced interpretation.

On a criterion-referenced interpretation, those who do well on the test have a firm grasp of the principles for appraising observation statements, and those who do poorly have a weak grasp of them. This interpretation can be justified by the content of the test and the way it was developed, which incorporated a method of controlling for background beliefs articulated and defended by Norris (1985). Norris and King synthesized from judicial practice, psychological research and common-sense psychology 31 principles for appraising observation statements, in the form of empirical generalizations about tendencies, such as the principle that observation statements tend to be more believable than inferences based on them (Norris & King 1984). They constructed items in which exactly one of the 31 principles determined which of two statements was more believable. Using a carefully constructed protocol, they interviewed about 100 students who responded to these items in order to determine the thinking that led them to choose the answers they did (Norris & King 1984). In several iterations of the test, they adjusted items so that selection of the correct answer generally reflected good thinking and selection of an incorrect answer reflected poor thinking. Thus they have good evidence that good performance on the test is due to good thinking about observation statements and that poor performance is due to poor thinking about observation statements. Collectively, the 50 items on the final version of the test require application of 29 of the 31 principles for appraising observation statements, with 13 principles tested by one item, 12 by two items, three by three items, and one by four items. Thus there is comprehensive coverage of the principles for appraising observation statements. Fisher and Scriven (1997: 135–136) judge the items to be well worked and sound, with one exception. The test is clearly written at a grade 6 reading level, meaning that poor performance cannot be attributed to difficulties in reading comprehension by the intended adolescent test takers. The stories that frame the items are realistic, and are engaging enough to stimulate test takers’ interest. Thus the most plausible explanation of a given score on the test is that it reflects roughly the degree to which the test taker can apply principles for appraising observations in real situations. In other words, there is good justification of the proposed interpretation that those who do well on the test have a firm grasp of the principles for appraising observation statements and those who do poorly have a weak grasp of them.

To get norms for performance on the test, Norris and King arranged for seven groups of high school students in different types of communities and with different levels of academic ability to take the test. The test manual includes percentiles, means, and standard deviations for each of these seven groups. These norms allow teachers to compare the performance of their class on the test to that of a similar group of students.

Copyright © 2022 by David Hitchcock < hitchckd @ mcmaster . ca >

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2024 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

ORIGINAL RESEARCH article

Performance assessment of critical thinking: conceptualization, design, and implementation.

\r\nHenry I. Braun*

  • 1 Lynch School of Education and Human Development, Boston College, Chestnut Hill, MA, United States
  • 2 Graduate School of Education, Stanford University, Stanford, CA, United States
  • 3 Department of Business and Economics Education, Johannes Gutenberg University, Mainz, Germany

Enhancing students’ critical thinking (CT) skills is an essential goal of higher education. This article presents a systematic approach to conceptualizing and measuring CT. CT generally comprises the following mental processes: identifying, evaluating, and analyzing a problem; interpreting information; synthesizing evidence; and reporting a conclusion. We further posit that CT also involves dealing with dilemmas involving ambiguity or conflicts among principles and contradictory information. We argue that performance assessment provides the most realistic—and most credible—approach to measuring CT. From this conceptualization and construct definition, we describe one possible framework for building performance assessments of CT with attention to extended performance tasks within the assessment system. The framework is a product of an ongoing, collaborative effort, the International Performance Assessment of Learning (iPAL). The framework comprises four main aspects: (1) The storyline describes a carefully curated version of a complex, real-world situation. (2) The challenge frames the task to be accomplished (3). A portfolio of documents in a range of formats is drawn from multiple sources chosen to have specific characteristics. (4) The scoring rubric comprises a set of scales each linked to a facet of the construct. We discuss a number of use cases, as well as the challenges that arise with the use and valid interpretation of performance assessments. The final section presents elements of the iPAL research program that involve various refinements and extensions of the assessment framework, a number of empirical studies, along with linkages to current work in online reading and information processing.

Introduction

In their mission statements, most colleges declare that a principal goal is to develop students’ higher-order cognitive skills such as critical thinking (CT) and reasoning (e.g., Shavelson, 2010 ; Hyytinen et al., 2019 ). The importance of CT is echoed by business leaders ( Association of American Colleges and Universities [AACU], 2018 ), as well as by college faculty (for curricular analyses in Germany, see e.g., Zlatkin-Troitschanskaia et al., 2018 ). Indeed, in the 2019 administration of the Faculty Survey of Student Engagement (FSSE), 93% of faculty reported that they “very much” or “quite a bit” structure their courses to support student development with respect to thinking critically and analytically. In a listing of 21st century skills, CT was the most highly ranked among FSSE respondents ( Indiana University, 2019 ). Nevertheless, there is considerable evidence that many college students do not develop these skills to a satisfactory standard ( Arum and Roksa, 2011 ; Shavelson et al., 2019 ; Zlatkin-Troitschanskaia et al., 2019 ). This state of affairs represents a serious challenge to higher education – and to society at large.

In view of the importance of CT, as well as evidence of substantial variation in its development during college, its proper measurement is essential to tracking progress in skill development and to providing useful feedback to both teachers and learners. Feedback can help focus students’ attention on key skill areas in need of improvement, and provide insight to teachers on choices of pedagogical strategies and time allocation. Moreover, comparative studies at the program and institutional level can inform higher education leaders and policy makers.

The conceptualization and definition of CT presented here is closely related to models of information processing and online reasoning, the skills that are the focus of this special issue. These two skills are especially germane to the learning environments that college students experience today when much of their academic work is done online. Ideally, students should be capable of more than naïve Internet search, followed by copy-and-paste (e.g., McGrew et al., 2017 ); rather, for example, they should be able to critically evaluate both sources of evidence and the quality of the evidence itself in light of a given purpose ( Leu et al., 2020 ).

In this paper, we present a systematic approach to conceptualizing CT. From that conceptualization and construct definition, we present one possible framework for building performance assessments of CT with particular attention to extended performance tasks within the test environment. The penultimate section discusses some of the challenges that arise with the use and valid interpretation of performance assessment scores. We conclude the paper with a section on future perspectives in an emerging field of research – the iPAL program.

Conceptual Foundations, Definition and Measurement of Critical Thinking

In this section, we briefly review the concept of CT and its definition. In accordance with the principles of evidence-centered design (ECD; Mislevy et al., 2003 ), the conceptualization drives the measurement of the construct; that is, implementation of ECD directly links aspects of the assessment framework to specific facets of the construct. We then argue that performance assessments designed in accordance with such an assessment framework provide the most realistic—and most credible—approach to measuring CT. The section concludes with a sketch of an approach to CT measurement grounded in performance assessment .

Concept and Definition of Critical Thinking

Taxonomies of 21st century skills ( Pellegrino and Hilton, 2012 ) abound, and it is neither surprising that CT appears in most taxonomies of learning, nor that there are many different approaches to defining and operationalizing the construct of CT. There is, however, general agreement that CT is a multifaceted construct ( Liu et al., 2014 ). Liu et al. (2014) identified five key facets of CT: (i) evaluating evidence and the use of evidence; (ii) analyzing arguments; (iii) understanding implications and consequences; (iv) developing sound arguments; and (v) understanding causation and explanation.

There is empirical support for these facets from college faculty. A 2016–2017 survey conducted by the Higher Education Research Institute (HERI) at the University of California, Los Angeles found that a substantial majority of faculty respondents “frequently” encouraged students to: (i) evaluate the quality or reliability of the information they receive; (ii) recognize biases that affect their thinking; (iii) analyze multiple sources of information before coming to a conclusion; and (iv) support their opinions with a logical argument ( Stolzenberg et al., 2019 ).

There is general agreement that CT involves the following mental processes: identifying, evaluating, and analyzing a problem; interpreting information; synthesizing evidence; and reporting a conclusion (e.g., Erwin and Sebrell, 2003 ; Kosslyn and Nelson, 2017 ; Shavelson et al., 2018 ). We further suggest that CT includes dealing with dilemmas of ambiguity or conflict among principles and contradictory information ( Oser and Biedermann, 2020 ).

Importantly, Oser and Biedermann (2020) posit that CT can be manifested at three levels. The first level, Critical Analysis , is the most complex of the three levels. Critical Analysis requires both knowledge in a specific discipline (conceptual) and procedural analytical (deduction, inclusion, etc.) knowledge. The second level is Critical Reflection , which involves more generic skills “… necessary for every responsible member of a society” (p. 90). It is “a basic attitude that must be taken into consideration if (new) information is questioned to be true or false, reliable or not reliable, moral or immoral etc.” (p. 90). To engage in Critical Reflection, one needs not only apply analytic reasoning, but also adopt a reflective stance toward the political, social, and other consequences of choosing a course of action. It also involves analyzing the potential motives of various actors involved in the dilemma of interest. The third level, Critical Alertness , involves questioning one’s own or others’ thinking from a skeptical point of view.

Wheeler and Haertel (1993) categorized higher-order skills, such as CT, into two types: (i) when solving problems and making decisions in professional and everyday life, for instance, related to civic affairs and the environment; and (ii) in situations where various mental processes (e.g., comparing, evaluating, and justifying) are developed through formal instruction, usually in a discipline. Hence, in both settings, individuals must confront situations that typically involve a problematic event, contradictory information, and possibly conflicting principles. Indeed, there is an ongoing debate concerning whether CT should be evaluated using generic or discipline-based assessments ( Nagel et al., 2020 ). Whether CT skills are conceptualized as generic or discipline-specific has implications for how they are assessed and how they are incorporated into the classroom.

In the iPAL project, CT is characterized as a multifaceted construct that comprises conceptualizing, analyzing, drawing inferences or synthesizing information, evaluating claims, and applying the results of these reasoning processes to various purposes (e.g., solve a problem, decide on a course of action, find an answer to a given question or reach a conclusion) ( Shavelson et al., 2019 ). In the course of carrying out a CT task, an individual typically engages in activities such as specifying or clarifying a problem; deciding what information is relevant to the problem; evaluating the trustworthiness of information; avoiding judgmental errors based on “fast thinking”; avoiding biases and stereotypes; recognizing different perspectives and how they can reframe a situation; considering the consequences of alternative courses of actions; and communicating clearly and concisely decisions and actions. The order in which activities are carried out can vary among individuals and the processes can be non-linear and reciprocal.

In this article, we focus on generic CT skills. The importance of these skills derives not only from their utility in academic and professional settings, but also the many situations involving challenging moral and ethical issues – often framed in terms of conflicting principles and/or interests – to which individuals have to apply these skills ( Kegan, 1994 ; Tessier-Lavigne, 2020 ). Conflicts and dilemmas are ubiquitous in the contexts in which adults find themselves: work, family, civil society. Moreover, to remain viable in the global economic environment – one characterized by increased competition and advances in second generation artificial intelligence (AI) – today’s college students will need to continually develop and leverage their CT skills. Ideally, colleges offer a supportive environment in which students can develop and practice effective approaches to reasoning about and acting in learning, professional and everyday situations.

Measurement of Critical Thinking

Critical thinking is a multifaceted construct that poses many challenges to those who would develop relevant and valid assessments. For those interested in current approaches to the measurement of CT that are not the focus of this paper, consult Zlatkin-Troitschanskaia et al. (2018) .

In this paper, we have singled out performance assessment as it offers important advantages to measuring CT. Extant tests of CT typically employ response formats such as forced-choice or short-answer, and scenario-based tasks (for an overview, see Liu et al., 2014 ). They all suffer from moderate to severe construct underrepresentation; that is, they fail to capture important facets of the CT construct such as perspective taking and communication. High fidelity performance tasks are viewed as more authentic in that they provide a problem context and require responses that are more similar to what individuals confront in the real world than what is offered by traditional multiple-choice items ( Messick, 1994 ; Braun, 2019 ). This greater verisimilitude promises higher levels of construct representation and lower levels of construct-irrelevant variance. Such performance tasks have the capacity to measure facets of CT that are imperfectly assessed, if at all, using traditional assessments ( Lane and Stone, 2006 ; Braun, 2019 ; Shavelson et al., 2019 ). However, these assertions must be empirically validated, and the measures should be subjected to psychometric analyses. Evidence of the reliability, validity, and interpretative challenges of performance assessment (PA) are extensively detailed in Davey et al. (2015) .

We adopt the following definition of performance assessment:

A performance assessment (sometimes called a work sample when assessing job performance) … is an activity or set of activities that requires test takers, either individually or in groups, to generate products or performances in response to a complex, most often real-world task. These products and performances provide observable evidence bearing on test takers’ knowledge, skills, and abilities—their competencies—in completing the assessment ( Davey et al., 2015 , p. 10).

A performance assessment typically includes an extended performance task and short constructed-response and selected-response (i.e., multiple-choice) tasks (for examples, see Zlatkin-Troitschanskaia and Shavelson, 2019 ). In this paper, we refer to both individual performance- and constructed-response tasks as performance tasks (PT) (For an example, see Table 1 in section “iPAL Assessment Framework”).

www.frontiersin.org

Table 1. The iPAL assessment framework.

An Approach to Performance Assessment of Critical Thinking: The iPAL Program

The approach to CT presented here is the result of ongoing work undertaken by the International Performance Assessment of Learning collaborative (iPAL 1 ). iPAL is an international consortium of volunteers, primarily from academia, who have come together to address the dearth in higher education of research and practice in measuring CT with performance tasks ( Shavelson et al., 2018 ). In this section, we present iPAL’s assessment framework as the basis of measuring CT, with examples along the way.

iPAL Background

The iPAL assessment framework builds on the Council of Aid to Education’s Collegiate Learning Assessment (CLA). The CLA was designed to measure cross-disciplinary, generic competencies, such as CT, analytic reasoning, problem solving, and written communication ( Klein et al., 2007 ; Shavelson, 2010 ). Ideally, each PA contained an extended PT (e.g., examining a range of evidential materials related to the crash of an aircraft) and two short PT’s: one in which students either critique an argument or provide a solution in response to a real-world societal issue.

Motivated by considerations of adequate reliability, in 2012, the CLA was later modified to create the CLA+. The CLA+ includes two subtests: a PT and a 25-item Selected Response Question (SRQ) section. The PT presents a document or problem statement and an assignment based on that document which elicits an open-ended response. The CLA+ added the SRQ section (which is not linked substantively to the PT scenario) to increase the number of student responses to obtain more reliable estimates of performance at the student-level than could be achieved with a single PT ( Zahner, 2013 ; Davey et al., 2015 ).

iPAL Assessment Framework

Methodological foundations.

The iPAL framework evolved from the Collegiate Learning Assessment developed by Klein et al. (2007) . It was also informed by the results from the AHELO pilot study ( Organisation for Economic Co-operation and Development [OECD], 2012 , 2013 ), as well as the KoKoHs research program in Germany (for an overview see, Zlatkin-Troitschanskaia et al., 2017 , 2020 ). The ongoing refinement of the iPAL framework has been guided in part by the principles of Evidence Centered Design (ECD) ( Mislevy et al., 2003 ; Mislevy and Haertel, 2006 ; Haertel and Fujii, 2017 ).

In educational measurement, an assessment framework plays a critical intermediary role between the theoretical formulation of the construct and the development of the assessment instrument containing tasks (or items) intended to elicit evidence with respect to that construct ( Mislevy et al., 2003 ). Builders of the assessment framework draw on the construct theory and operationalize it in a way that provides explicit guidance to PT’s developers. Thus, the framework should reflect the relevant facets of the construct, where relevance is determined by substantive theory or an appropriate alternative such as behavioral samples from real-world situations of interest (criterion-sampling; McClelland, 1973 ), as well as the intended use(s) (for an example, see Shavelson et al., 2019 ). By following the requirements and guidelines embodied in the framework, instrument developers strengthen the claim of construct validity for the instrument ( Messick, 1994 ).

An assessment framework can be specified at different levels of granularity: an assessment battery (“omnibus” assessment, for an example see below), a single performance task, or a specific component of an assessment ( Shavelson, 2010 ; Davey et al., 2015 ). In the iPAL program, a performance assessment comprises one or more extended performance tasks and additional selected-response and short constructed-response items. The focus of the framework specified below is on a single PT intended to elicit evidence with respect to some facets of CT, such as the evaluation of the trustworthiness of the documents provided and the capacity to address conflicts of principles.

From the ECD perspective, an assessment is an instrument for generating information to support an evidentiary argument and, therefore, the intended inferences (claims) must guide each stage of the design process. The construct of interest is operationalized through the Student Model , which represents the target knowledge, skills, and abilities, as well as the relationships among them. The student model should also make explicit the assumptions regarding student competencies in foundational skills or content knowledge. The Task Model specifies the features of the problems or items posed to the respondent, with the goal of eliciting the evidence desired. The assessment framework also describes the collection of task models comprising the instrument, with considerations of construct validity, various psychometric characteristics (e.g., reliability) and practical constraints (e.g., testing time and cost). The student model provides grounds for evidence of validity, especially cognitive validity; namely, that the students are thinking critically in responding to the task(s).

In the present context, the target construct (CT) is the competence of individuals to think critically, which entails solving complex, real-world problems, and clearly communicating their conclusions or recommendations for action based on trustworthy, relevant and unbiased information. The situations, drawn from actual events, are challenging and may arise in many possible settings. In contrast to more reductionist approaches to assessment development, the iPAL approach and framework rests on the assumption that properly addressing these situational demands requires the application of a constellation of CT skills appropriate to the particular task presented (e.g., Shavelson, 2010 , 2013 ). For a PT, the assessment framework must also specify the rubric by which the responses will be evaluated. The rubric must be properly linked to the target construct so that the resulting score profile constitutes evidence that is both relevant and interpretable in terms of the student model (for an example, see Zlatkin-Troitschanskaia et al., 2019 ).

iPAL Task Framework

The iPAL ‘omnibus’ framework comprises four main aspects: A storyline , a challenge , a document library , and a scoring rubric . Table 1 displays these aspects, brief descriptions of each, and the corresponding examples drawn from an iPAL performance assessment (Version adapted from original in Hyytinen and Toom, 2019 ). Storylines are drawn from various domains; for example, the worlds of business, public policy, civics, medicine, and family. They often involve moral and/or ethical considerations. Deriving an appropriate storyline from a real-world situation requires careful consideration of which features are to be kept in toto , which adapted for purposes of the assessment, and which to be discarded. Framing the challenge demands care in wording so that there is minimal ambiguity in what is required of the respondent. The difficulty of the challenge depends, in large part, on the nature and extent of the information provided in the document library , the amount of scaffolding included, as well as the scope of the required response. The amount of information and the scope of the challenge should be commensurate with the amount of time available. As is evident from the table, the characteristics of the documents in the library are intended to elicit responses related to facets of CT. For example, with regard to bias, the information provided is intended to play to judgmental errors due to fast thinking and/or motivational reasoning. Ideally, the situation should accommodate multiple solutions of varying degrees of merit.

The dimensions of the scoring rubric are derived from the Task Model and Student Model ( Mislevy et al., 2003 ) and signal which features are to be extracted from the response and indicate how they are to be evaluated. There should be a direct link between the evaluation of the evidence and the claims that are made with respect to the key features of the task model and student model . More specifically, the task model specifies the various manipulations embodied in the PA and so informs scoring, while the student model specifies the capacities students employ in more or less effectively responding to the tasks. The score scales for each of the five facets of CT (see section “Concept and Definition of Critical Thinking”) can be specified using appropriate behavioral anchors (for examples, see Zlatkin-Troitschanskaia and Shavelson, 2019 ). Of particular importance is the evaluation of the response with respect to the last dimension of the scoring rubric; namely, the overall coherence and persuasiveness of the argument, building on the explicit or implicit characteristics related to the first five dimensions. The scoring process must be monitored carefully to ensure that (trained) raters are judging each response based on the same types of features and evaluation criteria ( Braun, 2019 ) as indicated by interrater agreement coefficients.

The scoring rubric of the iPAL omnibus framework can be modified for specific tasks ( Lane and Stone, 2006 ). This generic rubric helps ensure consistency across rubrics for different storylines. For example, Zlatkin-Troitschanskaia et al. (2019 , p. 473) used the following scoring scheme:

Based on our construct definition of CT and its four dimensions: (D1-Info) recognizing and evaluating information, (D2-Decision) recognizing and evaluating arguments and making decisions, (D3-Conseq) recognizing and evaluating the consequences of decisions, and (D4-Writing), we developed a corresponding analytic dimensional scoring … The students’ performance is evaluated along the four dimensions, which in turn are subdivided into a total of 23 indicators as (sub)categories of CT … For each dimension, we sought detailed evidence in students’ responses for the indicators and scored them on a six-point Likert-type scale. In order to reduce judgment distortions, an elaborate procedure of ‘behaviorally anchored rating scales’ (Smith and Kendall, 1963) was applied by assigning concrete behavioral expectations to certain scale points (Bernardin et al., 1976). To this end, we defined the scale levels by short descriptions of typical behavior and anchored them with concrete examples. … We trained four raters in 1 day using a specially developed training course to evaluate students’ performance along the 23 indicators clustered into four dimensions (for a description of the rater training, see Klotzer, 2018).

Shavelson et al. (2019) examined the interrater agreement of the scoring scheme developed by Zlatkin-Troitschanskaia et al. (2019) and “found that with 23 items and 2 raters the generalizability (“reliability”) coefficient for total scores to be 0.74 (with 4 raters, 0.84)” ( Shavelson et al., 2019 , p. 15). In the study by Zlatkin-Troitschanskaia et al. (2019 , p. 478) three score profiles were identified (low-, middle-, and high-performer) for students. Proper interpretation of such profiles requires care. For example, there may be multiple possible explanations for low scores such as poor CT skills, a lack of a disposition to engage with the challenge, or the two attributes jointly. These alternative explanations for student performance can potentially pose a threat to the evidentiary argument. In this case, auxiliary information may be available to aid in resolving the ambiguity. For example, student responses to selected- and short-constructed-response items in the PA can provide relevant information about the levels of the different skills possessed by the student. When sufficient data are available, the scores can be modeled statistically and/or qualitatively in such a way as to bring them to bear on the technical quality or interpretability of the claims of the assessment: reliability, validity, and utility evidence ( Davey et al., 2015 ; Zlatkin-Troitschanskaia et al., 2019 ). These kinds of concerns are less critical when PT’s are used in classroom settings. The instructor can draw on other sources of evidence, including direct discussion with the student.

Use of iPAL Performance Assessments in Educational Practice: Evidence From Preliminary Validation Studies

The assessment framework described here supports the development of a PT in a general setting. Many modifications are possible and, indeed, desirable. If the PT is to be more deeply embedded in a certain discipline (e.g., economics, law, or medicine), for example, then the framework must specify characteristics of the narrative and the complementary documents as to the breadth and depth of disciplinary knowledge that is represented.

At present, preliminary field trials employing the omnibus framework (i.e., a full set of documents) indicated that 60 min was generally an inadequate amount of time for students to engage with the full set of complementary documents and to craft a complete response to the challenge (for an example, see Shavelson et al., 2019 ). Accordingly, it would be helpful to develop modified frameworks for PT’s that require substantially less time. For an example, see a short performance assessment of civic online reasoning, requiring response times from 10 to 50 min ( Wineburg et al., 2016 ). Such assessment frameworks could be derived from the omnibus framework by focusing on a reduced number of facets of CT, and specifying the characteristics of the complementary documents to be included – or, perhaps, choices among sets of documents. In principle, one could build a ‘family’ of PT’s, each using the same (or nearly the same) storyline and a subset of the full collection of complementary documents.

Paul and Elder (2007) argue that the goal of CT assessments should be to provide faculty with important information about how well their instruction supports the development of students’ CT. In that spirit, the full family of PT’s could represent all facets of the construct while affording instructors and students more specific insights on strengths and weaknesses with respect to particular facets of CT. Moreover, the framework should be expanded to include the design of a set of short answer and/or multiple choice items to accompany the PT. Ideally, these additional items would be based on the same narrative as the PT to collect more nuanced information on students’ precursor skills such as reading comprehension, while enhancing the overall reliability of the assessment. Areas where students are under-prepared could be addressed before, or even in parallel with the development of the focal CT skills. The parallel approach follows the co-requisite model of developmental education. In other settings (e.g., for summative assessment), these complementary items would be administered after the PT to augment the evidence in relation to the various claims. The full PT taking 90 min or more could serve as a capstone assessment.

As we transition from simply delivering paper-based assessments by computer to taking full advantage of the affordances of a digital platform, we should learn from the hard-won lessons of the past so that we can make swifter progress with fewer missteps. In that regard, we must take validity as the touchstone – assessment design, development and deployment must all be tightly linked to the operational definition of the CT construct. Considerations of reliability and practicality come into play with various use cases that highlight different purposes for the assessment (for future perspectives, see next section).

The iPAL assessment framework represents a feasible compromise between commercial, standardized assessments of CT (e.g., Liu et al., 2014 ), on the one hand, and, on the other, freedom for individual faculty to develop assessment tasks according to idiosyncratic models. It imposes a degree of standardization on both task development and scoring, while still allowing some flexibility for faculty to tailor the assessment to meet their unique needs. In so doing, it addresses a key weakness of the AAC&U’s VALUE initiative 2 (retrieved 5/7/2020) that has achieved wide acceptance among United States colleges.

The VALUE initiative has produced generic scoring rubrics for 15 domains including CT, problem-solving and written communication. A rubric for a particular skill domain (e.g., critical thinking) has five to six dimensions with four ordered performance levels for each dimension (1 = lowest, 4 = highest). The performance levels are accompanied by language that is intended to clearly differentiate among levels. 3 Faculty are asked to submit student work products from a senior level course that is intended to yield evidence with respect to student learning outcomes in a particular domain and that, they believe, can elicit performances at the highest level. The collection of work products is then graded by faculty from other institutions who have been trained to apply the rubrics.

A principal difficulty is that there is neither a common framework to guide the design of the challenge, nor any control on task complexity and difficulty. Consequently, there is substantial heterogeneity in the quality and evidential value of the submitted responses. This also causes difficulties with task scoring and inter-rater reliability. Shavelson et al. (2009) discuss some of the problems arising with non-standardized collections of student work.

In this context, one advantage of the iPAL framework is that it can provide valuable guidance and an explicit structure for faculty in developing performance tasks for both instruction and formative assessment. When faculty design assessments, their focus is typically on content coverage rather than other potentially important characteristics, such as the degree of construct representation and the adequacy of their scoring procedures ( Braun, 2019 ).

Concluding Reflections

Challenges to interpretation and implementation.

Performance tasks such as those generated by iPAL are attractive instruments for assessing CT skills (e.g., Shavelson, 2010 ; Shavelson et al., 2019 ). The attraction mainly rests on the assumption that elaborated PT’s are more authentic (direct) and more completely capture facets of the target construct (i.e., possess greater construct representation) than the widely used selected-response tests. However, as Messick (1994) noted authenticity is a “promissory note” that must be redeemed with empirical research. In practice, there are trade-offs among authenticity, construct validity, and psychometric quality such as reliability ( Davey et al., 2015 ).

One reason for Messick (1994) caution is that authenticity does not guarantee construct validity. The latter must be established by drawing on multiple sources of evidence ( American Educational Research Association et al., 2014 ). Following the ECD principles in designing and developing the PT, as well as the associated scoring rubrics, constitutes an important type of evidence. Further, as Leighton (2019) argues, response process data (“cognitive validity”) is needed to validate claims regarding the cognitive complexity of PT’s. Relevant data can be obtained through cognitive laboratory studies involving methods such as think aloud protocols or eye-tracking. Although time-consuming and expensive, such studies can yield not only evidence of validity, but also valuable information to guide refinements of the PT.

Going forward, iPAL PT’s must be subjected to validation studies as recommended in the Standards for Psychological and Educational Testing by American Educational Research Association et al. (2014) . With a particular focus on the criterion “relationships to other variables,” a framework should include assumptions about the theoretically expected relationships among the indicators assessed by the PT, as well as the indicators’ relationships to external variables such as intelligence or prior (task-relevant) knowledge.

Complementing the necessity of evaluating construct validity, there is the need to consider potential sources of construct-irrelevant variance (CIV). One pertains to student motivation, which is typically greater when the stakes are higher. If students are not motivated, then their performance is likely to be impacted by factors unrelated to their (construct-relevant) ability ( Lane and Stone, 2006 ; Braun et al., 2011 ; Shavelson, 2013 ). Differential motivation across groups can also bias comparisons. Student motivation might be enhanced if the PT is administered in the context of a course with the promise of generating useful feedback on students’ skill profiles.

Construct-irrelevant variance can also occur when students are not equally prepared for the format of the PT or fully appreciate the response requirements. This source of CIV could be alleviated by providing students with practice PT’s. Finally, the use of novel forms of documentation, such as those from the Internet, can potentially introduce CIV due to differential familiarity with forms of representation or contents. Interestingly, this suggests that there may be a conflict between enhancing construct representation and reducing CIV.

Another potential source of CIV is related to response evaluation. Even with training, human raters can vary in accuracy and usage of the full score range. In addition, raters may attend to features of responses that are unrelated to the target construct, such as the length of the students’ responses or the frequency of grammatical errors ( Lane and Stone, 2006 ). Some of these sources of variance could be addressed in an online environment, where word processing software could alert students to potential grammatical and spelling errors before they submit their final work product.

Performance tasks generally take longer to administer and are more costly than traditional assessments, making it more difficult to reliably measure student performance ( Messick, 1994 ; Davey et al., 2015 ). Indeed, it is well known that more than one performance task is needed to obtain high reliability ( Shavelson, 2013 ). This is due to both student-task interactions and variability in scoring. Sources of student-task interactions are differential familiarity with the topic ( Hyytinen and Toom, 2019 ) and differential motivation to engage with the task. The level of reliability required, however, depends on the context of use. For use in formative assessment as part of an instructional program, reliability can be lower than use for summative purposes. In the former case, other types of evidence are generally available to support interpretation and guide pedagogical decisions. Further studies are needed to obtain estimates of reliability in typical instructional settings.

With sufficient data, more sophisticated psychometric analyses become possible. One challenge is that the assumption of unidimensionality required for many psychometric models might be untenable for performance tasks ( Davey et al., 2015 ). Davey et al. (2015) provide the example of a mathematics assessment that requires students to demonstrate not only their mathematics skills but also their written communication skills. Although the iPAL framework does not explicitly address students’ reading comprehension and organization skills, students will likely need to call on these abilities to accomplish the task. Moreover, as the operational definition of CT makes evident, the student must not only deploy several skills in responding to the challenge of the PT, but also carry out component tasks in sequence. The former requirement strongly indicates the need for a multi-dimensional IRT model, while the latter suggests that the usual assumption of local item independence may well be problematic ( Lane and Stone, 2006 ). At the same time, the analytic scoring rubric should facilitate the use of latent class analysis to partition data from large groups into meaningful categories ( Zlatkin-Troitschanskaia et al., 2019 ).

Future Perspectives

Although the iPAL consortium has made substantial progress in the assessment of CT, much remains to be done. Further refinement of existing PT’s and their adaptation to different languages and cultures must continue. To this point, there are a number of examples: The refugee crisis PT (cited in Table 1 ) was translated and adapted from Finnish to US English and then to Colombian Spanish. A PT concerning kidney transplants was translated and adapted from German to US English. Finally, two PT’s based on ‘legacy admissions’ to US colleges were translated and adapted to Colombian Spanish.

With respect to data collection, there is a need for sufficient data to support psychometric analysis of student responses, especially the relationships among the different components of the scoring rubric, as this would inform both task development and response evaluation ( Zlatkin-Troitschanskaia et al., 2019 ). In addition, more intensive study of response processes through cognitive laboratories and the like are needed to strengthen the evidential argument for construct validity ( Leighton, 2019 ). We are currently conducting empirical studies, collecting data on both iPAL PT’s and other measures of CT. These studies will provide evidence of convergent and discriminant validity.

At the same time, efforts should be directed at further development to support different ways CT PT’s might be used—i.e., use cases—especially those that call for formative use of PT’s. Incorporating formative assessment into courses can plausibly be expected to improve students’ competency acquisition ( Zlatkin-Troitschanskaia et al., 2017 ). With suitable choices of storylines, appropriate combinations of (modified) PT’s, supplemented by short-answer and multiple-choice items, could be interwoven into ordinary classroom activities. The supplementary items may be completely separate from the PT’s (as is the case with the CLA+), loosely coupled with the PT’s (as in drawing on the same storyline), or tightly linked to the PT’s (as in requiring elaboration of certain components of the response to the PT).

As an alternative to such integration, stand-alone modules could be embedded in courses to yield evidence of students’ generic CT skills. Core curriculum courses or general education courses offer ideal settings for embedding performance assessments. If these assessments were administered to a representative sample of students in each cohort over their years in college, the results would yield important information on the development of CT skills at a population level. For another example, these PA’s could be used to assess the competence profiles of students entering Bachelor’s or graduate-level programs as a basis for more targeted instructional support.

Thus, in considering different use cases for the assessment of CT, it is evident that several modifications of the iPAL omnibus assessment framework are needed. As noted earlier, assessments built according to this framework are demanding with respect to the extensive preliminary work required by a task and the time required to properly complete it. Thus, it would be helpful to have modified versions of the framework, focusing on one or two facets of the CT construct and calling for a smaller number of supplementary documents. The challenge to the student should be suitably reduced.

Some members of the iPAL collaborative have developed PT’s that are embedded in disciplines such as engineering, law and education ( Crump et al., 2019 ; for teacher education examples, see Jeschke et al., 2019 ). These are proving to be of great interest to various stakeholders and further development is likely. Consequently, it is essential that an appropriate assessment framework be established and implemented. It is both a conceptual and an empirical question as to whether a single framework can guide development in different domains.

Performance Assessment in Online Learning Environment

Over the last 15 years, increasing amounts of time in both college and work are spent using computers and other electronic devices. This has led to formulation of models for the new literacies that attempt to capture some key characteristics of these activities. A prominent example is a model proposed by Leu et al. (2020) . The model frames online reading as a process of problem-based inquiry that calls on five practices to occur during online research and comprehension:

1. Reading to identify important questions,

2. Reading to locate information,

3. Reading to critically evaluate information,

4. Reading to synthesize online information, and

5. Reading and writing to communicate online information.

The parallels with the iPAL definition of CT are evident and suggest there may be benefits to closer links between these two lines of research. For example, a report by Leu et al. (2014) describes empirical studies comparing assessments of online reading using either open-ended or multiple-choice response formats.

The iPAL consortium has begun to take advantage of the affordances of the online environment (for examples, see Schmidt et al. and Nagel et al. in this special issue). Most obviously, Supplementary Materials can now include archival photographs, audio recordings, or videos. Additional tasks might include the online search for relevant documents, though this would add considerably to the time demands. This online search could occur within a simulated Internet environment, as is the case for the IEA’s ePIRLS assessment ( Mullis et al., 2017 ).

The prospect of having access to a wealth of materials that can add to task authenticity is exciting. Yet it can also add ambiguity and information overload. Increased authenticity, then, should be weighed against validity concerns and the time required to absorb the content in these materials. Modifications of the design framework and extensive empirical testing will be required to decide on appropriate trade-offs. A related possibility is to employ some of these materials in short-answer (or even selected-response) items that supplement the main PT. Response formats could include highlighting text or using a drag-and-drop menu to construct a response. Students’ responses could be automatically scored, thereby containing costs. With automated scoring, feedback to students and faculty, including suggestions for next steps in strengthening CT skills, could also be provided without adding to faculty workload. Therefore, taking advantage of the online environment to incorporate new types of supplementary documents should be a high priority and, perhaps, to introduce new response formats as well. Finally, further investigation of the overlap between this formulation of CT and the characterization of online reading promulgated by Leu et al. (2020) is a promising direction to pursue.

Data Availability Statement

All datasets generated for this study are included in the article/supplementary material.

Author Contributions

HB wrote the article. RS, OZ-T, and KB were involved in the preparation and revision of the article and co-wrote the manuscript. All authors contributed to the article and approved the submitted version.

This study was funded in part by the Spencer Foundation (Grant No. #201700123).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We would like to thank all the researchers who have participated in the iPAL program.

  • ^ https://www.ipal-rd.com/
  • ^ https://www.aacu.org/value
  • ^ When test results are reported by means of substantively defined categories, the scoring is termed “criterion-referenced”. This is, in contrast to results, reported as percentiles; such scoring is termed “norm-referenced”.

American Educational Research Association, American Psychological Association, and National Council on Measurement in Education (2014). Standards for Educational and Psychological Testing. Washington, D.C: American Educational Research Association.

Google Scholar

Arum, R., and Roksa, J. (2011). Academically Adrift: Limited Learning on College Campuses. Chicago, IL: University of Chicago Press.

Association of American Colleges and Universities (n.d.). VALUE: What is value?. Available online at:: https://www.aacu.org/value (accessed May 7, 2020).

Association of American Colleges and Universities [AACU] (2018). Fulfilling the American Dream: Liberal Education and the Future of Work. Available online at:: https://www.aacu.org/research/2018-future-of-work (accessed May 1, 2020).

Braun, H. (2019). Performance assessment and standardization in higher education: a problematic conjunction? Br. J. Educ. Psychol. 89, 429–440. doi: 10.1111/bjep.12274

PubMed Abstract | CrossRef Full Text | Google Scholar

Braun, H. I., Kirsch, I., and Yamoto, K. (2011). An experimental study of the effects of monetary incentives on performance on the 12th grade NAEP reading assessment. Teach. Coll. Rec. 113, 2309–2344.

Crump, N., Sepulveda, C., Fajardo, A., and Aguilera, A. (2019). Systematization of performance tests in critical thinking: an interdisciplinary construction experience. Rev. Estud. Educ. 2, 17–47.

Davey, T., Ferrara, S., Shavelson, R., Holland, P., Webb, N., and Wise, L. (2015). Psychometric Considerations for the Next Generation of Performance Assessment. Washington, DC: Center for K-12 Assessment & Performance Management, Educational Testing Service.

Erwin, T. D., and Sebrell, K. W. (2003). Assessment of critical thinking: ETS’s tasks in critical thinking. J. Gen. Educ. 52, 50–70. doi: 10.1353/jge.2003.0019

CrossRef Full Text | Google Scholar

Haertel, G. D., and Fujii, R. (2017). “Evidence-centered design and postsecondary assessment,” in Handbook on Measurement, Assessment, and Evaluation in Higher Education , 2nd Edn, eds C. Secolsky and D. B. Denison (Abingdon: Routledge), 313–339. doi: 10.4324/9781315709307-26

Hyytinen, H., and Toom, A. (2019). Developing a performance assessment task in the Finnish higher education context: conceptual and empirical insights. Br. J. Educ. Psychol. 89, 551–563. doi: 10.1111/bjep.12283

Hyytinen, H., Toom, A., and Shavelson, R. J. (2019). “Enhancing scientific thinking through the development of critical thinking in higher education,” in Redefining Scientific Thinking for Higher Education: Higher-Order Thinking, Evidence-Based Reasoning and Research Skills , eds M. Murtonen and K. Balloo (London: Palgrave MacMillan).

Indiana University (2019). FSSE 2019 Frequencies: FSSE 2019 Aggregate. Available online at:: http://fsse.indiana.edu/pdf/FSSE_IR_2019/summary_tables/FSSE19_Frequencies_(FSSE_2019).pdf (accessed May 1, 2020).

Jeschke, C., Kuhn, C., Lindmeier, A., Zlatkin-Troitschanskaia, O., Saas, H., and Heinze, A. (2019). Performance assessment to investigate the domain specificity of instructional skills among pre-service and in-service teachers of mathematics and economics. Br. J. Educ. Psychol. 89, 538–550. doi: 10.1111/bjep.12277

Kegan, R. (1994). In Over Our Heads: The Mental Demands of Modern Life. Cambridge, MA: Harvard University Press.

Klein, S., Benjamin, R., Shavelson, R., and Bolus, R. (2007). The collegiate learning assessment: facts and fantasies. Eval. Rev. 31, 415–439. doi: 10.1177/0193841x07303318

Kosslyn, S. M., and Nelson, B. (2017). Building the Intentional University: Minerva and the Future of Higher Education. Cambridge, MAL: The MIT Press.

Lane, S., and Stone, C. A. (2006). “Performance assessment,” in Educational Measurement , 4th Edn, ed. R. L. Brennan (Lanham, MA: Rowman & Littlefield Publishers), 387–432.

Leighton, J. P. (2019). The risk–return trade-off: performance assessments and cognitive validation of inferences. Br. J. Educ. Psychol. 89, 441–455. doi: 10.1111/bjep.12271

Leu, D. J., Kiili, C., Forzani, E., Zawilinski, L., McVerry, J. G., and O’Byrne, W. I. (2020). “The new literacies of online research and comprehension,” in The Concise Encyclopedia of Applied Linguistics , ed. C. A. Chapelle (Oxford: Wiley-Blackwell), 844–852.

Leu, D. J., Kulikowich, J. M., Kennedy, C., and Maykel, C. (2014). “The ORCA Project: designing technology-based assessments for online research,” in Paper Presented at the American Educational Research Annual Meeting , Philadelphia, PA.

Liu, O. L., Frankel, L., and Roohr, K. C. (2014). Assessing critical thinking in higher education: current state and directions for next-generation assessments. ETS Res. Rep. Ser. 1, 1–23. doi: 10.1002/ets2.12009

McClelland, D. C. (1973). Testing for competence rather than for “intelligence.”. Am. Psychol. 28, 1–14. doi: 10.1037/h0034092

McGrew, S., Ortega, T., Breakstone, J., and Wineburg, S. (2017). The challenge that’s bigger than fake news: civic reasoning in a social media environment. Am. Educ. 4, 4-9, 39.

Mejía, A., Mariño, J. P., and Molina, A. (2019). Incorporating perspective analysis into critical thinking performance assessments. Br. J. Educ. Psychol. 89, 456–467. doi: 10.1111/bjep.12297

Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educ. Res. 23, 13–23. doi: 10.3102/0013189x023002013

Mislevy, R. J., Almond, R. G., and Lukas, J. F. (2003). A brief introduction to evidence-centered design. ETS Res. Rep. Ser. 2003, i–29. doi: 10.1002/j.2333-8504.2003.tb01908.x

Mislevy, R. J., and Haertel, G. D. (2006). Implications of evidence-centered design for educational testing. Educ. Meas. Issues Pract. 25, 6–20. doi: 10.1111/j.1745-3992.2006.00075.x

Mullis, I. V. S., Martin, M. O., Foy, P., and Hooper, M. (2017). ePIRLS 2016 International Results in Online Informational Reading. Available online at:: http://timssandpirls.bc.edu/pirls2016/international-results/ (accessed May 1, 2020).

Nagel, M.-T., Zlatkin-Troitschanskaia, O., Schmidt, S., and Beck, K. (2020). “Performance assessment of generic and domain-specific skills in higher education economics,” in Student Learning in German Higher Education , eds O. Zlatkin-Troitschanskaia, H. A. Pant, M. Toepper, and C. Lautenbach (Berlin: Springer), 281–299. doi: 10.1007/978-3-658-27886-1_14

Organisation for Economic Co-operation and Development [OECD] (2012). AHELO: Feasibility Study Report , Vol. 1. Paris: OECD. Design and implementation.

Organisation for Economic Co-operation and Development [OECD] (2013). AHELO: Feasibility Study Report , Vol. 2. Paris: OECD. Data analysis and national experiences.

Oser, F. K., and Biedermann, H. (2020). “A three-level model for critical thinking: critical alertness, critical reflection, and critical analysis,” in Frontiers and Advances in Positive Learning in the Age of Information (PLATO) , ed. O. Zlatkin-Troitschanskaia (Cham: Springer), 89–106. doi: 10.1007/978-3-030-26578-6_7

Paul, R., and Elder, L. (2007). Consequential validity: using assessment to drive instruction. Found. Crit. Think. 29, 31–40.

Pellegrino, J. W., and Hilton, M. L. (eds) (2012). Education for life and work: Developing Transferable Knowledge and Skills in the 21st Century. Washington DC: National Academies Press.

Shavelson, R. (2010). Measuring College Learning Responsibly: Accountability in a New Era. Redwood City, CA: Stanford University Press.

Shavelson, R. J. (2013). On an approach to testing and modeling competence. Educ. Psychol. 48, 73–86. doi: 10.1080/00461520.2013.779483

Shavelson, R. J., Zlatkin-Troitschanskaia, O., Beck, K., Schmidt, S., and Marino, J. P. (2019). Assessment of university students’ critical thinking: next generation performance assessment. Int. J. Test. 19, 337–362. doi: 10.1080/15305058.2018.1543309

Shavelson, R. J., Zlatkin-Troitschanskaia, O., and Marino, J. P. (2018). “International performance assessment of learning in higher education (iPAL): research and development,” in Assessment of Learning Outcomes in Higher Education: Cross-National Comparisons and Perspectives , eds O. Zlatkin-Troitschanskaia, M. Toepper, H. A. Pant, C. Lautenbach, and C. Kuhn (Berlin: Springer), 193–214. doi: 10.1007/978-3-319-74338-7_10

Shavelson, R. J., Klein, S., and Benjamin, R. (2009). The limitations of portfolios. Inside Higher Educ. Available online at: https://www.insidehighered.com/views/2009/10/16/limitations-portfolios

Stolzenberg, E. B., Eagan, M. K., Zimmerman, H. B., Berdan Lozano, J., Cesar-Davis, N. M., Aragon, M. C., et al. (2019). Undergraduate Teaching Faculty: The HERI Faculty Survey 2016–2017. Los Angeles, CA: UCLA.

Tessier-Lavigne, M. (2020). Putting Ethics at the Heart of Innovation. Stanford, CA: Stanford Magazine.

Wheeler, P., and Haertel, G. D. (1993). Resource Handbook on Performance Assessment and Measurement: A Tool for Students, Practitioners, and Policymakers. Palm Coast, FL: Owl Press.

Wineburg, S., McGrew, S., Breakstone, J., and Ortega, T. (2016). Evaluating Information: The Cornerstone of Civic Online Reasoning. Executive Summary. Stanford, CA: Stanford History Education Group.

Zahner, D. (2013). Reliability and Validity–CLA+. Council for Aid to Education. Available online at:: https://pdfs.semanticscholar.org/91ae/8edfac44bce3bed37d8c9091da01d6db3776.pdf .

Zlatkin-Troitschanskaia, O., and Shavelson, R. J. (2019). Performance assessment of student learning in higher education [Special issue]. Br. J. Educ. Psychol. 89, i–iv, 413–563.

Zlatkin-Troitschanskaia, O., Pant, H. A., Lautenbach, C., Molerov, D., Toepper, M., and Brückner, S. (2017). Modeling and Measuring Competencies in Higher Education: Approaches to Challenges in Higher Education Policy and Practice. Berlin: Springer VS.

Zlatkin-Troitschanskaia, O., Pant, H. A., Toepper, M., and Lautenbach, C. (eds) (2020). Student Learning in German Higher Education: Innovative Measurement Approaches and Research Results. Wiesbaden: Springer.

Zlatkin-Troitschanskaia, O., Shavelson, R. J., and Pant, H. A. (2018). “Assessment of learning outcomes in higher education: international comparisons and perspectives,” in Handbook on Measurement, Assessment, and Evaluation in Higher Education , 2nd Edn, eds C. Secolsky and D. B. Denison (Abingdon: Routledge), 686–697.

Zlatkin-Troitschanskaia, O., Shavelson, R. J., Schmidt, S., and Beck, K. (2019). On the complementarity of holistic and analytic approaches to performance assessment scoring. Br. J. Educ. Psychol. 89, 468–484. doi: 10.1111/bjep.12286

Keywords : critical thinking, performance assessment, assessment framework, scoring rubric, evidence-centered design, 21st century skills, higher education

Citation: Braun HI, Shavelson RJ, Zlatkin-Troitschanskaia O and Borowiec K (2020) Performance Assessment of Critical Thinking: Conceptualization, Design, and Implementation. Front. Educ. 5:156. doi: 10.3389/feduc.2020.00156

Received: 30 May 2020; Accepted: 04 August 2020; Published: 08 September 2020.

Reviewed by:

Copyright © 2020 Braun, Shavelson, Zlatkin-Troitschanskaia and Borowiec. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Henry I. Braun, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Menu Trigger

Why Schools Need to Change Yes, We Can Define, Teach, and Assess Critical Thinking Skills

critical thinking in assessment

Jeff Heyck-Williams (He, His, Him) Director of the Two Rivers Learning Institute in Washington, DC

critical thinking

Today’s learners face an uncertain present and a rapidly changing future that demand far different skills and knowledge than were needed in the 20th century. We also know so much more about enabling deep, powerful learning than we ever did before. Our collective future depends on how well young people prepare for the challenges and opportunities of 21st-century life.

Critical thinking is a thing. We can define it; we can teach it; and we can assess it.

While the idea of teaching critical thinking has been bandied around in education circles since at least the time of John Dewey, it has taken greater prominence in the education debates with the advent of the term “21st century skills” and discussions of deeper learning. There is increasing agreement among education reformers that critical thinking is an essential ingredient for long-term success for all of our students.

However, there are still those in the education establishment and in the media who argue that critical thinking isn’t really a thing, or that these skills aren’t well defined and, even if they could be defined, they can’t be taught or assessed.

To those naysayers, I have to disagree. Critical thinking is a thing. We can define it; we can teach it; and we can assess it. In fact, as part of a multi-year Assessment for Learning Project , Two Rivers Public Charter School in Washington, D.C., has done just that.

Before I dive into what we have done, I want to acknowledge that some of the criticism has merit.

First, there are those that argue that critical thinking can only exist when students have a vast fund of knowledge. Meaning that a student cannot think critically if they don’t have something substantive about which to think. I agree. Students do need a robust foundation of core content knowledge to effectively think critically. Schools still have a responsibility for building students’ content knowledge.

However, I would argue that students don’t need to wait to think critically until after they have mastered some arbitrary amount of knowledge. They can start building critical thinking skills when they walk in the door. All students come to school with experience and knowledge which they can immediately think critically about. In fact, some of the thinking that they learn to do helps augment and solidify the discipline-specific academic knowledge that they are learning.

The second criticism is that critical thinking skills are always highly contextual. In this argument, the critics make the point that the types of thinking that students do in history is categorically different from the types of thinking students do in science or math. Thus, the idea of teaching broadly defined, content-neutral critical thinking skills is impossible. I agree that there are domain-specific thinking skills that students should learn in each discipline. However, I also believe that there are several generalizable skills that elementary school students can learn that have broad applicability to their academic and social lives. That is what we have done at Two Rivers.

Defining Critical Thinking Skills

We began this work by first defining what we mean by critical thinking. After a review of the literature and looking at the practice at other schools, we identified five constructs that encompass a set of broadly applicable skills: schema development and activation; effective reasoning; creativity and innovation; problem solving; and decision making.

critical thinking competency

We then created rubrics to provide a concrete vision of what each of these constructs look like in practice. Working with the Stanford Center for Assessment, Learning and Equity (SCALE) , we refined these rubrics to capture clear and discrete skills.

For example, we defined effective reasoning as the skill of creating an evidence-based claim: students need to construct a claim, identify relevant support, link their support to their claim, and identify possible questions or counter claims. Rubrics provide an explicit vision of the skill of effective reasoning for students and teachers. By breaking the rubrics down for different grade bands, we have been able not only to describe what reasoning is but also to delineate how the skills develop in students from preschool through 8th grade.

reasoning rubric

Before moving on, I want to freely acknowledge that in narrowly defining reasoning as the construction of evidence-based claims we have disregarded some elements of reasoning that students can and should learn. For example, the difference between constructing claims through deductive versus inductive means is not highlighted in our definition. However, by privileging a definition that has broad applicability across disciplines, we are able to gain traction in developing the roots of critical thinking. In this case, to formulate well-supported claims or arguments.

Teaching Critical Thinking Skills

The definitions of critical thinking constructs were only useful to us in as much as they translated into practical skills that teachers could teach and students could learn and use. Consequently, we have found that to teach a set of cognitive skills, we needed thinking routines that defined the regular application of these critical thinking and problem-solving skills across domains. Building on Harvard’s Project Zero Visible Thinking work, we have named routines aligned with each of our constructs.

For example, with the construct of effective reasoning, we aligned the Claim-Support-Question thinking routine to our rubric. Teachers then were able to teach students that whenever they were making an argument, the norm in the class was to use the routine in constructing their claim and support. The flexibility of the routine has allowed us to apply it from preschool through 8th grade and across disciplines from science to economics and from math to literacy.

argumentative writing

Kathryn Mancino, a 5th grade teacher at Two Rivers, has deliberately taught three of our thinking routines to students using the anchor charts above. Her charts name the components of each routine and has a place for students to record when they’ve used it and what they have figured out about the routine. By using this structure with a chart that can be added to throughout the year, students see the routines as broadly applicable across disciplines and are able to refine their application over time.

Assessing Critical Thinking Skills

By defining specific constructs of critical thinking and building thinking routines that support their implementation in classrooms, we have operated under the assumption that students are developing skills that they will be able to transfer to other settings. However, we recognized both the importance and the challenge of gathering reliable data to confirm this.

With this in mind, we have developed a series of short performance tasks around novel discipline-neutral contexts in which students can apply the constructs of thinking. Through these tasks, we have been able to provide an opportunity for students to demonstrate their ability to transfer the types of thinking beyond the original classroom setting. Once again, we have worked with SCALE to define tasks where students easily access the content but where the cognitive lift requires them to demonstrate their thinking abilities.

These assessments demonstrate that it is possible to capture meaningful data on students’ critical thinking abilities. They are not intended to be high stakes accountability measures. Instead, they are designed to give students, teachers, and school leaders discrete formative data on hard to measure skills.

While it is clearly difficult, and we have not solved all of the challenges to scaling assessments of critical thinking, we can define, teach, and assess these skills . In fact, knowing how important they are for the economy of the future and our democracy, it is essential that we do.

Jeff Heyck-Williams (He, His, Him)

Director of the two rivers learning institute.

Jeff Heyck-Williams is the director of the Two Rivers Learning Institute and a founder of Two Rivers Public Charter School. He has led work around creating school-wide cultures of mathematics, developing assessments of critical thinking and problem-solving, and supporting project-based learning.

Read More About Why Schools Need to Change

Teacher holding laptop in classroom

AI in Schools Has Prevailed for a Full Year. What Happens Next?

August 13, 2024

elementary students collaborate

Connections over Consequences: Effective Strategies for Collaborative Problem-Solving with Students

Sanchel Hall

August 6, 2024

Two high school students

Incorporating Leadership Skills into a Student-Centered Classroom

Elizabeth Lennon (she, her)

July 8, 2024

critical thinking in assessment

Assessment of Critical Thinking

  • First Online: 10 December 2023

Cite this chapter

critical thinking in assessment

  • Dirk Jahn 3 &
  • Michael Cursio 4  

200 Accesses

The term “to assess” has various meanings, such as to judge, evaluate, estimate, gauge, or determine. Assessment is therefore a diagnostic inventory of certain characteristics of a section of observable reality on the basis of defined criteria. In a pedagogical context, assessments aim to make learners’ knowledge, skills, or attitudes observable in certain application situations and to assess them on the basis of observation criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

To give an example: Holistic Critical Thinking Rubric from East Georgia College; Available at https://studylib.net/doc/7608742/east-georgia-college-holistic-critical-thinking-rubric-cr… (04/03/2020).

Astleitner, H. (1998). Kritisches Denken. Basisqualifikation für Lehrer und Ausbildner . Studien.

Google Scholar  

Biggs, J. (2003). Aligning teaching and assessment to curriculum objectives . https://www.heacademy.ac.uk/sites/default/files/biggs-aligning-teaching-and-assessment.pdf . Accessed 21 Apr 2015.

Brookfield, S. (2003). Critical thinking in adulthood. In D. J. Fasko & D. J. Fasko (Eds.), Critical thinking and reasoning. Current research, theory, and practice (pp. 143–163). Hampton Press.

Ennis, R. H. (2003). Critical thinking assessment. In D. Fasko (Ed.), Critical thinking and reasoning. Current research, theory, and practice (pp. 293–314). Hampton Press.

Garrison, D. R. (1992). Critical thinking and self-directed learning in adult education: an analysis of responsibility and control issues. Adult Education Quarterly, 42 (3), 136–148.

Article   Google Scholar  

Garrison, D. R., & Anderson, T. (2003). E-learning in the 21st century. A framework for research and practice . Routledge.

Book   Google Scholar  

Grotjahn, R. (1999). Testtheorie: Grundzüge und Anwendung in der Praxis. Materialien Deutsch als Fremdsprache, 53 , 304–341.

Halpern, D. F. (2003). The “how” and “why” of critical thinking assessment. In D. Fasko (Ed.), Critical thinking and reasoning: Current research, theory and practice . Hampton Press.

Handke, J., & Schäfer, A. M. (2012). E-learning, E-teaching und E-assessment in der Hochschullehre: Eine Anleitung: Eine Anleitung . Oldenbourg.

Ingenkamp, K. (1985). Lehrbuch der Pädagogischen Diagnostik . Beltz Verlag.

Jahn, D. (2012a). Kritisches Denken fördern können. Entwicklung eines didaktischen Designs zur Qualifizierung pädagogischer Professionals . Shaker.

Landis, M., Swain, K. D., Friehe, M. J., & Coufal, K. L. (2007). Evaluating critical thinking in class and online: Comparison of the Newman method and the Facione Rubric. Teacher Education Quarterly, 34 (4), 121–136.

Newman, D. R., Webb, B., & Cochrane, C. (1995). A content analysis method to measure critical thinking in face-to-face and computer supported group learning. Interpersonal Computing and Technology: An Electronic Journal for the 21st Century, 2 , 56–77.

Newman, D. R., Johnson, C., Cochrane, C. & Webb, B. (1996). An experiment in group learning technology: evaluating critical thinking in face-to-face and computer-supported seminars . Verfügbar unter: http://emoderators.com/ipct-j/1996/n1/newman/contents.html . Accessed 12 Apr.

Pandero, E., & Jonsson, A. (2013). The use of scoring rubrics for formative assessment purposes revisited: A review. Educational Research Review, 9 , 129–144.

Reinmann-Rothmeier, G., & Mandl, H. (1999). Unterrichten und Lernumgebungen gestalten (überarbeitete Fassung). Forschungsbericht Nr. 60. Ludwig-Maximilans-Universität München Institut für Pädagogische Psychologie und Empirische Pädagogik.

Rieck, K, unter Mitarbeit von Hoffmann, D., & Friege, G. (2005). Gute Aufgaben. In Modulbeschreibung des Programms SINUS-Transfer Grundschule. https://www.schulportal-thueringen.de/get-data/a79020fe-f99b-4153-8de5-cfff12f92f30/N1.pdf . Accessed 27 Jan 2020.

Sopka, S., Simon, M., & Beckers, S. (2013). “Assessment drives Learning”: Konzepte zur Erfolgs- und Qualitätskontrolle. In M. St. Pierre & G. Breuer (Eds.), Simulation in der Medizin . Springer.

Wilbers, K. (2014). Wirtschaftsunterricht gestalten. Toolbox (2. Aufl.). epubli.

Wilbers, K. (2019). Wirtschaftsunterricht gestalten. epubli GmbH. https://www.pedocs.de/volltexte/2019/17949/pdf/Wilbers_2019_Wi.rtschaftsunterricht_gestalten.pdf . Accessed 24 Okt 2019.

Download references

Author information

Authors and affiliations.

Friedrich Alexander Uni, Fortbildungszentrum Hochschullehre FBZHL, Fürth, Bayern, Germany

Friedrich Alexander Universität Erlangen-Nürnberg, Fortbildungszentrum Hochschullehre FBZHL, Fürth, Germany

Michael Cursio

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature

About this chapter

Jahn, D., Cursio, M. (2023). Assessment of Critical Thinking. In: Critical Thinking. Springer VS, Wiesbaden. https://doi.org/10.1007/978-3-658-41543-3_8

Download citation

DOI : https://doi.org/10.1007/978-3-658-41543-3_8

Published : 10 December 2023

Publisher Name : Springer VS, Wiesbaden

Print ISBN : 978-3-658-41542-6

Online ISBN : 978-3-658-41543-3

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Christopher Dwyer Ph.D.

Critical Thinking About Measuring Critical Thinking

A list of critical thinking measures..

Posted May 18, 2018

  • What Is Cognition?
  • Take our Mental Processing Test
  • Find a therapist near me

In my last post , I discussed the nature of engaging the critical thinking (CT) process and made mention of individuals who draw a conclusion and wind up being correct. But, just because they’re right, it doesn’t mean they used CT to get there. I exemplified this through an observation made in recent years regarding extant measures of CT, many of which assess CT via multiple-choice questions. In the case of CT MCQs, you can guess the "right" answer 20-25% of the time, without any need for CT. So, the question is, are these CT measures really measuring CT?

As my previous articles explain, CT is a metacognitive process consisting of a number of sub-skills and dispositions, that, when applied through purposeful, self-regulatory, reflective judgment, increase the chances of producing a logical solution to a problem or a valid conclusion to an argument (Dwyer, 2017; Dwyer, Hogan & Stewart, 2014). Most definitions, though worded differently, tend to agree with this perspective – it consists of certain dispositions, specific skills and a reflective sensibility that governs application of these skills. That’s how it’s defined; however, it’s not necessarily how it’s been operationally defined.

Operationally defining something refers to defining the terms of the process or measure required to determine the nature and properties of a phenomenon. Simply, it is defining the concept with respect to how it can be done, assessed or measured. If the manner in which you measure something does not match, or assess the parameters set out in the way in which you define it, then you have not been successful in operationally defining it.

Though most theoretical definitions of CT are similar, the manner in which they vary often impedes the construction of an integrated theoretical account of how best to measure CT skills. As a result, researchers and educators must consider the wide array of CT measures available, in order to identify the best and the most appropriate measures, based on the CT conceptualisation used for training. There are various extant CT measures – the most popular amongst them include the Watson-Glaser Critical Thinking Assessment (WGCTA; Watson & Glaser, 1980), the Cornell Critical Thinking Test (CCTT; Ennis, Millman & Tomko, 1985), the California Critical Thinking Skills Test (CCTST; Facione, 1990a), the Ennis-Weir Critical Thinking Essay Test (EWCTET; Ennis & Weir, 1985) and the Halpern Critical Thinking Assessment (Halpern, 2010).

It has been noted by some commentators that these different measures of CT ability may not be directly comparable (Abrami et al., 2008). For example, the WGCTA consists of 80 MCQs that measure the ability to draw inferences; recognise assumptions; evaluate arguments; and use logical interpretation and deductive reasoning (Watson & Glaser, 1980). The CCTT consists of 52 MCQs which measure skills of critical thinking associated with induction; deduction; observation and credibility; definition and assumption identification; and meaning and fallacies. Finally, the CCTST consists of 34 multiple-choice questions (MCQs) and measures CT according to the core skills of analysis, evaluation and inference, as well as inductive and deductive reasoning.

As addressed above, the MCQ-format of these three assessments is less than ideal – problematic even, because it allows test-takers to simply guess when they do not know the correct answer, instead of demonstrating their ability to critically analyse and evaluate problems and infer solutions to those problems (Ku, 2009). Furthermore, as argued by Halpern (2003), the MCQ format makes the assessment a test of verbal and quantitative knowledge rather than CT (i.e. because one selects from a list of possible answers rather than determining one’s own criteria for developing an answer). The measurement of CT through MCQs is also problematic given the potential incompatibility between the conceptualisation of CT that shapes test construction and its assessment using MCQs. That is, MCQ tests assess cognitive capacities associated with identifying single right-or-wrong answers and as a result, this approach to testing is unable to provide a direct measure of test-takers’ use of metacognitive processes such as CT, reflective judgment, and disposition towards CT.

Instead of using MCQ items, a better measure of CT might ask open-ended questions, which would allow test-takers to demonstrate whether or not they spontaneously use a specific CT skill. One commonly used CT assessment, mentioned above, that employs an open-ended format is the Ennis-Weir Critical Thinking Essay Test (EWCTET; Ennis & Weir, 1985). The EWCTET is an essay-based assessment of the test-taker’s ability to analyse, evaluate, and respond to arguments and debates in real-world situations (Ennis & Weir, 1985; see Ku, 2009 for a discussion). The authors of the EWCTET provide what they call a “rough, somewhat overlapping list of areas of critical thinking competence”, measured by their test (Ennis & Weir, 1985, p. 1). However, this test, too, has been criticised – for its domain-specific nature (Taube, 1997), the subjectivity of its scoring protocol and its bias in favour of those proficient in writing (Adams, Whitlow, Stover & Johnson, 1996).

Another, more recent CT assessment that utilises an open-ended format is the Halpern Critical Thinking Assessment (HCTA; Halpern, 2010). The HCTA consists of 25 open-ended questions based on believable, everyday situations, followed by 25 specific questions that probe for the reasoning behind each answer. The multi-part nature of the questions makes it possible to assess the ability to use specific CT skills when the prompt is provided (Ku, 2009). The HCTA’s scoring protocol also provides comprehensible, unambiguous instructions for how to evaluate responses by breaking them down into clear, measurable components. Questions on the HCTA represent five categories of CT application: hypothesis testing (e.g. understanding the limits of correlational reasoning and how to know when causal claims cannot be made), verbal reasoning (e.g. recognising the use of pervasive or misleading language), argumentation (e.g. recognising the structure of arguments, how to examine the credibility of a source and how to judge one’s own arguments), judging likelihood and uncertainty (e.g. applying relevant principles of probability, how to avoid overconfidence in certain situations) and problem-solving (e.g. identifying the problem goal, generating and selecting solutions among alternatives).

Up until the development of the HCTA, I would have recommended the CCTST for measuring CT, despite its limitations. What’s nice about the CCTST is that it assesses the three core skills of CT: analysis, evaluation, and inference, which other scales do not (explicitly). So, if you were interested in assessing students’ sub-skill ability, this would be helpful. However, as we know, though CT skill performance is a sequence, it is also a collation of these skills – meaning that for any given problem or topic, each skill is necessary. By administrating an analysis problem, an evaluation problem and an inference problem, in which the student scores top marks for all three, it doesn’t guarantee that the student will apply these three to a broader problem that requires all three. That is, these questions don’t measure CT skill ability per se, rather analysis skill, evaluation skill and inference skill in isolation. Simply, scores may predict CT skill performance, but they don’t measure it.

critical thinking in assessment

What may be a better indicator of CT performance is assessment of CT application . As addressed above, there are five general applications of CT: hypothesis testing, verbal reasoning, argumentation, problem-solving and judging likelihood and uncertainty – all of which require a collation of analysis, evaluation, and inference. Though the sub-skills of analysis, evaluation, and inference are not directly measured in this case, their collation is measured through five distinct applications; and, as I see it, provides a 'truer' assessment of CT. In addition to assessing CT via an open-ended, short-answer format, the HCTA measures CT according to the five applications of CT; thus, I recommend its use for measuring CT.

However, that’s not to say that the HCTA is perfect. Though it consists of 25 open-ended questions, followed by 25 specific questions that probe for the reasoning behind each answer, when I first used it to assess a sample of students, I found that in setting up my data file, there were actually 165 opportunities for scoring across the test. Past research recommends that the assessment takes roughly between 45 and 60 minutes to complete. However, many of my participants reported it requiring closer to two hours (sometimes longer). It’s a long assessment – thorough, but long. Fortunately, adapted, shortened versions are now available, and it’s an adapted version that I currently administrate to assess CT. Another limitation is that, despite the rationale above, it would be nice to have some indication of how participants get on with the sub-skills of analysis, evaluation, and inference, as I do think there’s a potential predictive element in the relationship among the individual skills and the applications. With that, I suppose it is feasible to administer both the HCTA and CCTST to assess such hypotheses.

Though it’s obviously important to consider how assessments actually measure CT and the nature in which each is limited, the broader, macro-problem still requires thought. Just as conceptualisations of CT vary, so too does the reliability and validity of the different CT measures, which has led Abrami and colleagues (2008, p. 1104) to ask: “How will we know if one intervention is more beneficial than another if we are uncertain about the validity and reliability of the outcome measures?” Abrami and colleagues add that, even when researchers explicitly declare that they are assessing CT, there still remains the major challenge of ensuring that measured outcomes are related, in some meaningful way, to the conceptualisation and operational definition of CT that informed the teaching practice in cases of interventional research. Often, the relationship between the concepts of CT that are taught and those that are assessed is unclear, and a large majority of studies in this area include no theory to help elucidate these relationships.

In conclusion, solving the problem of consistency across CT conceptualisation, training, and measure is no easy task. I think recent advancements in CT scale development (e.g. the development of the HCTA and its adapted versions) have eased the problem, given that they now bridge the gap between current theory and practical assessment. However, such advances need to be made clearer to interested populations. As always, I’m very interested in hearing from any readers who may have any insight or suggestions!

Abrami, P. C., Bernard, R. M., Borokhovski, E., Wade, A., Surkes, M. A., Tamim, R., & Zhang, D. (2008). Instructional interventions affecting critical thinking skills and dispositions: A stage 1 meta-analysis. Review of Educational Research, 78(4), 1102–1134.

Adams, M.H., Whitlow, J.F., Stover, L.M., & Johnson, K.W. (1996). Critical thinking as an educational outcome: An evaluation of current tools of measurement. Nurse Educator, 21, 23–32.

Dwyer, C.P. (2017). Critical thinking: Conceptual perspectives and practical guidelines. Cambridge, UK: Cambridge University Press.

Dwyer, C.P., Hogan, M.J. & Stewart, I. (2014). An integrated critical thinking framework for the 21st century. Thinking Skills & Creativity, 12, 43-52.

Ennis, R.H., Millman, J., & Tomko, T.N. (1985). Cornell critical thinking tests. CA: Critical Thinking Co.

Ennis, R.H., & Weir, E. (1985). The Ennis-Weir critical thinking essay test. Pacific Grove, CA: Midwest Publications.

Facione, P. A. (1990a). The California critical thinking skills test (CCTST): Forms A and B;The CCTST test manual. Millbrae, CA: California Academic Press.

Facione, P.A. (1990b). The Delphi report: Committee on pre-college philosophy. Millbrae, CA: California Academic Press.

Halpern, D. F. (2003b). The “how” and “why” of critical thinking assessment. In D. Fasko (Ed.), Critical thinking and reasoning: Current research, theory and practice. Cresskill, NJ: Hampton Press.

Halpern, D.F. (2010). The Halpern critical thinking assessment: Manual. Vienna: Schuhfried.

Ku, K.Y.L. (2009). Assessing students’ critical thinking performance: Urging for measurements using multi-response format. Thinking Skills and Creativity, 4, 1, 70- 76.

Taube, K.T. (1997). Critical thinking ability and disposition as factors of performance on a written critical thinking test. Journal of General Education, 46, 129-164.

Watson, G., & Glaser, E.M. (1980). Watson-Glaser critical thinking appraisal. New York: Psychological Corporation.

Christopher Dwyer Ph.D.

Christopher Dwyer, Ph.D., is a lecturer at the Technological University of the Shannon in Athlone, Ireland.

  • Find a Therapist
  • Find a Treatment Center
  • Find a Psychiatrist
  • Find a Support Group
  • Find Online Therapy
  • United States
  • Brooklyn, NY
  • Chicago, IL
  • Houston, TX
  • Los Angeles, CA
  • New York, NY
  • Portland, OR
  • San Diego, CA
  • San Francisco, CA
  • Seattle, WA
  • Washington, DC
  • Asperger's
  • Bipolar Disorder
  • Chronic Pain
  • Eating Disorders
  • Passive Aggression
  • Personality
  • Goal Setting
  • Positive Psychology
  • Stopping Smoking
  • Low Sexual Desire
  • Relationships
  • Child Development
  • Self Tests NEW
  • Therapy Center
  • Diagnosis Dictionary
  • Types of Therapy

July 2024 magazine cover

Sticking up for yourself is no easy task. But there are concrete skills you can use to hone your assertiveness and advocate for yourself.

  • Emotional Intelligence
  • Gaslighting
  • Affective Forecasting
  • Neuroscience

Get 25% off all test packages.

Get 25% off all test packages!

Click below to get 25% off all test packages.

Critical Thinking Tests

  • 228 questions

Critical thinking tests, sometimes known as critical reasoning tests, are often used by employers. They evaluate how a candidate makes logical deductions after scrutinising the evidence provided, while avoiding fallacies or non-factual opinions. Critical thinking tests can form part of an assessment day, or be used as a screening test before an interview.

What is a critical thinking test?

A critical thinking test assesses your ability to use a range of logical skills to evaluate given information and make a judgement. The test is presented in such a way that candidates are expected to quickly scrutinise the evidence presented and decide on the strength of the arguments.

Critical thinking tests show potential employers that you do not just accept data and can avoid subconscious bias and opinions – instead, you can find logical connections between ideas and find alternative interpretations.

This test is usually timed, so quick, clear, logical thinking will help candidates get the best marks. Critical thinking tests are designed to be challenging, and often used as part of the application process for upper-management-level roles.

What does critical thinking mean?

Critical thinking is the intellectual skill set that ensures you can process and consider information, challenge and analyse data, and then reach a conclusion that can be defended and justified.

In the most simple terms, critical reasoning skills will make sure that you are not simply accepting information at face value with little or no supporting evidence.

It also means that you are less likely to be swayed by ‘false news’ or opinions that cannot be backed with facts – which is important in high-level jobs that require logical thinking.

For more information about logical thinking, please see our article all about logical reasoning .

Which professions use critical thinking tests, and why?

Typically, critical thinking tests are taken as part of the application process for jobs that require advanced skills in judgement, analysis and decision making. The higher the position, the more likely that you will need to demonstrate reliable critical reasoning and good logic.

The legal sector is the main industry that uses critical thinking assessments – making decisions based on facts, without opinion and intuition, is vital in legal matters.

A candidate for a legal role needs to demonstrate their intellectual skills in problem-solving without pre-existing knowledge or subconscious bias – and the critical thinking test is a simple and effective way to screen candidates.

Another industry that uses critical thinking tests as part of the recruitment process is banking. In a similar way to the legal sector, those that work in banking are required to make decisions without allowing emotion, intuition or opinion to cloud coherent analysis and conclusions.

Critical thinking tests also sometimes comprise part of the recruitment assessment for graduate and management positions across numerous industries.

The format of the test: which skills are tested?

The test itself, no matter the publisher, is multiple choice.

As a rule, the questions present a paragraph of information for a scenario that may include numerical data. There will then be a statement and a number of possible answers.

The critical thinking test is timed, so decisions need to be made quickly and accurately; in most tests there is a little less than a minute for each question. Having experience of the test structure and what each question is looking for will make the experience smoother for you.

There are typically five separate sections in a critical thinking test, and each section may have multiple questions.

Inference questions assess your ability to judge whether a statement is true, false, or impossible to determine based on the given data and scenario. You usually have five possible answers: absolutely true, absolutely false, possibly true, possibly false, or not possible to determine.

Assumptions

In this section, you are being assessed on your ability to avoid taking things for granted. Each question gives a scenario including data, and you need to evaluate whether there are any assumptions present.

Here you are given a scenario and a number of deductions that may be applicable. You need to assess the given deductions to see which is the logical conclusion – does it follow?

Interpretation

In the interpretation stage, you need to read and analyse a paragraph of information, then interpret a set of possible conclusions, to see which one is correct. You are looking for the conclusion that follows beyond reasonable doubt.

Evaluation of Arguments

In this section, you are given a scenario and a set of arguments that can be for or against. You need to determine which are strong arguments and which are weak, in terms of the information that you have. This decision is made based on the way they address the scenario and how relevant they are to the content.

How best to prepare for a critical thinking test

The best way to prepare for any type of aptitude test is to practice, and critical thinking tests are no different.

Taking practice tests, as mentioned above, will give you confidence as it makes you better understand the structure, layout and timing of the real tests, so you can concentrate on the actual scenarios and questions.

Practice tests should be timed. This will help you get used to working through the scenarios and assessing the conclusions under time constraints – which is a good way to make sure that you perform quickly as well as accurately.

In some thinking skills assessments , a timer will be built in, but you might need to time yourself.

Consistent practice will also enable you to pinpoint any areas of the critical thinking test that require improvement. Our tests offer explanations for each answer, similar to the examples provided above.

Publishers of critical thinking tests

The watson glaser critical thinking test.

The Watson-Glaser Critical Thinking Appraisal (W-GCTA) is the most popular and widely used critical thinking test. This test has been in development for 85 years and is published by TalentLens .

The W-GCTA is seen as a successful tool for assessing cognitive abilities, allowing recruiting managers to predict job success, find good managers and identify future leaders. It is available in multiple languages including English, French and Spanish.

The test itself can be used as part of an assessment day or as a screening assessment before an interview. It consists of 40 questions on the 5 sections mentioned above, and is timed at 30 minutes. Click here for more information on Watson Glaser tests .

SHL critical reasoning test

SHL is a major aptitude test publisher, which offers critical thinking as part of its testing battery for pre-employment checks.

SHL tests cover all kinds of behavioural and aptitude tests, from logic to inference, verbal to numerical – and with a number of test batteries available online, they are one of the most popular choices for recruiters.

Cornell critical thinking test

The Cornell critical thinking test was made to test students and first developed in 1985. It is an American system that helps teachers, parents and administrators to confidently predict future performance for college admission, gifted and advanced placement programs, and even career success.

Prepare yourself for leading employers

BBC

5 Example critical thinking practice questions with answers

In this section, you need to deduce whether the inferred statement is true, false or impossible to deduce.

The UK Government has published data that shows 82% of people under the age of 30 are not homeowners. A charity that helps homeless people has published data that shows 48% of people that are considered homeless are under 30.

The lack of affordable housing on the sales market is the reason so many under-30s are homeless.

  • Definitely True
  • Probably True
  • Impossible to Deduce
  • Probably False
  • Definitely False

The information given does not infer the conclusion given, so it is impossible to deduce if the inference is correct – there is just not enough information to judge the inference as correct.

The removal of the five-substitution rule in British football will benefit clubs with a smaller roster.

Clubs with more money would prefer the five-substitute rule to continue.

  • Assumption Made

Assumption Not Made

This is an example of a fallacy that could cause confusion for a candidate – it encourages you to bring in any pre-existing knowledge of football clubs.

It would be easy to assume the assumption has been made when you consider that the more money a club has, the more players they should have on the roster. However, the statement does not make the assumption that the clubs with more money would prefer to continue with the five-substitute rule.

critical thinking tests

All boys love football. Football is a sport, therefore:

  • All boys love all sports
  • Girls do not love football
  • Boys are more likely to choose to play football than any other sport

In this section we are looking for the conclusion that follows the logic of the statement. In this example, we cannot deduce that girls do not love football, because there is not enough information to support that.

In the same way the conclusion that all boys love all sports does not follow – we are not given enough information to make that assumption. So, the conclusion that follows is 3: boys are more likely to choose football than any other sport because all boys like football.

The British Museum has a range of artefacts on display, including the largest privately owned collection of WWII weaponry.

There is a larger privately owned collection of WWII weaponry in the USA.

  • Conclusion Follows

Conclusion Does Not Follow

The fact that the collection is in the British Museum does not make a difference to the fact it is the largest private collection – so there cannot be a larger collection elsewhere.

The Department for Education should lower standards in examinations to make it fairer for less able students.

  • Yes – top grades are too hard for lower-income students
  • No – less fortunate students are not capable of higher standards
  • Yes – making the standards lower will benefit all students
  • No – private school students will suffer if grade standards are lower
  • The strongest argument is the right answer, not the one that you might personally believe.

In this case, we need to assess which argument is most relevant to the statement. Both 1 and 4 refer to students in particular situations, which isn’t relevant to the statement. The same can be said about 2, so the strongest argument is 3, since it is relevant and addresses the statement given.

Sample Critical Thinking Tests question Test your knowledge!

What implication can be drawn from the information in the passage?

A company’s internal audit revealed that departments with access to advanced analytics tools reported higher levels of strategic decision-making. These departments also showed a higher rate of reaching their quarterly objectives.

  • Strategic decision-making has no link to the achievement of quarterly objectives.
  • Access to advanced analytics does not influence a department's ability to make strategic decisions.
  • Advanced analytics tools are the sole reason for departments reaching their quarterly objectives.
  • Departments without access to advanced analytics tools are unable to make strategic decisions.
  • Advanced analytics tools may facilitate better strategic decision-making, which can lead to the achievement of objectives.

After reading the passage below, what conclusion is best supported by the information provided?

  • Job satisfaction increases when employees start their day earlier.
  • Starting early may lead to more efficient task completion and less job-related stress.
  • Workers who start their day later are more efficient at completing tasks.
  • There is a direct correlation between job satisfaction and starting work early.
  • The study concludes that job-related stress is unaffected by the start time of the workday.

Based on the passage below, which of the following assumptions is implicit?

  • Inter-departmental cooperation is the sole factor influencing project completion rates.
  • The increase in project completion rates is due entirely to the specialized team-building module.
  • Team-building exercises have no effect on inter-departmental cooperation.
  • The specialized team-building module may contribute to improvements in inter-departmental cooperation.
  • Departments that have not undergone the training will experience a decrease in project completion rates.

What is the flaw in the argument presented in the passage below?

  • The assumption that a casual dress code is suitable for all company types.
  • High-tech companies have a casual dress code to increase employee productivity specifically.
  • The argument correctly suggests that a casual dress code will increase employee morale in every company.
  • Morale and productivity cannot be affected by a company's dress code.
  • A casual dress code is more important than other factors in determining a company's success.

Which statement is an inference that can be drawn from the passage below?

  • Telecommuting employees are less productive than on-site workers.
  • The reduction in operational costs is directly caused by the increase in telecommuting employees.
  • Telecommuting may have contributed to the decrease in operational costs.
  • Operational costs are unaffected by employee work locations.
  • The number of telecommuting employees has no impact on operational costs.

Start your success journey

Access one of our Watson Glaser tests for FREE.

I’ve practiced hundreds of numerical questions and still have plenty more to try.

Ellen used Practice Aptitude Tests to prepare for her upcoming interview at HSBC.

testimonial

Hire better talent

At Neuroworx we help companies build perfect teams

Join picked

Critical Thinking Tests Tips

The most important factor in your success will be practice. If you have taken some practice tests, not only will you start to recognise the way questions are worded and become familiar with what each question is looking for, you will also be able to find out whether there are any parts that you need extra practice with.

It is important to find out which test you will be taking, as some generic critical thinking practice tests might not help if you are taking specific publisher tests (see the section below).

2 Fact vs fallacy

Practice questions can also help you recognise the difference between fact and fallacy in the test. A fallacy is simply an error or something misleading in the scenario paragraph that encourages you to choose an invalid argument. This might be a presumption or a misconception, but if it isn’t spotted it can make finding the right answer impossible.

3 Ignore what you already know

There is no need for pre-existing knowledge to be brought into the test, so no research is needed. In fact, it is important that you ignore any subconscious bias when you are considering the questions – you need logic and facts to get the correct answer, not intuition or instinct.

4 Read everything carefully

Read all the given information thoroughly. This might sound straightforward, but knowing that the test is timed can encourage candidates to skip content and risk misunderstanding the content or miss crucial details.

During the test itself, you will receive instructions that will help you to understand what is being asked of you on each section. There is likely to be an example question and answer, so ensure you take the time to read them fully.

5 Stay aware of the time you've taken

This test is usually timed, so don’t spend too long on a question. If you feel it is going to take too much time, leave it and come back to it at the end (if you have time). Critical thinking tests are complex by design, so they do have quite generous time limits.

For further advice, check out our full set of tips for critical thinking tests .

Prepare for your Watson Glaser Assessments

Immediate access. Cancel anytime.

  • 20 Aptitude packages
  • 59 Language packages
  • 110 Programming packages
  • 39 Admissions packages
  • 48 Personality packages
  • 315 Employer packages
  • 34 Publisher packages
  • 35 Industry packages
  • Dashboard performance tracking
  • Full solutions and explanations
  • Tips, tricks, guides and resources
  • Access to free tests
  • Basic performance tracking
  • Solutions & explanations
  • Tips and resources

Critical Thinking Tests FAQs

What are the basics of critical thinking.

In essence, critical thinking is the intellectual process of considering information on its merits, and reaching an analysis or conclusion from that information that can be defended and rationalised with evidence.

How do you know if you have good critical thinking skills?

You are likely to be someone with good critical thinking skills if you can build winning arguments; pick holes in someone’s theory if it’s inconsistent with known facts; reflect on the biases inherent in your own experiences and assumptions; and look at problems using a systematic methodology.

Reviews of our Watson Glaser tests

What our customers say about our Watson Glaser tests

Jozef Bailey

United Kingdom

April 05, 2022

Doesn't cover all aspects of Watson-Glaser tests but useful

The WGCTA uses more categories to assess critical thinking, but this was useful for the inference section.

April 01, 2022

Just practicing for an interview

Good information and liked that it had a countdown clock, to give you that real feel in the test situation.

Jerico Kadhir

March 31, 2022

Aptitude test

It was OK, I didn't understand personally whether or not the "cannot say" option was acceptable or not in a lot of the questions, as it may have been a trick option.

Salvarina Viknesuari

March 15, 2022

I like the test because the platform is simple and engaging while the test itself is different than most of the Watson Glaser tests I've taken.

Alexis Sheridan

March 02, 2022

Some of the ratios were harder than I thought!

I like how clear the design and layout is - makes things very easy (even if the content itself is not!)

Cyril Lekgetho

February 17, 2022

Mental arithmetic

I enjoyed the fact that there were multiple questions pertaining to one passage of information, rather than multiple passages. However I would've appreciated a more varied question type.

Madupoju Manish

February 16, 2022

Analytics are the best questions

I like the test because of its time schedule. The way the questions are prepared makes it easy to crack the original test.

Chelsea Franklin

February 02, 2022

Interesting

I haven't done something like this for ages. Very good for the brain - although I certainly experienced some fog whilst doing it.

[email protected]

January 04, 2022

Population/exchange rates were the hardest

Great test as it felt a bit time pressured. Very different types of questions in terms of difficulty.

faezeh tavakoli

January 02, 2022

More attention to detail + be more time conscious

It was asking about daily stuff we all deal with, but as an assessment it's scrutinising how we approach these problems.

By using our website you agree with our Cookie Policy.

APS

  • Teaching Tips

A Brief Guide for Teaching and Assessing Critical Thinking in Psychology

In my first year of college teaching, a student approached me one day after class and politely asked, “What did you mean by the word ‘evidence’?” I tried to hide my shock at what I took to be a very naive question. Upon further reflection, however, I realized that this was actually a good question, for which the usual approaches to teaching psychology provided too few answers. During the next several years, I developed lessons and techniques to help psychology students learn how to evaluate the strengths and weaknesses of scientific and nonscientific kinds of evidence and to help them draw sound conclusions. It seemed to me that learning about the quality of evidence and drawing appropriate conclusions from scientific research were central to teaching critical thinking (CT) in psychology.

In this article, I have attempted to provide guidelines to psychol­ogy instructors on how to teach CT, describing techniques I devel­oped over 20 years of teaching. More importantly, the techniques and approach described below are ones that are supported by scientific research. Classroom examples illustrate the use of the guidelines and how assessment can be integrated into CT skill instruction.

Overview of the Guidelines

Confusion about the definition of CT has been a major obstacle to teaching and assessing it (Halonen, 1995; Williams, 1999). To deal with this problem, we have defined CT as reflective think­ing involved in the evaluation of evidence relevant to a claim so that a sound or good conclusion can be drawn from the evidence (Bensley, 1998). One virtue of this definition is it can be applied to many thinking tasks in psychology. The claims and conclusions psychological scientists make include hypotheses, theoretical state­ments, interpretation of research findings, or diagnoses of mental disorders. Evidence can be the results of an experiment, case study, naturalistic observation study, or psychological test. Less formally, evidence can be anecdotes, introspective reports, commonsense beliefs, or statements of authority. Evaluating evidence and drawing appropriate conclusions along with other skills, such as distin­guishing arguments from nonarguments and finding assumptions, are collectively called argument analysis skills. Many CT experts take argument analysis skills to be fundamental CT skills (e.g., Ennis, 1987; Halpern, 1998). Psychology students need argument analysis skills to evaluate psychological claims in their work and in everyday discourse.

Some instructors expect their students will improve CT skills like argument analysis skills by simply immersing them in challenging course work. Others expect improvement because they use a textbook with special CT questions or modules, give lectures that critically review the literature, or have students complete written assignments. While these and other traditional techniques may help, a growing body of research suggests they are not sufficient to efficiently produce measurable changes in CT skills. Our research on acquisition of argument analysis skills in psychology (Bensley, Crowe, Bernhardt, Buchner, & Allman, in press) and on critical reading skills (Bensley & Haynes, 1995; Spero & Bensley, 2009) suggests that more explicit, direct instruction of CT skills is necessary. These results concur with results of an earlier review of CT programs by Chance (1986) and a recent meta-analysis by Abrami et al., (2008).

Based on these and other findings, the following guidelines describe an approach to explicit instruction in which instructors can directly infuse CT skills and assessment into their courses. With infusion, instructors can use relevant content to teach CT rules and concepts along with the subject matter. Directly infus­ing CT skills into course work involves targeting specific CT skills, making CT rules, criteria, and methods explicit, providing guided practice in the form of exercises focused on assessing skills, and giving feedback on practice and assessments. These components are similar to ones found in effective, direct instruc­tion approaches (Walberg, 2006). They also resemble approaches to teaching CT proposed by Angelo (1995), Beyer (1997), and Halpern (1998). Importantly, this approach has been successful in teaching CT skills in psychology (e.g., Bensley, et al., in press; Bensley & Haynes, 1995; Nieto & Saiz, 2008; Penningroth, Despain, & Gray, 2007). Directly infusing CT skill instruction can also enrich content instruction without sacrificing learning of subject matter (Solon, 2003). The following seven guidelines, illustrated by CT lessons and assessments, explicate this process.

Seven Guidelines for Teaching and Assessing Critical Thinking

1. Motivate your students to think critically

Critical thinking takes effort. Without proper motivation, students are less inclined to engage in it. Therefore, it is good to arouse interest right away and foster commitment to improving CT throughout a course. One motivational strategy is to explain why CT is important to effective, professional behavior. Often, telling a compelling story that illustrates the consequences of failing to think critically can mo­tivate students. For example, the tragic death of 10-year-old Candace Newmaker at the hands of her therapists practicing attachment therapy illustrates the perils of using a therapy that has not been supported by good empirical evidence (Lilienfeld, 2007).

Instructors can also pique interest by taking a class poll posing an interesting question on which students are likely to have an opinion. For example, asking students how many think that the full moon can lead to increases in abnormal behavior can be used to introduce the difference between empirical fact and opinion or common sense belief. After asking students how psychologists answer such questions, instructors might go over the meta-analysis of Rotton and Kelly (1985). Their review found that almost all of the 37 studies they reviewed showed no association between the phase of the moon and abnormal behavior with only a few, usually poorly, controlled studies supporting it. Effect size over all stud­ies was very small (.01). Instructors can use this to illustrate how psychologists draw a conclusion based on the quality and quantity of research studies as opposed to what many people commonly believe. For other interesting thinking errors and misconceptions related to psychology, see Bensley (1998; 2002; 2008), Halpern (2003), Ruscio (2006), Stanovich (2007), and Sternberg (2007).

Attitudes and dispositions can also affect motivation to think critically. If students lack certain CT dispositions such as open-mindedness, fair-mindedness, and skepticism, they will be less likely to think critically even if they have CT skills (Halpern, 1998). Instructors might point out that even great scientists noted for their powers of reasoning sometimes fail to think critically when they are not disposed to use their skills. For example, Alfred Russel Wallace who used his considerable CT skills to help develop the concept of natural selection also believed in spiritualistic contact with the dead. Despite considerable evidence that mediums claiming to contact the dead were really faking such contact, Wallace continued to believe in it (Bensley, 2006). Likewise, the great American psychologist William James, whose reasoning skills helped him develop the seeds of important contemporary theories, believed in spiritualism despite evidence to the contrary.

2. Clearly state the CT goals and objectives for your class

Once students are motivated, the instructor should focus them on what skills they will work on during the course. The APA task force on learning goals and objectives for psychology listed CT as one of 10 major goals for students (Halonen et al., 2002). Under critical thinking they have further specified outcomes such as evaluating the quality of information, identifying and evaluating the source and credibility of information, recognizing and defending against think­ing errors and fallacies. Instructors should publish goals like these in their CT course objectives in their syllabi and more specifically as assignment objectives in their assignments. Given the pragmatic penchant of students for studying what is needed to succeed in a course, this should help motivate and focus them.

To make instruction efficient, course objectives and lesson ob­jectives should explicitly target CT skills to be improved. Objectives should specify the behavior that will change in a way that can be measured. A course objective might read, “After taking this course, you will be able to analyze arguments found in psychological and everyday discussions.” When the goal of a lesson is to practice and improve specific microskills that make up argument analysis, an assignment objective might read “After successfully completing this assignment, you will be able to identify different kinds of evidence in a psychological discussion.” Or another might read “After suc­cessfully completing this assignment, you will be able to distinguish arguments from nonarguments.” Students might demonstrate they have reached these objectives by showing the behavior of correctly labeling the kinds of evidence presented in a passage or by indicating whether an argument or merely a claim has been made. By stating objectives in the form of assessable behaviors, the instructor can test these as assessment hypotheses.

Sometimes when the goal is to teach students how to decide which CT skills are appropriate in a situation, the instructor may not want to identify specific skills. Instead, a lesson objective might read, “After successfully completing this assignment, you will be able to decide which skills and knowledge are appropriate for criti­cally analyzing a discussion in psychology.”

3. Find opportunities to infuse CT that fit content and skill requirements of your course

To improve their CT skills, students must be given opportunities to practice them. Different courses present different opportunities for infusion and practice. Stand-alone CT courses usually provide the most opportunities to infuse CT. For example, the Frostburg State University Psychology Department has a senior seminar called “Thinking like a Psychologist” in which students complete lessons giving them practice in argument analysis, critical reading, critically evaluating information on the Internet, distinguishing science from pseudoscience, applying their knowledge and CT skills in simula­tions of psychological practice, and other activities.

In more typical subject-oriented courses, instructors must find specific content and types of tasks conducive to explicit CT skill instruction. For example, research methods courses present several opportunities to teach argument analysis skills. Instructors can have students critically evaluate the quality of evidence provided by studies using different research methods and designs they find in PsycINFO and Internet sources. This, in turn, could help students write better critical evaluations of research for research reports.

A cognitive psychology teacher might assign a critical evalu­ation of the evidence on an interesting question discussed in text­book literature reviews. For example, students might evaluate the evidence relevant to the question of whether people have flashbulb memories such as accurately remembering the 9-11 attack. This provides the opportunity to teach them that many of the studies, although informative, are quasi-experimental and cannot show causation. Or, students might analyze the arguments in a TV pro­gram such as the fascinating Nova program Kidnapped by Aliens on people who recall having been abducted by aliens.

4. Use guided practice, explicitly modeling and scaffolding CT.

Guided practice involves modeling and supporting the practice of target skills, and providing feedback on progress towards skill attainment. Research has shown that guided practice helps student more efficiently acquire thinking skills than unguided and discovery approaches (Meyer, 2004).

Instructors can model the use of CT rules, criteria, and proce­dures for evaluating evidence and drawing conclusions in many ways. They could provide worked examples of problems, writing samples displaying good CT, or real-world examples of good and bad thinking found in the media. They might also think out loud as they evaluate arguments in class to model the process of thinking.

To help students learn to use complex rules in thinking, instruc­tors should initially scaffold student thinking. Scaffolding involves providing product guidelines, rules, and other frameworks to support the process of thinking. Table 1 shows guidelines like those found in Bensley (1998) describing nonscientific kinds of evidence that can support student efforts to evaluate evidence in everyday psychologi­cal discussions. Likewise, Table 2 provides guidelines like those found in Bensley (1998) and Wade and Tavris (2005) describing various kinds of scientific research methods and designs that differ in the quality of evidence they provide for psychological arguments.

In the cognitive lesson on flashbulb memory described earlier, students use the framework in Table 2 to evaluate the kinds of evidence in the literature review. Table 1 can help them evaluate the kinds of evidence found in the Nova video Kidnapped by Aliens . Specifically, they could use it to contrast scientific authority with less credible authority. The video includes statements by scientific authorities like Elizabeth Loftus based on her extensive research contrasted with the nonscientific authority of Bud Hopkins, an artist turned hypnotherapist and author of popular books on alien abduction. Loftus argues that the memories of alien abduction in the children interviewed by Hopkins were reconstructed around the suggestive interview questions he posed. Therefore, his conclu­sion that the children and other people in the video were recalling actual abduction experiences was based on anecdotes, unreliable self-reports, and other weak evidence.

Modeling, scaffolding, and guided practice are especially useful in helping students first acquire CT skills. After sufficient practice, however, instructors should fade these and have students do more challenging assignments without these supports to promote transfer.

5. Align assessment with practice of specific CT skills

Test questions and other assessments of performance should be similar to practice questions and problems in the skills targeted but differ in content. For example, we have developed a series of practice and quiz questions about the kinds of evidence found in Table 1 used in everyday situations but which differ in subject matter from practice to quiz. Likewise, other questions employ research evidence examples corresponding to Table 2. Questions ask students to identify kinds of evidence, evaluate the quality of the evidence, distinguish arguments from nonarguments, and find assumptions in the examples with practice examples differing in content from assessment items.

6. Provide feedback and encourage students to reflect on it

Instructors should focus feedback on the degree of attainment of CT skill objectives in the lesson or assessment. The purpose of feedback is to help students learn how to correct faulty thinking so that in the future they monitor their thinking and avoid such problems. This should increase their metacognition or awareness and control of their thinking, an important goal of CT instruction (Halpern, 1998).

Students must use their feedback for it to improve their CT skills. In the CT exercises and critical reading assignments, students receive feedback in the form of corrected responses and written feedback on open-ended questions. They should be advised that paying attention to feedback on earlier work and assessments should improve their performance on later assessments.

7. Reflect on feedback and assessment results to improve CT instruction

Instructors should use the feedback they provide to students and the results of ongoing assessments to ‘close the loop,’ that is, use these outcomes to address deficiencies in performance and improve instruction. In actual practice, teaching and assessment strategies rarely work optimally the first time. Instructors must be willing to tinker with these to make needed improvements. Reflec­tion on reliable and valid assessment results provides a scientific means to systematically improve instruction and assessment.

Instructors may find the direct infusion approach as summarized in the seven guidelines to be efficient, especially in helping students acquire basic CT skills, as research has shown. They may especially appreciate how it allows them to take a scientific approach to the improvement of instruction. Although the direct infusion approach seems to efficiently promote acquisition of CT skills, more research is needed to find out if students transfer their skills outside of the class­room or whether this approach needs adjustment to promote transfer.

Table 1. Strengths and Weaknesses of Nonscientific Sources and Kinds of Evidence

Informal beliefs and folk theories of mind commonly assumed to be true

— is a view shared by many, not just a few people.

— is familiar and appeals to

everyday experience.

— is not based on careful,

systematic observation.

— may be biased by cultural

and social influences.

— often goes untested.

Story or example, often biographical, used to support a claim

— can vividly illustrate an ability, trait, behavior, or situation.

— provides a ‘real-world’ example.

— is not based on careful, systematic observation.

— may be unique, not repeatable, and cannot be generalized for large groups.

Reports of one’s own experience often in the form of testimonials and introspective self-reports

— tells what a person may be feeling, experiencing, or aware of at the time.

— is compelling and easily identified with.

— is often subjective and

biased.

— may be unreliable because

people are often unaware of

the real reasons for their

behaviors and experiences.

Statement made by a person or group assumed to have special knowledge or expertise

— may be true or useful when the authority has relevant knowledge or expertise.

— is convenient because acquiring one’s own knowledge and expertise takes a lot of time.

— is misleading when presumed authority does not have or pretends to have special knowledge or expertise.

— may be biased.

Table 2. Strengths and Weaknesses of Scientific Research Methods/Designs Used as Sources of Evidence

Detailed description of

one or a few subjects

— provides much information about one person.

— may inform about a person with special or rare abilities, knowledge, or characteristics.

— may be unique and hard to replicate.

— may not generalize to other people.

— cannot show cause and effect.

Observations of behavior made in the field or natural environment

— allows observations to be readily generalized to real world.

— can be a source of hypotheses.

— allows little control of extraneous variables.

— cannot test treatments.

— cannot show cause and effect.

A method like a questionnaire that allows many questions to be asked

— allows economical collection of much data.

— allows for study of many different questions at once.

— may have problems of self

reports such as dishonesty,

forgetting, and misrepresentation of self.

— may involve biased sampling.

A method for finding a quantitative relationship between variables — allows researcher to calculate

the strength and direction of

relation between variables.

— can use it to make predictions.

— does not allow random assignment of participants or much control of subject variables.

— cannot test treatments.

— cannot show cause and effect.

A method for comparing

treatment conditions without random assignment

— allows comparison of treatments.

— allows some control of extraneous variables.

— does not allow random assign-

ment of participants or much

control of subject variables.

— Cannot show cause and effect.

A method for comparing

Treatment conditions in which variables can be controlled through random assignment

— allows true manipulation

of treatment conditions.

— allows random assignment and much control of extraneous variables.

— can show cause and effect.

— cannot manipulate and test some variables.

— may control variables and conditions so much that they become artificial and

not like the ‘real world’.

Abrami, P. C., Bernard, R. M., Borokhovhovski, E., Wade, A., Surkes, M. A., Tamim, R., et al., (2008). Instructional interventions affecting critical thinking skills and dispositions: A stage 1 meta-analysis. Review of Educational Research, 4 , 1102–1134.

Angelo, T. A. (1995). Classroom assessment for critical thinking. Teaching of Psychology , 22(1), 6–7.

Bensley, D.A. (1998). Critical thinking in psychology: A unified skills approach. Pacific Grove, CA: Brooks/Cole.

Bensley, D.A. (2002). Science and pseudoscience: A critical thinking primer. In M. Shermer (Ed.), The Skeptic encyclopedia of pseudoscience. (pp. 195–203). Santa Barbara, CA: ABC–CLIO.

Bensley, D.A. (2006). Why great thinkers sometimes fail to think critically. Skeptical Inquirer, 30, 47–52.

Bensley, D.A. (2008). Can you learn to think more like a psychologist? The Psychologist, 21, 128–129.

Bensley, D.A., Crowe, D., Bernhardt, P., Buckner, C., & Allman, A. (in press). Teaching and assessing critical thinking skills for argument analysis in psychology. Teaching of Psychology .

Bensley, D.A. & Haynes, C. (1995). The acquisition of general purpose strategic knowledge for argumentation. Teaching of Psychology, 22 , 41–45.

Beyer, B.K. (1997). Improving student thinking: A comprehensive approach . Boston: Allyn & Bacon.

Chance, P. (1986) Thinking in the classroom: A review of programs . New York: Instructors College Press.

Ennis, R.H. (1987). A taxonomy of critical thinking dispositions and abilities. In J. B. Baron & R. F. Sternberg (Eds.). Teaching thinking skills: Theory and practice (pp. 9–26). New York: Freeman.

Halonen, J.S. (1995). Demystifying critical thinking. Teaching of Psychology, 22 , 75–81.

Halonen, J.S., Appleby, D.C., Brewer, C.L., Buskist, W., Gillem, A. R., Halpern, D. F., et al. (APA Task Force on Undergraduate Major Competencies). (2002) Undergraduate psychology major learning goals and outcomes: A report. Washington, DC: American Psychological Association. Retrieved August 27, 2008, from http://www.apa.org/ed/pcue/reports.html .

Halpern, D.F. (1998). Teaching critical thinking for transfer across domains: Dispositions, skills, structure training, and metacognitive monitoring. American Psychologist , 53 , 449–455.

Halpern, D.F. (2003). Thought and knowledge: An introduction to critical thinking . (3rd ed.). Mahwah, NJ: Erlbaum.

Lilienfeld, S.O. (2007). Psychological treatments that cause harm. Perspectives on Psychological Science , 2 , 53–70.

Meyer, R.E. (2004). Should there be a three-strikes rule against pure discovery learning? The case for guided methods of instruction. American Psychologist , 59 , 14–19.

Nieto, A.M., & Saiz, C. (2008). Evaluation of Halpern’s “structural component” for improving critical thinking. The Spanish Journal of Psychology , 11 ( 1 ), 266–274.

Penningroth, S.L., Despain, L.H., & Gray, M.J. (2007). A course designed to improve psychological critical thinking. Teaching of Psychology , 34 , 153–157.

Rotton, J., & Kelly, I. (1985). Much ado about the full moon: A meta-analysis of lunar-lunacy research. Psychological Bulletin , 97 , 286–306.

Ruscio, J. (2006). Critical thinking in psychology: Separating sense from nonsense. Belmont, CA: Wadsworth.

Solon, T. (2007). Generic critical thinking infusion and course content learning in introductory psychology. Journal of Instructional Psychology , 34(2), 972–987.

Stanovich, K.E. (2007). How to think straight about psychology . (8th ed.). Boston: Pearson.

Sternberg, R.J. (2007). Critical thinking in psychology: It really is critical. In R. J. Sternberg, H. L. Roediger, & D. F. Halpern (Eds.), Critical thinking in psychology. (pp. 289–296) . Cambridge, UK: Cambridge University Press.

Wade, C., & Tavris, C. (2005) Invitation to psychology. (3rd ed.). Upper Saddle River, NJ: Prentice Hall.

Walberg, H.J. (2006). Improving educational productivity: A review of extant research. In R. F. Subotnik & H. J. Walberg (Eds.), The scientific basis of educational productivity (pp. 103–159). Greenwich, CT: Information Age.

Williams, R.L. (1999). Operational definitions and assessment of higher-order cognitive constructs. Educational Psychology Review , 11 , 411–427.

critical thinking in assessment

Excellent article.

critical thinking in assessment

Interesting and helpful!

APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines .

Please login with your APS account to comment.

About the Author

D. Alan Bensley is Professor of Psychology at Frostburg State University. He received his Master’s and PhD degrees in cognitive psychology from Rutgers University. His main teaching and research interests concern the improvement of critical thinking and other cognitive skills. He coordinates assessment for his department and is developing a battery of instruments to assess critical thinking in psychology. He can be reached by email at [email protected] Association for Psychological Science December 2010 — Vol. 23, No. 10

critical thinking in assessment

Student Notebook: Five Tips for Working with Teaching Assistants in Online Classes

Sarah C. Turner suggests it’s best to follow the golden rule: Treat your TA’s time as you would your own.

Teaching Current Directions in Psychological Science

Aimed at integrating cutting-edge psychological science into the classroom, Teaching Current Directions in Psychological Science offers advice and how-to guidance about teaching a particular area of research or topic in psychological science that has been

European Psychology Learning and Teaching Conference

The School of Education of the Paris Lodron University of Salzburg is hosting the next European Psychology Learning and Teaching (EUROPLAT) Conference on September 18–20, 2017 in Salzburg, Austria. The main theme of the conference

Privacy Overview

CookieDurationDescription
__cf_bm30 minutesThis cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
CookieDurationDescription
AWSELBCORS5 minutesThis cookie is used by Elastic Load Balancing from Amazon Web Services to effectively balance load on the servers.
CookieDurationDescription
at-randneverAddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT2 yearsYouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
uvc1 year 27 daysSet by addthis.com to determine the usage of addthis.com service.
_ga2 yearsThe _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_3507334_11 minuteSet by Google to distinguish users.
_gid1 dayInstalled by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CookieDurationDescription
loc1 year 27 daysAddThis sets this geolocation cookie to help understand the location of users who share the information.
VISITOR_INFO1_LIVE5 months 27 daysA cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSCsessionYSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devicesneverYouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-idneverYouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextIdneverThis cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requestsneverThis cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Critical thinking definition

critical thinking in assessment

Critical thinking, as described by Oxford Languages, is the objective analysis and evaluation of an issue in order to form a judgement.

Active and skillful approach, evaluation, assessment, synthesis, and/or evaluation of information obtained from, or made by, observation, knowledge, reflection, acumen or conversation, as a guide to belief and action, requires the critical thinking process, which is why it's often used in education and academics.

Some even may view it as a backbone of modern thought.

However, it's a skill, and skills must be trained and encouraged to be used at its full potential.

People turn up to various approaches in improving their critical thinking, like:

  • Developing technical and problem-solving skills
  • Engaging in more active listening
  • Actively questioning their assumptions and beliefs
  • Seeking out more diversity of thought
  • Opening up their curiosity in an intellectual way etc.

Is critical thinking useful in writing?

Critical thinking can help in planning your paper and making it more concise, but it's not obvious at first. We carefully pinpointed some the questions you should ask yourself when boosting critical thinking in writing:

  • What information should be included?
  • Which information resources should the author look to?
  • What degree of technical knowledge should the report assume its audience has?
  • What is the most effective way to show information?
  • How should the report be organized?
  • How should it be designed?
  • What tone and level of language difficulty should the document have?

Usage of critical thinking comes down not only to the outline of your paper, it also begs the question: How can we use critical thinking solving problems in our writing's topic?

Let's say, you have a Powerpoint on how critical thinking can reduce poverty in the United States. You'll primarily have to define critical thinking for the viewers, as well as use a lot of critical thinking questions and synonyms to get them to be familiar with your methods and start the thinking process behind it.

Are there any services that can help me use more critical thinking?

We understand that it's difficult to learn how to use critical thinking more effectively in just one article, but our service is here to help.

We are a team specializing in writing essays and other assignments for college students and all other types of customers who need a helping hand in its making. We cover a great range of topics, offer perfect quality work, always deliver on time and aim to leave our customers completely satisfied with what they ordered.

The ordering process is fully online, and it goes as follows:

  • Select the topic and the deadline of your essay.
  • Provide us with any details, requirements, statements that should be emphasized or particular parts of the essay writing process you struggle with.
  • Leave the email address, where your completed order will be sent to.
  • Select your prefered payment type, sit back and relax!

With lots of experience on the market, professionally degreed essay writers , online 24/7 customer support and incredibly low prices, you won't find a service offering a better deal than ours.

CriticalThinking.NET

CriticalThinking.NET

thinking in practice

Critical Thinking Definition, Instruction, and Assessment: A Rigorous Approach

Ever wonder about how to think better for improved decisions and actions? The purpose of this site is to answer these questions from a personal perspective.

  • What is critical thinking?
  • Is critical thinking worth teaching ?
  • What are the options for textbooks , incorporating into the curriculum , and self-teaching ?
  • How can critical thinking skills be assessed ?
  • Who can help you if you want to know more?

In a brief definition, critical thinking is reasonable reflective thinking focused on deciding what to believe or do. 

See here for an outline of critical thinking dispositions and abilities.

Critical Thinking Assessment: 4 Ways to Test Applicants

Post Author - Juste Semetaite

In the current age of information overload, critical thinking (CT) is a vital skill to sift fact from fiction. Fake news, scams, and disinformation can have a negative impact on individuals as well as businesses. Ultimately, those with finer CT skills can help to lead their team with logical thinking, evidence-based motivation, and smarter decisions.

Today, most roles require critical thinking skills. And understanding how to test and evaluate critical thinking skills can not only help to differentiate candidates but may even predict job performance .

This article will cover:

What is critical thinking?

  • Critical thinking vs problem-solving
  • 5 critical thinking sub-skills
  • The importance of assessing critical thinking skills
  • 4 ways to leverage critical thinking assessments

Critical thinking is the process of analyzing and evaluating information in a logical way. And though a valuable skill since as far back as the early philosophers’ era, it is just as vital today. For candidates to succeed in the digital economy , they need modern thinking skills that help them think critically.

Whether we realize it or not, we process tons of data and information on a daily basis. Everything from social media to online news, data from apps like Strava – and that’s on top of all the key metrics in relation to our professional role.

Without a shadow of a doubt, correctly interpreting information — and recognizing disinformation — is an essential skill in today’s workplace and everyday life. And that’s also why teaching critical thinking skills in education is so important to prepare the next generation for the challenges they will face in the modern workplace.

Critical thinking isn’t about being constantly negative or critical of everything. It’s about objectivity and having an open, inquisitive mind. To think critically is to analyze issues based on hard evidence (as opposed to personal opinions, biases, etc.) in order to build a thorough understanding of what’s really going on. And from this place of thorough understanding, you can make better decisions and solve problems more effectively. Bernard Marr | Source

Today, candidates with CT skills think and reason independently, question data, and use their findings to contribute actively to their team rather than passively taking in or accepting information as fact.

Why are critical thinking skills important?

In the workplace, those with strong CT skills no longer rely on their gut or instinct for their decisions. They’re able to problem-solve more effectively by analyzing situations systematically.

With these skills, they think objectively about information and other points of view and look for evidence to support their findings rather than simply accepting opinions or conclusions as facts.

When employees can turn critical thinking into a habit, it ultimately reduces personal bias and helps them be more open to their teammates’ suggestions — improving how teams collaborate and collectively solve problems.

Critical thinking vs. Problem solving – what’s the difference?

Let’s explore the difference between these two similar concepts in more detail.

Critical thinking is about processing and analyzing information to reach an objective, evidence-based conclusion. Let’s take a look at an example of critical thinking in action:

  • A member of the team suggests using a new app they’ve heard about to automate and speed up candidate screening . Some like the idea, but others in the team share reasons why they don’t support the idea. So you visit the software website and look at the key features and benefits yourself, then you might look for reviews about it and ask your HR counterparts what they think of it. The reviews look promising, and a few of your fellow practitioners say it’s worked well for them. Next, you look into the costs compared to the solution your team is already using and calculate that the return on investment (ROI) is good. You arrive at the conclusion that it’d be worth testing the platform with the free trial version and recommend this to your team.

On the other hand, problem solving can involve many of the same skills as critical thinking, such as observing and evaluating. Still, it focuses on identifying business obstacles and coming up with solutions. So, let’s return to the example of the candidate screening software and see how it might work differently in the context of problem-solving :

  • For weeks, the talent acquisition team has complained about how long it takes to screen candidates manually. One of the team members decides to look for a solution to their problem. They assess the team’s current processes and resources and how to best solve the issues. In their research, they discover the new candidate screening platform and test out its functionality for a few days. They feel it would benefit the team and suggest it at the next meeting. Great problem solving, HR leader!

Problem-Solving Skills: 5 Ways to Evaluate Them When Hiring

What are the 5 sub-skills that make up critical thinking?

the sub skills of critical thinking competency

Now that we’ve established what CT is, let’s break it down into the 5 core sub-skills that make up a critical thinking mindset .

  • Observation : Being observant of your environment is the first step to thinking critically. Observant employees can even identify a potential problem before it becomes a problem.
  • Analysis : Once you’ve observed the issue or problem, you can begin to analyze its parts. It’s about asking questions, researching, and evaluating the findings objectively. This is an essential skill, especially for someone in a management role.
  • Inference : Also known as construct validity, is about drawing a conclusion from limited information. To do this effectively may require in-depth knowledge of a field. Candidates with this skill can contribute a lot of value to a startup, for instance, where initially, there may be little data available for information processing.
  • Communication : This pertains to expressing ideas and your reasoning clearly and persuasively, as well as actively listening to colleagues’ suggestions or viewpoints. When all members of a team or department can communicate and consider different perspectives, it helps tasks (and, well, everything) move along swiftly and smoothly.
  • Problem solving : Once you begin implementing a chosen solution, you may still encounter teething problems. At that point, problem solving skills will help you decide on the best solution and how to overcome the obstacles to bring you closer to your goal.

What is a critical thinking assessment test?

Though there are a few different ways to assess critical thinking, such as the Collegiate Learning Assessment, one of the most well-known tests is the Watson Glaser™ Critical Thinking Appraisal .

Critical thinking tests, or critical reasoning tests, are psychometric tests used in recruitment at all levels, graduate, professional and managerial, but predominantly in the legal sector. However, it is not uncommon to find companies in other sectors using critical thinking tests as part of their selection process. This is an intense test, focusing primarily on your analytical, or critical thinking, skills. Source

These tests are usually timed and typically include multiple choice items, short answers or short scenario-based questions to assess students or prospective candidates. They test candidates’ ability to interpret data without bias, find logical links between information, and separate facts from false data .

Critical thinking example questions from the Watson-Glacer test rubric

But how do these tests measure critical thinking?

In addition to educational and psychological testing, many employers today use critical thinking tests to assess a person’s ability to question information — to ask What , Why , and How of the data. A standard critical thinking test breaks down this aptitude by examining the following 5 components:

  • assumption – analyzing a scenario to determine if there are any assumptions made
  • deduction – the ability to choose which deductions are logical
  • evaluating evidence – in support of and against something
  • inference – conclusions, drawn from observed facts
  • interpretation – interpreting the accuracy of a stated conclusion (based on a scenario)

Why is it important to assess critical thinking skills during the recruitment process?

Critical thinking skills may be considered a soft skill , but it’s become a prerequisite in certain industries, like software, and for many roles. Marketing managers, project managers, accountants, and healthcare professionals, for example, all require a degree of CT skills to perform their roles.

The kinds of businesses that require critical thinking include technology , engineering , healthcare , the legal sector , scientific research, and education . These industries are typically very technical and rely on data . People working in these fields research and use data to draw logical conclusions that help them work smarter and more efficiently.

In the hiring process, test takers with good critical thinking skills stand out . Why? Because they are able to demonstrate their ability to collaborate, problem-solve, and manage pressure in a rational, logical manner. As a result, they’re more likely to make the right business decisions that boost efficiency and, ultimately, a business’s bottom line.

Critical thinking assessment template for evaluating candidates

Examples of jobs that rely on critical thinking skills

Critical thinking is not rocket science, but it is an important skill when making decisions — especially when the correct answer is not obvious. Here are a few examples of job roles that rely on critical thinking dispositions:

  • computer programmers or developers : may use critical thinking and other advanced skills in a variety of ways, from debugging code to analyzing the problem, finding potential causes, and coming up with suitable solutions. They also use CT when there is no clear roadmap to rely on, such as when building a new app or feature.
  • criminologists : must have critical thinking abilities to observe criminal behavior objectively and to analyze the problem in such a way that they can be confident in the conclusions they present to the authorities.
  • medical professionals : need to diagnose their patients’ condition through observation, communication, analysis and solving complex problems to decide on the best treatment.
  • air traffic controllers: need a super clear, calm head to deal with their high-stress job. They observe traffic, communicate with pilots, and constantly problem-solve to avoid airplane collisions.
  • legal professionals : use logic and reasoning to analyze various cases – even before deciding whether they’ll take on a case – and then use their excellent communication skills to sway people over to their reasoning in a trial setting.
  • project managers : have to deal with a lot of moving parts at the same time. To successfully keep projects on time and budget, they continually observe and analyze the progress of project components, communicate continually with the team and external stakeholders and work to solve any problems that crop up.

What are the risks of not testing for critical thinking?

By not evaluating critical thinking beforehand, you may end up hiring candidates with poor CT skills. Especially when hiring business leaders and for key positions, this has the potential to wreak havoc on a business. Their inaccurate assumptions are more likely to lead to bad decisions , which could cost the company money .

Weak critical thinking can result in a number of issues for your organization and justifies the expense or added effort of asking your candidate to complete critical thinking tests in the hiring process. For example, poor CT skills may result in:

  • making mistakes
  • not being able to take action when needed
  • working off false assumptions
  • unnecessary strain on work relationships

4 ways to assess critical thinking skills in candidates

Now that we’ve seen how important it is for most candidates today to have strong critical thinking skills, let’s take a look at some of the assessment instruments the talent acquisition team can use.

#1 – A homework assignment

A homework assignment is a task that assesses whether test takers have the right skills for a role. If critical thinking is essential for a particular job, you could provide candidates with a homework assignment that specifically tests their ability to:

  • accurately interpret relevant evidence
  • reach logical conclusions
  • judge information skeptically
  • communicate their own viewpoint and others’ backed by facts

Top tips to enlarge those brains

Tip : use Toggl Hire’s skills screening tests to easily filter out the good candidates first and speed up your hiring process.

#2 – Behavioral and situational interview questions

Ask the candidate to provide examples of situations when they used CT for solving problems or making a decision. This can provide insight into the candidate’s ability to analyze information and make informed decisions. For example:

Critical thinking example questions:

  • Tell me about a time when you had to make a really difficult decision at work.
  • What would you do in a situation where your manager made a mistake in a presentation or report?
  • How would you respond if a colleague shared a new idea or solution with you?
  • How do you evaluate the potential outcomes of different actions or decisions?
  • Can you describe a situation where you had to think on your feet and come up with a creative solution to a problem?
  • How do you ensure that your decision-making is based on relevant and accurate information?

30 Behavioral Interview Questions to Ask Candidates (With Answers)

#3 – Discuss the candidate’s critical thinking skills with their references

Additionally, the hiring manager can ask the candidate’s references about how the candidate demonstrated CT skills in the past.

  • Can you recall a time when (the candidate) had to convince you to choose an alternative solution to a problem?
  • Tell me about a time when (the candidate) had to solve a team disagreement regarding a project.

#4 – Critical thinking tests

Ask the candidate to complete a critical thinking test and score against critical thinking rubrics. You can then share feedback on their test scores with them and explore their willingness to improve their score, if necessary. Or compare their score to other applicants, and prioritize those with higher scores if the role truly requires a critical thinker.

Create your next critical thinking assessment with Toggl Hire

Assessing critical thinking skills is becoming a key component in the hiring process, especially for roles that require a particularly advanced skillset. Critical thinking is a sign of future performance. Candidates that clearly demonstrate these skills have a lot to offer companies, from better decision-making to more productive relationships and cost savings.

If your team needs help automating the screening process, and creating custom skills tests based on specific roles, try Toggl Hire’s skills test questions engine or the Custom Test Builder to create the exact questions you want from scratch.

Juste Semetaite

Juste loves investigating through writing. A copywriter by trade, she spent the last ten years in startups, telling stories and building marketing teams. She works at Toggl Hire and writes about how businesses can recruit really great people.

Subscribe to On The Clock.

Insights into building businesses better, from hiring to profitability (and everything in between). New editions drop every two weeks.

You might also like...

Related to Talent Assessments

6 Steps to Create a Professional Development Plan

6 Steps to Create a Professional Development Plan

Toggl Blog, Read articles by Mile Živković

A Complete Guide to Using Job Simulations in Your Hiring Process

Toggl Blog, Read articles by Juste Semetaite

15 Key Customer Service Skills for 2024

Take a peek at our most popular categories:

Recruiting?

  • Search for:
  • Verbal Aptitude Tests
  • Numerical Aptitude Test
  • Non-verbal Aptitude Test
  • Mechanical Aptitude Tests
  • Tests by Publisher
  • Personality Test
  • Prep Access
  • Articles & News

Verbal Aptitude Test

A Critical Thinking test, also known as a critical reasoning test, determines your ability to reason through an argument logically and make an objective decision. You may be required to assess a situation, recognize assumptions being made, create hypotheses, and evaluate arguments.

What questions can I expect?

Questions are likely based on the Watson and Glaser Critical Thinking Appraisal model, which contains five sections designed to assess how well an individual reasons analytically and logically. The five sections are:

Arguments : In this section, you are tested on your ability to distinguish between strong and weak arguments. For an argument to be strong, it must be both significant and directly related to the question. An argument is considered weak if it is not directly related to the question, of minor importance, or confuses correlation with causation, which is the incorrect assumption that correlation implies causation.

Assumptions : An assumption is something taken for granted. People often make assumptions that may not be correct. Being able to identify these is a key aspect of critical reasoning. A typical assumption question will present a statement and several assumptions, and you are required to identify whether an assumption has been made.

Deductions : Deduction questions require you to draw conclusions based solely on the information provided in the question, disregarding your own knowledge. You will be given a passage of information and must evaluate whether a conclusion made from that passage is valid.

Interpretation : In these questions, you are given a passage of information followed by a proposed conclusion. You must consider the information as true and decide whether the proposed conclusion logically and undoubtedly follows.

Inferences : Inference involves drawing conclusions from observed or supposed facts. It is about deducing information that is not explicitly stated but implied by the given information. For example, if we find a public restroom door locked, we infer that it is occupied.

Critical Thinking example:

Read the following statement and decide whether the conclusion logically follows from the information given.

Statement: Every librarian at the city library has completed a master’s degree in Library Science. Sarah is a librarian at the city library.

Conclusion: Sarah has completed a master’s degree in Library Science.

Does this conclusion logically follow from the statement?

Answer Options:

Explanation: Select your answer to display explanation.

The statement establishes that every librarian at the city library has completed a master’s degree in Library Science. Since Sarah is identified as a librarian at this library, it logically follows that she has completed a master’s degree in Library Science. The conclusion is a direct inference from the given information.

Where are Critical thinking tests used?

Critical thinking tests are commonly used in educational institutions for admissions and assessments, particularly in courses requiring strong analytical skills. In the professional realm, they are a key component of the recruitment process for roles demanding problem-solving and decision-making abilities, and are also utilized in internal promotions and leadership development. Additionally, these tests are integral to professional licensing and certification in fields like law and medicine, and are employed in training and development programs across various industries.

Practice Critical Thinking Test

Try a free critical thinking test. This free practice test contains 10 test questions and has a time limit of 6 minutes.

Verbal Test Prep

Verbal Test Preparation Package includes all nine verbal question categories, including:

  • Critical Thinking
  • Deductive Reasoning
  • Verbal Reasoning
  • Word Analogy

6 months access

What you get

  • 1000+ verbal practice questions
  • Clearly Explained Solutions
  • Test statistics
  • Score progression charts
  • Compare your performance
  • Vocabulary Trainer
  • Friendly customer service
  • 24/7 access on all devices

Discover how to make smart hiring decisions.

Unlock Your Potential

Improve your performance with our test preparation platform.

  • Access 24/7 from all your devices .
  • More than 1000 verbal practice questions.
  • Solutions explained in detail.
  • Keep track of your performance with charts and statistics.
  • Reference scores to compare your performance against others
  • Vocabulary Trainer.
  • Friendly customer service.

Aptitude Test Preparation

Simplify Your Study Maximize Your Score

Get instant access to our test prep platform.

Username or email address  *

Password  *

Remember me Log in

Lost your password?

Grand Valley State University

  • News & Events
  • Quick Links
  • Majors & Programs
  • People Finder
  • home   site index   contact us

ScholarWorks@GVSU

  • < Previous

Home > Graduate Research and Creative Practice > Culminating Experience Projects > 456

Culminating Experience Projects

The importance of critical thinking skills in secondary classrooms.

Clinton T. Sterkenburg , Grand Valley State University Follow

Date Approved

Graduate degree type, degree name.

Education-Instruction and Curriculum: Secondary Education (M.Ed.)

Degree Program

College of Education

First Advisor

Sherie Klee

Academic Year

According to research, many students lack effective critical thinking skills. The ability to think critically is crucial for individuals to be successful and responsible. Many students have difficulties understanding this important skill and especially lack the ability to initiate and apply the process. Although a difficult task, educators have the responsibility to teach critical skills to students and to discern when certain instructional methods or activities are not helping students. Each student is different, and their needs must be considered, this correlates with how they learn and process information. Research has shown that traditional teaching methods that require students to regurgitate information do not prove helpful in teaching students to apply and understand the critical thinking process. Therefore, effective teachers expand upon traditional teaching methods and differentiate instructional and activity design for imparting critical thinking skills to students. This project presents some of the possible reasons students have difficulties thinking critically and provides examples of instructional and lesson design methods that are proven to help students understand critical thinking. The goal of this project is to provide a guide for secondary teachers to address the lack of critical thinking skills in many students. The ability to think critically will greatly benefit students and help them become productive members of society.

ScholarWorks Citation

Sterkenburg, Clinton T., "The Importance of Critical Thinking Skills in Secondary Classrooms" (2024). Culminating Experience Projects . 456. https://scholarworks.gvsu.edu/gradprojects/456

Since August 05, 2024

Included in

Curriculum and Instruction Commons , Secondary Education Commons

Advanced Search

  • Notify me via email or RSS
  • Collections
  • University Archives
  • Open Textbooks
  • Open Educational Resources
  • Graduate Research and Creative Practice
  • Selected Works Galleries

Author Information

  • Submission Guidelines
  • Submit Research
  • Graduate Student Resources

Grand Valley State University Libraries

Home | About | FAQ | Contact | My Account | Accessibility Statement

Privacy Copyright

  • Skip to main content

Update your browser for the best possible experience

As of January 1st, 2020, Internet Explorer (versions 11 and below) is no longer supported by Evolve. To get the best possible experience using Evolve, we recommend that you use another web browser. See the browsers we support. .

American Psychological Association

How to cite ChatGPT

Timothy McAdoo

Use discount code STYLEBLOG15 for 15% off APA Style print products with free shipping in the United States.

We, the APA Style team, are not robots. We can all pass a CAPTCHA test , and we know our roles in a Turing test . And, like so many nonrobot human beings this year, we’ve spent a fair amount of time reading, learning, and thinking about issues related to large language models, artificial intelligence (AI), AI-generated text, and specifically ChatGPT . We’ve also been gathering opinions and feedback about the use and citation of ChatGPT. Thank you to everyone who has contributed and shared ideas, opinions, research, and feedback.

In this post, I discuss situations where students and researchers use ChatGPT to create text and to facilitate their research, not to write the full text of their paper or manuscript. We know instructors have differing opinions about how or even whether students should use ChatGPT, and we’ll be continuing to collect feedback about instructor and student questions. As always, defer to instructor guidelines when writing student papers. For more about guidelines and policies about student and author use of ChatGPT, see the last section of this post.

Quoting or reproducing the text created by ChatGPT in your paper

If you’ve used ChatGPT or other AI tools in your research, describe how you used the tool in your Method section or in a comparable section of your paper. For literature reviews or other types of essays or response or reaction papers, you might describe how you used the tool in your introduction. In your text, provide the prompt you used and then any portion of the relevant text that was generated in response.

Unfortunately, the results of a ChatGPT “chat” are not retrievable by other readers, and although nonretrievable data or quotations in APA Style papers are usually cited as personal communications , with ChatGPT-generated text there is no person communicating. Quoting ChatGPT’s text from a chat session is therefore more like sharing an algorithm’s output; thus, credit the author of the algorithm with a reference list entry and the corresponding in-text citation.

When prompted with “Is the left brain right brain divide real or a metaphor?” the ChatGPT-generated text indicated that although the two brain hemispheres are somewhat specialized, “the notation that people can be characterized as ‘left-brained’ or ‘right-brained’ is considered to be an oversimplification and a popular myth” (OpenAI, 2023).

OpenAI. (2023). ChatGPT (Mar 14 version) [Large language model]. https://chat.openai.com/chat

You may also put the full text of long responses from ChatGPT in an appendix of your paper or in online supplemental materials, so readers have access to the exact text that was generated. It is particularly important to document the exact text created because ChatGPT will generate a unique response in each chat session, even if given the same prompt. If you create appendices or supplemental materials, remember that each should be called out at least once in the body of your APA Style paper.

When given a follow-up prompt of “What is a more accurate representation?” the ChatGPT-generated text indicated that “different brain regions work together to support various cognitive processes” and “the functional specialization of different regions can change in response to experience and environmental factors” (OpenAI, 2023; see Appendix A for the full transcript).

Creating a reference to ChatGPT or other AI models and software

The in-text citations and references above are adapted from the reference template for software in Section 10.10 of the Publication Manual (American Psychological Association, 2020, Chapter 10). Although here we focus on ChatGPT, because these guidelines are based on the software template, they can be adapted to note the use of other large language models (e.g., Bard), algorithms, and similar software.

The reference and in-text citations for ChatGPT are formatted as follows:

  • Parenthetical citation: (OpenAI, 2023)
  • Narrative citation: OpenAI (2023)

Let’s break that reference down and look at the four elements (author, date, title, and source):

Author: The author of the model is OpenAI.

Date: The date is the year of the version you used. Following the template in Section 10.10, you need to include only the year, not the exact date. The version number provides the specific date information a reader might need.

Title: The name of the model is “ChatGPT,” so that serves as the title and is italicized in your reference, as shown in the template. Although OpenAI labels unique iterations (i.e., ChatGPT-3, ChatGPT-4), they are using “ChatGPT” as the general name of the model, with updates identified with version numbers.

The version number is included after the title in parentheses. The format for the version number in ChatGPT references includes the date because that is how OpenAI is labeling the versions. Different large language models or software might use different version numbering; use the version number in the format the author or publisher provides, which may be a numbering system (e.g., Version 2.0) or other methods.

Bracketed text is used in references for additional descriptions when they are needed to help a reader understand what’s being cited. References for a number of common sources, such as journal articles and books, do not include bracketed descriptions, but things outside of the typical peer-reviewed system often do. In the case of a reference for ChatGPT, provide the descriptor “Large language model” in square brackets. OpenAI describes ChatGPT-4 as a “large multimodal model,” so that description may be provided instead if you are using ChatGPT-4. Later versions and software or models from other companies may need different descriptions, based on how the publishers describe the model. The goal of the bracketed text is to briefly describe the kind of model to your reader.

Source: When the publisher name and the author name are the same, do not repeat the publisher name in the source element of the reference, and move directly to the URL. This is the case for ChatGPT. The URL for ChatGPT is https://chat.openai.com/chat . For other models or products for which you may create a reference, use the URL that links as directly as possible to the source (i.e., the page where you can access the model, not the publisher’s homepage).

Other questions about citing ChatGPT

You may have noticed the confidence with which ChatGPT described the ideas of brain lateralization and how the brain operates, without citing any sources. I asked for a list of sources to support those claims and ChatGPT provided five references—four of which I was able to find online. The fifth does not seem to be a real article; the digital object identifier given for that reference belongs to a different article, and I was not able to find any article with the authors, date, title, and source details that ChatGPT provided. Authors using ChatGPT or similar AI tools for research should consider making this scrutiny of the primary sources a standard process. If the sources are real, accurate, and relevant, it may be better to read those original sources to learn from that research and paraphrase or quote from those articles, as applicable, than to use the model’s interpretation of them.

We’ve also received a number of other questions about ChatGPT. Should students be allowed to use it? What guidelines should instructors create for students using AI? Does using AI-generated text constitute plagiarism? Should authors who use ChatGPT credit ChatGPT or OpenAI in their byline? What are the copyright implications ?

On these questions, researchers, editors, instructors, and others are actively debating and creating parameters and guidelines. Many of you have sent us feedback, and we encourage you to continue to do so in the comments below. We will also study the policies and procedures being established by instructors, publishers, and academic institutions, with a goal of creating guidelines that reflect the many real-world applications of AI-generated text.

For questions about manuscript byline credit, plagiarism, and related ChatGPT and AI topics, the APA Style team is seeking the recommendations of APA Journals editors. APA Style guidelines based on those recommendations will be posted on this blog and on the APA Style site later this year.

Update: APA Journals has published policies on the use of generative AI in scholarly materials .

We, the APA Style team humans, appreciate your patience as we navigate these unique challenges and new ways of thinking about how authors, researchers, and students learn, write, and work with new technologies.

American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed.). https://doi.org/10.1037/0000165-000

Related and recent

Comments are disabled due to your privacy settings. To re-enable, please adjust your cookie preferences.

APA Style Monthly

Subscribe to the APA Style Monthly newsletter to get tips, updates, and resources delivered directly to your inbox.

Welcome! Thank you for subscribing.

APA Style Guidelines

Browse APA Style writing guidelines by category

  • Abbreviations
  • Bias-Free Language
  • Capitalization
  • In-Text Citations
  • Italics and Quotation Marks
  • Paper Format
  • Punctuation
  • Research and Publication
  • Spelling and Hyphenation
  • Tables and Figures

Full index of topics

IMAGES

  1. Critical thinking assessment

    critical thinking in assessment

  2. Critical Thinking Assessment: 4 Ways to Test Applicants • Toggl Hire

    critical thinking in assessment

  3. PPT

    critical thinking in assessment

  4. Preliminary psychometric characteristics of the critical thinking self

    critical thinking in assessment

  5. Get a Free Critical Thinking Poster, Rubric, and Assessment Ideas

    critical thinking in assessment

  6. Standardized Critical Thinking Assessment Tools 1 College-Level

    critical thinking in assessment

COMMENTS

  1. Assessing Critical Thinking in Higher Education: Current State and

    Abstract Critical thinking is one of the most important skills deemed necessary for college graduates to become effective contributors in the global workforce. The first part of this article provides a comprehensive review of its definitions by major frameworks in higher education and the workforce, existing assessments and their psychometric qualities, and challenges surrounding the design ...

  2. Critical Thinking > Assessment (Stanford Encyclopedia of Philosophy)

    The Critical Thinking Assessment Test (CAT) is unique among them in being designed for use by college faculty to help them improve their development of students' critical thinking skills (Haynes et al. 2015; Haynes & Stein 2021). Also, for some years the United Kingdom body OCR (Oxford Cambridge and RSA Examinations) awarded AS and A Level ...

  3. Critical Thinking Testing and Assessment

    Critical Thinking Testing and Assessment The purpose of assessment in instruction is improvement. The purpose of assessing instruction for critical thinking is improving the teaching of discipline-based thinking (historical, biological, sociological, mathematical, etc.) It is to improve students' abilities to think their way through content using disciplined skill in reasoning. The more ...

  4. Frontiers

    Enhancing students' critical thinking (CT) skills is an essential goal of higher education. This article presents a systematic approach to conceptualizing an...

  5. Fostering and assessing student critical thinking: From theory to

    After contrasting critical thinking and creativity, I present how the rubrics can help teachers review and improve their lesson plans as well as assess the critical thinking of students. The conclusion links critical thinking to the concept of thinking like a scientist.

  6. Teaching, Measuring & Assessing Critical Thinking Skills

    Teaching critical thinking in classrooms is possible! Learn how Two Rivers Public Charter School defined, taught & assessed it and see how you can, too.

  7. Promoting and Assessing Critical Thinking

    Critical thinking can be defined as being able to examine an issue by breaking it down, and evaluating it in a conscious manner, while providing arguments/evidence to support the evaluation. Below are some suggestions for promoting and assessing critical thinking in our students. Asking questions and using the answers to understand the world ...

  8. Guidelines for a Scientific Approach to Critical Thinking Assessment

    This article examines benefits of taking a scientific approach to critical thinking assessment and proposes guidelines for planning, conducting, and using assessment research.

  9. Assessment of Critical Thinking

    Abstract. The term "to assess" has various meanings, such as to judge, evaluate, estimate, gauge, or determine. Assessment is therefore a diagnostic inventory of certain characteristics of a section of observable reality on the basis of defined criteria. In a pedagogical context, assessments aim to make learners' knowledge, skills, or ...

  10. Critical Thinking About Measuring Critical Thinking

    One commonly used CT assessment, mentioned above, that employs an open-ended format is the Ennis-Weir Critical Thinking Essay Test (EWCTET; Ennis & Weir, 1985).

  11. Critical Thinking Tests (2024 Guide)

    Critical thinking is the ability to scrutinize evidence using intellectual skills. Reflective skills are employed to reach clear, coherent and logical conclusions - rather than just accepting information as it is provided. Critical thinking tests measure the candidate's understanding of logical connections between ideas, the strength of an ...

  12. Critical Thinking Test Assessment

    Learn what a critical thinking test is, what's assessed and the format. Then try some free example questions and answers, and read some useful tips.

  13. What is critical thinking?

    What is critical thinking? Critical thinking is a kind of thinking in which you question, analyse, interpret , evaluate and make a judgement about what you read, hear, say, or write. The term critical comes from the Greek word kritikos meaning "able to judge or discern". Good critical thinking is about making reliable judgements based on reliable information.

  14. A Brief Guide for Teaching and Assessing Critical Thinking in

    The following seven guidelines, illustrated by CT lessons and assessments, explicate this process. Seven Guidelines for Teaching and Assessing Critical Thinking 1. Motivate your students to think critically Critical thinking takes effort. Without proper motivation, students are less inclined to engage in it.

  15. The Watson Glaser Critical Thinking Test: 2024 Guide

    The Watson Glaser critical thinking test is a unique assessment that provides a detailed analysis of a participant's ability to think critically. The test lasts 30 minutes and applicants can expect to be tested on around 40 questions in five distinct areas: Inference. Assumptions. Deduction.

  16. Using Critical Thinking in Essays and other Assignments

    Active and skillful approach, evaluation, assessment, synthesis, and/or evaluation of information obtained from, or made by, observation, knowledge, reflection, acumen or conversation, as a guide to belief and action, requires the critical thinking process, which is why it's often used in education and academics.

  17. The Analysis & Assessment of Thinking

    The Analysis & Assessment of Thinking. There are two essential dimensions of thinking that students need to master in order to develop as fairminded critical thinkers. They need to be able to identify the "parts" of thinking, and they need to be able to assess use of these parts of thinking, as follows: The question can then be raised, "What ...

  18. Critical Thinking Definition, Instruction, and Assessment: A Rigorous

    Rigorous approach to critical thinking built around definition, instruction - from textbooks, curriculum to self-teaching - and assessment.

  19. HCTA Halpern Critical Thinking Assessment

    HCTA is the first test that enables a content-representative assessment of recognition and recall aspects of critical thinking. The development of critical thinking skills is listed as the most important outcome of education and the most prized ability for high-level success in the workforce (Stanovich, 2009). Stanovich describes the ability to think critically as "what intelligence tests miss ...

  20. Critical thinking assessment: Theory Into Practice: Vol 32, No 3

    Making Student Thinking Visible through a Concept Map in Computer-Based Assessment of Critical Thinking.

  21. Critical Thinking Assessment: 4 Ways to Test Applicants

    Screening candidates with a critical thinking assessment is becoming a key step in the hiring process. But do you know how to do it well?

  22. Structures for Student Self-Assessment

    Structures for Student Self-Assessment. Critical thinking is thinking that assesses itself. To the extent that our students need us to tell them how well they are doing, they are not thinking critically. Didactic instruction makes students overly dependent on the teacher. In such instruction, students rarely develop any perceptible intellectual ...

  23. Critical Thinking Test

    What is a Critical Thinking test? A Critical Thinking test, also known as a critical reasoning test, determines your ability to reason through an argument logically and make an objective decision. You may be required to assess a situation, recognize assumptions being made, create hypotheses, and evaluate arguments.

  24. The Importance of Critical Thinking Skills in Secondary Classrooms

    According to research, many students lack effective critical thinking skills. The ability to think critically is crucial for individuals to be successful and responsible. Many students have difficulties understanding this important skill and especially lack the ability to initiate and apply the process. Although a difficult task, educators have the responsibility to teach critical skills to ...

  25. Using AI to Teach Critical Thinking

    A short video clip describing an assignment given to pre-service teachers to help them refine critical thinking skills

  26. Elsevier Education Portal

    Skip to main content ...

  27. How to cite ChatGPT

    This post outlines how to create references for large language model AI tools like ChatGPT and how to present AI-generated text in a paper.